WorldWideScience

Sample records for hierarchical multivariate regression

  1. Regularized multivariate regression models with skew-t error distributions

    KAUST Repository

    Chen, Lianfu

    2014-06-01

    We consider regularization of the parameters in multivariate linear regression models with the errors having a multivariate skew-t distribution. An iterative penalized likelihood procedure is proposed for constructing sparse estimators of both the regression coefficient and inverse scale matrices simultaneously. The sparsity is introduced through penalizing the negative log-likelihood by adding L1-penalties on the entries of the two matrices. Taking advantage of the hierarchical representation of skew-t distributions, and using the expectation conditional maximization (ECM) algorithm, we reduce the problem to penalized normal likelihood and develop a procedure to minimize the ensuing objective function. Using a simulation study the performance of the method is assessed, and the methodology is illustrated using a real data set with a 24-dimensional response vector. © 2014 Elsevier B.V.

  2. Multivariate sparse group lasso for the multivariate multiple linear regression with an arbitrary group structure.

    Science.gov (United States)

    Li, Yanming; Nan, Bin; Zhu, Ji

    2015-06-01

    We propose a multivariate sparse group lasso variable selection and estimation method for data with high-dimensional predictors as well as high-dimensional response variables. The method is carried out through a penalized multivariate multiple linear regression model with an arbitrary group structure for the regression coefficient matrix. It suits many biology studies well in detecting associations between multiple traits and multiple predictors, with each trait and each predictor embedded in some biological functional groups such as genes, pathways or brain regions. The method is able to effectively remove unimportant groups as well as unimportant individual coefficients within important groups, particularly for large p small n problems, and is flexible in handling various complex group structures such as overlapping or nested or multilevel hierarchical structures. The method is evaluated through extensive simulations with comparisons to the conventional lasso and group lasso methods, and is applied to an eQTL association study. © 2015, The International Biometric Society.

  3. Multivariate and semiparametric kernel regression

    OpenAIRE

    Härdle, Wolfgang; Müller, Marlene

    1997-01-01

    The paper gives an introduction to theory and application of multivariate and semiparametric kernel smoothing. Multivariate nonparametric density estimation is an often used pilot tool for examining the structure of data. Regression smoothing helps in investigating the association between covariates and responses. We concentrate on kernel smoothing using local polynomial fitting which includes the Nadaraya-Watson estimator. Some theory on the asymptotic behavior and bandwidth selection is pro...

  4. Hierarchical regression analysis in structural Equation Modeling

    NARCIS (Netherlands)

    de Jong, P.F.

    1999-01-01

    In a hierarchical or fixed-order regression analysis, the independent variables are entered into the regression equation in a prespecified order. Such an analysis is often performed when the extra amount of variance accounted for in a dependent variable by a specific independent variable is the main

  5. Retro-regression--another important multivariate regression improvement.

    Science.gov (United States)

    Randić, M

    2001-01-01

    We review the serious problem associated with instabilities of the coefficients of regression equations, referred to as the MRA (multivariate regression analysis) "nightmare of the first kind". This is manifested when in a stepwise regression a descriptor is included or excluded from a regression. The consequence is an unpredictable change of the coefficients of the descriptors that remain in the regression equation. We follow with consideration of an even more serious problem, referred to as the MRA "nightmare of the second kind", arising when optimal descriptors are selected from a large pool of descriptors. This process typically causes at different steps of the stepwise regression a replacement of several previously used descriptors by new ones. We describe a procedure that resolves these difficulties. The approach is illustrated on boiling points of nonanes which are considered (1) by using an ordered connectivity basis; (2) by using an ordering resulting from application of greedy algorithm; and (3) by using an ordering derived from an exhaustive search for optimal descriptors. A novel variant of multiple regression analysis, called retro-regression (RR), is outlined showing how it resolves the ambiguities associated with both "nightmares" of the first and the second kind of MRA.

  6. AN APPLICATION OF FUNCTIONAL MULTIVARIATE REGRESSION MODEL TO MULTICLASS CLASSIFICATION

    OpenAIRE

    Krzyśko, Mirosław; Smaga, Łukasz

    2017-01-01

    In this paper, the scale response functional multivariate regression model is considered. By using the basis functions representation of functional predictors and regression coefficients, this model is rewritten as a multivariate regression model. This representation of the functional multivariate regression model is used for multiclass classification for multivariate functional data. Computational experiments performed on real labelled data sets demonstrate the effectiveness of the proposed ...

  7. Hierarchical Decompositions for the Computation of High-Dimensional Multivariate Normal Probabilities

    KAUST Repository

    Genton, Marc G.

    2017-09-07

    We present a hierarchical decomposition scheme for computing the n-dimensional integral of multivariate normal probabilities that appear frequently in statistics. The scheme exploits the fact that the formally dense covariance matrix can be approximated by a matrix with a hierarchical low rank structure. It allows the reduction of the computational complexity per Monte Carlo sample from O(n2) to O(mn+knlog(n/m)), where k is the numerical rank of off-diagonal matrix blocks and m is the size of small diagonal blocks in the matrix that are not well-approximated by low rank factorizations and treated as dense submatrices. This hierarchical decomposition leads to substantial efficiencies in multivariate normal probability computations and allows integrations in thousands of dimensions to be practical on modern workstations.

  8. Hierarchical Decompositions for the Computation of High-Dimensional Multivariate Normal Probabilities

    KAUST Repository

    Genton, Marc G.; Keyes, David E.; Turkiyyah, George

    2017-01-01

    We present a hierarchical decomposition scheme for computing the n-dimensional integral of multivariate normal probabilities that appear frequently in statistics. The scheme exploits the fact that the formally dense covariance matrix can be approximated by a matrix with a hierarchical low rank structure. It allows the reduction of the computational complexity per Monte Carlo sample from O(n2) to O(mn+knlog(n/m)), where k is the numerical rank of off-diagonal matrix blocks and m is the size of small diagonal blocks in the matrix that are not well-approximated by low rank factorizations and treated as dense submatrices. This hierarchical decomposition leads to substantial efficiencies in multivariate normal probability computations and allows integrations in thousands of dimensions to be practical on modern workstations.

  9. Hierarchical Neural Regression Models for Customer Churn Prediction

    Directory of Open Access Journals (Sweden)

    Golshan Mohammadi

    2013-01-01

    Full Text Available As customers are the main assets of each industry, customer churn prediction is becoming a major task for companies to remain in competition with competitors. In the literature, the better applicability and efficiency of hierarchical data mining techniques has been reported. This paper considers three hierarchical models by combining four different data mining techniques for churn prediction, which are backpropagation artificial neural networks (ANN, self-organizing maps (SOM, alpha-cut fuzzy c-means (α-FCM, and Cox proportional hazards regression model. The hierarchical models are ANN + ANN + Cox, SOM + ANN + Cox, and α-FCM + ANN + Cox. In particular, the first component of the models aims to cluster data in two churner and nonchurner groups and also filter out unrepresentative data or outliers. Then, the clustered data as the outputs are used to assign customers to churner and nonchurner groups by the second technique. Finally, the correctly classified data are used to create Cox proportional hazards model. To evaluate the performance of the hierarchical models, an Iranian mobile dataset is considered. The experimental results show that the hierarchical models outperform the single Cox regression baseline model in terms of prediction accuracy, Types I and II errors, RMSE, and MAD metrics. In addition, the α-FCM + ANN + Cox model significantly performs better than the two other hierarchical models.

  10. Multivariate Regression Analysis and Slaughter Livestock,

    Science.gov (United States)

    AGRICULTURE, *ECONOMICS), (*MEAT, PRODUCTION), MULTIVARIATE ANALYSIS, REGRESSION ANALYSIS , ANIMALS, WEIGHT, COSTS, PREDICTIONS, STABILITY, MATHEMATICAL MODELS, STORAGE, BEEF, PORK, FOOD, STATISTICAL DATA, ACCURACY

  11. Regression Models For Multivariate Count Data.

    Science.gov (United States)

    Zhang, Yiwen; Zhou, Hua; Zhou, Jin; Sun, Wei

    2017-01-01

    Data with multivariate count responses frequently occur in modern applications. The commonly used multinomial-logit model is limiting due to its restrictive mean-variance structure. For instance, analyzing count data from the recent RNA-seq technology by the multinomial-logit model leads to serious errors in hypothesis testing. The ubiquity of over-dispersion and complicated correlation structures among multivariate counts calls for more flexible regression models. In this article, we study some generalized linear models that incorporate various correlation structures among the counts. Current literature lacks a treatment of these models, partly due to the fact that they do not belong to the natural exponential family. We study the estimation, testing, and variable selection for these models in a unifying framework. The regression models are compared on both synthetic and real RNA-seq data.

  12. Bayesian Inference of a Multivariate Regression Model

    Directory of Open Access Journals (Sweden)

    Marick S. Sinay

    2014-01-01

    Full Text Available We explore Bayesian inference of a multivariate linear regression model with use of a flexible prior for the covariance structure. The commonly adopted Bayesian setup involves the conjugate prior, multivariate normal distribution for the regression coefficients and inverse Wishart specification for the covariance matrix. Here we depart from this approach and propose a novel Bayesian estimator for the covariance. A multivariate normal prior for the unique elements of the matrix logarithm of the covariance matrix is considered. Such structure allows for a richer class of prior distributions for the covariance, with respect to strength of beliefs in prior location hyperparameters, as well as the added ability, to model potential correlation amongst the covariance structure. The posterior moments of all relevant parameters of interest are calculated based upon numerical results via a Markov chain Monte Carlo procedure. The Metropolis-Hastings-within-Gibbs algorithm is invoked to account for the construction of a proposal density that closely matches the shape of the target posterior distribution. As an application of the proposed technique, we investigate a multiple regression based upon the 1980 High School and Beyond Survey.

  13. Regularized multivariate regression models with skew-t error distributions

    KAUST Repository

    Chen, Lianfu; Pourahmadi, Mohsen; Maadooliat, Mehdi

    2014-01-01

    We consider regularization of the parameters in multivariate linear regression models with the errors having a multivariate skew-t distribution. An iterative penalized likelihood procedure is proposed for constructing sparse estimators of both

  14. Creating Hierarchical Pores by Controlled Linker Thermolysis in Multivariate Metal-Organic Frameworks.

    Science.gov (United States)

    Feng, Liang; Yuan, Shuai; Zhang, Liang-Liang; Tan, Kui; Li, Jia-Luo; Kirchon, Angelo; Liu, Ling-Mei; Zhang, Peng; Han, Yu; Chabal, Yves J; Zhou, Hong-Cai

    2018-02-14

    Sufficient pore size, appropriate stability, and hierarchical porosity are three prerequisites for open frameworks designed for drug delivery, enzyme immobilization, and catalysis involving large molecules. Herein, we report a powerful and general strategy, linker thermolysis, to construct ultrastable hierarchically porous metal-organic frameworks (HP-MOFs) with tunable pore size distribution. Linker instability, usually an undesirable trait of MOFs, was exploited to create mesopores by generating crystal defects throughout a microporous MOF crystal via thermolysis. The crystallinity and stability of HP-MOFs remain after thermolabile linkers are selectively removed from multivariate metal-organic frameworks (MTV-MOFs) through a decarboxylation process. A domain-based linker spatial distribution was found to be critical for creating hierarchical pores inside MTV-MOFs. Furthermore, linker thermolysis promotes the formation of ultrasmall metal oxide nanoparticles immobilized in an open framework that exhibits high catalytic activity for Lewis acid-catalyzed reactions. Most importantly, this work provides fresh insights into the connection between linker apportionment and vacancy distribution, which may shed light on probing the disordered linker apportionment in multivariate systems, a long-standing challenge in the study of MTV-MOFs.

  15. Hierarchical Cluster-based Partial Least Squares Regression (HC-PLSR is an efficient tool for metamodelling of nonlinear dynamic models

    Directory of Open Access Journals (Sweden)

    Omholt Stig W

    2011-06-01

    Full Text Available Abstract Background Deterministic dynamic models of complex biological systems contain a large number of parameters and state variables, related through nonlinear differential equations with various types of feedback. A metamodel of such a dynamic model is a statistical approximation model that maps variation in parameters and initial conditions (inputs to variation in features of the trajectories of the state variables (outputs throughout the entire biologically relevant input space. A sufficiently accurate mapping can be exploited both instrumentally and epistemically. Multivariate regression methodology is a commonly used approach for emulating dynamic models. However, when the input-output relations are highly nonlinear or non-monotone, a standard linear regression approach is prone to give suboptimal results. We therefore hypothesised that a more accurate mapping can be obtained by locally linear or locally polynomial regression. We present here a new method for local regression modelling, Hierarchical Cluster-based PLS regression (HC-PLSR, where fuzzy C-means clustering is used to separate the data set into parts according to the structure of the response surface. We compare the metamodelling performance of HC-PLSR with polynomial partial least squares regression (PLSR and ordinary least squares (OLS regression on various systems: six different gene regulatory network models with various types of feedback, a deterministic mathematical model of the mammalian circadian clock and a model of the mouse ventricular myocyte function. Results Our results indicate that multivariate regression is well suited for emulating dynamic models in systems biology. The hierarchical approach turned out to be superior to both polynomial PLSR and OLS regression in all three test cases. The advantage, in terms of explained variance and prediction accuracy, was largest in systems with highly nonlinear functional relationships and in systems with positive feedback

  16. Hierarchical cluster-based partial least squares regression (HC-PLSR) is an efficient tool for metamodelling of nonlinear dynamic models.

    Science.gov (United States)

    Tøndel, Kristin; Indahl, Ulf G; Gjuvsland, Arne B; Vik, Jon Olav; Hunter, Peter; Omholt, Stig W; Martens, Harald

    2011-06-01

    Deterministic dynamic models of complex biological systems contain a large number of parameters and state variables, related through nonlinear differential equations with various types of feedback. A metamodel of such a dynamic model is a statistical approximation model that maps variation in parameters and initial conditions (inputs) to variation in features of the trajectories of the state variables (outputs) throughout the entire biologically relevant input space. A sufficiently accurate mapping can be exploited both instrumentally and epistemically. Multivariate regression methodology is a commonly used approach for emulating dynamic models. However, when the input-output relations are highly nonlinear or non-monotone, a standard linear regression approach is prone to give suboptimal results. We therefore hypothesised that a more accurate mapping can be obtained by locally linear or locally polynomial regression. We present here a new method for local regression modelling, Hierarchical Cluster-based PLS regression (HC-PLSR), where fuzzy C-means clustering is used to separate the data set into parts according to the structure of the response surface. We compare the metamodelling performance of HC-PLSR with polynomial partial least squares regression (PLSR) and ordinary least squares (OLS) regression on various systems: six different gene regulatory network models with various types of feedback, a deterministic mathematical model of the mammalian circadian clock and a model of the mouse ventricular myocyte function. Our results indicate that multivariate regression is well suited for emulating dynamic models in systems biology. The hierarchical approach turned out to be superior to both polynomial PLSR and OLS regression in all three test cases. The advantage, in terms of explained variance and prediction accuracy, was largest in systems with highly nonlinear functional relationships and in systems with positive feedback loops. HC-PLSR is a promising approach for

  17. Depth-weighted robust multivariate regression with application to sparse data

    KAUST Repository

    Dutta, Subhajit; Genton, Marc G.

    2017-01-01

    A robust method for multivariate regression is developed based on robust estimators of the joint location and scatter matrix of the explanatory and response variables using the notion of data depth. The multivariate regression estimator possesses desirable affine equivariance properties, achieves the best breakdown point of any affine equivariant estimator, and has an influence function which is bounded in both the response as well as the predictor variable. To increase the efficiency of this estimator, a re-weighted estimator based on robust Mahalanobis distances of the residual vectors is proposed. In practice, the method is more stable than existing methods that are constructed using subsamples of the data. The resulting multivariate regression technique is computationally feasible, and turns out to perform better than several popular robust multivariate regression methods when applied to various simulated data as well as a real benchmark data set. When the data dimension is quite high compared to the sample size it is still possible to use meaningful notions of data depth along with the corresponding depth values to construct a robust estimator in a sparse setting.

  18. Depth-weighted robust multivariate regression with application to sparse data

    KAUST Repository

    Dutta, Subhajit

    2017-04-05

    A robust method for multivariate regression is developed based on robust estimators of the joint location and scatter matrix of the explanatory and response variables using the notion of data depth. The multivariate regression estimator possesses desirable affine equivariance properties, achieves the best breakdown point of any affine equivariant estimator, and has an influence function which is bounded in both the response as well as the predictor variable. To increase the efficiency of this estimator, a re-weighted estimator based on robust Mahalanobis distances of the residual vectors is proposed. In practice, the method is more stable than existing methods that are constructed using subsamples of the data. The resulting multivariate regression technique is computationally feasible, and turns out to perform better than several popular robust multivariate regression methods when applied to various simulated data as well as a real benchmark data set. When the data dimension is quite high compared to the sample size it is still possible to use meaningful notions of data depth along with the corresponding depth values to construct a robust estimator in a sparse setting.

  19. Creating Hierarchical Pores by Controlled Linker Thermolysis in Multivariate Metal-Organic Frameworks

    KAUST Repository

    Feng, Liang

    2018-01-18

    Sufficient pore size, appropriate stability and hierarchical porosity are three prerequisites for open frameworks designed for drug delivery, enzyme immobilization and catalysis involving large molecules. Herein, we report a powerful and general strate-gy, linker thermolysis, to construct ultra-stable hierarchically porous metal−organic frameworks (HP-MOFs) with tunable pore size distribution. Linker instability, usually an undesirable trait of MOFs, was exploited to create mesopores by generating crystal defects throughout a microporous MOF crystal via thermolysis. The crystallinity and stability of HP-MOFs remain after thermolabile linkers are selectively removed from multivariate metal-organic frameworks (MTV-MOFs) through a decarboxyla-tion process. A domain-based linker spatial distribution was found to be critical for creating hierarchical pores inside MTV-MOFs. Furthermore, linker thermolysis promotes the formation of ultra-small metal oxide (MO) nanoparticles immobilized in an open framework that exhibits high catalytic activity for Lewis acid catalyzed reactions. Most importantly, this work pro-vides fresh insights into the connection between linker apportionment and vacancy distribution, which may shed light on prob-ing the disordered linker apportionment in multivariate systems, a long-standing challenge in the study of MTV-MOFs.

  20. An Efficient Local Algorithm for Distributed Multivariate Regression

    Data.gov (United States)

    National Aeronautics and Space Administration — This paper offers a local distributed algorithm for multivariate regression in large peer-to-peer environments. The algorithm is designed for distributed...

  1. A Scalable Local Algorithm for Distributed Multivariate Regression

    Data.gov (United States)

    National Aeronautics and Space Administration — This paper offers a local distributed algorithm for multivariate regression in large peer-to-peer environments. The algorithm can be used for distributed...

  2. Sunspot Cycle Prediction Using Multivariate Regression and Binary ...

    Indian Academy of Sciences (India)

    49

    Multivariate regression model has been derived based on the available cycles 1 .... The flare index correlates well with various parameters of the solar activity. ...... 32) Sabarinath A and Anilkumar A K 2011 A stochastic prediction model for the.

  3. Higher-order Multivariable Polynomial Regression to Estimate Human Affective States

    Science.gov (United States)

    Wei, Jie; Chen, Tong; Liu, Guangyuan; Yang, Jiemin

    2016-03-01

    From direct observations, facial, vocal, gestural, physiological, and central nervous signals, estimating human affective states through computational models such as multivariate linear-regression analysis, support vector regression, and artificial neural network, have been proposed in the past decade. In these models, linear models are generally lack of precision because of ignoring intrinsic nonlinearities of complex psychophysiological processes; and nonlinear models commonly adopt complicated algorithms. To improve accuracy and simplify model, we introduce a new computational modeling method named as higher-order multivariable polynomial regression to estimate human affective states. The study employs standardized pictures in the International Affective Picture System to induce thirty subjects’ affective states, and obtains pure affective patterns of skin conductance as input variables to the higher-order multivariable polynomial model for predicting affective valence and arousal. Experimental results show that our method is able to obtain efficient correlation coefficients of 0.98 and 0.96 for estimation of affective valence and arousal, respectively. Moreover, the method may provide certain indirect evidences that valence and arousal have their brain’s motivational circuit origins. Thus, the proposed method can serve as a novel one for efficiently estimating human affective states.

  4. Multivariate Regression of Liver on Intestine of Mice: A ...

    African Journals Online (AJOL)

    Multivariate Regression of Liver on Intestine of Mice: A Chemotherapeutic Evaluation of Plant ... Using an analysis of covariance model, the effects ... The findings revealed, with the aid of likelihood-ratio statistic, a marked improvement in

  5. Asymptotics of Multivariate Regression with Consecutively Added Dependent Varibles

    NARCIS (Netherlands)

    Raats, V.M.; van der Genugten, B.B.; Moors, J.J.A.

    2004-01-01

    We consider multivariate regression where new dependent variables are consecutively added during the experiment (or in time).So, viewed at the end of the experiment, the number of observations decreases with each added variable. The explanatory variables are observed throughout.In a previous paper

  6. Supremum Norm Posterior Contraction and Credible Sets for Nonparametric Multivariate Regression

    NARCIS (Netherlands)

    Yoo, W.W.; Ghosal, S

    2016-01-01

    In the setting of nonparametric multivariate regression with unknown error variance, we study asymptotic properties of a Bayesian method for estimating a regression function f and its mixed partial derivatives. We use a random series of tensor product of B-splines with normal basis coefficients as a

  7. A joint model for multivariate hierarchical semicontinuous data with replications.

    Science.gov (United States)

    Kassahun-Yimer, Wondwosen; Albert, Paul S; Lipsky, Leah M; Nansel, Tonja R; Liu, Aiyi

    2017-01-01

    Longitudinal data are often collected in biomedical applications in such a way that measurements on more than one response are taken from a given subject repeatedly overtime. For some problems, these multiple profiles need to be modeled jointly to get insight on the joint evolution and/or association of these responses over time. In practice, such longitudinal outcomes may have many zeros that need to be accounted for in the analysis. For example, in dietary intake studies, as we focus on in this paper, some food components are eaten daily by almost all subjects, while others are consumed episodically, where individuals have time periods where they do not eat these components followed by periods where they do. These episodically consumed foods need to be adequately modeled to account for the many zeros that are encountered. In this paper, we propose a joint model to analyze multivariate hierarchical semicontinuous data characterized by many zeros and more than one replicate observations at each measurement occasion. This approach allows for different probability mechanisms for describing the zero behavior as compared with the mean intake given that the individual consumes the food. To deal with the potentially large number of multivariate profiles, we use a pairwise model fitting approach that was developed in the context of multivariate Gaussian random effects models with large number of multivariate components. The novelty of the proposed approach is that it incorporates: (1) multivariate, possibly correlated, response variables; (2) within subject correlation resulting from repeated measurements taken from each subject; (3) many zero observations; (4) overdispersion; and (5) replicate measurements at each visit time.

  8. Multivariate Local Polynomial Regression with Application to Shenzhen Component Index

    Directory of Open Access Journals (Sweden)

    Liyun Su

    2011-01-01

    Full Text Available This study attempts to characterize and predict stock index series in Shenzhen stock market using the concepts of multivariate local polynomial regression. Based on nonlinearity and chaos of the stock index time series, multivariate local polynomial prediction methods and univariate local polynomial prediction method, all of which use the concept of phase space reconstruction according to Takens' Theorem, are considered. To fit the stock index series, the single series changes into bivariate series. To evaluate the results, the multivariate predictor for bivariate time series based on multivariate local polynomial model is compared with univariate predictor with the same Shenzhen stock index data. The numerical results obtained by Shenzhen component index show that the prediction mean squared error of the multivariate predictor is much smaller than the univariate one and is much better than the existed three methods. Even if the last half of the training data are used in the multivariate predictor, the prediction mean squared error is smaller than the univariate predictor. Multivariate local polynomial prediction model for nonsingle time series is a useful tool for stock market price prediction.

  9. Latent Variable Regression 4-Level Hierarchical Model Using Multisite Multiple-Cohorts Longitudinal Data. CRESST Report 801

    Science.gov (United States)

    Choi, Kilchan

    2011-01-01

    This report explores a new latent variable regression 4-level hierarchical model for monitoring school performance over time using multisite multiple-cohorts longitudinal data. This kind of data set has a 4-level hierarchical structure: time-series observation nested within students who are nested within different cohorts of students. These…

  10. REGSTEP - stepwise multivariate polynomial regression with singular extensions

    International Nuclear Information System (INIS)

    Davierwalla, D.M.

    1977-09-01

    The program REGSTEP determines a polynomial approximation, in the least squares sense, to tabulated data. The polynomial may be univariate or multivariate. The computational method is that of stepwise regression. A variable is inserted into the regression basis if it is significant with respect to an appropriate F-test at a preselected risk level. In addition, should a variable already in the basis, become nonsignificant (again with respect to an appropriate F-test) after the entry of a new variable, it is expelled from the model. Thus only significant variables are retained in the model. Although written expressly to be incorporated into CORCOD, a code for predicting nuclear cross sections for given values of power, temperature, void fractions, Boron content etc. there is nothing to limit the use of REGSTEP to nuclear applications, as the examples demonstrate. A separate version has been incorporated into RSYST for the general user. (Auth.)

  11. Multivariate nonparametric regression and visualization with R and applications to finance

    CERN Document Server

    Klemelä, Jussi

    2014-01-01

    A modern approach to statistical learning and its applications through visualization methods With a unique and innovative presentation, Multivariate Nonparametric Regression and Visualization provides readers with the core statistical concepts to obtain complete and accurate predictions when given a set of data. Focusing on nonparametric methods to adapt to the multiple types of data generatingmechanisms, the book begins with an overview of classification and regression. The book then introduces and examines various tested and proven visualization techniques for learning samples and functio

  12. Entrepreneurial intention modeling using hierarchical multiple regression

    Directory of Open Access Journals (Sweden)

    Marina Jeger

    2014-12-01

    Full Text Available The goal of this study is to identify the contribution of effectuation dimensions to the predictive power of the entrepreneurial intention model over and above that which can be accounted for by other predictors selected and confirmed in previous studies. As is often the case in social and behavioral studies, some variables are likely to be highly correlated with each other. Therefore, the relative amount of variance in the criterion variable explained by each of the predictors depends on several factors such as the order of variable entry and sample specifics. The results show the modest predictive power of two dimensions of effectuation prior to the introduction of the theory of planned behavior elements. The article highlights the main advantages of applying hierarchical regression in social sciences as well as in the specific context of entrepreneurial intention formation, and addresses some of the potential pitfalls that this type of analysis entails.

  13. Collision prediction models using multivariate Poisson-lognormal regression.

    Science.gov (United States)

    El-Basyouny, Karim; Sayed, Tarek

    2009-07-01

    This paper advocates the use of multivariate Poisson-lognormal (MVPLN) regression to develop models for collision count data. The MVPLN approach presents an opportunity to incorporate the correlations across collision severity levels and their influence on safety analyses. The paper introduces a new multivariate hazardous location identification technique, which generalizes the univariate posterior probability of excess that has been commonly proposed and applied in the literature. In addition, the paper presents an alternative approach for quantifying the effect of the multivariate structure on the precision of expected collision frequency. The MVPLN approach is compared with the independent (separate) univariate Poisson-lognormal (PLN) models with respect to model inference, goodness-of-fit, identification of hot spots and precision of expected collision frequency. The MVPLN is modeled using the WinBUGS platform which facilitates computation of posterior distributions as well as providing a goodness-of-fit measure for model comparisons. The results indicate that the estimates of the extra Poisson variation parameters were considerably smaller under MVPLN leading to higher precision. The improvement in precision is due mainly to the fact that MVPLN accounts for the correlation between the latent variables representing property damage only (PDO) and injuries plus fatalities (I+F). This correlation was estimated at 0.758, which is highly significant, suggesting that higher PDO rates are associated with higher I+F rates, as the collision likelihood for both types is likely to rise due to similar deficiencies in roadway design and/or other unobserved factors. In terms of goodness-of-fit, the MVPLN model provided a superior fit than the independent univariate models. The multivariate hazardous location identification results demonstrated that some hazardous locations could be overlooked if the analysis was restricted to the univariate models.

  14. Fourier transform infrared spectroscopic imaging and multivariate regression for prediction of proteoglycan content of articular cartilage.

    Directory of Open Access Journals (Sweden)

    Lassi Rieppo

    Full Text Available Fourier Transform Infrared (FT-IR spectroscopic imaging has been earlier applied for the spatial estimation of the collagen and the proteoglycan (PG contents of articular cartilage (AC. However, earlier studies have been limited to the use of univariate analysis techniques. Current analysis methods lack the needed specificity for collagen and PGs. The aim of the present study was to evaluate the suitability of partial least squares regression (PLSR and principal component regression (PCR methods for the analysis of the PG content of AC. Multivariate regression models were compared with earlier used univariate methods and tested with a sample material consisting of healthy and enzymatically degraded steer AC. Chondroitinase ABC enzyme was used to increase the variation in PG content levels as compared to intact AC. Digital densitometric measurements of Safranin O-stained sections provided the reference for PG content. The results showed that multivariate regression models predict PG content of AC significantly better than earlier used absorbance spectrum (i.e. the area of carbohydrate region with or without amide I normalization or second derivative spectrum univariate parameters. Increased molecular specificity favours the use of multivariate regression models, but they require more knowledge of chemometric analysis and extended laboratory resources for gathering reference data for establishing the models. When true molecular specificity is required, the multivariate models should be used.

  15. Preference learning with evolutionary Multivariate Adaptive Regression Spline model

    DEFF Research Database (Denmark)

    Abou-Zleikha, Mohamed; Shaker, Noor; Christensen, Mads Græsbøll

    2015-01-01

    This paper introduces a novel approach for pairwise preference learning through combining an evolutionary method with Multivariate Adaptive Regression Spline (MARS). Collecting users' feedback through pairwise preferences is recommended over other ranking approaches as this method is more appealing...... for function approximation as well as being relatively easy to interpret. MARS models are evolved based on their efficiency in learning pairwise data. The method is tested on two datasets that collectively provide pairwise preference data of five cognitive states expressed by users. The method is analysed...

  16. Boosted regression trees, multivariate adaptive regression splines and their two-step combinations with multiple linear regression or partial least squares to predict blood-brain barrier passage: a case study.

    Science.gov (United States)

    Deconinck, E; Zhang, M H; Petitet, F; Dubus, E; Ijjaali, I; Coomans, D; Vander Heyden, Y

    2008-02-18

    The use of some unconventional non-linear modeling techniques, i.e. classification and regression trees and multivariate adaptive regression splines-based methods, was explored to model the blood-brain barrier (BBB) passage of drugs and drug-like molecules. The data set contains BBB passage values for 299 structural and pharmacological diverse drugs, originating from a structured knowledge-based database. Models were built using boosted regression trees (BRT) and multivariate adaptive regression splines (MARS), as well as their respective combinations with stepwise multiple linear regression (MLR) and partial least squares (PLS) regression in two-step approaches. The best models were obtained using combinations of MARS with either stepwise MLR or PLS. It could be concluded that the use of combinations of a linear with a non-linear modeling technique results in some improved properties compared to the individual linear and non-linear models and that, when the use of such a combination is appropriate, combinations using MARS as non-linear technique should be preferred over those with BRT, due to some serious drawbacks of the BRT approaches.

  17. Multivariate Frequency-Severity Regression Models in Insurance

    Directory of Open Access Journals (Sweden)

    Edward W. Frees

    2016-02-01

    Full Text Available In insurance and related industries including healthcare, it is common to have several outcome measures that the analyst wishes to understand using explanatory variables. For example, in automobile insurance, an accident may result in payments for damage to one’s own vehicle, damage to another party’s vehicle, or personal injury. It is also common to be interested in the frequency of accidents in addition to the severity of the claim amounts. This paper synthesizes and extends the literature on multivariate frequency-severity regression modeling with a focus on insurance industry applications. Regression models for understanding the distribution of each outcome continue to be developed yet there now exists a solid body of literature for the marginal outcomes. This paper contributes to this body of literature by focusing on the use of a copula for modeling the dependence among these outcomes; a major advantage of this tool is that it preserves the body of work established for marginal models. We illustrate this approach using data from the Wisconsin Local Government Property Insurance Fund. This fund offers insurance protection for (i property; (ii motor vehicle; and (iii contractors’ equipment claims. In addition to several claim types and frequency-severity components, outcomes can be further categorized by time and space, requiring complex dependency modeling. We find significant dependencies for these data; specifically, we find that dependencies among lines are stronger than the dependencies between the frequency and average severity within each line.

  18. Regression Analysis for Multivariate Dependent Count Data Using Convolved Gaussian Processes

    OpenAIRE

    Sofro, A'yunin; Shi, Jian Qing; Cao, Chunzheng

    2017-01-01

    Research on Poisson regression analysis for dependent data has been developed rapidly in the last decade. One of difficult problems in a multivariate case is how to construct a cross-correlation structure and at the meantime make sure that the covariance matrix is positive definite. To address the issue, we propose to use convolved Gaussian process (CGP) in this paper. The approach provides a semi-parametric model and offers a natural framework for modeling common mean structure and covarianc...

  19. Prognostic factorsin inoperable adenocarcinoma of the lung: A multivariate regression analysis of 259 patiens

    DEFF Research Database (Denmark)

    Sørensen, Jens Benn; Badsberg, Jens Henrik; Olsen, Jens

    1989-01-01

    The prognostic factors for survival in advanced adenocarcinoma of the lung were investigated in a consecutive series of 259 patients treated with chemotherapy. Twenty-eight pretreatment variables were investigated by use of Cox's multivariate regression model, including histological subtypes and ...

  20. Multivariate Linear Regression and CART Regression Analysis of TBM Performance at Abu Hamour Phase-I Tunnel

    Science.gov (United States)

    Jakubowski, J.; Stypulkowski, J. B.; Bernardeau, F. G.

    2017-12-01

    The first phase of the Abu Hamour drainage and storm tunnel was completed in early 2017. The 9.5 km long, 3.7 m diameter tunnel was excavated with two Earth Pressure Balance (EPB) Tunnel Boring Machines from Herrenknecht. TBM operation processes were monitored and recorded by Data Acquisition and Evaluation System. The authors coupled collected TBM drive data with available information on rock mass properties, cleansed, completed with secondary variables and aggregated by weeks and shifts. Correlations and descriptive statistics charts were examined. Multivariate Linear Regression and CART regression tree models linking TBM penetration rate (PR), penetration per revolution (PPR) and field penetration index (FPI) with TBM operational and geotechnical characteristics were performed for the conditions of the weak/soft rock of Doha. Both regression methods are interpretable and the data were screened with different computational approaches allowing enriched insight. The primary goal of the analysis was to investigate empirical relations between multiple explanatory and responding variables, to search for best subsets of explanatory variables and to evaluate the strength of linear and non-linear relations. For each of the penetration indices, a predictive model coupling both regression methods was built and validated. The resultant models appeared to be stronger than constituent ones and indicated an opportunity for more accurate and robust TBM performance predictions.

  1. High-throughput quantitative biochemical characterization of algal biomass by NIR spectroscopy; multiple linear regression and multivariate linear regression analysis.

    Science.gov (United States)

    Laurens, L M L; Wolfrum, E J

    2013-12-18

    One of the challenges associated with microalgal biomass characterization and the comparison of microalgal strains and conversion processes is the rapid determination of the composition of algae. We have developed and applied a high-throughput screening technology based on near-infrared (NIR) spectroscopy for the rapid and accurate determination of algal biomass composition. We show that NIR spectroscopy can accurately predict the full composition using multivariate linear regression analysis of varying lipid, protein, and carbohydrate content of algal biomass samples from three strains. We also demonstrate a high quality of predictions of an independent validation set. A high-throughput 96-well configuration for spectroscopy gives equally good prediction relative to a ring-cup configuration, and thus, spectra can be obtained from as little as 10-20 mg of material. We found that lipids exhibit a dominant, distinct, and unique fingerprint in the NIR spectrum that allows for the use of single and multiple linear regression of respective wavelengths for the prediction of the biomass lipid content. This is not the case for carbohydrate and protein content, and thus, the use of multivariate statistical modeling approaches remains necessary.

  2. Correcting for multivariate measurement error by regression calibration in meta-analyses of epidemiological studies.

    NARCIS (Netherlands)

    Kromhout, D.

    2009-01-01

    Within-person variability in measured values of multiple risk factors can bias their associations with disease. The multivariate regression calibration (RC) approach can correct for such measurement error and has been applied to studies in which true values or independent repeat measurements of the

  3. Parameter estimation of multivariate multiple regression model using bayesian with non-informative Jeffreys’ prior distribution

    Science.gov (United States)

    Saputro, D. R. S.; Amalia, F.; Widyaningsih, P.; Affan, R. C.

    2018-05-01

    Bayesian method is a method that can be used to estimate the parameters of multivariate multiple regression model. Bayesian method has two distributions, there are prior and posterior distributions. Posterior distribution is influenced by the selection of prior distribution. Jeffreys’ prior distribution is a kind of Non-informative prior distribution. This prior is used when the information about parameter not available. Non-informative Jeffreys’ prior distribution is combined with the sample information resulting the posterior distribution. Posterior distribution is used to estimate the parameter. The purposes of this research is to estimate the parameters of multivariate regression model using Bayesian method with Non-informative Jeffreys’ prior distribution. Based on the results and discussion, parameter estimation of β and Σ which were obtained from expected value of random variable of marginal posterior distribution function. The marginal posterior distributions for β and Σ are multivariate normal and inverse Wishart. However, in calculation of the expected value involving integral of a function which difficult to determine the value. Therefore, approach is needed by generating of random samples according to the posterior distribution characteristics of each parameter using Markov chain Monte Carlo (MCMC) Gibbs sampling algorithm.

  4. A Logistic Regression Model with a Hierarchical Random Error Term for Analyzing the Utilization of Public Transport

    Directory of Open Access Journals (Sweden)

    Chong Wei

    2015-01-01

    Full Text Available Logistic regression models have been widely used in previous studies to analyze public transport utilization. These studies have shown travel time to be an indispensable variable for such analysis and usually consider it to be a deterministic variable. This formulation does not allow us to capture travelers’ perception error regarding travel time, and recent studies have indicated that this error can have a significant effect on modal choice behavior. In this study, we propose a logistic regression model with a hierarchical random error term. The proposed model adds a new random error term for the travel time variable. This term structure enables us to investigate travelers’ perception error regarding travel time from a given choice behavior dataset. We also propose an extended model that allows constraining the sign of this error in the model. We develop two Gibbs samplers to estimate the basic hierarchical model and the extended model. The performance of the proposed models is examined using a well-known dataset.

  5. The PIT-trap-A "model-free" bootstrap procedure for inference about regression models with discrete, multivariate responses.

    Science.gov (United States)

    Warton, David I; Thibaut, Loïc; Wang, Yi Alice

    2017-01-01

    Bootstrap methods are widely used in statistics, and bootstrapping of residuals can be especially useful in the regression context. However, difficulties are encountered extending residual resampling to regression settings where residuals are not identically distributed (thus not amenable to bootstrapping)-common examples including logistic or Poisson regression and generalizations to handle clustered or multivariate data, such as generalised estimating equations. We propose a bootstrap method based on probability integral transform (PIT-) residuals, which we call the PIT-trap, which assumes data come from some marginal distribution F of known parametric form. This method can be understood as a type of "model-free bootstrap", adapted to the problem of discrete and highly multivariate data. PIT-residuals have the key property that they are (asymptotically) pivotal. The PIT-trap thus inherits the key property, not afforded by any other residual resampling approach, that the marginal distribution of data can be preserved under PIT-trapping. This in turn enables the derivation of some standard bootstrap properties, including second-order correctness of pivotal PIT-trap test statistics. In multivariate data, bootstrapping rows of PIT-residuals affords the property that it preserves correlation in data without the need for it to be modelled, a key point of difference as compared to a parametric bootstrap. The proposed method is illustrated on an example involving multivariate abundance data in ecology, and demonstrated via simulation to have improved properties as compared to competing resampling methods.

  6. Production optimisation in the petrochemical industry by hierarchical multivariate modelling

    Energy Technology Data Exchange (ETDEWEB)

    Andersson, Magnus; Furusjoe, Erik; Jansson, Aasa

    2004-06-01

    This project demonstrates the advantages of applying hierarchical multivariate modelling in the petrochemical industry in order to increase knowledge of the total process. The models indicate possible ways to optimise the process regarding the use of energy and raw material, which is directly linked to the environmental impact of the process. The refinery of Nynaes Refining AB (Goeteborg, Sweden) has acted as a demonstration site in this project. The models developed for the demonstration site resulted in: Detection of an unknown process disturbance and suggestions of possible causes; Indications on how to increase the yield in combination with energy savings; The possibility to predict product quality from on-line process measurements, making the results available at a higher frequency than customary laboratory analysis; Quantification of the gradually lowered efficiency of heat transfer in the furnace and increased fuel consumption as an effect of soot build-up on the furnace coils; Increased knowledge of the relation between production rate and the efficiency of the heat exchangers. This report is one of two reports from the project. It contains a technical discussion of the result with some degree of detail. A shorter and more easily accessible report is also available, see IVL report B1586-A.

  7. Comparing treatment effects after adjustment with multivariable Cox proportional hazards regression and propensity score methods

    NARCIS (Netherlands)

    Martens, Edwin P; de Boer, Anthonius; Pestman, Wiebe R; Belitser, Svetlana V; Stricker, Bruno H Ch; Klungel, Olaf H

    PURPOSE: To compare adjusted effects of drug treatment for hypertension on the risk of stroke from propensity score (PS) methods with a multivariable Cox proportional hazards (Cox PH) regression in an observational study with censored data. METHODS: From two prospective population-based cohort

  8. Multivariate linear regression of high-dimensional fMRI data with multiple target variables.

    Science.gov (United States)

    Valente, Giancarlo; Castellanos, Agustin Lage; Vanacore, Gianluca; Formisano, Elia

    2014-05-01

    Multivariate regression is increasingly used to study the relation between fMRI spatial activation patterns and experimental stimuli or behavioral ratings. With linear models, informative brain locations are identified by mapping the model coefficients. This is a central aspect in neuroimaging, as it provides the sought-after link between the activity of neuronal populations and subject's perception, cognition or behavior. Here, we show that mapping of informative brain locations using multivariate linear regression (MLR) may lead to incorrect conclusions and interpretations. MLR algorithms for high dimensional data are designed to deal with targets (stimuli or behavioral ratings, in fMRI) separately, and the predictive map of a model integrates information deriving from both neural activity patterns and experimental design. Not accounting explicitly for the presence of other targets whose associated activity spatially overlaps with the one of interest may lead to predictive maps of troublesome interpretation. We propose a new model that can correctly identify the spatial patterns associated with a target while achieving good generalization. For each target, the training is based on an augmented dataset, which includes all remaining targets. The estimation on such datasets produces both maps and interaction coefficients, which are then used to generalize. The proposed formulation is independent of the regression algorithm employed. We validate this model on simulated fMRI data and on a publicly available dataset. Results indicate that our method achieves high spatial sensitivity and good generalization and that it helps disentangle specific neural effects from interaction with predictive maps associated with other targets. Copyright © 2013 Wiley Periodicals, Inc.

  9. Bias correction in the hierarchical likelihood approach to the analysis of multivariate survival data.

    Science.gov (United States)

    Jeon, Jihyoun; Hsu, Li; Gorfine, Malka

    2012-07-01

    Frailty models are useful for measuring unobserved heterogeneity in risk of failures across clusters, providing cluster-specific risk prediction. In a frailty model, the latent frailties shared by members within a cluster are assumed to act multiplicatively on the hazard function. In order to obtain parameter and frailty variate estimates, we consider the hierarchical likelihood (H-likelihood) approach (Ha, Lee and Song, 2001. Hierarchical-likelihood approach for frailty models. Biometrika 88, 233-243) in which the latent frailties are treated as "parameters" and estimated jointly with other parameters of interest. We find that the H-likelihood estimators perform well when the censoring rate is low, however, they are substantially biased when the censoring rate is moderate to high. In this paper, we propose a simple and easy-to-implement bias correction method for the H-likelihood estimators under a shared frailty model. We also extend the method to a multivariate frailty model, which incorporates complex dependence structure within clusters. We conduct an extensive simulation study and show that the proposed approach performs very well for censoring rates as high as 80%. We also illustrate the method with a breast cancer data set. Since the H-likelihood is the same as the penalized likelihood function, the proposed bias correction method is also applicable to the penalized likelihood estimators.

  10. Non-proportional odds multivariate logistic regression of ordinal family data.

    Science.gov (United States)

    Zaloumis, Sophie G; Scurrah, Katrina J; Harrap, Stephen B; Ellis, Justine A; Gurrin, Lyle C

    2015-03-01

    Methods to examine whether genetic and/or environmental sources can account for the residual variation in ordinal family data usually assume proportional odds. However, standard software to fit the non-proportional odds model to ordinal family data is limited because the correlation structure of family data is more complex than for other types of clustered data. To perform these analyses we propose the non-proportional odds multivariate logistic regression model and take a simulation-based approach to model fitting using Markov chain Monte Carlo methods, such as partially collapsed Gibbs sampling and the Metropolis algorithm. We applied the proposed methodology to male pattern baldness data from the Victorian Family Heart Study. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  11. Linear Multivariable Regression Models for Prediction of Eddy Dissipation Rate from Available Meteorological Data

    Science.gov (United States)

    MCKissick, Burnell T. (Technical Monitor); Plassman, Gerald E.; Mall, Gerald H.; Quagliano, John R.

    2005-01-01

    Linear multivariable regression models for predicting day and night Eddy Dissipation Rate (EDR) from available meteorological data sources are defined and validated. Model definition is based on a combination of 1997-2000 Dallas/Fort Worth (DFW) data sources, EDR from Aircraft Vortex Spacing System (AVOSS) deployment data, and regression variables primarily from corresponding Automated Surface Observation System (ASOS) data. Model validation is accomplished through EDR predictions on a similar combination of 1994-1995 Memphis (MEM) AVOSS and ASOS data. Model forms include an intercept plus a single term of fixed optimal power for each of these regression variables; 30-minute forward averaged mean and variance of near-surface wind speed and temperature, variance of wind direction, and a discrete cloud cover metric. Distinct day and night models, regressing on EDR and the natural log of EDR respectively, yield best performance and avoid model discontinuity over day/night data boundaries.

  12. Analysis of Multivariate Experimental Data Using A Simplified Regression Model Search Algorithm

    Science.gov (United States)

    Ulbrich, Norbert Manfred

    2013-01-01

    A new regression model search algorithm was developed in 2011 that may be used to analyze both general multivariate experimental data sets and wind tunnel strain-gage balance calibration data. The new algorithm is a simplified version of a more complex search algorithm that was originally developed at the NASA Ames Balance Calibration Laboratory. The new algorithm has the advantage that it needs only about one tenth of the original algorithm's CPU time for the completion of a search. In addition, extensive testing showed that the prediction accuracy of math models obtained from the simplified algorithm is similar to the prediction accuracy of math models obtained from the original algorithm. The simplified algorithm, however, cannot guarantee that search constraints related to a set of statistical quality requirements are always satisfied in the optimized regression models. Therefore, the simplified search algorithm is not intended to replace the original search algorithm. Instead, it may be used to generate an alternate optimized regression model of experimental data whenever the application of the original search algorithm either fails or requires too much CPU time. Data from a machine calibration of NASA's MK40 force balance is used to illustrate the application of the new regression model search algorithm.

  13. Multivariate regression analysis for determining short-term values of radon and its decay products from filter measurements

    International Nuclear Information System (INIS)

    Kraut, W.; Schwarz, W.; Wilhelm, A.

    1994-01-01

    A multivariate regression analysis is applied to decay measurements of α-resp. β-filter activcity. Activity concentrations for Po-218, Pb-214 and Bi-214, resp. for the Rn-222 equilibrium equivalent concentration are obtained explicitly. The regression analysis takes into account properly the variances of the measured count rates and their influence on the resulting activity concentrations. (orig.) [de

  14. PM10 modeling in the Oviedo urban area (Northern Spain) by using multivariate adaptive regression splines

    Science.gov (United States)

    Nieto, Paulino José García; Antón, Juan Carlos Álvarez; Vilán, José Antonio Vilán; García-Gonzalo, Esperanza

    2014-10-01

    The aim of this research work is to build a regression model of the particulate matter up to 10 micrometers in size (PM10) by using the multivariate adaptive regression splines (MARS) technique in the Oviedo urban area (Northern Spain) at local scale. This research work explores the use of a nonparametric regression algorithm known as multivariate adaptive regression splines (MARS) which has the ability to approximate the relationship between the inputs and outputs, and express the relationship mathematically. In this sense, hazardous air pollutants or toxic air contaminants refer to any substance that may cause or contribute to an increase in mortality or serious illness, or that may pose a present or potential hazard to human health. To accomplish the objective of this study, the experimental dataset of nitrogen oxides (NOx), carbon monoxide (CO), sulfur dioxide (SO2), ozone (O3) and dust (PM10) were collected over 3 years (2006-2008) and they are used to create a highly nonlinear model of the PM10 in the Oviedo urban nucleus (Northern Spain) based on the MARS technique. One main objective of this model is to obtain a preliminary estimate of the dependence between PM10 pollutant in the Oviedo urban area at local scale. A second aim is to determine the factors with the greatest bearing on air quality with a view to proposing health and lifestyle improvements. The United States National Ambient Air Quality Standards (NAAQS) establishes the limit values of the main pollutants in the atmosphere in order to ensure the health of healthy people. Firstly, this MARS regression model captures the main perception of statistical learning theory in order to obtain a good prediction of the dependence among the main pollutants in the Oviedo urban area. Secondly, the main advantages of MARS are its capacity to produce simple, easy-to-interpret models, its ability to estimate the contributions of the input variables, and its computational efficiency. Finally, on the basis of

  15. Analyzing thresholds and efficiency with hierarchical Bayesian logistic regression.

    Science.gov (United States)

    Houpt, Joseph W; Bittner, Jennifer L

    2018-05-10

    Ideal observer analysis is a fundamental tool used widely in vision science for analyzing the efficiency with which a cognitive or perceptual system uses available information. The performance of an ideal observer provides a formal measure of the amount of information in a given experiment. The ratio of human to ideal performance is then used to compute efficiency, a construct that can be directly compared across experimental conditions while controlling for the differences due to the stimuli and/or task specific demands. In previous research using ideal observer analysis, the effects of varying experimental conditions on efficiency have been tested using ANOVAs and pairwise comparisons. In this work, we present a model that combines Bayesian estimates of psychometric functions with hierarchical logistic regression for inference about both unadjusted human performance metrics and efficiencies. Our approach improves upon the existing methods by constraining the statistical analysis using a standard model connecting stimulus intensity to human observer accuracy and by accounting for variability in the estimates of human and ideal observer performance scores. This allows for both individual and group level inferences. Copyright © 2018 Elsevier Ltd. All rights reserved.

  16. Correcting for multivariate measurement error by regression calibration in meta-analyses of epidemiological studies

    DEFF Research Database (Denmark)

    Tybjærg-Hansen, Anne

    2009-01-01

    Within-person variability in measured values of multiple risk factors can bias their associations with disease. The multivariate regression calibration (RC) approach can correct for such measurement error and has been applied to studies in which true values or independent repeat measurements...... of the risk factors are observed on a subsample. We extend the multivariate RC techniques to a meta-analysis framework where multiple studies provide independent repeat measurements and information on disease outcome. We consider the cases where some or all studies have repeat measurements, and compare study......-specific, averaged and empirical Bayes estimates of RC parameters. Additionally, we allow for binary covariates (e.g. smoking status) and for uncertainty and time trends in the measurement error corrections. Our methods are illustrated using a subset of individual participant data from prospective long-term studies...

  17. Structural brain connectivity and cognitive ability differences: A multivariate distance matrix regression analysis.

    Science.gov (United States)

    Ponsoda, Vicente; Martínez, Kenia; Pineda-Pardo, José A; Abad, Francisco J; Olea, Julio; Román, Francisco J; Barbey, Aron K; Colom, Roberto

    2017-02-01

    Neuroimaging research involves analyses of huge amounts of biological data that might or might not be related with cognition. This relationship is usually approached using univariate methods, and, therefore, correction methods are mandatory for reducing false positives. Nevertheless, the probability of false negatives is also increased. Multivariate frameworks have been proposed for helping to alleviate this balance. Here we apply multivariate distance matrix regression for the simultaneous analysis of biological and cognitive data, namely, structural connections among 82 brain regions and several latent factors estimating cognitive performance. We tested whether cognitive differences predict distances among individuals regarding their connectivity pattern. Beginning with 3,321 connections among regions, the 36 edges better predicted by the individuals' cognitive scores were selected. Cognitive scores were related to connectivity distances in both the full (3,321) and reduced (36) connectivity patterns. The selected edges connect regions distributed across the entire brain and the network defined by these edges supports high-order cognitive processes such as (a) (fluid) executive control, (b) (crystallized) recognition, learning, and language processing, and (c) visuospatial processing. This multivariate study suggests that one widespread, but limited number, of regions in the human brain, supports high-level cognitive ability differences. Hum Brain Mapp 38:803-816, 2017. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  18. Conceptual hierarchical modeling to describe wetland plant community organization

    Science.gov (United States)

    Little, A.M.; Guntenspergen, G.R.; Allen, T.F.H.

    2010-01-01

    Using multivariate analysis, we created a hierarchical modeling process that describes how differently-scaled environmental factors interact to affect wetland-scale plant community organization in a system of small, isolated wetlands on Mount Desert Island, Maine. We followed the procedure: 1) delineate wetland groups using cluster analysis, 2) identify differently scaled environmental gradients using non-metric multidimensional scaling, 3) order gradient hierarchical levels according to spatiotem-poral scale of fluctuation, and 4) assemble hierarchical model using group relationships with ordination axes and post-hoc tests of environmental differences. Using this process, we determined 1) large wetland size and poor surface water chemistry led to the development of shrub fen wetland vegetation, 2) Sphagnum and water chemistry differences affected fen vs. marsh / sedge meadows status within small wetlands, and 3) small-scale hydrologic differences explained transitions between forested vs. non-forested and marsh vs. sedge meadow vegetation. This hierarchical modeling process can help explain how upper level contextual processes constrain biotic community response to lower-level environmental changes. It creates models with more nuanced spatiotemporal complexity than classification and regression tree procedures. Using this process, wetland scientists will be able to generate more generalizable theories of plant community organization, and useful management models. ?? Society of Wetland Scientists 2009.

  19. A graphical method to evaluate spectral preprocessing in multivariate regression calibrations: example with Savitzky-Golay filters and partial least squares regression.

    Science.gov (United States)

    Delwiche, Stephen R; Reeves, James B

    2010-01-01

    In multivariate regression analysis of spectroscopy data, spectral preprocessing is often performed to reduce unwanted background information (offsets, sloped baselines) or accentuate absorption features in intrinsically overlapping bands. These procedures, also known as pretreatments, are commonly smoothing operations or derivatives. While such operations are often useful in reducing the number of latent variables of the actual decomposition and lowering residual error, they also run the risk of misleading the practitioner into accepting calibration equations that are poorly adapted to samples outside of the calibration. The current study developed a graphical method to examine this effect on partial least squares (PLS) regression calibrations of near-infrared (NIR) reflection spectra of ground wheat meal with two analytes, protein content and sodium dodecyl sulfate sedimentation (SDS) volume (an indicator of the quantity of the gluten proteins that contribute to strong doughs). These two properties were chosen because of their differing abilities to be modeled by NIR spectroscopy: excellent for protein content, fair for SDS sedimentation volume. To further demonstrate the potential pitfalls of preprocessing, an artificial component, a randomly generated value, was included in PLS regression trials. Savitzky-Golay (digital filter) smoothing, first-derivative, and second-derivative preprocess functions (5 to 25 centrally symmetric convolution points, derived from quadratic polynomials) were applied to PLS calibrations of 1 to 15 factors. The results demonstrated the danger of an over reliance on preprocessing when (1) the number of samples used in a multivariate calibration is low (<50), (2) the spectral response of the analyte is weak, and (3) the goodness of the calibration is based on the coefficient of determination (R(2)) rather than a term based on residual error. The graphical method has application to the evaluation of other preprocess functions and various

  20. Principal Covariates Clusterwise Regression (PCCR): Accounting for Multicollinearity and Population Heterogeneity in Hierarchically Organized Data.

    Science.gov (United States)

    Wilderjans, Tom Frans; Vande Gaer, Eva; Kiers, Henk A L; Van Mechelen, Iven; Ceulemans, Eva

    2017-03-01

    In the behavioral sciences, many research questions pertain to a regression problem in that one wants to predict a criterion on the basis of a number of predictors. Although in many cases, ordinary least squares regression will suffice, sometimes the prediction problem is more challenging, for three reasons: first, multiple highly collinear predictors can be available, making it difficult to grasp their mutual relations as well as their relations to the criterion. In that case, it may be very useful to reduce the predictors to a few summary variables, on which one regresses the criterion and which at the same time yields insight into the predictor structure. Second, the population under study may consist of a few unknown subgroups that are characterized by different regression models. Third, the obtained data are often hierarchically structured, with for instance, observations being nested into persons or participants within groups or countries. Although some methods have been developed that partially meet these challenges (i.e., principal covariates regression (PCovR), clusterwise regression (CR), and structural equation models), none of these methods adequately deals with all of them simultaneously. To fill this gap, we propose the principal covariates clusterwise regression (PCCR) method, which combines the key idea's behind PCovR (de Jong & Kiers in Chemom Intell Lab Syst 14(1-3):155-164, 1992) and CR (Späth in Computing 22(4):367-373, 1979). The PCCR method is validated by means of a simulation study and by applying it to cross-cultural data regarding satisfaction with life.

  1. Remote-sensing data processing with the multivariate regression analysis method for iron mineral resource potential mapping: a case study in the Sarvian area, central Iran

    Science.gov (United States)

    Mansouri, Edris; Feizi, Faranak; Jafari Rad, Alireza; Arian, Mehran

    2018-03-01

    This paper uses multivariate regression to create a mathematical model for iron skarn exploration in the Sarvian area, central Iran, using multivariate regression for mineral prospectivity mapping (MPM). The main target of this paper is to apply multivariate regression analysis (as an MPM method) to map iron outcrops in the northeastern part of the study area in order to discover new iron deposits in other parts of the study area. Two types of multivariate regression models using two linear equations were employed to discover new mineral deposits. This method is one of the reliable methods for processing satellite images. ASTER satellite images (14 bands) were used as unique independent variables (UIVs), and iron outcrops were mapped as dependent variables for MPM. According to the results of the probability value (p value), coefficient of determination value (R2) and adjusted determination coefficient (Radj2), the second regression model (which consistent of multiple UIVs) fitted better than other models. The accuracy of the model was confirmed by iron outcrops map and geological observation. Based on field observation, iron mineralization occurs at the contact of limestone and intrusive rocks (skarn type).

  2. Multivariate Analysis and Prediction of Dioxin-Furan ...

    Science.gov (United States)

    Peer Review Draft of Regional Methods Initiative Final Report Dioxins, which are bioaccumulative and environmentally persistent, pose an ongoing risk to human and ecosystem health. Fish constitute a significant source of dioxin exposure for humans and fish-eating wildlife. Current dioxin analytical methods are costly, time-consuming, and produce hazardous by-products. A Danish team developed a novel, multivariate statistical methodology based on the covariance of dioxin-furan congener Toxic Equivalences (TEQs) and fatty acid methyl esters (FAMEs) and applied it to North Atlantic Ocean fishmeal samples. The goal of the current study was to attempt to extend this Danish methodology to 77 whole and composite fish samples from three trophic groups: predator (whole largemouth bass), benthic (whole flathead and channel catfish) and forage fish (composite bluegill, pumpkinseed and green sunfish) from two dioxin contaminated rivers (Pocatalico R. and Kanawha R.) in West Virginia, USA. Multivariate statistical analyses, including, Principal Components Analysis (PCA), Hierarchical Clustering, and Partial Least Squares Regression (PLS), were used to assess the relationship between the FAMEs and TEQs in these dioxin contaminated freshwater fish from the Kanawha and Pocatalico Rivers. These three multivariate statistical methods all confirm that the pattern of Fatty Acid Methyl Esters (FAMEs) in these freshwater fish covaries with and is predictive of the WHO TE

  3. Determinants of falls in community-dwelling elderly: hierarchical analysis.

    Science.gov (United States)

    Brito, Thais Alves; Coqueiro, Raildo da Silva; Fernandes, Marcos Henrique; de Jesus, Cleber Souza

    2014-01-01

    To analyze the fall-related factors in community-dwelling elderly. Epidemiologic cross-sectional population-based household study with hierarchical interrelationships among the potential risk factors. The sample was made up of noninstitutionalized individuals over age 60, who were resident of a city in Brazil's Northeast Region. The dependent variable was fall occurrence in the last 12 months; independent variables were sociodemographic, behavioral, health, and functional status factors. Multivariate hierarchical Poisson regression analysis was used based on a proposed theoretic model. Three hundred and sixteen (89.0%) elderly participated of the survey, average age 74.2 years; the majority was female, with limited literacy and had low-medium family income. The fall prevalence was of 25.8%; occurrence was related to depression symptoms (PR = 1.55) and balance limitation (PR = 1.56). The high fall prevalence among elderly necessitates the identification of fall-related factors for action planning prevention programs with this group. © 2014 Wiley Periodicals, Inc.

  4. Sparse multivariate factor analysis regression models and its applications to integrative genomics analysis.

    Science.gov (United States)

    Zhou, Yan; Wang, Pei; Wang, Xianlong; Zhu, Ji; Song, Peter X-K

    2017-01-01

    The multivariate regression model is a useful tool to explore complex associations between two kinds of molecular markers, which enables the understanding of the biological pathways underlying disease etiology. For a set of correlated response variables, accounting for such dependency can increase statistical power. Motivated by integrative genomic data analyses, we propose a new methodology-sparse multivariate factor analysis regression model (smFARM), in which correlations of response variables are assumed to follow a factor analysis model with latent factors. This proposed method not only allows us to address the challenge that the number of association parameters is larger than the sample size, but also to adjust for unobserved genetic and/or nongenetic factors that potentially conceal the underlying response-predictor associations. The proposed smFARM is implemented by the EM algorithm and the blockwise coordinate descent algorithm. The proposed methodology is evaluated and compared to the existing methods through extensive simulation studies. Our results show that accounting for latent factors through the proposed smFARM can improve sensitivity of signal detection and accuracy of sparse association map estimation. We illustrate smFARM by two integrative genomics analysis examples, a breast cancer dataset, and an ovarian cancer dataset, to assess the relationship between DNA copy numbers and gene expression arrays to understand genetic regulatory patterns relevant to the disease. We identify two trans-hub regions: one in cytoband 17q12 whose amplification influences the RNA expression levels of important breast cancer genes, and the other in cytoband 9q21.32-33, which is associated with chemoresistance in ovarian cancer. © 2016 WILEY PERIODICALS, INC.

  5. Evaluation of Logistic Regression and Multivariate Adaptive Regression Spline Models for Groundwater Potential Mapping Using R and GIS

    Directory of Open Access Journals (Sweden)

    Soyoung Park

    2017-07-01

    Full Text Available This study mapped and analyzed groundwater potential using two different models, logistic regression (LR and multivariate adaptive regression splines (MARS, and compared the results. A spatial database was constructed for groundwater well data and groundwater influence factors. Groundwater well data with a high potential yield of ≥70 m3/d were extracted, and 859 locations (70% were used for model training, whereas the other 365 locations (30% were used for model validation. We analyzed 16 groundwater influence factors including altitude, slope degree, slope aspect, plan curvature, profile curvature, topographic wetness index, stream power index, sediment transport index, distance from drainage, drainage density, lithology, distance from fault, fault density, distance from lineament, lineament density, and land cover. Groundwater potential maps (GPMs were constructed using LR and MARS models and tested using a receiver operating characteristics curve. Based on this analysis, the area under the curve (AUC for the success rate curve of GPMs created using the MARS and LR models was 0.867 and 0.838, and the AUC for the prediction rate curve was 0.836 and 0.801, respectively. This implies that the MARS model is useful and effective for groundwater potential analysis in the study area.

  6. Selecting minimum dataset soil variables using PLSR as a regressive multivariate method

    Science.gov (United States)

    Stellacci, Anna Maria; Armenise, Elena; Castellini, Mirko; Rossi, Roberta; Vitti, Carolina; Leogrande, Rita; De Benedetto, Daniela; Ferrara, Rossana M.; Vivaldi, Gaetano A.

    2017-04-01

    Long-term field experiments and science-based tools that characterize soil status (namely the soil quality indices, SQIs) assume a strategic role in assessing the effect of agronomic techniques and thus in improving soil management especially in marginal environments. Selecting key soil variables able to best represent soil status is a critical step for the calculation of SQIs. Current studies show the effectiveness of statistical methods for variable selection to extract relevant information deriving from multivariate datasets. Principal component analysis (PCA) has been mainly used, however supervised multivariate methods and regressive techniques are progressively being evaluated (Armenise et al., 2013; de Paul Obade et al., 2016; Pulido Moncada et al., 2014). The present study explores the effectiveness of partial least square regression (PLSR) in selecting critical soil variables, using a dataset comparing conventional tillage and sod-seeding on durum wheat. The results were compared to those obtained using PCA and stepwise discriminant analysis (SDA). The soil data derived from a long-term field experiment in Southern Italy. On samples collected in April 2015, the following set of variables was quantified: (i) chemical: total organic carbon and nitrogen (TOC and TN), alkali-extractable C (TEC and humic substances - HA-FA), water extractable N and organic C (WEN and WEOC), Olsen extractable P, exchangeable cations, pH and EC; (ii) physical: texture, dry bulk density (BD), macroporosity (Pmac), air capacity (AC), and relative field capacity (RFC); (iii) biological: carbon of the microbial biomass quantified with the fumigation-extraction method. PCA and SDA were previously applied to the multivariate dataset (Stellacci et al., 2016). PLSR was carried out on mean centered and variance scaled data of predictors (soil variables) and response (wheat yield) variables using the PLS procedure of SAS/STAT. In addition, variable importance for projection (VIP

  7. Bayesian nonparametric hierarchical modeling.

    Science.gov (United States)

    Dunson, David B

    2009-04-01

    In biomedical research, hierarchical models are very widely used to accommodate dependence in multivariate and longitudinal data and for borrowing of information across data from different sources. A primary concern in hierarchical modeling is sensitivity to parametric assumptions, such as linearity and normality of the random effects. Parametric assumptions on latent variable distributions can be challenging to check and are typically unwarranted, given available prior knowledge. This article reviews some recent developments in Bayesian nonparametric methods motivated by complex, multivariate and functional data collected in biomedical studies. The author provides a brief review of flexible parametric approaches relying on finite mixtures and latent class modeling. Dirichlet process mixture models are motivated by the need to generalize these approaches to avoid assuming a fixed finite number of classes. Focusing on an epidemiology application, the author illustrates the practical utility and potential of nonparametric Bayes methods.

  8. Investigating the effects of climate variations on bacillary dysentery incidence in northeast China using ridge regression and hierarchical cluster analysis

    Directory of Open Access Journals (Sweden)

    Guo Junqiao

    2008-09-01

    Full Text Available Abstract Background The effects of climate variations on bacillary dysentery incidence have gained more recent concern. However, the multi-collinearity among meteorological factors affects the accuracy of correlation with bacillary dysentery incidence. Methods As a remedy, a modified method to combine ridge regression and hierarchical cluster analysis was proposed for investigating the effects of climate variations on bacillary dysentery incidence in northeast China. Results All weather indicators, temperatures, precipitation, evaporation and relative humidity have shown positive correlation with the monthly incidence of bacillary dysentery, while air pressure had a negative correlation with the incidence. Ridge regression and hierarchical cluster analysis showed that during 1987–1996, relative humidity, temperatures and air pressure affected the transmission of the bacillary dysentery. During this period, all meteorological factors were divided into three categories. Relative humidity and precipitation belonged to one class, temperature indexes and evaporation belonged to another class, and air pressure was the third class. Conclusion Meteorological factors have affected the transmission of bacillary dysentery in northeast China. Bacillary dysentery prevention and control would benefit from by giving more consideration to local climate variations.

  9. Predicting multi-level drug response with gene expression profile in multiple myeloma using hierarchical ordinal regression.

    Science.gov (United States)

    Zhang, Xinyan; Li, Bingzong; Han, Huiying; Song, Sha; Xu, Hongxia; Hong, Yating; Yi, Nengjun; Zhuang, Wenzhuo

    2018-05-10

    Multiple myeloma (MM), like other cancers, is caused by the accumulation of genetic abnormalities. Heterogeneity exists in the patients' response to treatments, for example, bortezomib. This urges efforts to identify biomarkers from numerous molecular features and build predictive models for identifying patients that can benefit from a certain treatment scheme. However, previous studies treated the multi-level ordinal drug response as a binary response where only responsive and non-responsive groups are considered. It is desirable to directly analyze the multi-level drug response, rather than combining the response to two groups. In this study, we present a novel method to identify significantly associated biomarkers and then develop ordinal genomic classifier using the hierarchical ordinal logistic model. The proposed hierarchical ordinal logistic model employs the heavy-tailed Cauchy prior on the coefficients and is fitted by an efficient quasi-Newton algorithm. We apply our hierarchical ordinal regression approach to analyze two publicly available datasets for MM with five-level drug response and numerous gene expression measures. Our results show that our method is able to identify genes associated with the multi-level drug response and to generate powerful predictive models for predicting the multi-level response. The proposed method allows us to jointly fit numerous correlated predictors and thus build efficient models for predicting the multi-level drug response. The predictive model for the multi-level drug response can be more informative than the previous approaches. Thus, the proposed approach provides a powerful tool for predicting multi-level drug response and has important impact on cancer studies.

  10. Predictive Ability of Pender's Health Promotion Model for Physical Activity and Exercise in People with Spinal Cord Injuries: A Hierarchical Regression Analysis

    Science.gov (United States)

    Keegan, John P.; Chan, Fong; Ditchman, Nicole; Chiu, Chung-Yi

    2012-01-01

    The main objective of this study was to validate Pender's Health Promotion Model (HPM) as a motivational model for exercise/physical activity self-management for people with spinal cord injuries (SCIs). Quantitative descriptive research design using hierarchical regression analysis (HRA) was used. A total of 126 individuals with SCI were recruited…

  11. Exploratory multivariate analysis by example using R

    CERN Document Server

    Husson, Francois; Pages, Jerome

    2010-01-01

    Full of real-world case studies and practical advice, Exploratory Multivariate Analysis by Example Using R focuses on four fundamental methods of multivariate exploratory data analysis that are most suitable for applications. It covers principal component analysis (PCA) when variables are quantitative, correspondence analysis (CA) and multiple correspondence analysis (MCA) when variables are categorical, and hierarchical cluster analysis.The authors take a geometric point of view that provides a unified vision for exploring multivariate data tables. Within this framework, they present the prin

  12. Real estate value prediction using multivariate regression models

    Science.gov (United States)

    Manjula, R.; Jain, Shubham; Srivastava, Sharad; Rajiv Kher, Pranav

    2017-11-01

    The real estate market is one of the most competitive in terms of pricing and the same tends to vary significantly based on a lot of factors, hence it becomes one of the prime fields to apply the concepts of machine learning to optimize and predict the prices with high accuracy. Therefore in this paper, we present various important features to use while predicting housing prices with good accuracy. We have described regression models, using various features to have lower Residual Sum of Squares error. While using features in a regression model some feature engineering is required for better prediction. Often a set of features (multiple regressions) or polynomial regression (applying a various set of powers in the features) is used for making better model fit. For these models are expected to be susceptible towards over fitting ridge regression is used to reduce it. This paper thus directs to the best application of regression models in addition to other techniques to optimize the result.

  13. Extending multivariate distance matrix regression with an effect size measure and the asymptotic null distribution of the test statistic.

    Science.gov (United States)

    McArtor, Daniel B; Lubke, Gitta H; Bergeman, C S

    2017-12-01

    Person-centered methods are useful for studying individual differences in terms of (dis)similarities between response profiles on multivariate outcomes. Multivariate distance matrix regression (MDMR) tests the significance of associations of response profile (dis)similarities and a set of predictors using permutation tests. This paper extends MDMR by deriving and empirically validating the asymptotic null distribution of its test statistic, and by proposing an effect size for individual outcome variables, which is shown to recover true associations. These extensions alleviate the computational burden of permutation tests currently used in MDMR and render more informative results, thus making MDMR accessible to new research domains.

  14. Determination of boiling point of petrochemicals by gas chromatography-mass spectrometry and multivariate regression analysis of structural activity relationship.

    Science.gov (United States)

    Fakayode, Sayo O; Mitchell, Breanna S; Pollard, David A

    2014-08-01

    Accurate understanding of analyte boiling points (BP) is of critical importance in gas chromatographic (GC) separation and crude oil refinery operation in petrochemical industries. This study reported the first combined use of GC separation and partial-least-square (PLS1) multivariate regression analysis of petrochemical structural activity relationship (SAR) for accurate BP determination of two commercially available (D3710 and MA VHP) calibration gas mix samples. The results of the BP determination using PLS1 multivariate regression were further compared with the results of traditional simulated distillation method of BP determination. The developed PLS1 regression was able to correctly predict analytes BP in D3710 and MA VHP calibration gas mix samples, with a root-mean-square-%-relative-error (RMS%RE) of 6.4%, and 10.8% respectively. In contrast, the overall RMS%RE of 32.9% and 40.4%, respectively obtained for BP determination in D3710 and MA VHP using a traditional simulated distillation method were approximately four times larger than the corresponding RMS%RE of BP prediction using MRA, demonstrating the better predictive ability of MRA. The reported method is rapid, robust, and promising, and can be potentially used routinely for fast analysis, pattern recognition, and analyte BP determination in petrochemical industries. Copyright © 2014 Elsevier B.V. All rights reserved.

  15. Inference for multivariate regression model based on multiply imputed synthetic data generated via posterior predictive sampling

    Science.gov (United States)

    Moura, Ricardo; Sinha, Bimal; Coelho, Carlos A.

    2017-06-01

    The recent popularity of the use of synthetic data as a Statistical Disclosure Control technique has enabled the development of several methods of generating and analyzing such data, but almost always relying in asymptotic distributions and in consequence being not adequate for small sample datasets. Thus, a likelihood-based exact inference procedure is derived for the matrix of regression coefficients of the multivariate regression model, for multiply imputed synthetic data generated via Posterior Predictive Sampling. Since it is based in exact distributions this procedure may even be used in small sample datasets. Simulation studies compare the results obtained from the proposed exact inferential procedure with the results obtained from an adaptation of Reiters combination rule to multiply imputed synthetic datasets and an application to the 2000 Current Population Survey is discussed.

  16. On the degrees of freedom of reduced-rank estimators in multivariate regression.

    Science.gov (United States)

    Mukherjee, A; Chen, K; Wang, N; Zhu, J

    We study the effective degrees of freedom of a general class of reduced-rank estimators for multivariate regression in the framework of Stein's unbiased risk estimation. A finite-sample exact unbiased estimator is derived that admits a closed-form expression in terms of the thresholded singular values of the least-squares solution and hence is readily computable. The results continue to hold in the high-dimensional setting where both the predictor and the response dimensions may be larger than the sample size. The derived analytical form facilitates the investigation of theoretical properties and provides new insights into the empirical behaviour of the degrees of freedom. In particular, we examine the differences and connections between the proposed estimator and a commonly-used naive estimator. The use of the proposed estimator leads to efficient and accurate prediction risk estimation and model selection, as demonstrated by simulation studies and a data example.

  17. Stepwise versus Hierarchical Regression: Pros and Cons

    Science.gov (United States)

    Lewis, Mitzi

    2007-01-01

    Multiple regression is commonly used in social and behavioral data analysis. In multiple regression contexts, researchers are very often interested in determining the "best" predictors in the analysis. This focus may stem from a need to identify those predictors that are supportive of theory. Alternatively, the researcher may simply be interested…

  18. Prediction of road accidents: A Bayesian hierarchical approach

    DEFF Research Database (Denmark)

    Deublein, Markus; Schubert, Matthias; Adey, Bryan T.

    2013-01-01

    the expected number of accidents in which an injury has occurred and the expected number of light, severe and fatally injured road users. Additionally, the methodology is used for geo-referenced identification of road sections with increased occurrence probabilities of injury accident events on a road link......In this paper a novel methodology for the prediction of the occurrence of road accidents is presented. The methodology utilizes a combination of three statistical methods: (1) gamma-updating of the occurrence rates of injury accidents and injured road users, (2) hierarchical multivariate Poisson......-lognormal regression analysis taking into account correlations amongst multiple dependent model response variables and effects of discrete accident count data e.g. over-dispersion, and (3) Bayesian inference algorithms, which are applied by means of data mining techniques supported by Bayesian Probabilistic Networks...

  19. Hierarchical Multinomial Processing Tree Models: A Latent-Trait Approach

    Science.gov (United States)

    Klauer, Karl Christoph

    2010-01-01

    Multinomial processing tree models are widely used in many areas of psychology. A hierarchical extension of the model class is proposed, using a multivariate normal distribution of person-level parameters with the mean and covariance matrix to be estimated from the data. The hierarchical model allows one to take variability between persons into…

  20. Reduced Rank Regression

    DEFF Research Database (Denmark)

    Johansen, Søren

    2008-01-01

    The reduced rank regression model is a multivariate regression model with a coefficient matrix with reduced rank. The reduced rank regression algorithm is an estimation procedure, which estimates the reduced rank regression model. It is related to canonical correlations and involves calculating...

  1. Multivariate Multiple Regression Models for a Big Data-Empowered SON Framework in Mobile Wireless Networks

    Directory of Open Access Journals (Sweden)

    Yoonsu Shin

    2016-01-01

    Full Text Available In the 5G era, the operational cost of mobile wireless networks will significantly increase. Further, massive network capacity and zero latency will be needed because everything will be connected to mobile networks. Thus, self-organizing networks (SON are needed, which expedite automatic operation of mobile wireless networks, but have challenges to satisfy the 5G requirements. Therefore, researchers have proposed a framework to empower SON using big data. The recent framework of a big data-empowered SON analyzes the relationship between key performance indicators (KPIs and related network parameters (NPs using machine-learning tools, and it develops regression models using a Gaussian process with those parameters. The problem, however, is that the methods of finding the NPs related to the KPIs differ individually. Moreover, the Gaussian process regression model cannot determine the relationship between a KPI and its various related NPs. In this paper, to solve these problems, we proposed multivariate multiple regression models to determine the relationship between various KPIs and NPs. If we assume one KPI and multiple NPs as one set, the proposed models help us process multiple sets at one time. Also, we can find out whether some KPIs are conflicting or not. We implement the proposed models using MapReduce.

  2. Multivariate research in areas of phosphorus cast-iron brake shoes manufacturing using the statistical analysis and the multiple regression equations

    Science.gov (United States)

    Kiss, I.; Cioată, V. G.; Alexa, V.; Raţiu, S. A.

    2017-05-01

    The braking system is one of the most important and complex subsystems of railway vehicles, especially when it comes for safety. Therefore, installing efficient safe brakes on the modern railway vehicles is essential. Nowadays is devoted attention to solving problems connected with using high performance brake materials and its impact on thermal and mechanical loading of railway wheels. The main factor that influences the selection of a friction material for railway applications is the performance criterion, due to the interaction between the brake block and the wheel produce complex thermos-mechanical phenomena. In this work, the investigated subjects are the cast-iron brake shoes, which are still widely used on freight wagons. Therefore, the cast-iron brake shoes - with lamellar graphite and with a high content of phosphorus (0.8-1.1%) - need a special investigation. In order to establish the optimal condition for the cast-iron brake shoes we proposed a mathematical modelling study by using the statistical analysis and multiple regression equations. Multivariate research is important in areas of cast-iron brake shoes manufacturing, because many variables interact with each other simultaneously. Multivariate visualization comes to the fore when researchers have difficulties in comprehending many dimensions at one time. Technological data (hardness and chemical composition) obtained from cast-iron brake shoes were used for this purpose. In order to settle the multiple correlation between the hardness of the cast-iron brake shoes, and the chemical compositions elements several model of regression equation types has been proposed. Because a three-dimensional surface with variables on three axes is a common way to illustrate multivariate data, in which the maximum and minimum values are easily highlighted, we plotted graphical representation of the regression equations in order to explain interaction of the variables and locate the optimal level of each variable for

  3. Regressão múltipla stepwise e hierárquica em Psicologia Organizacional: aplicações, problemas e soluções Stepwise and hierarchical multiple regression in organizational psychology: Applications, problemas and solutions

    Directory of Open Access Journals (Sweden)

    Gardênia Abbad

    2002-01-01

    Full Text Available Este artigo discute algumas aplicações das técnicas de análise de regressão múltipla stepwise e hierárquica, as quais são muito utilizadas em pesquisas da área de Psicologia Organizacional. São discutidas algumas estratégias de identificação e de solução de problemas relativos à ocorrência de erros do Tipo I e II e aos fenômenos de supressão, complementaridade e redundância nas equações de regressão múltipla. São apresentados alguns exemplos de pesquisas nas quais esses padrões de associação entre variáveis estiveram presentes e descritas as estratégias utilizadas pelos pesquisadores para interpretá-los. São discutidas as aplicações dessas análises no estudo de interação entre variáveis e na realização de testes para avaliação da linearidade do relacionamento entre variáveis. Finalmente, são apresentadas sugestões para lidar com as limitações das análises de regressão múltipla (stepwise e hierárquica.This article discusses applications of stepwise and hierarchical multiple regression analyses to research in organizational psychology. Strategies for identifying type I and II errors, and solutions to potential problems that may arise from such errors are proposed. In addition, phenomena such as suppression, complementarity, and redundancy are reviewed. The article presents examples of research where these phenomena occurred, and the manner in which they were explained by researchers. Some applications of multiple regression analyses to studies involving between-variable interactions are presented, along with tests used to analyze the presence of linearity among variables. Finally, some suggestions are provided for dealing with limitations implicit in multiple regression analyses (stepwise and hierarchical.

  4. A generalized multivariate regression model for modelling ocean wave heights

    Science.gov (United States)

    Wang, X. L.; Feng, Y.; Swail, V. R.

    2012-04-01

    In this study, a generalized multivariate linear regression model is developed to represent the relationship between 6-hourly ocean significant wave heights (Hs) and the corresponding 6-hourly mean sea level pressure (MSLP) fields. The model is calibrated using the ERA-Interim reanalysis of Hs and MSLP fields for 1981-2000, and is validated using the ERA-Interim reanalysis for 2001-2010 and ERA40 reanalysis of Hs and MSLP for 1958-2001. The performance of the fitted model is evaluated in terms of Pierce skill score, frequency bias index, and correlation skill score. Being not normally distributed, wave heights are subjected to a data adaptive Box-Cox transformation before being used in the model fitting. Also, since 6-hourly data are being modelled, lag-1 autocorrelation must be and is accounted for. The models with and without Box-Cox transformation, and with and without accounting for autocorrelation, are inter-compared in terms of their prediction skills. The fitted MSLP-Hs relationship is then used to reconstruct historical wave height climate from the 6-hourly MSLP fields taken from the Twentieth Century Reanalysis (20CR, Compo et al. 2011), and to project possible future wave height climates using CMIP5 model simulations of MSLP fields. The reconstructed and projected wave heights, both seasonal means and maxima, are subject to a trend analysis that allows for non-linear (polynomial) trends.

  5. Remote sensing and GIS-based landslide hazard analysis and cross-validation using multivariate logistic regression model on three test areas in Malaysia

    Science.gov (United States)

    Pradhan, Biswajeet

    2010-05-01

    This paper presents the results of the cross-validation of a multivariate logistic regression model using remote sensing data and GIS for landslide hazard analysis on the Penang, Cameron, and Selangor areas in Malaysia. Landslide locations in the study areas were identified by interpreting aerial photographs and satellite images, supported by field surveys. SPOT 5 and Landsat TM satellite imagery were used to map landcover and vegetation index, respectively. Maps of topography, soil type, lineaments and land cover were constructed from the spatial datasets. Ten factors which influence landslide occurrence, i.e., slope, aspect, curvature, distance from drainage, lithology, distance from lineaments, soil type, landcover, rainfall precipitation, and normalized difference vegetation index (ndvi), were extracted from the spatial database and the logistic regression coefficient of each factor was computed. Then the landslide hazard was analysed using the multivariate logistic regression coefficients derived not only from the data for the respective area but also using the logistic regression coefficients calculated from each of the other two areas (nine hazard maps in all) as a cross-validation of the model. For verification of the model, the results of the analyses were then compared with the field-verified landslide locations. Among the three cases of the application of logistic regression coefficient in the same study area, the case of Selangor based on the Selangor logistic regression coefficients showed the highest accuracy (94%), where as Penang based on the Penang coefficients showed the lowest accuracy (86%). Similarly, among the six cases from the cross application of logistic regression coefficient in other two areas, the case of Selangor based on logistic coefficient of Cameron showed highest (90%) prediction accuracy where as the case of Penang based on the Selangor logistic regression coefficients showed the lowest accuracy (79%). Qualitatively, the cross

  6. Bayesian Modeling of Air Pollution Extremes Using Nested Multivariate Max-Stable Processes

    KAUST Repository

    Vettori, Sabrina; Huser, Raphaë l; Genton, Marc G.

    2018-01-01

    Capturing the potentially strong dependence among the peak concentrations of multiple air pollutants across a spatial region is crucial for assessing the related public health risks. In order to investigate the multivariate spatial dependence properties of air pollution extremes, we introduce a new class of multivariate max-stable processes. Our proposed model admits a hierarchical tree-based formulation, in which the data are conditionally independent given some latent nested $\\alpha$-stable random factors. The hierarchical structure facilitates Bayesian inference and offers a convenient and interpretable characterization. We fit this nested multivariate max-stable model to the maxima of air pollution concentrations and temperatures recorded at a number of sites in the Los Angeles area, showing that the proposed model succeeds in capturing their complex tail dependence structure.

  7. Bayesian Modeling of Air Pollution Extremes Using Nested Multivariate Max-Stable Processes

    KAUST Repository

    Vettori, Sabrina

    2018-03-18

    Capturing the potentially strong dependence among the peak concentrations of multiple air pollutants across a spatial region is crucial for assessing the related public health risks. In order to investigate the multivariate spatial dependence properties of air pollution extremes, we introduce a new class of multivariate max-stable processes. Our proposed model admits a hierarchical tree-based formulation, in which the data are conditionally independent given some latent nested $\\\\alpha$-stable random factors. The hierarchical structure facilitates Bayesian inference and offers a convenient and interpretable characterization. We fit this nested multivariate max-stable model to the maxima of air pollution concentrations and temperatures recorded at a number of sites in the Los Angeles area, showing that the proposed model succeeds in capturing their complex tail dependence structure.

  8. Creating Hierarchical Pores by Controlled Linker Thermolysis in Multivariate Metal-Organic Frameworks

    KAUST Repository

    Feng, Liang; Yuan, Shuai; Zhang, Liang-Liang; Tan, Kui; Li, Jia-Luo; Kirchon, Angelo; Liu, Ling-Mei; Zhang, Peng; Han, Yu; Chabal, Yves J.; Zhou, Hong-Cai

    2018-01-01

    strate-gy, linker thermolysis, to construct ultra-stable hierarchically porous metal−organic frameworks (HP-MOFs) with tunable pore size distribution. Linker instability, usually an undesirable trait of MOFs, was exploited to create mesopores

  9. [Multivariate ordinal logistic regression analysis on the association between consumption of fried food and both esophageal cancer and precancerous lesions].

    Science.gov (United States)

    Guo, L W; Liu, S Z; Zhang, M; Chen, Q; Zhang, S K; Sun, X B

    2017-12-10

    Objective: To investigate the effect of fried food intake on the pathogenesis of esophageal cancer and precancerous lesions. Methods: From 2005 to 2013, all the residents aged 40-69 years from 11 counties (cities) where cancer screening of upper gastrointestinal cancer had been conducted in rural areas of Henan province, were recruited as the subjects of study. Information on demography and lifestyle was collected. The residents under study were screened with iodine staining endoscopic examination and biopsy samples were diagnosed pathologically, under standardized criteria. Subjects with high risk were divided into the groups based on their different pathological degrees. Multivariate ordinal logistic regression analysis was used to analyze the relationship between the frequency of fried food intake and esophageal cancer and precancerous lesions. Results: A total number of 8 792 cases with normal esophagus, 3 680 with mild hyperplasia, 972 with moderate hyperplasia, 413 with severe hyperplasia carcinoma in situ, and 336 cases of esophageal cancer were recruited. Results from multivariate logistic regression analysis showed that, when compared with those who did not eat fried food, the intake of fried food (food appeared a risk factor for both esophageal cancer and precancerous lesions.

  10. Hierarchical spatial point process analysis for a plant community with high biodiversity

    DEFF Research Database (Denmark)

    Illian, Janine B.; Møller, Jesper; Waagepetersen, Rasmus

    2009-01-01

    A complex multivariate spatial point pattern of a plant community with high biodiversity is modelled using a hierarchical multivariate point process model. In the model, interactions between plants with different post-fire regeneration strategies are of key interest. We consider initially a maxim...

  11. What are hierarchical models and how do we analyze them?

    Science.gov (United States)

    Royle, Andy

    2016-01-01

    In this chapter we provide a basic definition of hierarchical models and introduce the two canonical hierarchical models in this book: site occupancy and N-mixture models. The former is a hierarchical extension of logistic regression and the latter is a hierarchical extension of Poisson regression. We introduce basic concepts of probability modeling and statistical inference including likelihood and Bayesian perspectives. We go through the mechanics of maximizing the likelihood and characterizing the posterior distribution by Markov chain Monte Carlo (MCMC) methods. We give a general perspective on topics such as model selection and assessment of model fit, although we demonstrate these topics in practice in later chapters (especially Chapters 5, 6, 7, and 10 Chapter 5 Chapter 6 Chapter 7 Chapter 10)

  12. Hierarchical control of a nuclear reactor using uncertain dynamics techniques

    International Nuclear Information System (INIS)

    Rovere, L.A.; Otaduy, P.J.; Brittain, C.R.; Perez, R.B.

    1988-01-01

    Recent advances in the nonlinear optimal control area are opening new possibilities towards its implementation in process control. Algorithms for multivariate control, hierarchical decomposition, parameter tracking, model uncertainties actuator saturation effects and physical limits to state variables can be implemented on the basis of a consistent mathematical formulation. In this paper, good agreement is shown between a centralized and a hierarchical implementation of a controller for a hypothetical nuclear power plant subject to multiple demands. The performance of the hierarchical distributed system in the presence of localized subsystem failures is analyzed. 4 refs., 13 figs

  13. Assessing the response of area burned to changing climate in western boreal North America using a Multivariate Adaptive Regression Splines (MARS) approach

    Science.gov (United States)

    Michael S. Balshi; A. David McGuire; Paul Duffy; Mike Flannigan; John Walsh; Jerry Melillo

    2009-01-01

    We developed temporally and spatially explicit relationships between air temperature and fuel moisture codes derived from the Canadian Fire Weather Index System to estimate annual area burned at 2.5o (latitude x longitude) resolution using a Multivariate Adaptive Regression Spline (MARS) approach across Alaska and Canada. Burned area was...

  14. Semiparametric regression during 2003–2007

    KAUST Repository

    Ruppert, David; Wand, M.P.; Carroll, Raymond J.

    2009-01-01

    Semiparametric regression is a fusion between parametric regression and nonparametric regression that integrates low-rank penalized splines, mixed model and hierarchical Bayesian methodology – thus allowing more streamlined handling of longitudinal and spatial correlation. We review progress in the field over the five-year period between 2003 and 2007. We find semiparametric regression to be a vibrant field with substantial involvement and activity, continual enhancement and widespread application.

  15. Robust multivariate analysis

    CERN Document Server

    J Olive, David

    2017-01-01

    This text presents methods that are robust to the assumption of a multivariate normal distribution or methods that are robust to certain types of outliers. Instead of using exact theory based on the multivariate normal distribution, the simpler and more applicable large sample theory is given.  The text develops among the first practical robust regression and robust multivariate location and dispersion estimators backed by theory.   The robust techniques  are illustrated for methods such as principal component analysis, canonical correlation analysis, and factor analysis.  A simple way to bootstrap confidence regions is also provided. Much of the research on robust multivariate analysis in this book is being published for the first time. The text is suitable for a first course in Multivariate Statistical Analysis or a first course in Robust Statistics. This graduate text is also useful for people who are familiar with the traditional multivariate topics, but want to know more about handling data sets with...

  16. Should metacognition be measured by logistic regression?

    Science.gov (United States)

    Rausch, Manuel; Zehetleitner, Michael

    2017-03-01

    Are logistic regression slopes suitable to quantify metacognitive sensitivity, i.e. the efficiency with which subjective reports differentiate between correct and incorrect task responses? We analytically show that logistic regression slopes are independent from rating criteria in one specific model of metacognition, which assumes (i) that rating decisions are based on sensory evidence generated independently of the sensory evidence used for primary task responses and (ii) that the distributions of evidence are logistic. Given a hierarchical model of metacognition, logistic regression slopes depend on rating criteria. According to all considered models, regression slopes depend on the primary task criterion. A reanalysis of previous data revealed that massive numbers of trials are required to distinguish between hierarchical and independent models with tolerable accuracy. It is argued that researchers who wish to use logistic regression as measure of metacognitive sensitivity need to control the primary task criterion and rating criteria. Copyright © 2017 Elsevier Inc. All rights reserved.

  17. Forecasting the daily power output of a grid-connected photovoltaic system based on multivariate adaptive regression splines

    International Nuclear Information System (INIS)

    Li, Yanting; He, Yong; Su, Yan; Shu, Lianjie

    2016-01-01

    Highlights: • Suggests a nonparametric model based on MARS for output power prediction. • Compare the MARS model with a wide variety of prediction models. • Show that the MARS model is able to provide an overall good performance in both the training and testing stages. - Abstract: Both linear and nonlinear models have been proposed for forecasting the power output of photovoltaic systems. Linear models are simple to implement but less flexible. Due to the stochastic nature of the power output of PV systems, nonlinear models tend to provide better forecast than linear models. Motivated by this, this paper suggests a fairly simple nonlinear regression model known as multivariate adaptive regression splines (MARS), as an alternative to forecasting of solar power output. The MARS model is a data-driven modeling approach without any assumption about the relationship between the power output and predictors. It maintains simplicity of the classical multiple linear regression (MLR) model while possessing the capability of handling nonlinearity. It is simpler in format than other nonlinear models such as ANN, k-nearest neighbors (KNN), classification and regression tree (CART), and support vector machine (SVM). The MARS model was applied on the daily output of a grid-connected 2.1 kW PV system to provide the 1-day-ahead mean daily forecast of the power output. The comparisons with a wide variety of forecast models show that the MARS model is able to provide reliable forecast performance.

  18. Polynomial regression analysis and significance test of the regression function

    International Nuclear Information System (INIS)

    Gao Zhengming; Zhao Juan; He Shengping

    2012-01-01

    In order to analyze the decay heating power of a certain radioactive isotope per kilogram with polynomial regression method, the paper firstly demonstrated the broad usage of polynomial function and deduced its parameters with ordinary least squares estimate. Then significance test method of polynomial regression function is derived considering the similarity between the polynomial regression model and the multivariable linear regression model. Finally, polynomial regression analysis and significance test of the polynomial function are done to the decay heating power of the iso tope per kilogram in accord with the authors' real work. (authors)

  19. A Matlab program for stepwise regression

    Directory of Open Access Journals (Sweden)

    Yanhong Qi

    2016-03-01

    Full Text Available The stepwise linear regression is a multi-variable regression for identifying statistically significant variables in the linear regression equation. In present study, we presented the Matlab program of stepwise regression.

  20. Multivariate Regression of Liver on Intestine of Mice: A ...

    African Journals Online (AJOL)

    FIRST LADY

    pairs recovered. Linear, semi-logarithmic and logarithmic-logarithmic (log- log) regressions were performed. He chose the log-log curves because its variance was more uniform. The statistical comparison of .... E(U1| U2 = u2) is the regression function of U1 on U2, and Var (U1|U2 = u2) is the conditional covariance matrix.

  1. Advanced statistics: linear regression, part II: multiple linear regression.

    Science.gov (United States)

    Marill, Keith A

    2004-01-01

    The applications of simple linear regression in medical research are limited, because in most situations, there are multiple relevant predictor variables. Univariate statistical techniques such as simple linear regression use a single predictor variable, and they often may be mathematically correct but clinically misleading. Multiple linear regression is a mathematical technique used to model the relationship between multiple independent predictor variables and a single dependent outcome variable. It is used in medical research to model observational data, as well as in diagnostic and therapeutic studies in which the outcome is dependent on more than one factor. Although the technique generally is limited to data that can be expressed with a linear function, it benefits from a well-developed mathematical framework that yields unique solutions and exact confidence intervals for regression coefficients. Building on Part I of this series, this article acquaints the reader with some of the important concepts in multiple regression analysis. These include multicollinearity, interaction effects, and an expansion of the discussion of inference testing, leverage, and variable transformations to multivariate models. Examples from the first article in this series are expanded on using a primarily graphic, rather than mathematical, approach. The importance of the relationships among the predictor variables and the dependence of the multivariate model coefficients on the choice of these variables are stressed. Finally, concepts in regression model building are discussed.

  2. Risk factors for pedicled flap necrosis in hand soft tissue reconstruction: a multivariate logistic regression analysis.

    Science.gov (United States)

    Gong, Xu; Cui, Jianli; Jiang, Ziping; Lu, Laijin; Li, Xiucun

    2018-03-01

    Few clinical retrospective studies have reported the risk factors of pedicled flap necrosis in hand soft tissue reconstruction. The aim of this study was to identify non-technical risk factors associated with pedicled flap perioperative necrosis in hand soft tissue reconstruction via a multivariate logistic regression analysis. For patients with hand soft tissue reconstruction, we carefully reviewed hospital records and identified 163 patients who met the inclusion criteria. The characteristics of these patients, flap transfer procedures and postoperative complications were recorded. Eleven predictors were identified. The correlations between pedicled flap necrosis and risk factors were analysed using a logistic regression model. Of 163 skin flaps, 125 flaps survived completely without any complications. The pedicled flap necrosis rate in hands was 11.04%, which included partial flap necrosis (7.36%) and total flap necrosis (3.68%). Soft tissue defects in fingers were noted in 68.10% of all cases. The logistic regression analysis indicated that the soft tissue defect site (P = 0.046, odds ratio (OR) = 0.079, confidence interval (CI) (0.006, 0.959)), flap size (P = 0.020, OR = 1.024, CI (1.004, 1.045)) and postoperative wound infection (P < 0.001, OR = 17.407, CI (3.821, 79.303)) were statistically significant risk factors for pedicled flap necrosis of the hand. Soft tissue defect site, flap size and postoperative wound infection were risk factors associated with pedicled flap necrosis in hand soft tissue defect reconstruction. © 2017 Royal Australasian College of Surgeons.

  3. Scale and shape mixtures of multivariate skew-normal distributions

    KAUST Repository

    Arellano-Valle, Reinaldo B.

    2018-02-26

    We introduce a broad and flexible class of multivariate distributions obtained by both scale and shape mixtures of multivariate skew-normal distributions. We present the probabilistic properties of this family of distributions in detail and lay down the theoretical foundations for subsequent inference with this model. In particular, we study linear transformations, marginal distributions, selection representations, stochastic representations and hierarchical representations. We also describe an EM-type algorithm for maximum likelihood estimation of the parameters of the model and demonstrate its implementation on a wind dataset. Our family of multivariate distributions unifies and extends many existing models of the literature that can be seen as submodels of our proposal.

  4. Multivariate analysis with LISREL

    CERN Document Server

    Jöreskog, Karl G; Y Wallentin, Fan

    2016-01-01

    This book traces the theory and methodology of multivariate statistical analysis and shows how it can be conducted in practice using the LISREL computer program. It presents not only the typical uses of LISREL, such as confirmatory factor analysis and structural equation models, but also several other multivariate analysis topics, including regression (univariate, multivariate, censored, logistic, and probit), generalized linear models, multilevel analysis, and principal component analysis. It provides numerous examples from several disciplines and discusses and interprets the results, illustrated with sections of output from the LISREL program, in the context of the example. The book is intended for masters and PhD students and researchers in the social, behavioral, economic and many other sciences who require a basic understanding of multivariate statistical theory and methods for their analysis of multivariate data. It can also be used as a textbook on various topics of multivariate statistical analysis.

  5. Hierarchical multivariate covariance analysis of metabolic connectivity.

    Science.gov (United States)

    Carbonell, Felix; Charil, Arnaud; Zijdenbos, Alex P; Evans, Alan C; Bedell, Barry J

    2014-12-01

    Conventional brain connectivity analysis is typically based on the assessment of interregional correlations. Given that correlation coefficients are derived from both covariance and variance, group differences in covariance may be obscured by differences in the variance terms. To facilitate a comprehensive assessment of connectivity, we propose a unified statistical framework that interrogates the individual terms of the correlation coefficient. We have evaluated the utility of this method for metabolic connectivity analysis using [18F]2-fluoro-2-deoxyglucose (FDG) positron emission tomography (PET) data from the Alzheimer's Disease Neuroimaging Initiative (ADNI) study. As an illustrative example of the utility of this approach, we examined metabolic connectivity in angular gyrus and precuneus seed regions of mild cognitive impairment (MCI) subjects with low and high β-amyloid burdens. This new multivariate method allowed us to identify alterations in the metabolic connectome, which would not have been detected using classic seed-based correlation analysis. Ultimately, this novel approach should be extensible to brain network analysis and broadly applicable to other imaging modalities, such as functional magnetic resonance imaging (MRI).

  6. Hierarchical probabilistic regionalization of volcanism for Sengan region in Japan using multivariate statistical techniques and geostatistical interpolation techniques

    International Nuclear Information System (INIS)

    Park, Jinyong; Balasingham, P.; McKenna, Sean Andrew; Kulatilake, Pinnaduwa H. S. W.

    2004-01-01

    Sandia National Laboratories, under contract to Nuclear Waste Management Organization of Japan (NUMO), is performing research on regional classification of given sites in Japan with respect to potential volcanic disruption using multivariate statistics and geo-statistical interpolation techniques. This report provides results obtained for hierarchical probabilistic regionalization of volcanism for the Sengan region in Japan by applying multivariate statistical techniques and geostatistical interpolation techniques on the geologic data provided by NUMO. A workshop report produced in September 2003 by Sandia National Laboratories (Arnold et al., 2003) on volcanism lists a set of most important geologic variables as well as some secondary information related to volcanism. Geologic data extracted for the Sengan region in Japan from the data provided by NUMO revealed that data are not available at the same locations for all the important geologic variables. In other words, the geologic variable vectors were found to be incomplete spatially. However, it is necessary to have complete geologic variable vectors to perform multivariate statistical analyses. As a first step towards constructing complete geologic variable vectors, the Universal Transverse Mercator (UTM) zone 54 projected coordinate system and a 1 km square regular grid system were selected. The data available for each geologic variable on a geographic coordinate system were transferred to the aforementioned grid system. Also the recorded data on volcanic activity for Sengan region were produced on the same grid system. Each geologic variable map was compared with the recorded volcanic activity map to determine the geologic variables that are most important for volcanism. In the regionalized classification procedure, this step is known as the variable selection step. The following variables were determined as most important for volcanism: geothermal gradient, groundwater temperature, heat discharge, groundwater

  7. Endpoint in plasma etch process using new modified w-multivariate charts and windowed regression

    Science.gov (United States)

    Zakour, Sihem Ben; Taleb, Hassen

    2017-09-01

    Endpoint detection is very important undertaking on the side of getting a good understanding and figuring out if a plasma etching process is done in the right way, especially if the etched area is very small (0.1%). It truly is a crucial part of supplying repeatable effects in every single wafer. When the film being etched has been completely cleared, the endpoint is reached. To ensure the desired device performance on the produced integrated circuit, the high optical emission spectroscopy (OES) sensor is employed. The huge number of gathered wavelengths (profiles) is then analyzed and pre-processed using a new proposed simple algorithm named Spectra peak selection (SPS) to select the important wavelengths, then we employ wavelet analysis (WA) to enhance the performance of detection by suppressing noise and redundant information. The selected and treated OES wavelengths are then used in modified multivariate control charts (MEWMA and Hotelling) for three statistics (mean, SD and CV) and windowed polynomial regression for mean. The employ of three aforementioned statistics is motivated by controlling mean shift, variance shift and their ratio (CV) if both mean and SD are not stable. The control charts show their performance in detecting endpoint especially W-mean Hotelling chart and the worst result is given by CV statistic. As the best detection of endpoint is given by the W-Hotelling mean statistic, this statistic will be used to construct a windowed wavelet Hotelling polynomial regression. This latter can only identify the window containing endpoint phenomenon.

  8. Multivariate methods and forecasting with IBM SPSS statistics

    CERN Document Server

    Aljandali, Abdulkader

    2017-01-01

    This is the second of a two-part guide to quantitative analysis using the IBM SPSS Statistics software package; this volume focuses on multivariate statistical methods and advanced forecasting techniques. More often than not, regression models involve more than one independent variable. For example, forecasting methods are commonly applied to aggregates such as inflation rates, unemployment, exchange rates, etc., that have complex relationships with determining variables. This book introduces multivariate regression models and provides examples to help understand theory underpinning the model. The book presents the fundamentals of multivariate regression and then moves on to examine several related techniques that have application in business-orientated fields such as logistic and multinomial regression. Forecasting tools such as the Box-Jenkins approach to time series modeling are introduced, as well as exponential smoothing and naïve techniques. This part also covers hot topics such as Factor Analysis, Dis...

  9. On Bayesian shared component disease mapping and ecological regression with errors in covariates.

    Science.gov (United States)

    MacNab, Ying C

    2010-05-20

    Recent literature on Bayesian disease mapping presents shared component models (SCMs) for joint spatial modeling of two or more diseases with common risk factors. In this study, Bayesian hierarchical formulations of shared component disease mapping and ecological models are explored and developed in the context of ecological regression, taking into consideration errors in covariates. A review of multivariate disease mapping models (MultiVMs) such as the multivariate conditional autoregressive models that are also part of the more recent Bayesian disease mapping literature is presented. Some insights into the connections and distinctions between the SCM and MultiVM procedures are communicated. Important issues surrounding (appropriate) formulation of shared- and disease-specific components, consideration/choice of spatial or non-spatial random effects priors, and identification of model parameters in SCMs are explored and discussed in the context of spatial and ecological analysis of small area multivariate disease or health outcome rates and associated ecological risk factors. The methods are illustrated through an in-depth analysis of four-variate road traffic accident injury (RTAI) data: gender-specific fatal and non-fatal RTAI rates in 84 local health areas in British Columbia (Canada). Fully Bayesian inference via Markov chain Monte Carlo simulations is presented. Copyright 2010 John Wiley & Sons, Ltd.

  10. Regional trends in short-duration precipitation extremes: a flexible multivariate monotone quantile regression approach

    Science.gov (United States)

    Cannon, Alex

    2017-04-01

    univariate technique, and cannot incorporate information from additional covariates, for example ENSO state or physiographic controls on extreme rainfall within a region. Here, the univariate MQR model is extended to allow the use of multiple covariates. Multivariate monotone quantile regression (MMQR) is based on a single hidden-layer feedforward network with the quantile regression error function and partial monotonicity constraints. The MMQR model is demonstrated via Monte Carlo simulations and the estimation and visualization of regional trends in moderate rainfall extremes based on homogenized sub-daily precipitation data at stations in Canada.

  11. Multivariate regression models for the simultaneous quantitative analysis of calcium and magnesium carbonates and magnesium oxide through drifts data

    Directory of Open Access Journals (Sweden)

    Marder Luciano

    2006-01-01

    Full Text Available In the present work multivariate regression models were developed for the quantitative analysis of ternary systems using Diffuse Reflectance Infrared Fourier Transform Spectroscopy (DRIFTS to determine the concentration in weight of calcium carbonate, magnesium carbonate and magnesium oxide. Nineteen spectra of standard samples previously defined in ternary diagram by mixture design were prepared and mid-infrared diffuse reflectance spectra were recorded. The partial least squares (PLS regression method was applied to the model. The spectra set was preprocessed by either mean-centered and variance-scaled (model 2 or mean-centered only (model 1. The results based on the prediction performance of the external validation set expressed by RMSEP (root mean square error of prediction demonstrated that it is possible to develop good models to simultaneously determine calcium carbonate, magnesium carbonate and magnesium oxide content in powdered samples that can be used in the study of the thermal decomposition of dolomite rocks.

  12. Linear regression

    CERN Document Server

    Olive, David J

    2017-01-01

    This text covers both multiple linear regression and some experimental design models. The text uses the response plot to visualize the model and to detect outliers, does not assume that the error distribution has a known parametric distribution, develops prediction intervals that work when the error distribution is unknown, suggests bootstrap hypothesis tests that may be useful for inference after variable selection, and develops prediction regions and large sample theory for the multivariate linear regression model that has m response variables. A relationship between multivariate prediction regions and confidence regions provides a simple way to bootstrap confidence regions. These confidence regions often provide a practical method for testing hypotheses. There is also a chapter on generalized linear models and generalized additive models. There are many R functions to produce response and residual plots, to simulate prediction intervals and hypothesis tests, to detect outliers, and to choose response trans...

  13. Partitioning of Multivariate Phenotypes using Regression Trees Reveals Complex Patterns of Adaptation to Climate across the Range of Black Cottonwood (Populus trichocarpa

    Directory of Open Access Journals (Sweden)

    Regis Wendpouire Oubida

    2015-03-01

    Full Text Available Local adaptation to climate in temperate forest trees involves the integration of multiple physiological, morphological, and phenological traits. Latitudinal clines are frequently observed for these traits, but environmental constraints also track longitude and altitude. We combined extensive phenotyping of 12 candidate adaptive traits, multivariate regression trees, quantitative genetics, and a genome-wide panel of SNP markers to better understand the interplay among geography, climate, and adaptation to abiotic factors in Populus trichocarpa. Heritabilities were low to moderate (0.13 to 0.32 and population differentiation for many traits exceeded the 99th percentile of the genome-wide distribution of FST, suggesting local adaptation. When climate variables were taken as predictors and the 12 traits as response variables in a multivariate regression tree analysis, evapotranspiration (Eref explained the most variation, with subsequent splits related to mean temperature of the warmest month, frost-free period (FFP, and mean annual precipitation (MAP. These grouping matched relatively well the splits using geographic variables as predictors: the northernmost groups (short FFP and low Eref had the lowest growth, and lowest cold injury index; the southern British Columbia group (low Eref and intermediate temperatures had average growth and cold injury index; the group from the coast of California and Oregon (high Eref and FFP had the highest growth performance and the highest cold injury index; and the southernmost, high-altitude group (with high Eref and low FFP performed poorly, had high cold injury index, and lower water use efficiency. Taken together, these results suggest variation in both temperature and water availability across the range shape multivariate adaptive traits in poplar.

  14. imDEV: a graphical user interface to R multivariate analysis tools in Microsoft Excel.

    Science.gov (United States)

    Grapov, Dmitry; Newman, John W

    2012-09-01

    Interactive modules for Data Exploration and Visualization (imDEV) is a Microsoft Excel spreadsheet embedded application providing an integrated environment for the analysis of omics data through a user-friendly interface. Individual modules enables interactive and dynamic analyses of large data by interfacing R's multivariate statistics and highly customizable visualizations with the spreadsheet environment, aiding robust inferences and generating information-rich data visualizations. This tool provides access to multiple comparisons with false discovery correction, hierarchical clustering, principal and independent component analyses, partial least squares regression and discriminant analysis, through an intuitive interface for creating high-quality two- and a three-dimensional visualizations including scatter plot matrices, distribution plots, dendrograms, heat maps, biplots, trellis biplots and correlation networks. Freely available for download at http://sourceforge.net/projects/imdev/. Implemented in R and VBA and supported by Microsoft Excel (2003, 2007 and 2010).

  15. Leadership styles across hierarchical levels in nursing departments.

    Science.gov (United States)

    Stordeur, S; Vandenberghe, C; D'hoore, W

    2000-01-01

    Some researchers have reported on the cascading effect of transformational leadership across hierarchical levels. One study examined this effect in nursing, but it was limited to a single hospital. To examine the cascading effect of leadership styles across hierarchical levels in a sample of nursing departments and to investigate the effect of hierarchical level on the relationships between leadership styles and various work outcomes. Based on a sample of eight hospitals, the cascading effect was tested using correlation analysis. The main sources of variation among leadership scores were determined with analyses of variance (ANOVA), and the interaction effect of hierarchical level and leadership styles on criterion variables was tested with moderated regression analysis. No support was found for a cascading effect of leadership across hierarchical levels. Rather, the variation of leadership scores was explained primarily by the organizational context. Transformational leadership had a stronger impact on criterion variables than transactional leadership. Interaction effects between leadership styles and hierarchical level were observed only for perceived unit effectiveness. The hospital's structure and culture are major determinants of leadership styles.

  16. Reporting quality of multivariable logistic regression in selected Indian medical journals.

    Science.gov (United States)

    Kumar, R; Indrayan, A; Chhabra, P

    2012-01-01

    Use of multivariable logistic regression (MLR) modeling has steeply increased in the medical literature over the past few years. Testing of model assumptions and adequate reporting of MLR allow the reader to interpret results more accurately. To review the fulfillment of assumptions and reporting quality of MLR in selected Indian medical journals using established criteria. Analysis of published literature. Medknow.com publishes 68 Indian medical journals with open access. Eight of these journals had at least five articles using MLR between the years 1994 to 2008. Articles from each of these journals were evaluated according to the previously established 10-point quality criteria for reporting and to test the MLR model assumptions. SPSS 17 software and non-parametric test (Kruskal-Wallis H, Mann Whitney U, Spearman Correlation). One hundred and nine articles were finally found using MLR for analyzing the data in the selected eight journals. The number of such articles gradually increased after year 2003, but quality score remained almost similar over time. P value, odds ratio, and 95% confidence interval for coefficients in MLR was reported in 75.2% and sufficient cases (>10) per covariate of limiting sample size were reported in the 58.7% of the articles. No article reported the test for conformity of linear gradient for continuous covariates. Total score was not significantly different across the journals. However, involvement of statistician or epidemiologist as a co-author improved the average quality score significantly (P=0.014). Reporting of MLR in many Indian journals is incomplete. Only one article managed to score 8 out of 10 among 109 articles under review. All others scored less. Appropriate guidelines in instructions to authors, and pre-publication review of articles using MLR by a qualified statistician may improve quality of reporting.

  17. Multivariate Adaptative Regression Splines (MARS, una alternativa para el análisis de series de tiempo

    Directory of Open Access Journals (Sweden)

    Jairo Vanegas

    2017-05-01

    Full Text Available Multivariate Adaptative Regression Splines (MARS es un método de modelación no paramétrico que extiende el modelo lineal incorporando no linealidades e interacciones de variables. Es una herramienta flexible que automatiza la construcción de modelos de predicción, seleccionando variables relevantes, transformando las variables predictoras, tratando valores perdidos y previniendo sobreajustes mediante un autotest. También permite predecir tomando en cuenta factores estructurales que pudieran tener influencia sobre la variable respuesta, generando modelos hipotéticos. El resultado final serviría para identificar puntos de corte relevantes en series de datos. En el área de la salud es poco utilizado, por lo que se propone como una herramienta más para la evaluación de indicadores relevantes en salud pública. Para efectos demostrativos se utilizaron series de datos de mortalidad de menores de 5 años de Costa Rica en el periodo 1978-2008.

  18. Adaptive estimation of multivariate functions using conditionally Gaussian tensor-product spline priors

    NARCIS (Netherlands)

    Jonge, de R.; Zanten, van J.H.

    2012-01-01

    We investigate posterior contraction rates for priors on multivariate functions that are constructed using tensor-product B-spline expansions. We prove that using a hierarchical prior with an appropriate prior distribution on the partition size and Gaussian prior weights on the B-spline

  19. Modeling the potential risk factors of bovine viral diarrhea prevalence in Egypt using univariable and multivariable logistic regression analyses

    Directory of Open Access Journals (Sweden)

    Abdelfattah M. Selim

    2018-03-01

    Full Text Available Aim: The present cross-sectional study was conducted to determine the seroprevalence and potential risk factors associated with Bovine viral diarrhea virus (BVDV disease in cattle and buffaloes in Egypt, to model the potential risk factors associated with the disease using logistic regression (LR models, and to fit the best predictive model for the current data. Materials and Methods: A total of 740 blood samples were collected within November 2012-March 2013 from animals aged between 6 months and 3 years. The potential risk factors studied were species, age, sex, and herd location. All serum samples were examined with indirect ELIZA test for antibody detection. Data were analyzed with different statistical approaches such as Chi-square test, odds ratios (OR, univariable, and multivariable LR models. Results: Results revealed a non-significant association between being seropositive with BVDV and all risk factors, except for species of animal. Seroprevalence percentages were 40% and 23% for cattle and buffaloes, respectively. OR for all categories were close to one with the highest OR for cattle relative to buffaloes, which was 2.237. Likelihood ratio tests showed a significant drop of the -2LL from univariable LR to multivariable LR models. Conclusion: There was an evidence of high seroprevalence of BVDV among cattle as compared with buffaloes with the possibility of infection in different age groups of animals. In addition, multivariable LR model was proved to provide more information for association and prediction purposes relative to univariable LR models and Chi-square tests if we have more than one predictor.

  20. Scale of association: hierarchical linear models and the measurement of ecological systems

    Science.gov (United States)

    Sean M. McMahon; Jeffrey M. Diez

    2007-01-01

    A fundamental challenge to understanding patterns in ecological systems lies in employing methods that can analyse, test and draw inference from measured associations between variables across scales. Hierarchical linear models (HLM) use advanced estimation algorithms to measure regression relationships and variance-covariance parameters in hierarchically structured...

  1. Hierarchical Matching and Regression with Application to Photometric Redshift Estimation

    Science.gov (United States)

    Murtagh, Fionn

    2017-06-01

    This work emphasizes that heterogeneity, diversity, discontinuity, and discreteness in data is to be exploited in classification and regression problems. A global a priori model may not be desirable. For data analytics in cosmology, this is motivated by the variety of cosmological objects such as elliptical, spiral, active, and merging galaxies at a wide range of redshifts. Our aim is matching and similarity-based analytics that takes account of discrete relationships in the data. The information structure of the data is represented by a hierarchy or tree where the branch structure, rather than just the proximity, is important. The representation is related to p-adic number theory. The clustering or binning of the data values, related to the precision of the measurements, has a central role in this methodology. If used for regression, our approach is a method of cluster-wise regression, generalizing nearest neighbour regression. Both to exemplify this analytics approach, and to demonstrate computational benefits, we address the well-known photometric redshift or `photo-z' problem, seeking to match Sloan Digital Sky Survey (SDSS) spectroscopic and photometric redshifts.

  2. Geoelectrical parameter-based multivariate regression borehole yield model for predicting aquifer yield in managing groundwater resource sustainability

    Directory of Open Access Journals (Sweden)

    Kehinde Anthony Mogaji

    2016-07-01

    Full Text Available This study developed a GIS-based multivariate regression (MVR yield rate prediction model of groundwater resource sustainability in the hard-rock geology terrain of southwestern Nigeria. This model can economically manage the aquifer yield rate potential predictions that are often overlooked in groundwater resources development. The proposed model relates the borehole yield rate inventory of the area to geoelectrically derived parameters. Three sets of borehole yield rate conditioning geoelectrically derived parameters—aquifer unit resistivity (ρ, aquifer unit thickness (D and coefficient of anisotropy (λ—were determined from the acquired and interpreted geophysical data. The extracted borehole yield rate values and the geoelectrically derived parameter values were regressed to develop the MVR relationship model by applying linear regression and GIS techniques. The sensitivity analysis results of the MVR model evaluated at P ⩽ 0.05 for the predictors ρ, D and λ provided values of 2.68 × 10−05, 2 × 10−02 and 2.09 × 10−06, respectively. The accuracy and predictive power tests conducted on the MVR model using the Theil inequality coefficient measurement approach, coupled with the sensitivity analysis results, confirmed the model yield rate estimation and prediction capability. The MVR borehole yield prediction model estimates were processed in a GIS environment to model an aquifer yield potential prediction map of the area. The information on the prediction map can serve as a scientific basis for predicting aquifer yield potential rates relevant in groundwater resources sustainability management. The developed MVR borehole yield rate prediction mode provides a good alternative to other methods used for this purpose.

  3. Hierarchical Hidden Markov Models for Multivariate Integer-Valued Time-Series

    DEFF Research Database (Denmark)

    Catania, Leopoldo; Di Mari, Roberto

    2018-01-01

    We propose a new flexible dynamic model for multivariate nonnegative integer-valued time-series. Observations are assumed to depend on the realization of two additional unobserved integer-valued stochastic variables which control for the time-and cross-dependence of the data. An Expectation......-Maximization algorithm for maximum likelihood estimation of the model's parameters is derived. We provide conditional and unconditional (cross)-moments implied by the model, as well as the limiting distribution of the series. A Monte Carlo experiment investigates the finite sample properties of our estimation...

  4. Hierarchical Bayesian modelling of mobility metrics for hazard model input calibration

    Science.gov (United States)

    Calder, Eliza; Ogburn, Sarah; Spiller, Elaine; Rutarindwa, Regis; Berger, Jim

    2015-04-01

    In this work we present a method to constrain flow mobility input parameters for pyroclastic flow models using hierarchical Bayes modeling of standard mobility metrics such as H/L and flow volume etc. The advantage of hierarchical modeling is that it can leverage the information in global dataset for a particular mobility metric in order to reduce the uncertainty in modeling of an individual volcano, especially important where individual volcanoes have only sparse datasets. We use compiled pyroclastic flow runout data from Colima, Merapi, Soufriere Hills, Unzen and Semeru volcanoes, presented in an open-source database FlowDat (https://vhub.org/groups/massflowdatabase). While the exact relationship between flow volume and friction varies somewhat between volcanoes, dome collapse flows originating from the same volcano exhibit similar mobility relationships. Instead of fitting separate regression models for each volcano dataset, we use a variation of the hierarchical linear model (Kass and Steffey, 1989). The model presents a hierarchical structure with two levels; all dome collapse flows and dome collapse flows at specific volcanoes. The hierarchical model allows us to assume that the flows at specific volcanoes share a common distribution of regression slopes, then solves for that distribution. We present comparisons of the 95% confidence intervals on the individual regression lines for the data set from each volcano as well as those obtained from the hierarchical model. The results clearly demonstrate the advantage of considering global datasets using this technique. The technique developed is demonstrated here for mobility metrics, but can be applied to many other global datasets of volcanic parameters. In particular, such methods can provide a means to better contain parameters for volcanoes for which we only have sparse data, a ubiquitous problem in volcanology.

  5. New strategy for determination of anthocyanins, polyphenols and antioxidant capacity of Brassica oleracea liquid extract using infrared spectroscopies and multivariate regression

    Science.gov (United States)

    de Oliveira, Isadora R. N.; Roque, Jussara V.; Maia, Mariza P.; Stringheta, Paulo C.; Teófilo, Reinaldo F.

    2018-04-01

    A new method was developed to determine the antioxidant properties of red cabbage extract (Brassica oleracea) by mid (MID) and near (NIR) infrared spectroscopies and partial least squares (PLS) regression. A 70% (v/v) ethanolic extract of red cabbage was concentrated to 9° Brix and further diluted (12 to 100%) in water. The dilutions were used as external standards for the building of PLS models. For the first time, this strategy was applied for building multivariate regression models. Reference analyses and spectral data were obtained from diluted extracts. The determinate properties were total and monomeric anthocyanins, total polyphenols and antioxidant capacity by ABTS (2,2-azino-bis(3-ethyl-benzothiazoline-6-sulfonate)) and DPPH (2,2-diphenyl-1-picrylhydrazyl) methods. Ordered predictors selection (OPS) and genetic algorithm (GA) were used for feature selection before PLS regression (PLS-1). In addition, a PLS-2 regression was applied to all properties simultaneously. PLS-1 models provided more predictive models than did PLS-2 regression. PLS-OPS and PLS-GA models presented excellent prediction results with a correlation coefficient higher than 0.98. However, the best models were obtained using PLS and variable selection with the OPS algorithm and the models based on NIR spectra were considered more predictive for all properties. Then, these models provided a simple, rapid and accurate method for determination of red cabbage extract antioxidant properties and its suitability for use in the food industry.

  6. Simultaneous chemometric determination of pyridoxine hydrochloride and isoniazid in tablets by multivariate regression methods.

    Science.gov (United States)

    Dinç, Erdal; Ustündağ, Ozgür; Baleanu, Dumitru

    2010-08-01

    The sole use of pyridoxine hydrochloride during treatment of tuberculosis gives rise to pyridoxine deficiency. Therefore, a combination of pyridoxine hydrochloride and isoniazid is used in pharmaceutical dosage form in tuberculosis treatment to reduce this side effect. In this study, two chemometric methods, partial least squares (PLS) and principal component regression (PCR), were applied to the simultaneous determination of pyridoxine (PYR) and isoniazid (ISO) in their tablets. A concentration training set comprising binary mixtures of PYR and ISO consisting of 20 different combinations were randomly prepared in 0.1 M HCl. Both multivariate calibration models were constructed using the relationships between the concentration data set (concentration data matrix) and absorbance data matrix in the spectral region 200-330 nm. The accuracy and the precision of the proposed chemometric methods were validated by analyzing synthetic mixtures containing the investigated drugs. The recovery results obtained by applying PCR and PLS calibrations to the artificial mixtures were found between 100.0 and 100.7%. Satisfactory results obtained by applying the PLS and PCR methods to both artificial and commercial samples were obtained. The results obtained in this manuscript strongly encourage us to use them for the quality control and the routine analysis of the marketing tablets containing PYR and ISO drugs. Copyright © 2010 John Wiley & Sons, Ltd.

  7. Stock price forecasting for companies listed on Tehran stock exchange using multivariate adaptive regression splines model and semi-parametric splines technique

    Science.gov (United States)

    Rounaghi, Mohammad Mahdi; Abbaszadeh, Mohammad Reza; Arashi, Mohammad

    2015-11-01

    One of the most important topics of interest to investors is stock price changes. Investors whose goals are long term are sensitive to stock price and its changes and react to them. In this regard, we used multivariate adaptive regression splines (MARS) model and semi-parametric splines technique for predicting stock price in this study. The MARS model as a nonparametric method is an adaptive method for regression and it fits for problems with high dimensions and several variables. semi-parametric splines technique was used in this study. Smoothing splines is a nonparametric regression method. In this study, we used 40 variables (30 accounting variables and 10 economic variables) for predicting stock price using the MARS model and using semi-parametric splines technique. After investigating the models, we select 4 accounting variables (book value per share, predicted earnings per share, P/E ratio and risk) as influencing variables on predicting stock price using the MARS model. After fitting the semi-parametric splines technique, only 4 accounting variables (dividends, net EPS, EPS Forecast and P/E Ratio) were selected as variables effective in forecasting stock prices.

  8. Modelling lecturer performance index of private university in Tulungagung by using survival analysis with multivariate adaptive regression spline

    Science.gov (United States)

    Hasyim, M.; Prastyo, D. D.

    2018-03-01

    Survival analysis performs relationship between independent variables and survival time as dependent variable. In fact, not all survival data can be recorded completely by any reasons. In such situation, the data is called censored data. Moreover, several model for survival analysis requires assumptions. One of the approaches in survival analysis is nonparametric that gives more relax assumption. In this research, the nonparametric approach that is employed is Multivariate Regression Adaptive Spline (MARS). This study is aimed to measure the performance of private university’s lecturer. The survival time in this study is duration needed by lecturer to obtain their professional certificate. The results show that research activities is a significant factor along with developing courses material, good publication in international or national journal, and activities in research collaboration.

  9. Descriptor Learning via Supervised Manifold Regularization for Multioutput Regression.

    Science.gov (United States)

    Zhen, Xiantong; Yu, Mengyang; Islam, Ali; Bhaduri, Mousumi; Chan, Ian; Li, Shuo

    2017-09-01

    Multioutput regression has recently shown great ability to solve challenging problems in both computer vision and medical image analysis. However, due to the huge image variability and ambiguity, it is fundamentally challenging to handle the highly complex input-target relationship of multioutput regression, especially with indiscriminate high-dimensional representations. In this paper, we propose a novel supervised descriptor learning (SDL) algorithm for multioutput regression, which can establish discriminative and compact feature representations to improve the multivariate estimation performance. The SDL is formulated as generalized low-rank approximations of matrices with a supervised manifold regularization. The SDL is able to simultaneously extract discriminative features closely related to multivariate targets and remove irrelevant and redundant information by transforming raw features into a new low-dimensional space aligned to targets. The achieved discriminative while compact descriptor largely reduces the variability and ambiguity for multioutput regression, which enables more accurate and efficient multivariate estimation. We conduct extensive evaluation of the proposed SDL on both synthetic data and real-world multioutput regression tasks for both computer vision and medical image analysis. Experimental results have shown that the proposed SDL can achieve high multivariate estimation accuracy on all tasks and largely outperforms the algorithms in the state of the arts. Our method establishes a novel SDL framework for multioutput regression, which can be widely used to boost the performance in different applications.

  10. Prediction of road accidents: A Bayesian hierarchical approach.

    Science.gov (United States)

    Deublein, Markus; Schubert, Matthias; Adey, Bryan T; Köhler, Jochen; Faber, Michael H

    2013-03-01

    In this paper a novel methodology for the prediction of the occurrence of road accidents is presented. The methodology utilizes a combination of three statistical methods: (1) gamma-updating of the occurrence rates of injury accidents and injured road users, (2) hierarchical multivariate Poisson-lognormal regression analysis taking into account correlations amongst multiple dependent model response variables and effects of discrete accident count data e.g. over-dispersion, and (3) Bayesian inference algorithms, which are applied by means of data mining techniques supported by Bayesian Probabilistic Networks in order to represent non-linearity between risk indicating and model response variables, as well as different types of uncertainties which might be present in the development of the specific models. Prior Bayesian Probabilistic Networks are first established by means of multivariate regression analysis of the observed frequencies of the model response variables, e.g. the occurrence of an accident, and observed values of the risk indicating variables, e.g. degree of road curvature. Subsequently, parameter learning is done using updating algorithms, to determine the posterior predictive probability distributions of the model response variables, conditional on the values of the risk indicating variables. The methodology is illustrated through a case study using data of the Austrian rural motorway network. In the case study, on randomly selected road segments the methodology is used to produce a model to predict the expected number of accidents in which an injury has occurred and the expected number of light, severe and fatally injured road users. Additionally, the methodology is used for geo-referenced identification of road sections with increased occurrence probabilities of injury accident events on a road link between two Austrian cities. It is shown that the proposed methodology can be used to develop models to estimate the occurrence of road accidents for any

  11. Simultaneous determination of estrogens (ethinylestradiol and norgestimate) concentrations in human and bovine serum albumin by use of fluorescence spectroscopy and multivariate regression analysis.

    Science.gov (United States)

    Hordge, LaQuana N; McDaniel, Kiara L; Jones, Derick D; Fakayode, Sayo O

    2016-05-15

    The endocrine disruption property of estrogens necessitates the immediate need for effective monitoring and development of analytical protocols for their analyses in biological and human specimens. This study explores the first combined utility of a steady-state fluorescence spectroscopy and multivariate partial-least-square (PLS) regression analysis for the simultaneous determination of two estrogens (17α-ethinylestradiol (EE) and norgestimate (NOR)) concentrations in bovine serum albumin (BSA) and human serum albumin (HSA) samples. The influence of EE and NOR concentrations and temperature on the emission spectra of EE-HSA EE-BSA, NOR-HSA, and NOR-BSA complexes was also investigated. The binding of EE with HSA and BSA resulted in increase in emission characteristics of HSA and BSA and a significant blue spectra shift. In contrast, the interaction of NOR with HSA and BSA quenched the emission characteristics of HSA and BSA. The observed emission spectral shifts preclude the effective use of traditional univariate regression analysis of fluorescent data for the determination of EE and NOR concentrations in HSA and BSA samples. Multivariate partial-least-squares (PLS) regression analysis was utilized to correlate the changes in emission spectra with EE and NOR concentrations in HSA and BSA samples. The figures-of-merit of the developed PLS regression models were excellent, with limits of detection as low as 1.6×10(-8) M for EE and 2.4×10(-7) M for NOR and good linearity (R(2)>0.994985). The PLS models correctly predicted EE and NOR concentrations in independent validation HSA and BSA samples with a root-mean-square-percent-relative-error (RMS%RE) of less than 6.0% at physiological condition. On the contrary, the use of univariate regression resulted in poor predictions of EE and NOR in HSA and BSA samples, with RMS%RE larger than 40% at physiological conditions. High accuracy, low sensitivity, simplicity, low-cost with no prior analyte extraction or separation

  12. Proposing a Hierarchical Utility Package with Reference to Mobile Advertising

    OpenAIRE

    Shalini N. Tripathi; Masood H. Siddiqui

    2011-01-01

    Mobile advertising is a powerful tool for direct and interactive marketing. However effective marketing requires examining consumers’ psyche. This study proposes a hierarchical utility package (in the consumers’ perception) with reference to mobile advertising, thus enhancing its acceptance. Confirmatory factor analysis revealed four consolidated utility dimensions (with reference to mobile advertising). Binary logistic regression was used to create a hierarchical utility package with res...

  13. A comparison between univariate probabilistic and multivariate (logistic regression) methods for landslide susceptibility analysis: the example of the Febbraro valley (Northern Alps, Italy)

    Science.gov (United States)

    Rossi, M.; Apuani, T.; Felletti, F.

    2009-04-01

    The aim of this paper is to compare the results of two statistical methods for landslide susceptibility analysis: 1) univariate probabilistic method based on landslide susceptibility index, 2) multivariate method (logistic regression). The study area is the Febbraro valley, located in the central Italian Alps, where different types of metamorphic rocks croup out. On the eastern part of the studied basin a quaternary cover represented by colluvial and secondarily, by glacial deposits, is dominant. In this study 110 earth flows, mainly located toward NE portion of the catchment, were analyzed. They involve only the colluvial deposits and their extension mainly ranges from 36 to 3173 m2. Both statistical methods require to establish a spatial database, in which each landslide is described by several parameters that can be assigned using a main scarp central point of landslide. The spatial database is constructed using a Geographical Information System (GIS). Each landslide is described by several parameters corresponding to the value of main scarp central point of the landslide. Based on bibliographic review a total of 15 predisposing factors were utilized. The width of the intervals, in which the maps of the predisposing factors have to be reclassified, has been defined assuming constant intervals to: elevation (100 m), slope (5 °), solar radiation (0.1 MJ/cm2/year), profile curvature (1.2 1/m), tangential curvature (2.2 1/m), drainage density (0.5), lineament density (0.00126). For the other parameters have been used the results of the probability-probability plots analysis and the statistical indexes of landslides site. In particular slope length (0 ÷ 2, 2 ÷ 5, 5 ÷ 10, 10 ÷ 20, 20 ÷ 35, 35 ÷ 260), accumulation flow (0 ÷ 1, 1 ÷ 2, 2 ÷ 5, 5 ÷ 12, 12 ÷ 60, 60 ÷27265), Topographic Wetness Index 0 ÷ 0.74, 0.74 ÷ 1.94, 1.94 ÷ 2.62, 2.62 ÷ 3.48, 3.48 ÷ 6,00, 6.00 ÷ 9.44), Stream Power Index (0 ÷ 0.64, 0.64 ÷ 1.28, 1.28 ÷ 1.81, 1.81 ÷ 4.20, 4.20 ÷ 9

  14. EXPLORATORY DATA ANALYSIS AND MULTIVARIATE STRATEGIES FOR REVEALING MULTIVARIATE STRUCTURES IN CLIMATE DATA

    Directory of Open Access Journals (Sweden)

    2016-12-01

    Full Text Available This paper is on data analysis strategy in a complex, multidimensional, and dynamic domain. The focus is on the use of data mining techniques to explore the importance of multivariate structures; using climate variables which influences climate change. Techniques involved in data mining exercise vary according to the data structures. The multivariate analysis strategy considered here involved choosing an appropriate tool to analyze a process. Factor analysis is introduced into data mining technique in order to reveal the influencing impacts of factors involved as well as solving for multicolinearity effect among the variables. The temporal nature and multidimensionality of the target variables is revealed in the model using multidimensional regression estimates. The strategy of integrating the method of several statistical techniques, using climate variables in Nigeria was employed. R2 of 0.518 was obtained from the ordinary least square regression analysis carried out and the test was not significant at 5% level of significance. However, factor analysis regression strategy gave a good fit with R2 of 0.811 and the test was significant at 5% level of significance. Based on this study, model building should go beyond the usual confirmatory data analysis (CDA, rather it should be complemented with exploratory data analysis (EDA in order to achieve a desired result.

  15. Online Monitoring of Copper Damascene Electroplating Bath by Voltammetry: Selection of Variables for Multiblock and Hierarchical Chemometric Analysis of Voltammetric Data

    Directory of Open Access Journals (Sweden)

    Aleksander Jaworski

    2017-01-01

    Full Text Available The Real Time Analyzer (RTA utilizing DC- and AC-voltammetric techniques is an in situ, online monitoring system that provides a complete chemical analysis of different electrochemical deposition solutions. The RTA employs multivariate calibration when predicting concentration parameters from a multivariate data set. Although the hierarchical and multiblock Principal Component Regression- (PCR- and Partial Least Squares- (PLS- based methods can handle data sets even when the number of variables significantly exceeds the number of samples, it can be advantageous to reduce the number of variables to obtain improvement of the model predictions and better interpretation. This presentation focuses on the introduction of a multistep, rigorous method of data-selection-based Least Squares Regression, Simple Modeling of Class Analogy modeling power, and, as a novel application in electroanalysis, Uninformative Variable Elimination by PLS and by PCR, Variable Importance in the Projection coupled with PLS, Interval PLS, Interval PCR, and Moving Window PLS. Selection criteria of the optimum decomposition technique for the specific data are also demonstrated. The chief goal of this paper is to introduce to the community of electroanalytical chemists numerous variable selection methods which are well established in spectroscopy and can be successfully applied to voltammetric data analysis.

  16. Hierarchical species distribution models

    Science.gov (United States)

    Hefley, Trevor J.; Hooten, Mevin B.

    2016-01-01

    Determining the distribution pattern of a species is important to increase scientific knowledge, inform management decisions, and conserve biodiversity. To infer spatial and temporal patterns, species distribution models have been developed for use with many sampling designs and types of data. Recently, it has been shown that count, presence-absence, and presence-only data can be conceptualized as arising from a point process distribution. Therefore, it is important to understand properties of the point process distribution. We examine how the hierarchical species distribution modeling framework has been used to incorporate a wide array of regression and theory-based components while accounting for the data collection process and making use of auxiliary information. The hierarchical modeling framework allows us to demonstrate how several commonly used species distribution models can be derived from the point process distribution, highlight areas of potential overlap between different models, and suggest areas where further research is needed.

  17. A MULTIVARIATE ANALYSIS OF CROATIAN COUNTIES ENTREPRENEURSHIP

    Directory of Open Access Journals (Sweden)

    Elza Jurun

    2012-12-01

    Full Text Available In the focus of this paper is a multivariate analysis of Croatian Counties entrepreneurship. Complete data base available by official statistic institutions at national and regional level is used. Modern econometric methodology starting from a comparative analysis via multiple regression to multivariate cluster analysis is carried out as well as the analysis of successful or inefficacious entrepreneurship measured by indicators of efficiency, profitability and productivity. Time horizons of the comparative analysis are in 2004 and 2010. Accelerators of socio-economic development - number of entrepreneur investors, investment in fixed assets and current assets ratio in multiple regression model are analytically filtered between twenty-six independent variables as variables of the dominant influence on GDP per capita in 2010 as dependent variable. Results of multivariate cluster analysis of twentyone Croatian Counties are interpreted also in the sense of three Croatian NUTS 2 regions according to European nomenclature of regional territorial division of Croatia.

  18. Application of multivariate adaptive regression spine-assisted objective function on optimization of heat transfer rate around a cylinder

    Energy Technology Data Exchange (ETDEWEB)

    Dey, Prasenjit; Dad, Ajoy K. [Mechanical Engineering Department, National Institute of Technology, Agartala (India)

    2016-12-15

    The present study aims to predict the heat transfer characteristics around a square cylinder with different corner radii using multivariate adaptive regression splines (MARS). Further, the MARS-generated objective function is optimized by particle swarm optimization. The data for the prediction are taken from the recently published article by the present authors [P. Dey, A. Sarkar, A.K. Das, Development of GEP and ANN model to predict the unsteady forced convection over a cylinder, Neural Comput. Appl. (2015). Further, the MARS model is compared with artificial neural network and gene expression programming. It has been found that the MARS model is very efficient in predicting the heat transfer characteristics. It has also been found that MARS is more efficient than artificial neural network and gene expression programming in predicting the forced convection data, and also particle swarm optimization can efficiently optimize the heat transfer rate.

  19. Analysis of multi-species point patterns using multivariate log Gaussian Cox processes

    DEFF Research Database (Denmark)

    Waagepetersen, Rasmus; Guan, Yongtao; Jalilian, Abdollah

    Multivariate log Gaussian Cox processes are flexible models for multivariate point patterns. However, they have so far only been applied in bivariate cases. In this paper we move beyond the bivariate case in order to model multi-species point patterns of tree locations. In particular we address t...... of the data. The selected number of common latent fields provides an index of complexity of the multivariate covariance structure. Hierarchical clustering is used to identify groups of species with similar patterns of dependence on the common latent fields.......Multivariate log Gaussian Cox processes are flexible models for multivariate point patterns. However, they have so far only been applied in bivariate cases. In this paper we move beyond the bivariate case in order to model multi-species point patterns of tree locations. In particular we address...... the problems of identifying parsimonious models and of extracting biologically relevant information from the fitted models. The latent multivariate Gaussian field is decomposed into components given in terms of random fields common to all species and components which are species specific. This allows...

  20. A Model for Shovel Capital Cost Estimation, Using a Hybrid Model of Multivariate Regression and Neural Networks

    Directory of Open Access Journals (Sweden)

    Abdolreza Yazdani-Chamzini

    2017-12-01

    Full Text Available Cost estimation is an essential issue in feasibility studies in civil engineering. Many different methods can be applied to modelling costs. These methods can be divided into several main groups: (1 artificial intelligence, (2 statistical methods, and (3 analytical methods. In this paper, the multivariate regression (MVR method, which is one of the most popular linear models, and the artificial neural network (ANN method, which is widely applied to solving different prediction problems with a high degree of accuracy, have been combined to provide a cost estimate model for a shovel machine. This hybrid methodology is proposed, taking the advantages of MVR and ANN models in linear and nonlinear modelling, respectively. In the proposed model, the unique advantages of the MVR model in linear modelling are used first to recognize the existing linear structure in data, and, then, the ANN for determining nonlinear patterns in preprocessed data is applied. The results with three indices indicate that the proposed model is efficient and capable of increasing the prediction accuracy.

  1. A multivariate decision tree analysis of biophysical factors in tropical forest fire occurrence

    Science.gov (United States)

    Rey S. Ofren; Edward Harvey

    2000-01-01

    A multivariate decision tree model was used to quantify the relative importance of complex hierarchical relationships between biophysical variables and the occurrence of tropical forest fires. The study site is the Huai Kha Kbaeng wildlife sanctuary, a World Heritage Site in northwestern Thailand where annual fires are common and particularly destructive. Thematic...

  2. Mixture of Regression Models with Single-Index

    OpenAIRE

    Xiang, Sijia; Yao, Weixin

    2016-01-01

    In this article, we propose a class of semiparametric mixture regression models with single-index. We argue that many recently proposed semiparametric/nonparametric mixture regression models can be considered special cases of the proposed model. However, unlike existing semiparametric mixture regression models, the new pro- posed model can easily incorporate multivariate predictors into the nonparametric components. Backfitting estimates and the corresponding algorithms have been proposed for...

  3. Local bilinear multiple-output quantile/depth regression

    Czech Academy of Sciences Publication Activity Database

    Hallin, M.; Lu, Z.; Paindaveine, D.; Šiman, Miroslav

    2015-01-01

    Roč. 21, č. 3 (2015), s. 1435-1466 ISSN 1350-7265 R&D Projects: GA MŠk(CZ) 1M06047 Institutional support: RVO:67985556 Keywords : conditional depth * growth chart * halfspace depth * local bilinear regression * multivariate quantile * quantile regression * regression depth Subject RIV: BA - General Mathematics Impact factor: 1.372, year: 2015 http://library.utia.cas.cz/separaty/2015/SI/siman-0446857.pdf

  4. Applied multivariate statistics with R

    CERN Document Server

    Zelterman, Daniel

    2015-01-01

    This book brings the power of multivariate statistics to graduate-level practitioners, making these analytical methods accessible without lengthy mathematical derivations. Using the open source, shareware program R, Professor Zelterman demonstrates the process and outcomes for a wide array of multivariate statistical applications. Chapters cover graphical displays, linear algebra, univariate, bivariate and multivariate normal distributions, factor methods, linear regression, discrimination and classification, clustering, time series models, and additional methods. Zelterman uses practical examples from diverse disciplines to welcome readers from a variety of academic specialties. Those with backgrounds in statistics will learn new methods while they review more familiar topics. Chapters include exercises, real data sets, and R implementations. The data are interesting, real-world topics, particularly from health and biology-related contexts. As an example of the approach, the text examines a sample from the B...

  5. Prospective surveillance of multivariate spatial disease data

    Science.gov (United States)

    Corberán-Vallet, A

    2012-01-01

    Surveillance systems are often focused on more than one disease within a predefined area. On those occasions when outbreaks of disease are likely to be correlated, the use of multivariate surveillance techniques integrating information from multiple diseases allows us to improve the sensitivity and timeliness of outbreak detection. In this article, we present an extension of the surveillance conditional predictive ordinate to monitor multivariate spatial disease data. The proposed surveillance technique, which is defined for each small area and time period as the conditional predictive distribution of those counts of disease higher than expected given the data observed up to the previous time period, alerts us to both small areas of increased disease incidence and the diseases causing the alarm within each area. We investigate its performance within the framework of Bayesian hierarchical Poisson models using a simulation study. An application to diseases of the respiratory system in South Carolina is finally presented. PMID:22534429

  6. Research on refugees and immigrants social integration in Yunnan Border Area: An empirical analysis on the multivariable linear regression model

    Directory of Open Access Journals (Sweden)

    Peng Nai

    2016-03-01

    Full Text Available A great number of immigration populations resident permanently in Yunnan Border Area of China. To some extent, these people belong to refugees or immigrants in accordance with International Rules, which significantly features the social diversity of this area. However, this kind of social diversity always impairs the social order. Therefore, there will be a positive influence to the local society governance by a research on local immigration integration. This essay hereby attempts to acquire the data of the living situation of these border area immigration and refugees. The analysis of the social integration of refugees and immigration in Yunnan border area in China will be deployed through the modeling of multivariable linear regression based on these data in order to propose some more achievable resolutions.

  7. TYPE Ia SUPERNOVA COLORS AND EJECTA VELOCITIES: HIERARCHICAL BAYESIAN REGRESSION WITH NON-GAUSSIAN DISTRIBUTIONS

    International Nuclear Information System (INIS)

    Mandel, Kaisey S.; Kirshner, Robert P.; Foley, Ryan J.

    2014-01-01

    We investigate the statistical dependence of the peak intrinsic colors of Type Ia supernovae (SNe Ia) on their expansion velocities at maximum light, measured from the Si II λ6355 spectral feature. We construct a new hierarchical Bayesian regression model, accounting for the random effects of intrinsic scatter, measurement error, and reddening by host galaxy dust, and implement a Gibbs sampler and deviance information criteria to estimate the correlation. The method is applied to the apparent colors from BVRI light curves and Si II velocity data for 79 nearby SNe Ia. The apparent color distributions of high-velocity (HV) and normal velocity (NV) supernovae exhibit significant discrepancies for B – V and B – R, but not other colors. Hence, they are likely due to intrinsic color differences originating in the B band, rather than dust reddening. The mean intrinsic B – V and B – R color differences between HV and NV groups are 0.06 ± 0.02 and 0.09 ± 0.02 mag, respectively. A linear model finds significant slopes of –0.021 ± 0.006 and –0.030 ± 0.009 mag (10 3 km s –1 ) –1 for intrinsic B – V and B – R colors versus velocity, respectively. Because the ejecta velocity distribution is skewed toward high velocities, these effects imply non-Gaussian intrinsic color distributions with skewness up to +0.3. Accounting for the intrinsic-color-velocity correlation results in corrections to A V extinction estimates as large as –0.12 mag for HV SNe Ia and +0.06 mag for NV events. Velocity measurements from SN Ia spectra have the potential to diminish systematic errors from the confounding of intrinsic colors and dust reddening affecting supernova distances

  8. Determination of sulfamethoxazole and trimethoprim mixtures by multivariate electronic spectroscopy

    OpenAIRE

    Cordeiro, Gilcélia A.; Peralta-Zamora, Patricio; Nagata, Noemi; Pontarollo, Roberto

    2008-01-01

    In this work a multivariate spectroscopic methodology is proposed for quantitative determination of sulfamethoxazole and trimethoprim in pharmaceutical associations. The multivariate model was developed by partial least-squares regression, using twenty synthetic mixtures and the spectral region between 190 and 350 nm. In the validation stage, which involved the analysis of five synthetic mixtures, prediction errors lower that 3% were observed. The predictive capacity of the multivariate model...

  9. Assessment of Genetic Heterogeneity in Structured Plant Populations Using Multivariate Whole-Genome Regression Models.

    Science.gov (United States)

    Lehermeier, Christina; Schön, Chris-Carolin; de Los Campos, Gustavo

    2015-09-01

    Plant breeding populations exhibit varying levels of structure and admixture; these features are likely to induce heterogeneity of marker effects across subpopulations. Traditionally, structure has been dealt with as a potential confounder, and various methods exist to "correct" for population stratification. However, these methods induce a mean correction that does not account for heterogeneity of marker effects. The animal breeding literature offers a few recent studies that consider modeling genetic heterogeneity in multibreed data, using multivariate models. However, these methods have received little attention in plant breeding where population structure can have different forms. In this article we address the problem of analyzing data from heterogeneous plant breeding populations, using three approaches: (a) a model that ignores population structure [A-genome-based best linear unbiased prediction (A-GBLUP)], (b) a stratified (i.e., within-group) analysis (W-GBLUP), and (c) a multivariate approach that uses multigroup data and accounts for heterogeneity (MG-GBLUP). The performance of the three models was assessed on three different data sets: a diversity panel of rice (Oryza sativa), a maize (Zea mays L.) half-sib panel, and a wheat (Triticum aestivum L.) data set that originated from plant breeding programs. The estimated genomic correlations between subpopulations varied from null to moderate, depending on the genetic distance between subpopulations and traits. Our assessment of prediction accuracy features cases where ignoring population structure leads to a parsimonious more powerful model as well as others where the multivariate and stratified approaches have higher predictive power. In general, the multivariate approach appeared slightly more robust than either the A- or the W-GBLUP. Copyright © 2015 by the Genetics Society of America.

  10. Multivariate analysis of nystatin and metronidazole in a semi-solid matrix by means of diffuse reflectance NIR spectroscopy and PLS regression.

    Science.gov (United States)

    Baratieri, Sabrina C; Barbosa, Juliana M; Freitas, Matheus P; Martins, José A

    2006-01-23

    A multivariate method of analysis of nystatin and metronidazole in a semi-solid matrix, based on diffuse reflectance NIR measurements and partial least squares regression, is reported. The product, a vaginal cream used in the antifungal and antibacterial treatment, is usually, quantitatively analyzed through microbiological tests (nystatin) and HPLC technique (metronidazole), according to pharmacopeial procedures. However, near infrared spectroscopy has demonstrated to be a valuable tool for content determination, given the rapidity and scope of the method. In the present study, it was successfully applied in the prediction of nystatin (even in low concentrations, ca. 0.3-0.4%, w/w, which is around 100,000 IU/5g) and metronidazole contents, as demonstrated by some figures of merit, namely linearity, precision (mean and repeatability) and accuracy.

  11. A land use regression model for ambient ultrafine particles in Montreal, Canada: A comparison of linear regression and a machine learning approach.

    Science.gov (United States)

    Weichenthal, Scott; Ryswyk, Keith Van; Goldstein, Alon; Bagg, Scott; Shekkarizfard, Maryam; Hatzopoulou, Marianne

    2016-04-01

    Existing evidence suggests that ambient ultrafine particles (UFPs) (regression model for UFPs in Montreal, Canada using mobile monitoring data collected from 414 road segments during the summer and winter months between 2011 and 2012. Two different approaches were examined for model development including standard multivariable linear regression and a machine learning approach (kernel-based regularized least squares (KRLS)) that learns the functional form of covariate impacts on ambient UFP concentrations from the data. The final models included parameters for population density, ambient temperature and wind speed, land use parameters (park space and open space), length of local roads and rail, and estimated annual average NOx emissions from traffic. The final multivariable linear regression model explained 62% of the spatial variation in ambient UFP concentrations whereas the KRLS model explained 79% of the variance. The KRLS model performed slightly better than the linear regression model when evaluated using an external dataset (R(2)=0.58 vs. 0.55) or a cross-validation procedure (R(2)=0.67 vs. 0.60). In general, our findings suggest that the KRLS approach may offer modest improvements in predictive performance compared to standard multivariable linear regression models used to estimate spatial variations in ambient UFPs. However, differences in predictive performance were not statistically significant when evaluated using the cross-validation procedure. Crown Copyright © 2015. Published by Elsevier Inc. All rights reserved.

  12. imDEV: a graphical user interface to R multivariate analysis tools in Microsoft Excel

    Science.gov (United States)

    Grapov, Dmitry; Newman, John W.

    2012-01-01

    Summary: Interactive modules for Data Exploration and Visualization (imDEV) is a Microsoft Excel spreadsheet embedded application providing an integrated environment for the analysis of omics data through a user-friendly interface. Individual modules enables interactive and dynamic analyses of large data by interfacing R's multivariate statistics and highly customizable visualizations with the spreadsheet environment, aiding robust inferences and generating information-rich data visualizations. This tool provides access to multiple comparisons with false discovery correction, hierarchical clustering, principal and independent component analyses, partial least squares regression and discriminant analysis, through an intuitive interface for creating high-quality two- and a three-dimensional visualizations including scatter plot matrices, distribution plots, dendrograms, heat maps, biplots, trellis biplots and correlation networks. Availability and implementation: Freely available for download at http://sourceforge.net/projects/imdev/. Implemented in R and VBA and supported by Microsoft Excel (2003, 2007 and 2010). Contact: John.Newman@ars.usda.gov Supplementary Information: Installation instructions, tutorials and users manual are available at http://sourceforge.net/projects/imdev/. PMID:22815358

  13. Nonparametric Regression Estimation for Multivariate Null Recurrent Processes

    Directory of Open Access Journals (Sweden)

    Biqing Cai

    2015-04-01

    Full Text Available This paper discusses nonparametric kernel regression with the regressor being a \\(d\\-dimensional \\(\\beta\\-null recurrent process in presence of conditional heteroscedasticity. We show that the mean function estimator is consistent with convergence rate \\(\\sqrt{n(Th^{d}}\\, where \\(n(T\\ is the number of regenerations for a \\(\\beta\\-null recurrent process and the limiting distribution (with proper normalization is normal. Furthermore, we show that the two-step estimator for the volatility function is consistent. The finite sample performance of the estimate is quite reasonable when the leave-one-out cross validation method is used for bandwidth selection. We apply the proposed method to study the relationship of Federal funds rate with 3-month and 5-year T-bill rates and discover the existence of nonlinearity of the relationship. Furthermore, the in-sample and out-of-sample performance of the nonparametric model is far better than the linear model.

  14. Applied Statistics: From Bivariate through Multivariate Techniques [with CD-ROM

    Science.gov (United States)

    Warner, Rebecca M.

    2007-01-01

    This book provides a clear introduction to widely used topics in bivariate and multivariate statistics, including multiple regression, discriminant analysis, MANOVA, factor analysis, and binary logistic regression. The approach is applied and does not require formal mathematics; equations are accompanied by verbal explanations. Students are asked…

  15. The Realized Hierarchical Archimedean Copula in Risk Modelling

    Directory of Open Access Journals (Sweden)

    Ostap Okhrin

    2017-06-01

    Full Text Available This paper introduces the concept of the realized hierarchical Archimedean copula (rHAC. The proposed approach inherits the ability of the copula to capture the dependencies among financial time series, and combines it with additional information contained in high-frequency data. The considered model does not suffer from the curse of dimensionality, and is able to accurately predict high-dimensional distributions. This flexibility is obtained by using a hierarchical structure in the copula. The time variability of the model is provided by daily forecasts of the realized correlation matrix, which is used to estimate the structure and the parameters of the rHAC. Extensive simulation studies show the validity of the estimator based on this realized correlation matrix, and its performance, in comparison to the benchmark models. The application of the estimator to one-day-ahead Value at Risk (VaR prediction using high-frequency data exhibits good forecasting properties for a multivariate portfolio.

  16. Trochanteric entry femoral nails yield better femoral version and lower revision rates-A large cohort multivariate regression analysis.

    Science.gov (United States)

    Yoon, Richard S; Gage, Mark J; Galos, David K; Donegan, Derek J; Liporace, Frank A

    2017-06-01

    Intramedullary nailing (IMN) has become the standard of care for the treatment of most femoral shaft fractures. Different IMN options include trochanteric and piriformis entry as well as retrograde nails, which may result in varying degrees of femoral rotation. The objective of this study was to analyze postoperative femoral version between three types of nails and to delineate any significant differences in femoral version (DFV) and revision rates. Over a 10-year period, 417 patients underwent IMN of a diaphyseal femur fracture (AO/OTA 32A-C). Of these patients, 316 met inclusion criteria and obtained postoperative computed tomography (CT) scanograms to calculate femoral version and were thus included in the study. In this study, our main outcome measure was the difference in femoral version (DFV) between the uninjured limb and the injured limb. The effect of the following variables on DFV and revision rates were determined via univariate, multivariate, and ordinal regression analyses: gender, age, BMI, ethnicity, mechanism of injury, operative side, open fracture, and table type/position. Statistical significance was set at pregression analysis revealed that a lower BMI was significantly associated with a lower DFV (p=0.006). Controlling for possible covariables, multivariate analysis yielded a significantly lower DFV for trochanteric entry nails than piriformis or retrograde nails (7.9±6.10 vs. 9.5±7.4 vs. 9.4±7.8°, pregression analysis. However, this is not to state that the other nail types exhibited abnormal DFV. Translation to the clinical impact of a few degrees of DFV is also unknown. Future studies to more in-depth study the intricacies of femoral version may lead to improved technology in addition to potentially improved clinical outcomes. Copyright © 2017 Elsevier Ltd. All rights reserved.

  17. A Hierarchical Multivariate Bayesian Approach to Ensemble Model output Statistics in Atmospheric Prediction

    Science.gov (United States)

    2017-09-01

    application of statistical inference. Even when human forecasters leverage their professional experience, which is often gained through long periods of... application throughout statistics and Bayesian data analysis. The multivariate form of 2( , )  (e.g., Figure 12) is similarly analytically...data (i.e., no systematic manipulations with analytical functions), it is common in the statistical literature to apply mathematical transformations

  18. Vector regression introduced

    Directory of Open Access Journals (Sweden)

    Mok Tik

    2014-06-01

    Full Text Available This study formulates regression of vector data that will enable statistical analysis of various geodetic phenomena such as, polar motion, ocean currents, typhoon/hurricane tracking, crustal deformations, and precursory earthquake signals. The observed vector variable of an event (dependent vector variable is expressed as a function of a number of hypothesized phenomena realized also as vector variables (independent vector variables and/or scalar variables that are likely to impact the dependent vector variable. The proposed representation has the unique property of solving the coefficients of independent vector variables (explanatory variables also as vectors, hence it supersedes multivariate multiple regression models, in which the unknown coefficients are scalar quantities. For the solution, complex numbers are used to rep- resent vector information, and the method of least squares is deployed to estimate the vector model parameters after transforming the complex vector regression model into a real vector regression model through isomorphism. Various operational statistics for testing the predictive significance of the estimated vector parameter coefficients are also derived. A simple numerical example demonstrates the use of the proposed vector regression analysis in modeling typhoon paths.

  19. Local Strategy Combined with a Wavelength Selection Method for Multivariate Calibration

    Directory of Open Access Journals (Sweden)

    Haitao Chang

    2016-06-01

    Full Text Available One of the essential factors influencing the prediction accuracy of multivariate calibration models is the quality of the calibration data. A local regression strategy, together with a wavelength selection approach, is proposed to build the multivariate calibration models based on partial least squares regression. The local algorithm is applied to create a calibration set of spectra similar to the spectrum of an unknown sample; the synthetic degree of grey relation coefficient is used to evaluate the similarity. A wavelength selection method based on simple-to-use interactive self-modeling mixture analysis minimizes the influence of noisy variables, and the most informative variables of the most similar samples are selected to build the multivariate calibration model based on partial least squares regression. To validate the performance of the proposed method, ultraviolet-visible absorbance spectra of mixed solutions of food coloring analytes in a concentration range of 20–200 µg/mL is measured. Experimental results show that the proposed method can not only enhance the prediction accuracy of the calibration model, but also greatly reduce its complexity.

  20. A New Predictive Model Based on the ABC Optimized Multivariate Adaptive Regression Splines Approach for Predicting the Remaining Useful Life in Aircraft Engines

    Directory of Open Access Journals (Sweden)

    Paulino José García Nieto

    2016-05-01

    Full Text Available Remaining useful life (RUL estimation is considered as one of the most central points in the prognostics and health management (PHM. The present paper describes a nonlinear hybrid ABC–MARS-based model for the prediction of the remaining useful life of aircraft engines. Indeed, it is well-known that an accurate RUL estimation allows failure prevention in a more controllable way so that the effective maintenance can be carried out in appropriate time to correct impending faults. The proposed hybrid model combines multivariate adaptive regression splines (MARS, which have been successfully adopted for regression problems, with the artificial bee colony (ABC technique. This optimization technique involves parameter setting in the MARS training procedure, which significantly influences the regression accuracy. However, its use in reliability applications has not yet been widely explored. Bearing this in mind, remaining useful life values have been predicted here by using the hybrid ABC–MARS-based model from the remaining measured parameters (input variables for aircraft engines with success. A correlation coefficient equal to 0.92 was obtained when this hybrid ABC–MARS-based model was applied to experimental data. The agreement of this model with experimental data confirmed its good performance. The main advantage of this predictive model is that it does not require information about the previous operation states of the aircraft engine.

  1. Clinical, laboratory, and demographic determinants of hospitalization due to dengue in 7613 patients: A retrospective study based on hierarchical models.

    Science.gov (United States)

    da Silva, Natal Santos; Undurraga, Eduardo A; da Silva Ferreira, Elis Regina; Estofolete, Cássia Fernanda; Nogueira, Maurício Lacerda

    2018-01-01

    In Brazil, the incidence of hospitalization due to dengue, as an indicator of severity, has drastically increased since 1998. The objective of our study was to identify risk factors associated with subsequent hospitalization related to dengue. We analyzed 7613 dengue confirmed via serology (ELISA), non-structural protein 1, or polymerase chain reaction amplification. We used a hierarchical framework to generate a multivariate logistic regression based on a variety of risk variables. This was followed by multiple statistical analyses to assess hierarchical model accuracy, variance, goodness of fit, and whether or not this model reliably represented the population. The final model, which included age, sex, ethnicity, previous dengue infection, hemorrhagic manifestations, plasma leakage, and organ failure, showed that all measured parameters, with the exception of previous dengue, were statistically significant. The presence of organ failure was associated with the highest risk of subsequent dengue hospitalization (OR=5·75; CI=3·53-9·37). Therefore, plasma leakage and organ failure were the main indicators of hospitalization due to dengue, although other variables of minor importance should also be considered to refer dengue patients to hospital treatment, which may lead to a reduction in avoidable deaths as well as costs related to dengue. Copyright © 2017 Elsevier B.V. All rights reserved.

  2. Linear models of coregionalization for multivariate lattice data: Order-dependent and order-free cMCARs.

    Science.gov (United States)

    MacNab, Ying C

    2016-08-01

    This paper concerns with multivariate conditional autoregressive models defined by linear combination of independent or correlated underlying spatial processes. Known as linear models of coregionalization, the method offers a systematic and unified approach for formulating multivariate extensions to a broad range of univariate conditional autoregressive models. The resulting multivariate spatial models represent classes of coregionalized multivariate conditional autoregressive models that enable flexible modelling of multivariate spatial interactions, yielding coregionalization models with symmetric or asymmetric cross-covariances of different spatial variation and smoothness. In the context of multivariate disease mapping, for example, they facilitate borrowing strength both over space and cross variables, allowing for more flexible multivariate spatial smoothing. Specifically, we present a broadened coregionalization framework to include order-dependent, order-free, and order-robust multivariate models; a new class of order-free coregionalized multivariate conditional autoregressives is introduced. We tackle computational challenges and present solutions that are integral for Bayesian analysis of these models. We also discuss two ways of computing deviance information criterion for comparison among competing hierarchical models with or without unidentifiable prior parameters. The models and related methodology are developed in the broad context of modelling multivariate data on spatial lattice and illustrated in the context of multivariate disease mapping. The coregionalization framework and related methods also present a general approach for building spatially structured cross-covariance functions for multivariate geostatistics. © The Author(s) 2016.

  3. Using Multivariate Adaptive Regression Spline and Artificial Neural Network to Simulate Urbanization in Mumbai, India

    Science.gov (United States)

    Ahmadlou, M.; Delavar, M. R.; Tayyebi, A.; Shafizadeh-Moghadam, H.

    2015-12-01

    Land use change (LUC) models used for modelling urban growth are different in structure and performance. Local models divide the data into separate subsets and fit distinct models on each of the subsets. Non-parametric models are data driven and usually do not have a fixed model structure or model structure is unknown before the modelling process. On the other hand, global models perform modelling using all the available data. In addition, parametric models have a fixed structure before the modelling process and they are model driven. Since few studies have compared local non-parametric models with global parametric models, this study compares a local non-parametric model called multivariate adaptive regression spline (MARS), and a global parametric model called artificial neural network (ANN) to simulate urbanization in Mumbai, India. Both models determine the relationship between a dependent variable and multiple independent variables. We used receiver operating characteristic (ROC) to compare the power of the both models for simulating urbanization. Landsat images of 1991 (TM) and 2010 (ETM+) were used for modelling the urbanization process. The drivers considered for urbanization in this area were distance to urban areas, urban density, distance to roads, distance to water, distance to forest, distance to railway, distance to central business district, number of agricultural cells in a 7 by 7 neighbourhoods, and slope in 1991. The results showed that the area under the ROC curve for MARS and ANN was 94.77% and 95.36%, respectively. Thus, ANN performed slightly better than MARS to simulate urban areas in Mumbai, India.

  4. Using Apparent Density of Paper from Hardwood Kraft Pulps to Predict Sheet Properties, based on Unsupervised Classification and Multivariable Regression Techniques

    Directory of Open Access Journals (Sweden)

    Ofélia Anjos

    2015-07-01

    Full Text Available Paper properties determine the product application potential and depend on the raw material, pulping conditions, and pulp refining. The aim of this study was to construct mathematical models that predict quantitative relations between the paper density and various mechanical and optical properties of the paper. A dataset of properties of paper handsheets produced with pulps of Acacia dealbata, Acacia melanoxylon, and Eucalyptus globulus beaten at 500, 2500, and 4500 revolutions was used. Unsupervised classification techniques were combined to assess the need to perform separated prediction models for each species, and multivariable regression techniques were used to establish such prediction models. It was possible to develop models with a high goodness of fit using paper density as the independent variable (or predictor for all variables except tear index and zero-span tensile strength, both dry and wet.

  5. On directional multiple-output quantile regression

    Czech Academy of Sciences Publication Activity Database

    Paindaveine, D.; Šiman, Miroslav

    2011-01-01

    Roč. 102, č. 2 (2011), s. 193-212 ISSN 0047-259X R&D Projects: GA MŠk(CZ) 1M06047 Grant - others:Commision EC(BE) Fonds National de la Recherche Scientifique Institutional research plan: CEZ:AV0Z10750506 Keywords : multivariate quantile * quantile regression * multiple-output regression * halfspace depth * portfolio optimization * value-at risk Subject RIV: BA - General Mathematics Impact factor: 0.879, year: 2011 http://library.utia.cas.cz/separaty/2011/SI/siman-0364128.pdf

  6. Elliptical multiple-output quantile regression and convex optimization

    Czech Academy of Sciences Publication Activity Database

    Hallin, M.; Šiman, Miroslav

    2016-01-01

    Roč. 109, č. 1 (2016), s. 232-237 ISSN 0167-7152 R&D Projects: GA ČR GA14-07234S Institutional support: RVO:67985556 Keywords : quantile regression * elliptical quantile * multivariate quantile * multiple-output regression Subject RIV: BA - General Mathematics Impact factor: 0.540, year: 2016 http://library.utia.cas.cz/separaty/2016/SI/siman-0458243.pdf

  7. Unsupervised classification of multivariate geostatistical data: Two algorithms

    Science.gov (United States)

    Romary, Thomas; Ors, Fabien; Rivoirard, Jacques; Deraisme, Jacques

    2015-12-01

    With the increasing development of remote sensing platforms and the evolution of sampling facilities in mining and oil industry, spatial datasets are becoming increasingly large, inform a growing number of variables and cover wider and wider areas. Therefore, it is often necessary to split the domain of study to account for radically different behaviors of the natural phenomenon over the domain and to simplify the subsequent modeling step. The definition of these areas can be seen as a problem of unsupervised classification, or clustering, where we try to divide the domain into homogeneous domains with respect to the values taken by the variables in hand. The application of classical clustering methods, designed for independent observations, does not ensure the spatial coherence of the resulting classes. Image segmentation methods, based on e.g. Markov random fields, are not adapted to irregularly sampled data. Other existing approaches, based on mixtures of Gaussian random functions estimated via the expectation-maximization algorithm, are limited to reasonable sample sizes and a small number of variables. In this work, we propose two algorithms based on adaptations of classical algorithms to multivariate geostatistical data. Both algorithms are model free and can handle large volumes of multivariate, irregularly spaced data. The first one proceeds by agglomerative hierarchical clustering. The spatial coherence is ensured by a proximity condition imposed for two clusters to merge. This proximity condition relies on a graph organizing the data in the coordinates space. The hierarchical algorithm can then be seen as a graph-partitioning algorithm. Following this interpretation, a spatial version of the spectral clustering algorithm is also proposed. The performances of both algorithms are assessed on toy examples and a mining dataset.

  8. Determining Predictor Importance in Hierarchical Linear Models Using Dominance Analysis

    Science.gov (United States)

    Luo, Wen; Azen, Razia

    2013-01-01

    Dominance analysis (DA) is a method used to evaluate the relative importance of predictors that was originally proposed for linear regression models. This article proposes an extension of DA that allows researchers to determine the relative importance of predictors in hierarchical linear models (HLM). Commonly used measures of model adequacy in…

  9. Modelling daily dissolved oxygen concentration using least square support vector machine, multivariate adaptive regression splines and M5 model tree

    Science.gov (United States)

    Heddam, Salim; Kisi, Ozgur

    2018-04-01

    In the present study, three types of artificial intelligence techniques, least square support vector machine (LSSVM), multivariate adaptive regression splines (MARS) and M5 model tree (M5T) are applied for modeling daily dissolved oxygen (DO) concentration using several water quality variables as inputs. The DO concentration and water quality variables data from three stations operated by the United States Geological Survey (USGS) were used for developing the three models. The water quality data selected consisted of daily measured of water temperature (TE, °C), pH (std. unit), specific conductance (SC, μS/cm) and discharge (DI cfs), are used as inputs to the LSSVM, MARS and M5T models. The three models were applied for each station separately and compared to each other. According to the results obtained, it was found that: (i) the DO concentration could be successfully estimated using the three models and (ii) the best model among all others differs from one station to another.

  10. application of multilinear regression analysis in modeling of soil

    African Journals Online (AJOL)

    Windows User

    Accordingly [1, 3] in their work, they applied linear regression ... (MLRA) is a statistical technique that uses several explanatory ... order to check this, they adopted bivariate correlation analysis .... groups, namely A-1 through A-7, based on their relative expected ..... Multivariate Regression in Gorgan Province North of Iran” ...

  11. Semiparametric Allelic Tests for Mapping Multiple Phenotypes: Binomial Regression and Mahalanobis Distance.

    Science.gov (United States)

    Majumdar, Arunabha; Witte, John S; Ghosh, Saurabh

    2015-12-01

    Binary phenotypes commonly arise due to multiple underlying quantitative precursors and genetic variants may impact multiple traits in a pleiotropic manner. Hence, simultaneously analyzing such correlated traits may be more powerful than analyzing individual traits. Various genotype-level methods, e.g., MultiPhen (O'Reilly et al. []), have been developed to identify genetic factors underlying a multivariate phenotype. For univariate phenotypes, the usefulness and applicability of allele-level tests have been investigated. The test of allele frequency difference among cases and controls is commonly used for mapping case-control association. However, allelic methods for multivariate association mapping have not been studied much. In this article, we explore two allelic tests of multivariate association: one using a Binomial regression model based on inverted regression of genotype on phenotype (Binomial regression-based Association of Multivariate Phenotypes [BAMP]), and the other employing the Mahalanobis distance between two sample means of the multivariate phenotype vector for two alleles at a single-nucleotide polymorphism (Distance-based Association of Multivariate Phenotypes [DAMP]). These methods can incorporate both discrete and continuous phenotypes. Some theoretical properties for BAMP are studied. Using simulations, the power of the methods for detecting multivariate association is compared with the genotype-level test MultiPhen's. The allelic tests yield marginally higher power than MultiPhen for multivariate phenotypes. For one/two binary traits under recessive mode of inheritance, allelic tests are found to be substantially more powerful. All three tests are applied to two different real data and the results offer some support for the simulation study. We propose a hybrid approach for testing multivariate association that implements MultiPhen when Hardy-Weinberg Equilibrium (HWE) is violated and BAMP otherwise, because the allelic approaches assume HWE

  12. On the Optimality of Multivariate S-Estimators

    NARCIS (Netherlands)

    Croux, C.; Dehon, C.; Yadine, A.

    2010-01-01

    In this paper we maximize the efficiency of a multivariate S-estimator under a constraint on the breakdown point. In the linear regression model, it is known that the highest possible efficiency of a maximum breakdown S-estimator is bounded above by 33% for Gaussian errors. We prove the surprising

  13. Precision Index in the Multivariate Context

    Czech Academy of Sciences Publication Activity Database

    Šiman, Miroslav

    2014-01-01

    Roč. 43, č. 2 (2014), s. 377-387 ISSN 0361-0926 R&D Projects: GA MŠk(CZ) 1M06047 Institutional support: RVO:67985556 Keywords : data depth * multivariate quantile * process capability index * precision index * regression quantile Subject RIV: BA - General Mathematics Impact factor: 0.274, year: 2014 http://library.utia.cas.cz/separaty/2014/SI/siman-0425059.pdf

  14. Function approximation with polynomial regression slines

    International Nuclear Information System (INIS)

    Urbanski, P.

    1996-01-01

    Principles of the polynomial regression splines as well as algorithms and programs for their computation are presented. The programs prepared using software package MATLAB are generally intended for approximation of the X-ray spectra and can be applied in the multivariate calibration of radiometric gauges. (author)

  15. Robust methods for multivariate data analysis A1

    DEFF Research Database (Denmark)

    Frosch, Stina; Von Frese, J.; Bro, Rasmus

    2005-01-01

    Outliers may hamper proper classical multivariate analysis, and lead to incorrect conclusions. To remedy the problem of outliers, robust methods are developed in statistics and chemometrics. Robust methods reduce or remove the effect of outlying data points and allow the ?good? data to primarily...... determine the result. This article reviews the most commonly used robust multivariate regression and exploratory methods that have appeared since 1996 in the field of chemometrics. Special emphasis is put on the robust versions of chemometric standard tools like PCA and PLS and the corresponding robust...

  16. Multilevel Hierarchical Modeling of Benthic Macroinvertebrate Responses to Urbanization in Nine Metropolitan Regions across the Conterminous United States

    Science.gov (United States)

    Kashuba, Roxolana; Cha, YoonKyung; Alameddine, Ibrahim; Lee, Boknam; Cuffney, Thomas F.

    2010-01-01

    Multilevel hierarchical modeling methodology has been developed for use in ecological data analysis. The effect of urbanization on stream macroinvertebrate communities was measured across a gradient of basins in each of nine metropolitan regions across the conterminous United States. The hierarchical nature of this dataset was harnessed in a multi-tiered model structure, predicting both invertebrate response at the basin scale and differences in invertebrate response at the region scale. Ordination site scores, total taxa richness, Ephemeroptera, Plecoptera, Trichoptera (EPT) taxa richness, and richness-weighted mean tolerance of organisms at a site were used to describe invertebrate responses. Percentage of urban land cover was used as a basin-level predictor variable. Regional mean precipitation, air temperature, and antecedent agriculture were used as region-level predictor variables. Multilevel hierarchical models were fit to both levels of data simultaneously, borrowing statistical strength from the complete dataset to reduce uncertainty in regional coefficient estimates. Additionally, whereas non-hierarchical regressions were only able to show differing relations between invertebrate responses and urban intensity separately for each region, the multilevel hierarchical regressions were able to explain and quantify those differences within a single model. In this way, this modeling approach directly establishes the importance of antecedent agricultural conditions in masking the response of invertebrates to urbanization in metropolitan regions such as Milwaukee-Green Bay, Wisconsin; Denver, Colorado; and Dallas-Fort Worth, Texas. Also, these models show that regions with high precipitation, such as Atlanta, Georgia; Birmingham, Alabama; and Portland, Oregon, start out with better regional background conditions of invertebrates prior to urbanization but experience faster negative rates of change with urbanization. Ultimately, this urbanization

  17. Advances in Applications of Hierarchical Bayesian Methods with Hydrological Models

    Science.gov (United States)

    Alexander, R. B.; Schwarz, G. E.; Boyer, E. W.

    2017-12-01

    Mechanistic and empirical watershed models are increasingly used to inform water resource decisions. Growing access to historical stream measurements and data from in-situ sensor technologies has increased the need for improved techniques for coupling models with hydrological measurements. Techniques that account for the intrinsic uncertainties of both models and measurements are especially needed. Hierarchical Bayesian methods provide an efficient modeling tool for quantifying model and prediction uncertainties, including those associated with measurements. Hierarchical methods can also be used to explore spatial and temporal variations in model parameters and uncertainties that are informed by hydrological measurements. We used hierarchical Bayesian methods to develop a hybrid (statistical-mechanistic) SPARROW (SPAtially Referenced Regression On Watershed attributes) model of long-term mean annual streamflow across diverse environmental and climatic drainages in 18 U.S. hydrological regions. Our application illustrates the use of a new generation of Bayesian methods that offer more advanced computational efficiencies than the prior generation. Evaluations of the effects of hierarchical (regional) variations in model coefficients and uncertainties on model accuracy indicates improved prediction accuracies (median of 10-50%) but primarily in humid eastern regions, where model uncertainties are one-third of those in arid western regions. Generally moderate regional variability is observed for most hierarchical coefficients. Accounting for measurement and structural uncertainties, using hierarchical state-space techniques, revealed the effects of spatially-heterogeneous, latent hydrological processes in the "localized" drainages between calibration sites; this improved model precision, with only minor changes in regional coefficients. Our study can inform advances in the use of hierarchical methods with hydrological models to improve their integration with stream

  18. Bayesian nonlinear regression for large small problems

    KAUST Repository

    Chakraborty, Sounak; Ghosh, Malay; Mallick, Bani K.

    2012-01-01

    Statistical modeling and inference problems with sample sizes substantially smaller than the number of available covariates are challenging. This is known as large p small n problem. Furthermore, the problem is more complicated when we have multiple correlated responses. We develop multivariate nonlinear regression models in this setup for accurate prediction. In this paper, we introduce a full Bayesian support vector regression model with Vapnik's ε-insensitive loss function, based on reproducing kernel Hilbert spaces (RKHS) under the multivariate correlated response setup. This provides a full probabilistic description of support vector machine (SVM) rather than an algorithm for fitting purposes. We have also introduced a multivariate version of the relevance vector machine (RVM). Instead of the original treatment of the RVM relying on the use of type II maximum likelihood estimates of the hyper-parameters, we put a prior on the hyper-parameters and use Markov chain Monte Carlo technique for computation. We have also proposed an empirical Bayes method for our RVM and SVM. Our methods are illustrated with a prediction problem in the near-infrared (NIR) spectroscopy. A simulation study is also undertaken to check the prediction accuracy of our models. © 2012 Elsevier Inc.

  19. Bayesian nonlinear regression for large small problems

    KAUST Repository

    Chakraborty, Sounak

    2012-07-01

    Statistical modeling and inference problems with sample sizes substantially smaller than the number of available covariates are challenging. This is known as large p small n problem. Furthermore, the problem is more complicated when we have multiple correlated responses. We develop multivariate nonlinear regression models in this setup for accurate prediction. In this paper, we introduce a full Bayesian support vector regression model with Vapnik\\'s ε-insensitive loss function, based on reproducing kernel Hilbert spaces (RKHS) under the multivariate correlated response setup. This provides a full probabilistic description of support vector machine (SVM) rather than an algorithm for fitting purposes. We have also introduced a multivariate version of the relevance vector machine (RVM). Instead of the original treatment of the RVM relying on the use of type II maximum likelihood estimates of the hyper-parameters, we put a prior on the hyper-parameters and use Markov chain Monte Carlo technique for computation. We have also proposed an empirical Bayes method for our RVM and SVM. Our methods are illustrated with a prediction problem in the near-infrared (NIR) spectroscopy. A simulation study is also undertaken to check the prediction accuracy of our models. © 2012 Elsevier Inc.

  20. Processing data collected from radiometric experiments by multivariate technique

    International Nuclear Information System (INIS)

    Urbanski, P.; Kowalska, E.; Machaj, B.; Jakowiuk, A.

    2005-01-01

    Multivariate techniques applied for processing data collected from radiometric experiments can provide more efficient extraction of the information contained in the spectra. Several techniques are considered: (i) multivariate calibration using Partial Least Square Regression and Artificial Neural Network, (ii) standardization of the spectra, (iii) smoothing of collected spectra were autocorrelation function and bootstrap were used for the assessment of the processed data, (iv) image processing using Principal Component Analysis. Application of these techniques is illustrated on examples of some industrial applications. (author)

  1. Using Multivariate Regression Model with Least Absolute Shrinkage and Selection Operator (LASSO) to Predict the Incidence of Xerostomia after Intensity-Modulated Radiotherapy for Head and Neck Cancer

    Science.gov (United States)

    Ting, Hui-Min; Chang, Liyun; Huang, Yu-Jie; Wu, Jia-Ming; Wang, Hung-Yu; Horng, Mong-Fong; Chang, Chun-Ming; Lan, Jen-Hong; Huang, Ya-Yu; Fang, Fu-Min; Leung, Stephen Wan

    2014-01-01

    Purpose The aim of this study was to develop a multivariate logistic regression model with least absolute shrinkage and selection operator (LASSO) to make valid predictions about the incidence of moderate-to-severe patient-rated xerostomia among head and neck cancer (HNC) patients treated with IMRT. Methods and Materials Quality of life questionnaire datasets from 206 patients with HNC were analyzed. The European Organization for Research and Treatment of Cancer QLQ-H&N35 and QLQ-C30 questionnaires were used as the endpoint evaluation. The primary endpoint (grade 3+ xerostomia) was defined as moderate-to-severe xerostomia at 3 (XER3m) and 12 months (XER12m) after the completion of IMRT. Normal tissue complication probability (NTCP) models were developed. The optimal and suboptimal numbers of prognostic factors for a multivariate logistic regression model were determined using the LASSO with bootstrapping technique. Statistical analysis was performed using the scaled Brier score, Nagelkerke R2, chi-squared test, Omnibus, Hosmer-Lemeshow test, and the AUC. Results Eight prognostic factors were selected by LASSO for the 3-month time point: Dmean-c, Dmean-i, age, financial status, T stage, AJCC stage, smoking, and education. Nine prognostic factors were selected for the 12-month time point: Dmean-i, education, Dmean-c, smoking, T stage, baseline xerostomia, alcohol abuse, family history, and node classification. In the selection of the suboptimal number of prognostic factors by LASSO, three suboptimal prognostic factors were fine-tuned by Hosmer-Lemeshow test and AUC, i.e., Dmean-c, Dmean-i, and age for the 3-month time point. Five suboptimal prognostic factors were also selected for the 12-month time point, i.e., Dmean-i, education, Dmean-c, smoking, and T stage. The overall performance for both time points of the NTCP model in terms of scaled Brier score, Omnibus, and Nagelkerke R2 was satisfactory and corresponded well with the expected values. Conclusions

  2. Multivariable Regression Analysis in Schistosoma mansoni-Infected Individuals in the Sudan Reveals Unique Immunoepidemiological Profiles in Uninfected, egg+ and Non-egg+ Infected Individuals.

    Science.gov (United States)

    Elfaki, Tayseer Elamin Mohamed; Arndts, Kathrin; Wiszniewsky, Anna; Ritter, Manuel; Goreish, Ibtisam A; Atti El Mekki, Misk El Yemen A; Arriens, Sandra; Pfarr, Kenneth; Fimmers, Rolf; Doenhoff, Mike; Hoerauf, Achim; Layland, Laura E

    2016-05-01

    In the Sudan, Schistosoma mansoni infections are a major cause of morbidity in school-aged children and infection rates are associated with available clean water sources. During infection, immune responses pass through a Th1 followed by Th2 and Treg phases and patterns can relate to different stages of infection or immunity. This retrospective study evaluated immunoepidemiological aspects in 234 individuals (range 4-85 years old) from Kassala and Khartoum states in 2011. Systemic immune profiles (cytokines and immunoglobulins) and epidemiological parameters were surveyed in n = 110 persons presenting patent S. mansoni infections (egg+), n = 63 individuals positive for S. mansoni via PCR in sera but egg negative (SmPCR+) and n = 61 people who were infection-free (Sm uninf). Immunoepidemiological findings were further investigated using two binary multivariable regression analysis. Nearly all egg+ individuals had no access to latrines and over 90% obtained water via the canal stemming from the Atbara River. With regards to age, infection and an egg+ status was linked to young and adolescent groups. In terms of immunology, S. mansoni infection per se was strongly associated with increased SEA-specific IgG4 but not IgE levels. IL-6, IL-13 and IL-10 were significantly elevated in patently-infected individuals and positively correlated with egg load. In contrast, IL-2 and IL-1β were significantly lower in SmPCR+ individuals when compared to Sm uninf and egg+ groups which was further confirmed during multivariate regression analysis. Schistosomiasis remains an important public health problem in the Sudan with a high number of patent individuals. In addition, SmPCR diagnostics revealed another cohort of infected individuals with a unique immunological profile and provides an avenue for future studies on non-patent infection states. Future studies should investigate the downstream signalling pathways/mechanisms of IL-2 and IL-1β as potential diagnostic markers in order to

  3. Multivariable Regression Analysis in Schistosoma mansoni-Infected Individuals in the Sudan Reveals Unique Immunoepidemiological Profiles in Uninfected, egg+ and Non-egg+ Infected Individuals.

    Directory of Open Access Journals (Sweden)

    Tayseer Elamin Mohamed Elfaki

    2016-05-01

    Full Text Available In the Sudan, Schistosoma mansoni infections are a major cause of morbidity in school-aged children and infection rates are associated with available clean water sources. During infection, immune responses pass through a Th1 followed by Th2 and Treg phases and patterns can relate to different stages of infection or immunity.This retrospective study evaluated immunoepidemiological aspects in 234 individuals (range 4-85 years old from Kassala and Khartoum states in 2011. Systemic immune profiles (cytokines and immunoglobulins and epidemiological parameters were surveyed in n = 110 persons presenting patent S. mansoni infections (egg+, n = 63 individuals positive for S. mansoni via PCR in sera but egg negative (SmPCR+ and n = 61 people who were infection-free (Sm uninf. Immunoepidemiological findings were further investigated using two binary multivariable regression analysis.Nearly all egg+ individuals had no access to latrines and over 90% obtained water via the canal stemming from the Atbara River. With regards to age, infection and an egg+ status was linked to young and adolescent groups. In terms of immunology, S. mansoni infection per se was strongly associated with increased SEA-specific IgG4 but not IgE levels. IL-6, IL-13 and IL-10 were significantly elevated in patently-infected individuals and positively correlated with egg load. In contrast, IL-2 and IL-1β were significantly lower in SmPCR+ individuals when compared to Sm uninf and egg+ groups which was further confirmed during multivariate regression analysis.Schistosomiasis remains an important public health problem in the Sudan with a high number of patent individuals. In addition, SmPCR diagnostics revealed another cohort of infected individuals with a unique immunological profile and provides an avenue for future studies on non-patent infection states. Future studies should investigate the downstream signalling pathways/mechanisms of IL-2 and IL-1β as potential diagnostic markers

  4. Neighborhood social capital and crime victimization: comparison of spatial regression analysis and hierarchical regression analysis.

    Science.gov (United States)

    Takagi, Daisuke; Ikeda, Ken'ichi; Kawachi, Ichiro

    2012-11-01

    Crime is an important determinant of public health outcomes, including quality of life, mental well-being, and health behavior. A body of research has documented the association between community social capital and crime victimization. The association between social capital and crime victimization has been examined at multiple levels of spatial aggregation, ranging from entire countries, to states, metropolitan areas, counties, and neighborhoods. In multilevel analysis, the spatial boundaries at level 2 are most often drawn from administrative boundaries (e.g., Census tracts in the U.S.). One problem with adopting administrative definitions of neighborhoods is that it ignores spatial spillover. We conducted a study of social capital and crime victimization in one ward of Tokyo city, using a spatial Durbin model with an inverse-distance weighting matrix that assigned each respondent a unique level of "exposure" to social capital based on all other residents' perceptions. The study is based on a postal questionnaire sent to 20-69 years old residents of Arakawa Ward, Tokyo. The response rate was 43.7%. We examined the contextual influence of generalized trust, perceptions of reciprocity, two types of social network variables, as well as two principal components of social capital (constructed from the above four variables). Our outcome measure was self-reported crime victimization in the last five years. In the spatial Durbin model, we found that neighborhood generalized trust, reciprocity, supportive networks and two principal components of social capital were each inversely associated with crime victimization. By contrast, a multilevel regression performed with the same data (using administrative neighborhood boundaries) found generally null associations between neighborhood social capital and crime. Spatial regression methods may be more appropriate for investigating the contextual influence of social capital in homogeneous cultural settings such as Japan. Copyright

  5. Hierarchical ordering with partial pairwise hierarchical relationships on the macaque brain data sets.

    Directory of Open Access Journals (Sweden)

    Woosang Lim

    Full Text Available Hierarchical organizations of information processing in the brain networks have been known to exist and widely studied. To find proper hierarchical structures in the macaque brain, the traditional methods need the entire pairwise hierarchical relationships between cortical areas. In this paper, we present a new method that discovers hierarchical structures of macaque brain networks by using partial information of pairwise hierarchical relationships. Our method uses a graph-based manifold learning to exploit inherent relationship, and computes pseudo distances of hierarchical levels for every pair of cortical areas. Then, we compute hierarchy levels of all cortical areas by minimizing the sum of squared hierarchical distance errors with the hierarchical information of few cortical areas. We evaluate our method on the macaque brain data sets whose true hierarchical levels are known as the FV91 model. The experimental results show that hierarchy levels computed by our method are similar to the FV91 model, and its errors are much smaller than the errors of hierarchical clustering approaches.

  6. Directional quantile regression in R

    Czech Academy of Sciences Publication Activity Database

    Boček, Pavel; Šiman, Miroslav

    2017-01-01

    Roč. 53, č. 3 (2017), s. 480-492 ISSN 0023-5954 R&D Projects: GA ČR GA14-07234S Institutional support: RVO:67985556 Keywords : multivariate quantile * regression quantile * halfspace depth * depth contour Subject RIV: BD - Theory of Information OBOR OECD: Applied mathematics Impact factor: 0.379, year: 2016 http://library.utia.cas.cz/separaty/2017/SI/bocek-0476587.pdf

  7. Field applications of stand-off sensing using visible/NIR multivariate optical computing

    Science.gov (United States)

    Eastwood, DeLyle; Soyemi, Olusola O.; Karunamuni, Jeevanandra; Zhang, Lixia; Li, Hongli; Myrick, Michael L.

    2001-02-01

    12 A novel multivariate visible/NIR optical computing approach applicable to standoff sensing will be demonstrated with porphyrin mixtures as examples. The ultimate goal is to develop environmental or counter-terrorism sensors for chemicals such as organophosphorus (OP) pesticides or chemical warfare simulants in the near infrared spectral region. The mathematical operation that characterizes prediction of properties via regression from optical spectra is a calculation of inner products between the spectrum and the pre-determined regression vector. The result is scaled appropriately and offset to correspond to the basis from which the regression vector is derived. The process involves collecting spectroscopic data and synthesizing a multivariate vector using a pattern recognition method. Then, an interference coating is designed that reproduces the pattern of the multivariate vector in its transmission or reflection spectrum, and appropriate interference filters are fabricated. High and low refractive index materials such as Nb2O5 and SiO2 are excellent choices for the visible and near infrared regions. The proof of concept has now been established for this system in the visible and will later be extended to chemicals such as OP compounds in the near and mid-infrared.

  8. The importance of trait emotional intelligence and feelings in the prediction of perceived and biological stress in adolescents: hierarchical regressions and fsQCA models.

    Science.gov (United States)

    Villanueva, Lidón; Montoya-Castilla, Inmaculada; Prado-Gascó, Vicente

    2017-07-01

    The purpose of this study is to analyze the combined effects of trait emotional intelligence (EI) and feelings on healthy adolescents' stress. Identifying the extent to which adolescent stress varies with trait emotional differences and the feelings of adolescents is of considerable interest in the development of intervention programs for fostering youth well-being. To attain this goal, self-reported questionnaires (perceived stress, trait EI, and positive/negative feelings) and biological measures of stress (hair cortisol concentrations, HCC) were collected from 170 adolescents (12-14 years old). Two different methodologies were conducted, which included hierarchical regression models and a fuzzy-set qualitative comparative analysis (fsQCA). The results support trait EI as a protective factor against stress in healthy adolescents and suggest that feelings reinforce this relation. However, the debate continues regarding the possibility of optimal levels of trait EI for effective and adaptive emotional management, particularly in the emotional attention and clarity dimensions and for female adolescents.

  9. TMVA(Toolkit for Multivariate Analysis) new architectures design and implementation.

    CERN Document Server

    Zapata Mesa, Omar Andres

    2016-01-01

    Toolkit for Multivariate Analysis(TMVA) is a package in ROOT for machine learning algorithms for classification and regression of the events in the detectors. In TMVA, we are developing new high level algorithms to perform multivariate analysis as cross validation, hyper parameter optimization, variable importance etc... Almost all the algorithms are expensive and designed to process a huge amount of data. It is very important to implement the new technologies on parallel computing to reduce the processing times.

  10. On generalized elliptical quantiles in the nonlinear quantile regression setup

    Czech Academy of Sciences Publication Activity Database

    Hlubinka, D.; Šiman, Miroslav

    2015-01-01

    Roč. 24, č. 2 (2015), s. 249-264 ISSN 1133-0686 R&D Projects: GA ČR GA14-07234S Institutional support: RVO:67985556 Keywords : multivariate quantile * elliptical quantile * quantile regression * multivariate statistical inference * portfolio optimization Subject RIV: BA - General Mathematics Impact factor: 1.207, year: 2015 http://library.utia.cas.cz/separaty/2014/SI/siman-0434510.pdf

  11. Sparse reduced-rank regression with covariance estimation

    KAUST Repository

    Chen, Lisha

    2014-12-08

    Improving the predicting performance of the multiple response regression compared with separate linear regressions is a challenging question. On the one hand, it is desirable to seek model parsimony when facing a large number of parameters. On the other hand, for certain applications it is necessary to take into account the general covariance structure for the errors of the regression model. We assume a reduced-rank regression model and work with the likelihood function with general error covariance to achieve both objectives. In addition we propose to select relevant variables for reduced-rank regression by using a sparsity-inducing penalty, and to estimate the error covariance matrix simultaneously by using a similar penalty on the precision matrix. We develop a numerical algorithm to solve the penalized regression problem. In a simulation study and real data analysis, the new method is compared with two recent methods for multivariate regression and exhibits competitive performance in prediction and variable selection.

  12. Sparse reduced-rank regression with covariance estimation

    KAUST Repository

    Chen, Lisha; Huang, Jianhua Z.

    2014-01-01

    Improving the predicting performance of the multiple response regression compared with separate linear regressions is a challenging question. On the one hand, it is desirable to seek model parsimony when facing a large number of parameters. On the other hand, for certain applications it is necessary to take into account the general covariance structure for the errors of the regression model. We assume a reduced-rank regression model and work with the likelihood function with general error covariance to achieve both objectives. In addition we propose to select relevant variables for reduced-rank regression by using a sparsity-inducing penalty, and to estimate the error covariance matrix simultaneously by using a similar penalty on the precision matrix. We develop a numerical algorithm to solve the penalized regression problem. In a simulation study and real data analysis, the new method is compared with two recent methods for multivariate regression and exhibits competitive performance in prediction and variable selection.

  13. Utilização de regressão multivariada para avaliação espectrofotométrica da demanda química de oxigênio em amostras de relevância ambiental Use of multivariate regression in spectrophotometric evaluation of chemical oxigen demand in samples of environmental relevance

    Directory of Open Access Journals (Sweden)

    Patricio Peralta-Zamora

    2005-10-01

    Full Text Available In this work, a partial least squares regression routine was used to develop a multivariate calibration model to predict the chemical oxygen demand (COD in substrates of environmental relevance (paper effluents and landfill leachates from UV-Vis spectral data. The calibration models permit the fast determination of the COD with typical relative errors lower by 10% with respect to the conventional methodology.

  14. Growing hierarchical probabilistic self-organizing graphs.

    Science.gov (United States)

    López-Rubio, Ezequiel; Palomo, Esteban José

    2011-07-01

    Since the introduction of the growing hierarchical self-organizing map, much work has been done on self-organizing neural models with a dynamic structure. These models allow adjusting the layers of the model to the features of the input dataset. Here we propose a new self-organizing model which is based on a probabilistic mixture of multivariate Gaussian components. The learning rule is derived from the stochastic approximation framework, and a probabilistic criterion is used to control the growth of the model. Moreover, the model is able to adapt to the topology of each layer, so that a hierarchy of dynamic graphs is built. This overcomes the limitations of the self-organizing maps with a fixed topology, and gives rise to a faithful visualization method for high-dimensional data.

  15. Control Multivariable por Desacoplo

    Directory of Open Access Journals (Sweden)

    Fernando Morilla

    2013-01-01

    results obtained by the authors after several years of research giving priority to the problem generalization and practical issues like easiness of implementation and utilization of PID controllers as elementary blocks. This combination of interests makes difficult to obtain perfect decoupling in all cases; although it is possible to achieve an important interaction reduction at the basic level of the control pyramid in such a way that other control systems at higher hierarchical levels benefit of this fact. This article summarizes the main aspects of decoupling control and presents its application to two illustrative examples: an experimental quadruple tank process and a 4×4 model of a heat, ventilation and air conditioning system. Palabras clave: Control de procesos, Control multivariable, Control por desacoplo, Control PID, Keywords: Process control, multivariable control, decoupling control, PID control

  16. Location optimization of solar plants by an integrated hierarchical DEA PCA approach

    International Nuclear Information System (INIS)

    Azadeh, A.; Ghaderi, S.F.; Maghsoudi, A.

    2008-01-01

    Unique features of renewable energies such as solar energy has caused increasing demands for such resources. In order to use solar energy as a natural resource, environmental circumstances and geographical location related to solar intensity must be considered. Different factors may affect on the selection of a suitable location for solar plants. These factors must be considered concurrently for optimum location identification of solar plants. This article presents an integrated hierarchical approach for location of solar plants by data envelopment analysis (DEA), principal component analysis (PCA) and numerical taxonomy (NT). Furthermore, an integrated hierarchical DEA approach incorporating the most relevant parameters of solar plants is introduced. Moreover, 2 multivariable methods namely, PCA and NT are used to validate the results of DEA model. The prescribed approach is tested for 25 different cities in Iran with 6 different regions within each city. This is the first study that considers an integrated hierarchical DEA approach for geographical location optimization of solar plants. Implementation of the proposed approach would enable the energy policy makers to select the best-possible location for construction of a solar power plant with lowest possible costs

  17. Predicting allergic contact dermatitis: a hierarchical structure activity relationship (SAR) approach to chemical classification using topological and quantum chemical descriptors

    Science.gov (United States)

    Basak, Subhash C.; Mills, Denise; Hawkins, Douglas M.

    2008-06-01

    A hierarchical classification study was carried out based on a set of 70 chemicals—35 which produce allergic contact dermatitis (ACD) and 35 which do not. This approach was implemented using a regular ridge regression computer code, followed by conversion of regression output to binary data values. The hierarchical descriptor classes used in the modeling include topostructural (TS), topochemical (TC), and quantum chemical (QC), all of which are based solely on chemical structure. The concordance, sensitivity, and specificity are reported. The model based on the TC descriptors was found to be the best, while the TS model was extremely poor.

  18. Application of Hierarchical Linear Models/Linear Mixed-Effects Models in School Effectiveness Research

    Science.gov (United States)

    Ker, H. W.

    2014-01-01

    Multilevel data are very common in educational research. Hierarchical linear models/linear mixed-effects models (HLMs/LMEs) are often utilized to analyze multilevel data nowadays. This paper discusses the problems of utilizing ordinary regressions for modeling multilevel educational data, compare the data analytic results from three regression…

  19. A multivariate tobit analysis of highway accident-injury-severity rates.

    Science.gov (United States)

    Anastasopoulos, Panagiotis Ch; Shankar, Venky N; Haddock, John E; Mannering, Fred L

    2012-03-01

    Relatively recent research has illustrated the potential that tobit regression has in studying factors that affect vehicle accident rates (accidents per distance traveled) on specific roadway segments. Tobit regression has been used because accident rates on specific roadway segments are continuous data that are left-censored at zero (they are censored because accidents may not be observed on all roadway segments during the period over which data are collected). This censoring may arise from a number of sources, one of which being the possibility that less severe crashes may be under-reported and thus may be less likely to appear in crash databases. Traditional tobit-regression analyses have dealt with the overall accident rate (all crashes regardless of injury severity), so the issue of censoring by the severity of crashes has not been addressed. However, a tobit-regression approach that considers accident rates by injury-severity level, such as the rate of no-injury, possible injury and injury accidents per distance traveled (as opposed to all accidents regardless of injury-severity), can potentially provide new insights, and address the possibility that censoring may vary by crash-injury severity. Using five-year data from highways in Washington State, this paper estimates a multivariate tobit model of accident-injury-severity rates that addresses the possibility of differential censoring across injury-severity levels, while also accounting for the possible contemporaneous error correlation resulting from commonly shared unobserved characteristics across roadway segments. The empirical results show that the multivariate tobit model outperforms its univariate counterpart, is practically equivalent to the multivariate negative binomial model, and has the potential to provide a fuller understanding of the factors determining accident-injury-severity rates on specific roadway segments. Published by Elsevier Ltd.

  20. Distributed Monitoring of the R2 Statistic for Linear Regression

    Data.gov (United States)

    National Aeronautics and Space Administration — The problem of monitoring a multivariate linear regression model is relevant in studying the evolving relationship between a set of input variables (features) and...

  1. Adaptive metric kernel regression

    DEFF Research Database (Denmark)

    Goutte, Cyril; Larsen, Jan

    2000-01-01

    Kernel smoothing is a widely used non-parametric pattern recognition technique. By nature, it suffers from the curse of dimensionality and is usually difficult to apply to high input dimensions. In this contribution, we propose an algorithm that adapts the input metric used in multivariate...... regression by minimising a cross-validation estimate of the generalisation error. This allows to automatically adjust the importance of different dimensions. The improvement in terms of modelling performance is illustrated on a variable selection task where the adaptive metric kernel clearly outperforms...

  2. A retrospective study: Multivariate logistic regression analysis of the outcomes after pressure sores reconstruction with fasciocutaneous, myocutaneous, and perforator flaps.

    Science.gov (United States)

    Chiu, Yu-Jen; Liao, Wen-Chieh; Wang, Tien-Hsiang; Shih, Yu-Chung; Ma, Hsu; Lin, Chih-Hsun; Wu, Szu-Hsien; Perng, Cherng-Kang

    2017-08-01

    Despite significant advances in medical care and surgical techniques, pressure sore reconstruction is still prone to elevated rates of complication and recurrence. We conducted a retrospective study to investigate not only complication and recurrence rates following pressure sore reconstruction but also preoperative risk stratification. This study included 181 ulcers underwent flap operations between January 2002 and December 2013 were included in the study. We performed a multivariable logistic regression model, which offers a regression-based method accounting for the within-patient correlation of the success or failure of each flap. The overall complication and recurrence rates for all flaps were 46.4% and 16.0%, respectively, with a mean follow-up period of 55.4 ± 38.0 months. No statistically significant differences of complication and recurrence rates were observed among three different reconstruction methods. In subsequent analysis, albumin ≤3.0 g/dl and paraplegia were significantly associated with higher postoperative complication. The anatomic factor, ischial wound location, significantly trended toward the development of ulcer recurrence. In the fasciocutaneous group, paraplegia had significant correlation to higher complication and recurrence rates. In the musculocutaneous flap group, variables had no significant correlation to complication and recurrence rates. In the free-style perforator group, ischial wound location and malnourished status correlated with significantly higher complication rates; ischial wound location also correlated with significantly higher recurrence rate. Ultimately, our review of a noteworthy cohort with lengthy follow-up helped identify and confirm certain risk factors that can facilitate a more informed and thoughtful pre- and postoperative decision-making process for patients with pressure ulcers. Copyright © 2017 British Association of Plastic, Reconstructive and Aesthetic Surgeons. Published by Elsevier Ltd. All

  3. Aortic and Hepatic Contrast Enhancement During Hepatic-Arterial and Portal Venous Phase Computed Tomography Scanning: Multivariate Linear Regression Analysis Using Age, Sex, Total Body Weight, Height, and Cardiac Output.

    Science.gov (United States)

    Masuda, Takanori; Nakaura, Takeshi; Funama, Yoshinori; Higaki, Toru; Kiguchi, Masao; Imada, Naoyuki; Sato, Tomoyasu; Awai, Kazuo

    We evaluated the effect of the age, sex, total body weight (TBW), height (HT) and cardiac output (CO) of patients on aortic and hepatic contrast enhancement during hepatic-arterial phase (HAP) and portal venous phase (PVP) computed tomography (CT) scanning. This prospective study received institutional review board approval; prior informed consent to participate was obtained from all 168 patients. All were examined using our routine protocol; the contrast material was 600 mg/kg iodine. Cardiac output was measured with a portable electrical velocimeter within 5 minutes of starting the CT scan. We calculated contrast enhancement (per gram of iodine: [INCREMENT]HU/gI) of the abdominal aorta during the HAP and of the liver parenchyma during the PVP. We performed univariate and multivariate linear regression analysis between all patient characteristics and the [INCREMENT]HU/gI of aortic- and liver parenchymal enhancement. Univariate linear regression analysis demonstrated statistically significant correlations between the [INCREMENT]HU/gI and the age, sex, TBW, HT, and CO (all P linear regression analysis showed that only the TBW and CO were of independent predictive value (P linear regression analysis only the TBW and CO were significantly correlated with aortic and liver parenchymal enhancement; the age, sex, and HT were not. The CO was the only independent factor affecting aortic and liver parenchymal enhancement at hepatic CT when the protocol was adjusted for the TBW.

  4. Adaptive Metric Kernel Regression

    DEFF Research Database (Denmark)

    Goutte, Cyril; Larsen, Jan

    1998-01-01

    Kernel smoothing is a widely used nonparametric pattern recognition technique. By nature, it suffers from the curse of dimensionality and is usually difficult to apply to high input dimensions. In this paper, we propose an algorithm that adapts the input metric used in multivariate regression...... by minimising a cross-validation estimate of the generalisation error. This allows one to automatically adjust the importance of different dimensions. The improvement in terms of modelling performance is illustrated on a variable selection task where the adaptive metric kernel clearly outperforms the standard...

  5. Directional quantile regression in Octave (and MATLAB)

    Czech Academy of Sciences Publication Activity Database

    Boček, Pavel; Šiman, Miroslav

    2016-01-01

    Roč. 52, č. 1 (2016), s. 28-51 ISSN 0023-5954 R&D Projects: GA ČR GA14-07234S Institutional support: RVO:67985556 Keywords : quantile regression * multivariate quantile * depth contour * Matlab Subject RIV: IN - Informatics, Computer Science Impact factor: 0.379, year: 2016 http://library.utia.cas.cz/separaty/2016/SI/bocek-0458380.pdf

  6. Comparing lagged linear correlation, lagged regression, Granger causality, and vector autoregression for uncovering associations in EHR data.

    Science.gov (United States)

    Levine, Matthew E; Albers, David J; Hripcsak, George

    2016-01-01

    Time series analysis methods have been shown to reveal clinical and biological associations in data collected in the electronic health record. We wish to develop reliable high-throughput methods for identifying adverse drug effects that are easy to implement and produce readily interpretable results. To move toward this goal, we used univariate and multivariate lagged regression models to investigate associations between twenty pairs of drug orders and laboratory measurements. Multivariate lagged regression models exhibited higher sensitivity and specificity than univariate lagged regression in the 20 examples, and incorporating autoregressive terms for labs and drugs produced more robust signals in cases of known associations among the 20 example pairings. Moreover, including inpatient admission terms in the model attenuated the signals for some cases of unlikely associations, demonstrating how multivariate lagged regression models' explicit handling of context-based variables can provide a simple way to probe for health-care processes that confound analyses of EHR data.

  7. Assessment of susceptibility to earth-flow landslide using logistic regression and multivariate adaptive regression splines: A case of the Belice River basin (western Sicily, Italy)

    Science.gov (United States)

    Conoscenti, Christian; Ciaccio, Marilena; Caraballo-Arias, Nathalie Almaru; Gómez-Gutiérrez, Álvaro; Rotigliano, Edoardo; Agnesi, Valerio

    2015-08-01

    In this paper, terrain susceptibility to earth-flow occurrence was evaluated by using geographic information systems (GIS) and two statistical methods: Logistic regression (LR) and multivariate adaptive regression splines (MARS). LR has been already demonstrated to provide reliable predictions of earth-flow occurrence, whereas MARS, as far as we know, has never been used to generate earth-flow susceptibility models. The experiment was carried out in a basin of western Sicily (Italy), which extends for 51 km2 and is severely affected by earth-flows. In total, we mapped 1376 earth-flows, covering an area of 4.59 km2. To explore the effect of pre-failure topography on earth-flow spatial distribution, we performed a reconstruction of topography before the landslide occurrence. This was achieved by preparing a digital terrain model (DTM) where altitude of areas hosting landslides was interpolated from the adjacent undisturbed land surface by using the algorithm topo-to-raster. This DTM was exploited to extract 15 morphological and hydrological variables that, in addition to outcropping lithology, were employed as explanatory variables of earth-flow spatial distribution. The predictive skill of the earth-flow susceptibility models and the robustness of the procedure were tested by preparing five datasets, each including a different subset of landslides and stable areas. The accuracy of the predictive models was evaluated by drawing receiver operating characteristic (ROC) curves and by calculating the area under the ROC curve (AUC). The results demonstrate that the overall accuracy of LR and MARS earth-flow susceptibility models is from excellent to outstanding. However, AUC values of the validation datasets attest to a higher predictive power of MARS-models (AUC between 0.881 and 0.912) with respect to LR-models (AUC between 0.823 and 0.870). The adopted procedure proved to be resistant to overfitting and stable when changes of the learning and validation samples are

  8. Principal Covariates Clusterwise Regression (PCCR) : Accounting for multicollinearity and population heterogeneity in hierarchically organized data.

    NARCIS (Netherlands)

    Wilderjans, Tom F.; Van de Gaer, E.; Kiers, H.A.L.; Van Mechelen, Iven; Ceulemans, Eva

    In the behavioral sciences, many research questions pertain to a regression problem in that one wants to predict a criterion on the basis of a number of predictors. Although in many cases, ordinary least squares regression will suffice, sometimes the prediction problem is more challenging, for three

  9. Modelo hierárquico multivariado da inatividade física em crianças de escolas públicas Multivariate hierarchical model for physical inactivity among public school children

    Directory of Open Access Journals (Sweden)

    Mario M. Bracco

    2006-08-01

    Full Text Available OBJETIVO: Identificar fatores biológicos e sociodemográficos atribuíveis à inatividade física em crianças de escolas públicas. MÉTODOS: Foram estudadas, através de questionário auto-relatado pelos pais, 2.519 crianças (49,3% meninas, de 7 a 10 anos (média = 7,6±0,9 anos, de oito escolas públicas da cidade de São Paulo. Aplicamos a análise de correspondência múltipla para identificar grupos de respostas relacionadas com padrões de atividade e inatividade física e a geração de uma escala ótima. A análise de agrupamento identificou os grupos de crianças ativas e inativas. A análise de curva ROC (receiver operator characteristic, para o estudo das propriedades diagnósticas de uma escala simplificada de inatividade física derivada da escala ótima, mostrou o ponto de corte = 3 como o de melhor sensibilidade e especificidade, sendo utilizado como a variável de resposta no modelo de regressão. Um modelo hierárquico multivariado foi construído, assumindo variáveis categóricas como distais e proximais, adotando-se p OBJECTIVE: To identify biological and sociodemographic factors associated with physical inactivity in public school children. METHODS: Parents of 2,519 children (49.3% of whom were girls, aged 7 to 10 years (mean = 7.6±0.9 years, from eight public schools in São Paulo, Brazil, completed a self-administered questionnaire. We used multiple correspondence analysis to identify groups of responses related to levels of physical activity and inactivity and to obtain an optimal scale. The cluster analysis identified groups of active and inactive children. The analysis of the receiver operator characteristic (ROC curve, for the study of diagnostic properties of a simplified scale for physical inactivity derived from the optimal scale, revealed that a cutoff point of 3 had the best sensitivity and specificity, being therefore used as outcome variable in the regression model. A multivariate hierarchical model was

  10. A hierarchical bayesian model to quantify uncertainty of stream water temperature forecasts.

    Directory of Open Access Journals (Sweden)

    Guillaume Bal

    Full Text Available Providing generic and cost effective modelling approaches to reconstruct and forecast freshwater temperature using predictors as air temperature and water discharge is a prerequisite to understanding ecological processes underlying the impact of water temperature and of global warming on continental aquatic ecosystems. Using air temperature as a simple linear predictor of water temperature can lead to significant bias in forecasts as it does not disentangle seasonality and long term trends in the signal. Here, we develop an alternative approach based on hierarchical Bayesian statistical time series modelling of water temperature, air temperature and water discharge using seasonal sinusoidal periodic signals and time varying means and amplitudes. Fitting and forecasting performances of this approach are compared with that of simple linear regression between water and air temperatures using i an emotive simulated example, ii application to three French coastal streams with contrasting bio-geographical conditions and sizes. The time series modelling approach better fit data and does not exhibit forecasting bias in long term trends contrary to the linear regression. This new model also allows for more accurate forecasts of water temperature than linear regression together with a fair assessment of the uncertainty around forecasting. Warming of water temperature forecast by our hierarchical Bayesian model was slower and more uncertain than that expected with the classical regression approach. These new forecasts are in a form that is readily usable in further ecological analyses and will allow weighting of outcomes from different scenarios to manage climate change impacts on freshwater wildlife.

  11. Is ovarian hyperstimulation associated with higher blood pressure in 4-year-old IVF offspring? Part I: multivariable regression analysis.

    Science.gov (United States)

    Seggers, Jorien; Haadsma, Maaike L; La Bastide-Van Gemert, Sacha; Heineman, Maas Jan; Middelburg, Karin J; Roseboom, Tessa J; Schendelaar, Pamela; Van den Heuvel, Edwin R; Hadders-Algra, Mijna

    2014-03-01

    Does ovarian hyperstimulation, the in vitro procedure, or a combination of these two negatively influence blood pressure (BP) and anthropometrics of 4-year-old children born following IVF? Higher systolic blood pressure (SBP) percentiles were found in 4-year-old children born following conventional IVF with ovarian hyperstimulation compared with children born following IVF without ovarian hyperstimulation. Increasing evidence suggests that IVF, which has an increased incidence of preterm birth and low birthweight, is associated with higher BP and altered body fat distribution in offspring but the underlying mechanisms are largely unknown. We performed a prospective, assessor-blinded follow-up study in which 194 children were assessed. The attrition rate up until the 4-year-old assessment was 10%. We measured BP and anthropometrics of 4-year-old singletons born following conventional IVF with controlled ovarian hyperstimulation (COH-IVF, n = 63), or born following modified natural cycle IV (MNC-IVF, n = 52), or born to subfertile couples who conceived naturally (Sub-NC, n = 79). Both IVF and ICSI were performed. Primary outcome measures were the SBP percentiles and diastolic BP (DBP) percentiles. Anthropometric measures included triceps and subscapular skinfold thickness. Several multivariable regression analyses were applied in order to correct for subsets of confounders. The value 'B' is the unstandardized regression coefficient. SBP percentiles were significantly lower in the MNC-IVF group (mean 59, SD 24) than in the COH-IVF (mean 68, SD 22) and Sub-NC groups (mean 70, SD 16). The difference in SBP between COH-IVF and MNC-IVF remained significant after correction for current, early life and parental characteristics (B: 14.09; 95% confidence interval (CI): 5.39-22.79), whereas the difference between MNC-IVF and Sub-NC did not. DBP percentiles did not differ between groups. After correction for early life factors, subscapular skinfold thickness was thicker in the

  12. Use of multivariate extensions of generalized linear models in the analysis of data from clinical trials

    OpenAIRE

    ALONSO ABAD, Ariel; Rodriguez, O.; TIBALDI, Fabian; CORTINAS ABRAHANTES, Jose

    2002-01-01

    In medical studies the categorical endpoints are quite often. Even though nowadays some models for handling this multicategorical variables have been developed their use is not common. This work shows an application of the Multivariate Generalized Linear Models to the analysis of Clinical Trials data. After a theoretical introduction models for ordinal and nominal responses are applied and the main results are discussed. multivariate analysis; multivariate logistic regression; multicategor...

  13. Study of risk factors affecting both hypertension and obesity outcome by using multivariate multilevel logistic regression models

    Directory of Open Access Journals (Sweden)

    Sepedeh Gholizadeh

    2016-07-01

    Full Text Available Background:Obesity and hypertension are the most important non-communicable diseases thatin many studies, the prevalence and their risk factors have been performedin each geographic region univariately.Study of factors affecting both obesity and hypertension may have an important role which to be adrressed in this study. Materials &Methods:This cross-sectional study was conducted on 1000 men aged 20-70 living in Bushehr province. Blood pressure was measured three times and the average of them was considered as one of the response variables. Hypertension was defined as systolic blood pressure ≥140 (and-or diastolic blood pressure ≥90 and obesity was defined as body mass index ≥25. Data was analyzed by using multilevel, multivariate logistic regression model by MlwiNsoftware. Results:Intra class correlations in cluster level obtained 33% for high blood pressure and 37% for obesity, so two level model was fitted to data. The prevalence of obesity and hypertension obtained 43.6% (0.95%CI; 40.6-46.5, 29.4% (0.95%CI; 26.6-32.1 respectively. Age, gender, smoking, hyperlipidemia, diabetes, fruit and vegetable consumption and physical activity were the factors affecting blood pressure (p≤0.05. Age, gender, hyperlipidemia, diabetes, fruit and vegetable consumption, physical activity and place of residence are effective on obesity (p≤0.05. Conclusion: The multilevel models with considering levels distribution provide more precise estimates. As regards obesity and hypertension are the major risk factors for cardiovascular disease, by knowing the high-risk groups we can d careful planning to prevention of non-communicable diseases and promotion of society health.

  14. Decentralized Hierarchical Controller Design for Selective Damping of Inter Area Oscillations Using PMU Signals

    Directory of Open Access Journals (Sweden)

    Ashfaque Ahmed Hashmani

    2011-07-01

    Full Text Available This paper deals with the decentralized hierarchical PSS (Power System Stabilizer controller design to achieve a better damping of specific inter-area oscillations. The two-level decentralized hierarchical structure consists of two PSS controllers. The first level controller is a local PSS controller for each generator to damp local mode in the area where controller is located. This controller uses only local signals as input signals. The local signal comes from the generator at which the controller is located. The secondary level controller is a multivariable decentralized global PSS controller to damp inter-area modes. This controller uses selected suitable wide area PMU (Phasor Measurement Units signals as inputs. The PMU or global signals are taken from network locations where the oscillations are well observable. The global controller uses only those global input signals in which the assigned single inter-area mode is most observable and is located at a generator that is most effective in controlling the assigned mode. The global controller works mainly in a frequency band given by the natural frequency of the assigned mode. The effectiveness of the resulting hierarchical controller is demonstrated through simulation studies conducted on a test power system.

  15. Hierarchical Network Design

    DEFF Research Database (Denmark)

    Thomadsen, Tommy

    2005-01-01

    Communication networks are immensely important today, since both companies and individuals use numerous services that rely on them. This thesis considers the design of hierarchical (communication) networks. Hierarchical networks consist of layers of networks and are well-suited for coping...... with changing and increasing demands. Two-layer networks consist of one backbone network, which interconnects cluster networks. The clusters consist of nodes and links, which connect the nodes. One node in each cluster is a hub node, and the backbone interconnects the hub nodes of each cluster and thus...... the clusters. The design of hierarchical networks involves clustering of nodes, hub selection, and network design, i.e. selection of links and routing of ows. Hierarchical networks have been in use for decades, but integrated design of these networks has only been considered for very special types of networks...

  16. Application of least square support vector machine and multivariate adaptive regression spline models in long term prediction of river water pollution

    Science.gov (United States)

    Kisi, Ozgur; Parmar, Kulwinder Singh

    2016-03-01

    This study investigates the accuracy of least square support vector machine (LSSVM), multivariate adaptive regression splines (MARS) and M5 model tree (M5Tree) in modeling river water pollution. Various combinations of water quality parameters, Free Ammonia (AMM), Total Kjeldahl Nitrogen (TKN), Water Temperature (WT), Total Coliform (TC), Fecal Coliform (FC) and Potential of Hydrogen (pH) monitored at Nizamuddin, Delhi Yamuna River in India were used as inputs to the applied models. Results indicated that the LSSVM and MARS models had almost same accuracy and they performed better than the M5Tree model in modeling monthly chemical oxygen demand (COD). The average root mean square error (RMSE) of the LSSVM and M5Tree models was decreased by 1.47% and 19.1% using MARS model, respectively. Adding TC input to the models did not increase their accuracy in modeling COD while adding FC and pH inputs to the models generally decreased the accuracy. The overall results indicated that the MARS and LSSVM models could be successfully used in estimating monthly river water pollution level by using AMM, TKN and WT parameters as inputs.

  17. MODELING SNAKE MICROHABITAT FROM RADIOTELEMETRY STUDIES USING POLYTOMOUS LOGISTIC REGRESSION

    Science.gov (United States)

    Multivariate analysis of snake microhabitat has historically used techniques that were derived under assumptions of normality and common covariance structure (e.g., discriminant function analysis, MANOVA). In this study, polytomous logistic regression (PLR which does not require ...

  18. Theory of net analyte signal vectors in inverse regression

    DEFF Research Database (Denmark)

    Bro, R.; Andersen, Charlotte Møller

    2003-01-01

    The. net analyte signal and the net analyte signal vector are useful measures in building and optimizing multivariate calibration models. In this paper a theory for their use in inverse regression is developed. The theory of net analyte signal was originally derived from classical least squares...

  19. Nonparametric regression using the concept of minimum energy

    International Nuclear Information System (INIS)

    Williams, Mike

    2011-01-01

    It has recently been shown that an unbinned distance-based statistic, the energy, can be used to construct an extremely powerful nonparametric multivariate two sample goodness-of-fit test. An extension to this method that makes it possible to perform nonparametric regression using multiple multivariate data sets is presented in this paper. The technique, which is based on the concept of minimizing the energy of the system, permits determination of parameters of interest without the need for parametric expressions of the parent distributions of the data sets. The application and performance of this new method is discussed in the context of some simple example analyses.

  20. Detecting Hierarchical Structure in Networks

    DEFF Research Database (Denmark)

    Herlau, Tue; Mørup, Morten; Schmidt, Mikkel Nørgaard

    2012-01-01

    Many real-world networks exhibit hierarchical organization. Previous models of hierarchies within relational data has focused on binary trees; however, for many networks it is unknown whether there is hierarchical structure, and if there is, a binary tree might not account well for it. We propose...... a generative Bayesian model that is able to infer whether hierarchies are present or not from a hypothesis space encompassing all types of hierarchical tree structures. For efficient inference we propose a collapsed Gibbs sampling procedure that jointly infers a partition and its hierarchical structure....... On synthetic and real data we demonstrate that our model can detect hierarchical structure leading to better link-prediction than competing models. Our model can be used to detect if a network exhibits hierarchical structure, thereby leading to a better comprehension and statistical account the network....

  1. A binary logistic regression model with complex sampling design of ...

    African Journals Online (AJOL)

    2017-09-03

    Sep 3, 2017 ... Bi-variable and multi-variable binary logistic regression model with complex sampling design was fitted. .... Data was entered into STATA-12 and analyzed using. SPSS-21. .... lack of access/too far or costs too much. 35. 1.2.

  2. Ultracentrifuge separative power modeling with multivariate regression using covariance matrix

    International Nuclear Information System (INIS)

    Migliavacca, Elder

    2004-01-01

    In this work, the least-squares methodology with covariance matrix is applied to determine a data curve fitting to obtain a performance function for the separative power δU of a ultracentrifuge as a function of variables that are experimentally controlled. The experimental data refer to 460 experiments on the ultracentrifugation process for uranium isotope separation. The experimental uncertainties related with these independent variables are considered in the calculation of the experimental separative power values, determining an experimental data input covariance matrix. The process variables, which significantly influence the δU values are chosen in order to give information on the ultracentrifuge behaviour when submitted to several levels of feed flow rate F, cut θ and product line pressure P p . After the model goodness-of-fit validation, a residual analysis is carried out to verify the assumed basis concerning its randomness and independence and mainly the existence of residual heteroscedasticity with any explained regression model variable. The surface curves are made relating the separative power with the control variables F, θ and P p to compare the fitted model with the experimental data and finally to calculate their optimized values. (author)

  3. Dental age assessment of young Iranian adults using third molars: A multivariate regression study.

    Science.gov (United States)

    Bagherpour, Ali; Anbiaee, Najmeh; Partovi, Parnia; Golestani, Shayan; Afzalinasab, Shakiba

    2012-10-01

    In recent years, a noticeable increase in forensic age estimations of living individuals has been observed. Radiologic assessment of the mineralisation stage of third molars is of particular importance, with regard to the relevant age group. To attain a referral database and regression equations for dental age estimation of unaccompanied minors in an Iranian population was the goal of this study. Moreover, determination was made concerning the probability of an individual being over the age of 18 in case of full third molar(s) development. Using the scoring system of Gleiser and Hunt, modified by Köhler, an investigation of a cross-sectional sample of 1274 orthopantomograms of 885 females and 389 males aged between 15 and 22 years was carried out. Using kappa statistics, intra-observer reliability was tested. With Spearman correlation coefficient, correlation between the scores of all four wisdom teeth, was evaluated. We also carried out the Wilcoxon signed-rank test on asymmetry and calculated the regression formulae. A strong intra-observer agreement was displayed by the kappa value. No significant difference (p-value for upper and lower jaws were 0.07 and 0.59, respectively) was discovered by Wilcoxon signed-rank test for left and right asymmetry. The developmental stage of upper right and upper left third molars yielded the greatest correlation coefficient. The probability of an individual being over the age of 18 is 95.6% for males and 100.0% for females in case four fully developed third molars are present. Taking into consideration gender, location and number of wisdom teeth, regression formulae were arrived at. Use of population-specific standards is recommended as a means of improving the accuracy of forensic age estimates based on third molars mineralisation. To obtain more exact regression formulae, wider age range studies are recommended. Copyright © 2012 Elsevier Ltd and Faculty of Forensic and Legal Medicine. All rights reserved.

  4. Order Selection for General Expression of Nonlinear Autoregressive Model Based on Multivariate Stepwise Regression

    Science.gov (United States)

    Shi, Jinfei; Zhu, Songqing; Chen, Ruwen

    2017-12-01

    An order selection method based on multiple stepwise regressions is proposed for General Expression of Nonlinear Autoregressive model which converts the model order problem into the variable selection of multiple linear regression equation. The partial autocorrelation function is adopted to define the linear term in GNAR model. The result is set as the initial model, and then the nonlinear terms are introduced gradually. Statistics are chosen to study the improvements of both the new introduced and originally existed variables for the model characteristics, which are adopted to determine the model variables to retain or eliminate. So the optimal model is obtained through data fitting effect measurement or significance test. The simulation and classic time-series data experiment results show that the method proposed is simple, reliable and can be applied to practical engineering.

  5. LiDAR based prediction of forest biomass using hierarchical models with spatially varying coefficients

    Science.gov (United States)

    Chad Babcock; Andrew O. Finley; John B. Bradford; Randy Kolka; Richard Birdsey; Michael G. Ryan

    2015-01-01

    Many studies and production inventory systems have shown the utility of coupling covariates derived from Light Detection and Ranging (LiDAR) data with forest variables measured on georeferenced inventory plots through regression models. The objective of this study was to propose and assess the use of a Bayesian hierarchical modeling framework that accommodates both...

  6. Mental and physical health correlates among family caregivers of patients with newly-diagnosed incurable cancer: a hierarchical linear regression analysis.

    Science.gov (United States)

    Shaffer, Kelly M; Jacobs, Jamie M; Nipp, Ryan D; Carr, Alaina; Jackson, Vicki A; Park, Elyse R; Pirl, William F; El-Jawahri, Areej; Gallagher, Emily R; Greer, Joseph A; Temel, Jennifer S

    2017-03-01

    Caregiver, relational, and patient factors have been associated with the health of family members and friends providing care to patients with early-stage cancer. Little research has examined whether findings extend to family caregivers of patients with incurable cancer, who experience unique and substantial caregiving burdens. We examined correlates of mental and physical health among caregivers of patients with newly-diagnosed incurable lung or non-colorectal gastrointestinal cancer. At baseline for a trial of early palliative care, caregivers of participating patients (N = 275) reported their mental and physical health (Medical Outcome Survey-Short Form-36); patients reported their quality of life (Functional Assessment of Cancer Therapy-General). Analyses used hierarchical linear regression with two-tailed significance tests. Caregivers' mental health was worse than the U.S. national population (M = 44.31, p caregiver, relational, and patient factors simultaneously revealed that younger (B = 0.31, p = .001), spousal caregivers (B = -8.70, p = .003), who cared for patients reporting low emotional well-being (B = 0.51, p = .01) reported worse mental health; older (B = -0.17, p = .01) caregivers with low educational attainment (B = 4.36, p family caregivers of patients with incurable cancer, caregiver demographics, relational factors, and patient-specific factors were all related to caregiver mental health, while caregiver demographics were primarily associated with caregiver physical health. These findings help identify characteristics of family caregivers at highest risk of poor mental and physical health who may benefit from greater supportive care.

  7. Functional annotation of hierarchical modularity.

    Directory of Open Access Journals (Sweden)

    Kanchana Padmanabhan

    Full Text Available In biological networks of molecular interactions in a cell, network motifs that are biologically relevant are also functionally coherent, or form functional modules. These functionally coherent modules combine in a hierarchical manner into larger, less cohesive subsystems, thus revealing one of the essential design principles of system-level cellular organization and function-hierarchical modularity. Arguably, hierarchical modularity has not been explicitly taken into consideration by most, if not all, functional annotation systems. As a result, the existing methods would often fail to assign a statistically significant functional coherence score to biologically relevant molecular machines. We developed a methodology for hierarchical functional annotation. Given the hierarchical taxonomy of functional concepts (e.g., Gene Ontology and the association of individual genes or proteins with these concepts (e.g., GO terms, our method will assign a Hierarchical Modularity Score (HMS to each node in the hierarchy of functional modules; the HMS score and its p-value measure functional coherence of each module in the hierarchy. While existing methods annotate each module with a set of "enriched" functional terms in a bag of genes, our complementary method provides the hierarchical functional annotation of the modules and their hierarchically organized components. A hierarchical organization of functional modules often comes as a bi-product of cluster analysis of gene expression data or protein interaction data. Otherwise, our method will automatically build such a hierarchy by directly incorporating the functional taxonomy information into the hierarchy search process and by allowing multi-functional genes to be part of more than one component in the hierarchy. In addition, its underlying HMS scoring metric ensures that functional specificity of the terms across different levels of the hierarchical taxonomy is properly treated. We have evaluated our

  8. Assessing exposure to violence using multiple informants: application of hierarchical linear model.

    Science.gov (United States)

    Kuo, M; Mohler, B; Raudenbush, S L; Earls, F J

    2000-11-01

    The present study assesses the effects of demographic risk factors on children's exposure to violence (ETV) and how these effects vary by informants. Data on exposure to violence of 9-, 12-, and 15-year-olds were collected from both child participants (N = 1880) and parents (N = 1776), as part of the assessment of the Project on Human Development in Chicago Neighborhoods (PHDCN). A two-level hierarchical linear model (HLM) with multivariate outcomes was employed to analyze information obtained from these two different groups of informants. The findings indicate that parents generally report less ETV than do their children and that associations of age, gender, and parent education with ETV are stronger in the self-reports than in the parent reports. The findings support a multivariate approach when information obtained from different sources is being integrated. The application of HLM allows an assessment of interactions between risk factors and informants and uses all available data, including data from one informant when data from the other informant is missing.

  9. Regression analysis for LED color detection of visual-MIMO system

    Science.gov (United States)

    Banik, Partha Pratim; Saha, Rappy; Kim, Ki-Doo

    2018-04-01

    Color detection from a light emitting diode (LED) array using a smartphone camera is very difficult in a visual multiple-input multiple-output (visual-MIMO) system. In this paper, we propose a method to determine the LED color using a smartphone camera by applying regression analysis. We employ a multivariate regression model to identify the LED color. After taking a picture of an LED array, we select the LED array region, and detect the LED using an image processing algorithm. We then apply the k-means clustering algorithm to determine the number of potential colors for feature extraction of each LED. Finally, we apply the multivariate regression model to predict the color of the transmitted LEDs. In this paper, we show our results for three types of environmental light condition: room environmental light, low environmental light (560 lux), and strong environmental light (2450 lux). We compare the results of our proposed algorithm from the analysis of training and test R-Square (%) values, percentage of closeness of transmitted and predicted colors, and we also mention about the number of distorted test data points from the analysis of distortion bar graph in CIE1931 color space.

  10. Multivariate return periods of sea storms for coastal erosion risk assessment

    Directory of Open Access Journals (Sweden)

    S. Corbella

    2012-08-01

    Full Text Available The erosion of a beach depends on various storm characteristics. Ideally, the risk associated with a storm would be described by a single multivariate return period that is also representative of the erosion risk, i.e. a 100 yr multivariate storm return period would cause a 100 yr erosion return period. Unfortunately, a specific probability level may be associated with numerous combinations of storm characteristics. These combinations, despite having the same multivariate probability, may cause very different erosion outcomes. This paper explores this ambiguity problem in the context of copula based multivariate return periods and using a case study at Durban on the east coast of South Africa. Simulations were used to correlate multivariate return periods of historical events to return periods of estimated storm induced erosion volumes. In addition, the relationship of the most-likely design event (Salvadori et al., 2011 to coastal erosion was investigated. It was found that the multivariate return periods for wave height and duration had the highest correlation to erosion return periods. The most-likely design event was found to be an inadequate design method in its current form. We explore the inclusion of conditions based on the physical realizability of wave events and the use of multivariate linear regression to relate storm parameters to erosion computed from a process based model. Establishing a link between storm statistics and erosion consequences can resolve the ambiguity between multivariate storm return periods and associated erosion return periods.

  11. Catalysis with hierarchical zeolites

    DEFF Research Database (Denmark)

    Holm, Martin Spangsberg; Taarning, Esben; Egeblad, Kresten

    2011-01-01

    Hierarchical (or mesoporous) zeolites have attracted significant attention during the first decade of the 21st century, and so far this interest continues to increase. There have already been several reviews giving detailed accounts of the developments emphasizing different aspects of this research...... topic. Until now, the main reason for developing hierarchical zeolites has been to achieve heterogeneous catalysts with improved performance but this particular facet has not yet been reviewed in detail. Thus, the present paper summaries and categorizes the catalytic studies utilizing hierarchical...... zeolites that have been reported hitherto. Prototypical examples from some of the different categories of catalytic reactions that have been studied using hierarchical zeolite catalysts are highlighted. This clearly illustrates the different ways that improved performance can be achieved with this family...

  12. Multivariate calibration applied to the quantitative analysis of infrared spectra

    Energy Technology Data Exchange (ETDEWEB)

    Haaland, D.M.

    1991-01-01

    Multivariate calibration methods are very useful for improving the precision, accuracy, and reliability of quantitative spectral analyses. Spectroscopists can more effectively use these sophisticated statistical tools if they have a qualitative understanding of the techniques involved. A qualitative picture of the factor analysis multivariate calibration methods of partial least squares (PLS) and principal component regression (PCR) is presented using infrared calibrations based upon spectra of phosphosilicate glass thin films on silicon wafers. Comparisons of the relative prediction abilities of four different multivariate calibration methods are given based on Monte Carlo simulations of spectral calibration and prediction data. The success of multivariate spectral calibrations is demonstrated for several quantitative infrared studies. The infrared absorption and emission spectra of thin-film dielectrics used in the manufacture of microelectronic devices demonstrate rapid, nondestructive at-line and in-situ analyses using PLS calibrations. Finally, the application of multivariate spectral calibrations to reagentless analysis of blood is presented. We have found that the determination of glucose in whole blood taken from diabetics can be precisely monitored from the PLS calibration of either mind- or near-infrared spectra of the blood. Progress toward the non-invasive determination of glucose levels in diabetics is an ultimate goal of this research. 13 refs., 4 figs.

  13. Parallel hierarchical radiosity rendering

    Energy Technology Data Exchange (ETDEWEB)

    Carter, Michael [Iowa State Univ., Ames, IA (United States)

    1993-07-01

    In this dissertation, the step-by-step development of a scalable parallel hierarchical radiosity renderer is documented. First, a new look is taken at the traditional radiosity equation, and a new form is presented in which the matrix of linear system coefficients is transformed into a symmetric matrix, thereby simplifying the problem and enabling a new solution technique to be applied. Next, the state-of-the-art hierarchical radiosity methods are examined for their suitability to parallel implementation, and scalability. Significant enhancements are also discovered which both improve their theoretical foundations and improve the images they generate. The resultant hierarchical radiosity algorithm is then examined for sources of parallelism, and for an architectural mapping. Several architectural mappings are discussed. A few key algorithmic changes are suggested during the process of making the algorithm parallel. Next, the performance, efficiency, and scalability of the algorithm are analyzed. The dissertation closes with a discussion of several ideas which have the potential to further enhance the hierarchical radiosity method, or provide an entirely new forum for the application of hierarchical methods.

  14. Hierarchical prisoner’s dilemma in hierarchical game for resource competition

    Science.gov (United States)

    Fujimoto, Yuma; Sagawa, Takahiro; Kaneko, Kunihiko

    2017-07-01

    Dilemmas in cooperation are one of the major concerns in game theory. In a public goods game, each individual cooperates by paying a cost or defecting without paying it, and receives a reward from the group out of the collected cost. Thus, defecting is beneficial for each individual, while cooperation is beneficial for the group. Now, groups (say, countries) consisting of individuals also play games. To study such a multi-level game, we introduce a hierarchical game in which multiple groups compete for limited resources by utilizing the collected cost in each group, where the power to appropriate resources increases with the population of the group. Analyzing this hierarchical game, we found a hierarchical prisoner’s dilemma, in which groups choose the defecting policy (say, armament) as a Nash strategy to optimize each group’s benefit, while cooperation optimizes the total benefit. On the other hand, for each individual, refusing to pay the cost (say, tax) is a Nash strategy, which turns out to be a cooperation policy for the group, thus leading to a hierarchical dilemma. Here the group reward increases with the group size. However, we find that there exists an optimal group size that maximizes the individual payoff. Furthermore, when the population asymmetry between two groups is large, the smaller group will choose a cooperation policy (say, disarmament) to avoid excessive response from the larger group, and the prisoner’s dilemma between the groups is resolved. Accordingly, the relevance of this hierarchical game on policy selection in society and the optimal size of human or animal groups are discussed.

  15. Computational Tools for Probing Interactions in Multiple Linear Regression, Multilevel Modeling, and Latent Curve Analysis

    Science.gov (United States)

    Preacher, Kristopher J.; Curran, Patrick J.; Bauer, Daniel J.

    2006-01-01

    Simple slopes, regions of significance, and confidence bands are commonly used to evaluate interactions in multiple linear regression (MLR) models, and the use of these techniques has recently been extended to multilevel or hierarchical linear modeling (HLM) and latent curve analysis (LCA). However, conducting these tests and plotting the…

  16. Social Influence on Information Technology Adoption and Sustained Use in Healthcare: A Hierarchical Bayesian Learning Method Analysis

    Science.gov (United States)

    Hao, Haijing

    2013-01-01

    Information technology adoption and diffusion is currently a significant challenge in the healthcare delivery setting. This thesis includes three papers that explore social influence on information technology adoption and sustained use in the healthcare delivery environment using conventional regression models and novel hierarchical Bayesian…

  17. Micromechanics of hierarchical materials

    DEFF Research Database (Denmark)

    Mishnaevsky, Leon, Jr.

    2012-01-01

    A short overview of micromechanical models of hierarchical materials (hybrid composites, biomaterials, fractal materials, etc.) is given. Several examples of the modeling of strength and damage in hierarchical materials are summarized, among them, 3D FE model of hybrid composites...... with nanoengineered matrix, fiber bundle model of UD composites with hierarchically clustered fibers and 3D multilevel model of wood considered as a gradient, cellular material with layered composite cell walls. The main areas of research in micromechanics of hierarchical materials are identified, among them......, the investigations of the effects of load redistribution between reinforcing elements at different scale levels, of the possibilities to control different material properties and to ensure synergy of strengthening effects at different scale levels and using the nanoreinforcement effects. The main future directions...

  18. PARAMETRIC AND NON PARAMETRIC (MARS: MULTIVARIATE ADDITIVE REGRESSION SPLINES) LOGISTIC REGRESSIONS FOR PREDICTION OF A DICHOTOMOUS RESPONSE VARIABLE WITH AN EXAMPLE FOR PRESENCE/ABSENCE OF AMPHIBIANS

    Science.gov (United States)

    The purpose of this report is to provide a reference manual that could be used by investigators for making informed use of logistic regression using two methods (standard logistic regression and MARS). The details for analyses of relationships between a dependent binary response ...

  19. How hierarchical is language use?

    Science.gov (United States)

    Frank, Stefan L.; Bod, Rens; Christiansen, Morten H.

    2012-01-01

    It is generally assumed that hierarchical phrase structure plays a central role in human language. However, considerations of simplicity and evolutionary continuity suggest that hierarchical structure should not be invoked too hastily. Indeed, recent neurophysiological, behavioural and computational studies show that sequential sentence structure has considerable explanatory power and that hierarchical processing is often not involved. In this paper, we review evidence from the recent literature supporting the hypothesis that sequential structure may be fundamental to the comprehension, production and acquisition of human language. Moreover, we provide a preliminary sketch outlining a non-hierarchical model of language use and discuss its implications and testable predictions. If linguistic phenomena can be explained by sequential rather than hierarchical structure, this will have considerable impact in a wide range of fields, such as linguistics, ethology, cognitive neuroscience, psychology and computer science. PMID:22977157

  20. On a Robust MaxEnt Process Regression Model with Sample-Selection

    Directory of Open Access Journals (Sweden)

    Hea-Jung Kim

    2018-04-01

    Full Text Available In a regression analysis, a sample-selection bias arises when a dependent variable is partially observed as a result of the sample selection. This study introduces a Maximum Entropy (MaxEnt process regression model that assumes a MaxEnt prior distribution for its nonparametric regression function and finds that the MaxEnt process regression model includes the well-known Gaussian process regression (GPR model as a special case. Then, this special MaxEnt process regression model, i.e., the GPR model, is generalized to obtain a robust sample-selection Gaussian process regression (RSGPR model that deals with non-normal data in the sample selection. Various properties of the RSGPR model are established, including the stochastic representation, distributional hierarchy, and magnitude of the sample-selection bias. These properties are used in the paper to develop a hierarchical Bayesian methodology to estimate the model. This involves a simple and computationally feasible Markov chain Monte Carlo algorithm that avoids analytical or numerical derivatives of the log-likelihood function of the model. The performance of the RSGPR model in terms of the sample-selection bias correction, robustness to non-normality, and prediction, is demonstrated through results in simulations that attest to its good finite-sample performance.

  1. A spline-based regression parameter set for creating customized DARTEL MRI brain templates from infancy to old age

    Directory of Open Access Journals (Sweden)

    Marko Wilke

    2018-02-01

    Full Text Available This dataset contains the regression parameters derived by analyzing segmented brain MRI images (gray matter and white matter from a large population of healthy subjects, using a multivariate adaptive regression splines approach. A total of 1919 MRI datasets ranging in age from 1–75 years from four publicly available datasets (NIH, C-MIND, fCONN, and IXI were segmented using the CAT12 segmentation framework, writing out gray matter and white matter images normalized using an affine-only spatial normalization approach. These images were then subjected to a six-step DARTEL procedure, employing an iterative non-linear registration approach and yielding increasingly crisp intermediate images. The resulting six datasets per tissue class were then analyzed using multivariate adaptive regression splines, using the CerebroMatic toolbox. This approach allows for flexibly modelling smoothly varying trajectories while taking into account demographic (age, gender as well as technical (field strength, data quality predictors. The resulting regression parameters described here can be used to generate matched DARTEL or SHOOT templates for a given population under study, from infancy to old age. The dataset and the algorithm used to generate it are publicly available at https://irc.cchmc.org/software/cerebromatic.php. Keywords: MRI template creation, Multivariate adaptive regression splines, DARTEL, Structural MRI

  2. Assessment of Differential Item Functioning in Health-Related Outcomes: A Simulation and Empirical Analysis with Hierarchical Polytomous Data

    Directory of Open Access Journals (Sweden)

    Zahra Sharafi

    2017-01-01

    Full Text Available Background. The purpose of this study was to evaluate the effectiveness of two methods of detecting differential item functioning (DIF in the presence of multilevel data and polytomously scored items. The assessment of DIF with multilevel data (e.g., patients nested within hospitals, hospitals nested within districts from large-scale assessment programs has received considerable attention but very few studies evaluated the effect of hierarchical structure of data on DIF detection for polytomously scored items. Methods. The ordinal logistic regression (OLR and hierarchical ordinal logistic regression (HOLR were utilized to assess DIF in simulated and real multilevel polytomous data. Six factors (DIF magnitude, grouping variable, intraclass correlation coefficient, number of clusters, number of participants per cluster, and item discrimination parameter with a fully crossed design were considered in the simulation study. Furthermore, data of Pediatric Quality of Life Inventory™ (PedsQL™ 4.0 collected from 576 healthy school children were analyzed. Results. Overall, results indicate that both methods performed equivalently in terms of controlling Type I error and detection power rates. Conclusions. The current study showed negligible difference between OLR and HOLR in detecting DIF with polytomously scored items in a hierarchical structure. Implications and considerations while analyzing real data were also discussed.

  3. Improving sub-pixel imperviousness change prediction by ensembling heterogeneous non-linear regression models

    Directory of Open Access Journals (Sweden)

    Drzewiecki Wojciech

    2016-12-01

    Full Text Available In this work nine non-linear regression models were compared for sub-pixel impervious surface area mapping from Landsat images. The comparison was done in three study areas both for accuracy of imperviousness coverage evaluation in individual points in time and accuracy of imperviousness change assessment. The performance of individual machine learning algorithms (Cubist, Random Forest, stochastic gradient boosting of regression trees, k-nearest neighbors regression, random k-nearest neighbors regression, Multivariate Adaptive Regression Splines, averaged neural networks, and support vector machines with polynomial and radial kernels was also compared with the performance of heterogeneous model ensembles constructed from the best models trained using particular techniques.

  4. Regularized principal covariates regression and its application to finding coupled patterns in climate fields

    Science.gov (United States)

    Fischer, M. J.

    2014-02-01

    There are many different methods for investigating the coupling between two climate fields, which are all based on the multivariate regression model. Each different method of solving the multivariate model has its own attractive characteristics, but often the suitability of a particular method for a particular problem is not clear. Continuum regression methods search the solution space between the conventional methods and thus can find regression model subspaces that mix the attractive characteristics of the end-member subspaces. Principal covariates regression is a continuum regression method that is easily applied to climate fields and makes use of two end-members: principal components regression and redundancy analysis. In this study, principal covariates regression is extended to additionally span a third end-member (partial least squares or maximum covariance analysis). The new method, regularized principal covariates regression, has several attractive features including the following: it easily applies to problems in which the response field has missing values or is temporally sparse, it explores a wide range of model spaces, and it seeks a model subspace that will, for a set number of components, have a predictive skill that is the same or better than conventional regression methods. The new method is illustrated by applying it to the problem of predicting the southern Australian winter rainfall anomaly field using the regional atmospheric pressure anomaly field. Regularized principal covariates regression identifies four major coupled patterns in these two fields. The two leading patterns, which explain over half the variance in the rainfall field, are related to the subtropical ridge and features of the zonally asymmetric circulation.

  5. Hierarchical architecture of active knits

    International Nuclear Information System (INIS)

    Abel, Julianna; Luntz, Jonathan; Brei, Diann

    2013-01-01

    Nature eloquently utilizes hierarchical structures to form the world around us. Applying the hierarchical architecture paradigm to smart materials can provide a basis for a new genre of actuators which produce complex actuation motions. One promising example of cellular architecture—active knits—provides complex three-dimensional distributed actuation motions with expanded operational performance through a hierarchically organized structure. The hierarchical structure arranges a single fiber of active material, such as shape memory alloys (SMAs), into a cellular network of interlacing adjacent loops according to a knitting grid. This paper defines a four-level hierarchical classification of knit structures: the basic knit loop, knit patterns, grid patterns, and restructured grids. Each level of the hierarchy provides increased architectural complexity, resulting in expanded kinematic actuation motions of active knits. The range of kinematic actuation motions are displayed through experimental examples of different SMA active knits. The results from this paper illustrate and classify the ways in which each level of the hierarchical knit architecture leverages the performance of the base smart material to generate unique actuation motions, providing necessary insight to best exploit this new actuation paradigm. (paper)

  6. Association between parental guilt and oral health problems in preschool children: a hierarchical approach.

    Science.gov (United States)

    Gomes, Monalisa Cesarino; Clementino, Marayza Alves; Pinto-Sarmento, Tassia Cristina de Almeida; Martins, Carolina Castro; Granville-Garcia, Ana Flávia; Paiva, Saul Martins

    2014-08-16

    Dental caries and traumatic dental injury (TDI) can play an important role in the emergence of parental guilt, since parents feel responsible for their child's health. The aim of the present study was to evaluate the influence of oral health problems among preschool children on parental guilt. A preschool-based, cross-sectional study was carried out with 832 preschool children between three and five years of age in the city of Campina Grande, Brazil. Parents/caregivers answered the Brazilian version of the Early Childhood Oral Health Impact Scale (B-ECOHIS). The item "parental guilt" was the dependent variable. Questionnaires addressing socio-demographic variables (child's sex, child's age, parent's/caregiver's age, mother's schooling, type of preschool and household income), history of toothache and health perceptions (general and oral) were also administered. Clinical exams for dental caries and TDI were performed by three dentists who had undergone a training and calibration exercise (Kappa: 0.85-0.90). Poisson hierarchical regression was used to determine the significance of associations between parental guilt and oral health problems (α = 5%). The multivariate model was carried out on three levels using a hierarchical approach from distal to proximal determinants: 1) socio-demographic aspects; 2) health perceptions; and 3) oral health problems. The frequency of parental guilt was 22.8%. The following variables were significantly associated with parental guilt: parental perception of child's oral health as poor (PR = 2.010; 95% CI: 1.502-2.688), history of toothache (PR = 2.344; 95% CI: 1.755-3.130), cavitated lesions (PR = 2.002; 95% CI: 1.388-2.887), avulsion/luxation (PR = 2.029; 95% CI: 1.141-3.610) and tooth discoloration (PR = 1.540; 95% CI: 1.169-2.028). Based on the present findings, parental guilt increases with the occurrence of oral health problems that require treatment, such as dental caries and TDI of greater severity. Parental perceptions of

  7. Nested and Hierarchical Archimax copulas

    KAUST Repository

    Hofert, Marius

    2017-07-03

    The class of Archimax copulas is generalized to nested and hierarchical Archimax copulas in several ways. First, nested extreme-value copulas or nested stable tail dependence functions are introduced to construct nested Archimax copulas based on a single frailty variable. Second, a hierarchical construction of d-norm generators is presented to construct hierarchical stable tail dependence functions and thus hierarchical extreme-value copulas. Moreover, one can, by itself or additionally, introduce nested frailties to extend Archimax copulas to nested Archimax copulas in a similar way as nested Archimedean copulas extend Archimedean copulas. Further results include a general formula for the density of Archimax copulas.

  8. Nested and Hierarchical Archimax copulas

    KAUST Repository

    Hofert, Marius; Huser, Raphaë l; Prasad, Avinash

    2017-01-01

    The class of Archimax copulas is generalized to nested and hierarchical Archimax copulas in several ways. First, nested extreme-value copulas or nested stable tail dependence functions are introduced to construct nested Archimax copulas based on a single frailty variable. Second, a hierarchical construction of d-norm generators is presented to construct hierarchical stable tail dependence functions and thus hierarchical extreme-value copulas. Moreover, one can, by itself or additionally, introduce nested frailties to extend Archimax copulas to nested Archimax copulas in a similar way as nested Archimedean copulas extend Archimedean copulas. Further results include a general formula for the density of Archimax copulas.

  9. Using multiobjective tradeoff sets and Multivariate Regression Trees to identify critical and robust decisions for long term water utility planning

    Science.gov (United States)

    Smith, R.; Kasprzyk, J. R.; Balaji, R.

    2017-12-01

    In light of deeply uncertain factors like future climate change and population shifts, responsible resource management will require new types of information and strategies. For water utilities, this entails potential expansion and efficient management of water supply infrastructure systems for changes in overall supply; changes in frequency and severity of climate extremes such as droughts and floods; and variable demands, all while accounting for conflicting long and short term performance objectives. Multiobjective Evolutionary Algorithms (MOEAs) are emerging decision support tools that have been used by researchers and, more recently, water utilities to efficiently generate and evaluate thousands of planning portfolios. The tradeoffs between conflicting objectives are explored in an automated way to produce (often large) suites of portfolios that strike different balances of performance. Once generated, the sets of optimized portfolios are used to support relatively subjective assertions of priorities and human reasoning, leading to adoption of a plan. These large tradeoff sets contain information about complex relationships between decisions and between groups of decisions and performance that, until now, has not been quantitatively described. We present a novel use of Multivariate Regression Trees (MRTs) to analyze tradeoff sets to reveal these relationships and critical decisions. Additionally, when MRTs are applied to tradeoff sets developed for different realizations of an uncertain future, they can identify decisions that are robust across a wide range of conditions and produce fundamental insights about the system being optimized.

  10. Hierarchical Bayesian Markov switching models with application to predicting spawning success of shovelnose sturgeon

    Science.gov (United States)

    Holan, S.H.; Davis, G.M.; Wildhaber, M.L.; DeLonay, A.J.; Papoulias, D.M.

    2009-01-01

    The timing of spawning in fish is tightly linked to environmental factors; however, these factors are not very well understood for many species. Specifically, little information is available to guide recruitment efforts for endangered species such as the sturgeon. Therefore, we propose a Bayesian hierarchical model for predicting the success of spawning of the shovelnose sturgeon which uses both biological and behavioural (longitudinal) data. In particular, we use data that were produced from a tracking study that was conducted in the Lower Missouri River. The data that were produced from this study consist of biological variables associated with readiness to spawn along with longitudinal behavioural data collected by using telemetry and archival data storage tags. These high frequency data are complex both biologically and in the underlying behavioural process. To accommodate such complexity we developed a hierarchical linear regression model that uses an eigenvalue predictor, derived from the transition probability matrix of a two-state Markov switching model with generalized auto-regressive conditional heteroscedastic dynamics. Finally, to minimize the computational burden that is associated with estimation of this model, a parallel computing approach is proposed. ?? Journal compilation 2009 Royal Statistical Society.

  11. Pattern recognition by the use of multivariate statistical evaluation of macro- and micro-PIXE results

    International Nuclear Information System (INIS)

    Tapper, U.A.S.; Malmqvist, K.G.; Loevestam, N.E.G.; Swietlicki, E.; Salford, L.G.

    1991-01-01

    The importance of statistical evaluation of multielemental data is illustrated using the data collected in a macro- and micro-PIXE analysis of human brain tumours. By employing a multivariate statistical classification methodology (SIMCA) it was shown that the total information collected from each specimen separates three types of tissue: High malignant, less malignant and normal brain tissue. This makes a classification of a given specimen possible based on the elemental concentrations. Partial least squares regression (PLS), a multivariate regression method, made it possible to study the relative importance of the examined nine trace elements, the dry/wet weight ratio and the age of the patient in predicting the survival time after operation for patients with the high malignant form, astrocytomas grade III-IV. The elemental maps from a microprobe analysis were also subjected to multivariate analysis. This showed that the six elements sorted into maps could be presented in three maps containing all the relevant information. The intensity in these maps is proportional to the value (score) of the actual pixel along the calculated principal components. (orig.)

  12. Production optimisation in the petrochemical industry by hierarchical multivariate modelling. Phase 2: On-line implementation

    Energy Technology Data Exchange (ETDEWEB)

    Nilsson, Aasa; Persson, Fredrik; Andersson, Magnus

    2009-07-15

    IVL, together with Emerson Process Management, has developed a decision support system (DSS) based on multivariate statistical process models. The system was implemented at Nynas AB's refinery in order to provide real-time TBP curves and to enable the operator to optimise the process with regards to product quality and energy consumption. The project resulted in the following proven benefits at the industrial reference site, Nynas Refinery in Gothenburg: - Increased yield with up to 14 % (relative terms) for the most valuable product - Decreased energy consumption of 8 %. Validation of model predictions compared to the laboratory analysis showed that the prediction error lay within 1 deg C throughout the whole test period

  13. The Covariance Adjustment Approaches for Combining Incomparable Cox Regressions Caused by Unbalanced Covariates Adjustment: A Multivariate Meta-Analysis Study

    Directory of Open Access Journals (Sweden)

    Tania Dehesh

    2015-01-01

    Full Text Available Background. Univariate meta-analysis (UM procedure, as a technique that provides a single overall result, has become increasingly popular. Neglecting the existence of other concomitant covariates in the models leads to loss of treatment efficiency. Our aim was proposing four new approximation approaches for the covariance matrix of the coefficients, which is not readily available for the multivariate generalized least square (MGLS method as a multivariate meta-analysis approach. Methods. We evaluated the efficiency of four new approaches including zero correlation (ZC, common correlation (CC, estimated correlation (EC, and multivariate multilevel correlation (MMC on the estimation bias, mean square error (MSE, and 95% probability coverage of the confidence interval (CI in the synthesis of Cox proportional hazard models coefficients in a simulation study. Result. Comparing the results of the simulation study on the MSE, bias, and CI of the estimated coefficients indicated that MMC approach was the most accurate procedure compared to EC, CC, and ZC procedures. The precision ranking of the four approaches according to all above settings was MMC ≥ EC ≥ CC ≥ ZC. Conclusion. This study highlights advantages of MGLS meta-analysis on UM approach. The results suggested the use of MMC procedure to overcome the lack of information for having a complete covariance matrix of the coefficients.

  14. The Covariance Adjustment Approaches for Combining Incomparable Cox Regressions Caused by Unbalanced Covariates Adjustment: A Multivariate Meta-Analysis Study.

    Science.gov (United States)

    Dehesh, Tania; Zare, Najaf; Ayatollahi, Seyyed Mohammad Taghi

    2015-01-01

    Univariate meta-analysis (UM) procedure, as a technique that provides a single overall result, has become increasingly popular. Neglecting the existence of other concomitant covariates in the models leads to loss of treatment efficiency. Our aim was proposing four new approximation approaches for the covariance matrix of the coefficients, which is not readily available for the multivariate generalized least square (MGLS) method as a multivariate meta-analysis approach. We evaluated the efficiency of four new approaches including zero correlation (ZC), common correlation (CC), estimated correlation (EC), and multivariate multilevel correlation (MMC) on the estimation bias, mean square error (MSE), and 95% probability coverage of the confidence interval (CI) in the synthesis of Cox proportional hazard models coefficients in a simulation study. Comparing the results of the simulation study on the MSE, bias, and CI of the estimated coefficients indicated that MMC approach was the most accurate procedure compared to EC, CC, and ZC procedures. The precision ranking of the four approaches according to all above settings was MMC ≥ EC ≥ CC ≥ ZC. This study highlights advantages of MGLS meta-analysis on UM approach. The results suggested the use of MMC procedure to overcome the lack of information for having a complete covariance matrix of the coefficients.

  15. The effect of hospital mergers on long-term sickness absence among hospital employees: a fixed effects multivariate regression analysis using panel data.

    Science.gov (United States)

    Kjekshus, Lars Erik; Bernstrøm, Vilde Hoff; Dahl, Espen; Lorentzen, Thomas

    2014-02-03

    Hospitals are merging to become more cost-effective. Mergers are often complex and difficult processes with variable outcomes. The aim of this study was to analyze the effect of mergers on long-term sickness absence among hospital employees. Long-term sickness absence was analyzed among hospital employees (N = 107 209) in 57 hospitals involved in 23 mergers in Norway between 2000 and 2009. Variation in long-term sickness absence was explained through a fixed effects multivariate regression analysis using panel data with years-since-merger as the independent variable. We found a significant but modest effect of mergers on long-term sickness absence in the year of the merger, and in years 2, 3 and 4; analyzed by gender there was a significant effect for women, also for these years, but only in year 4 for men. However, men are less represented among the hospital workforce; this could explain the lack of significance. Mergers has a significant effect on employee health that should be taken into consideration when deciding to merge hospitals. This study illustrates the importance of analyzing the effects of mergers over several years and the need for more detailed analyses of merger processes and of the changes that may occur as a result of such mergers.

  16. Understanding gendered aspects of migration aspiration and motives of university students by multivariate statistical methods

    Directory of Open Access Journals (Sweden)

    Đula Borozan

    2014-03-01

    Full Text Available The paper deals with the application of multivariate analysis of variance and logistic regression in measuring, explaining and evaluating (i gender differences in expressing migration aspirations, and (ii a gender effect on migration motivation of university students in Croatia. The results supported the thesis that migration is a complex gendering process that assumes subjective assessment of the whole set of interrelated motives. According to logistic regression, gender is a significant predictor of migration aspirations among the selected demographic and socio-economic variables. A multivariate analysis of variance showed that gender and migration aspirations in interaction matter when it comes to migration motives, particularly related to the perceived importance of social networks. Females, and especially those who aspire to migrate, assessed these motives as more important than males.

  17. Assessing risk factors for periodontitis using regression

    Science.gov (United States)

    Lobo Pereira, J. A.; Ferreira, Maria Cristina; Oliveira, Teresa

    2013-10-01

    Multivariate statistical analysis is indispensable to assess the associations and interactions between different factors and the risk of periodontitis. Among others, regression analysis is a statistical technique widely used in healthcare to investigate and model the relationship between variables. In our work we study the impact of socio-demographic, medical and behavioral factors on periodontal health. Using regression, linear and logistic models, we can assess the relevance, as risk factors for periodontitis disease, of the following independent variables (IVs): Age, Gender, Diabetic Status, Education, Smoking status and Plaque Index. The multiple linear regression analysis model was built to evaluate the influence of IVs on mean Attachment Loss (AL). Thus, the regression coefficients along with respective p-values will be obtained as well as the respective p-values from the significance tests. The classification of a case (individual) adopted in the logistic model was the extent of the destruction of periodontal tissues defined by an Attachment Loss greater than or equal to 4 mm in 25% (AL≥4mm/≥25%) of sites surveyed. The association measures include the Odds Ratios together with the correspondent 95% confidence intervals.

  18. Forecasting building energy consumption with hybrid genetic algorithm-hierarchical adaptive network-based fuzzy inference system

    Energy Technology Data Exchange (ETDEWEB)

    Li, Kangji [Institute of Cyber-Systems and Control, Zhejiang University, Hangzhou 310027 (China); School of Electricity Information Engineering, Jiangsu University, Zhenjiang 212013 (China); Su, Hongye [Institute of Cyber-Systems and Control, Zhejiang University, Hangzhou 310027 (China)

    2010-11-15

    There are several ways to forecast building energy consumption, varying from simple regression to models based on physical principles. In this paper, a new method, namely, the hybrid genetic algorithm-hierarchical adaptive network-based fuzzy inference system (GA-HANFIS) model is developed. In this model, hierarchical structure decreases the rule base dimension. Both clustering and rule base parameters are optimized by GAs and neural networks (NNs). The model is applied to predict a hotel's daily air conditioning consumption for a period over 3 months. The results obtained by the proposed model are presented and compared with regular method of NNs, which indicates that GA-HANFIS model possesses better performance than NNs in terms of their forecasting accuracy. (author)

  19. Bayesian Hierarchical Distributed Lag Models for Summer Ozone Exposure and Cardio-Respiratory Mortality

    OpenAIRE

    Yi Huang; Francesca Dominici; Michelle Bell

    2004-01-01

    In this paper, we develop Bayesian hierarchical distributed lag models for estimating associations between daily variations in summer ozone levels and daily variations in cardiovascular and respiratory (CVDRESP) mortality counts for 19 U.S. large cities included in the National Morbidity Mortality Air Pollution Study (NMMAPS) for the period 1987 - 1994. At the first stage, we define a semi-parametric distributed lag Poisson regression model to estimate city-specific relative rates of CVDRESP ...

  20. Extracting bb Higgs Decay Signals using Multivariate Techniques

    Energy Technology Data Exchange (ETDEWEB)

    Smith, W Clarke; /George Washington U. /SLAC

    2012-08-28

    For low-mass Higgs boson production at ATLAS at {radical}s = 7 TeV, the hard subprocess gg {yields} h{sup 0} {yields} b{bar b} dominates but is in turn drowned out by background. We seek to exploit the intrinsic few-MeV mass width of the Higgs boson to observe it above the background in b{bar b}-dijet mass plots. The mass resolution of existing mass-reconstruction algorithms is insufficient for this purpose due to jet combinatorics, that is, the algorithms cannot identify every jet that results from b{bar b} Higgs decay. We combine these algorithms using the neural net (NN) and boosted regression tree (BDT) multivariate methods in attempt to improve the mass resolution. Events involving gg {yields} h{sup 0} {yields} b{bar b} are generated using Monte Carlo methods with Pythia and then the Toolkit for Multivariate Analysis (TMVA) is used to train and test NNs and BDTs. For a 120 GeV Standard Model Higgs boson, the m{sub h{sup 0}}-reconstruction width is reduced from 8.6 to 6.5 GeV. Most importantly, however, the methods used here allow for more advanced m{sub h{sup 0}}-reconstructions to be created in the future using multivariate methods.

  1. Prediction of periodically correlated processes by wavelet transform and multivariate methods with applications to climatological data

    Science.gov (United States)

    Ghanbarzadeh, Mitra; Aminghafari, Mina

    2015-05-01

    This article studies the prediction of periodically correlated process using wavelet transform and multivariate methods with applications to climatological data. Periodically correlated processes can be reformulated as multivariate stationary processes. Considering this fact, two new prediction methods are proposed. In the first method, we use stepwise regression between the principal components of the multivariate stationary process and past wavelet coefficients of the process to get a prediction. In the second method, we propose its multivariate version without principal component analysis a priori. Also, we study a generalization of the prediction methods dealing with a deterministic trend using exponential smoothing. Finally, we illustrate the performance of the proposed methods on simulated and real climatological data (ozone amounts, flows of a river, solar radiation, and sea levels) compared with the multivariate autoregressive model. The proposed methods give good results as we expected.

  2. Identification of Civil Engineering Structures using Multivariate ARMAV and RARMAV Models

    DEFF Research Database (Denmark)

    Kirkegaard, Poul Henning; Andersen, P.; Brincker, Rune

    This paper presents how to make system identification of civil engineering structures using multivariate auto-regressive moving-average vector (ARMAV) models. Further, the ARMAV technique is extended to a recursive technique (RARMAV). The ARMAV model is used to identify measured stationary data....... The results show the usefulness of the approaches for identification of civil engineering structures excited by natural excitation...

  3. Introduction to multivariate discrimination

    Science.gov (United States)

    Kégl, Balázs

    2013-07-01

    Multivariate discrimination or classification is one of the best-studied problem in machine learning, with a plethora of well-tested and well-performing algorithms. There are also several good general textbooks [1-9] on the subject written to an average engineering, computer science, or statistics graduate student; most of them are also accessible for an average physics student with some background on computer science and statistics. Hence, instead of writing a generic introduction, we concentrate here on relating the subject to a practitioner experimental physicist. After a short introduction on the basic setup (Section 1) we delve into the practical issues of complexity regularization, model selection, and hyperparameter optimization (Section 2), since it is this step that makes high-complexity non-parametric fitting so different from low-dimensional parametric fitting. To emphasize that this issue is not restricted to classification, we illustrate the concept on a low-dimensional but non-parametric regression example (Section 2.1). Section 3 describes the common algorithmic-statistical formal framework that unifies the main families of multivariate classification algorithms. We explain here the large-margin principle that partly explains why these algorithms work. Section 4 is devoted to the description of the three main (families of) classification algorithms, neural networks, the support vector machine, and AdaBoost. We do not go into the algorithmic details; the goal is to give an overview on the form of the functions these methods learn and on the objective functions they optimize. Besides their technical description, we also make an attempt to put these algorithm into a socio-historical context. We then briefly describe some rather heterogeneous applications to illustrate the pattern recognition pipeline and to show how widespread the use of these methods is (Section 5). We conclude the chapter with three essentially open research problems that are either

  4. Introduction to multivariate discrimination

    International Nuclear Information System (INIS)

    Kegl, B.

    2013-01-01

    Multivariate discrimination or classification is one of the best-studied problem in machine learning, with a plethora of well-tested and well-performing algorithms. There are also several good general textbooks [1-9] on the subject written to an average engineering, computer science, or statistics graduate student; most of them are also accessible for an average physics student with some background on computer science and statistics. Hence, instead of writing a generic introduction, we concentrate here on relating the subject to a practitioner experimental physicist. After a short introduction on the basic setup (Section 1) we delve into the practical issues of complexity regularization, model selection, and hyper-parameter optimization (Section 2), since it is this step that makes high-complexity non-parametric fitting so different from low-dimensional parametric fitting. To emphasize that this issue is not restricted to classification, we illustrate the concept on a low-dimensional but non-parametric regression example (Section 2.1). Section 3 describes the common algorithmic-statistical formal framework that unifies the main families of multivariate classification algorithms. We explain here the large-margin principle that partly explains why these algorithms work. Section 4 is devoted to the description of the three main (families of) classification algorithms, neural networks, the support vector machine, and AdaBoost. We do not go into the algorithmic details; the goal is to give an overview on the form of the functions these methods learn and on the objective functions they optimize. Besides their technical description, we also make an attempt to put these algorithm into a socio-historical context. We then briefly describe some rather heterogeneous applications to illustrate the pattern recognition pipeline and to show how widespread the use of these methods is (Section 5). We conclude the chapter with three essentially open research problems that are either

  5. A multivariate nonlinear mixed effects method for analyzing energy partitioning in growing pigs

    DEFF Research Database (Denmark)

    Strathe, Anders Bjerring; Danfær, Allan Christian; Chwalibog, André

    2010-01-01

    to the multivariate nonlinear regression model because the MNLME method accounted for correlated errors associated with PD and LD measurements and could also include the random effect of animal. It is recommended that multivariate models used to quantify energy metabolism in growing pigs should account for animal......Simultaneous equations have become increasingly popular for describing the effects of nutrition on the utilization of ME for protein (PD) and lipid deposition (LD) in animals. The study developed a multivariate nonlinear mixed effects (MNLME) framework and compared it with an alternative method...... for estimating parameters in simultaneous equations that described energy metabolism in growing pigs, and then proposed new PD and LD equations. The general statistical framework was implemented in the NLMIXED procedure in SAS. Alternative PD and LD equations were also developed, which assumed...

  6. Sparse Reduced-Rank Regression for Simultaneous Dimension Reduction and Variable Selection

    KAUST Repository

    Chen, Lisha

    2012-12-01

    The reduced-rank regression is an effective method in predicting multiple response variables from the same set of predictor variables. It reduces the number of model parameters and takes advantage of interrelations between the response variables and hence improves predictive accuracy. We propose to select relevant variables for reduced-rank regression by using a sparsity-inducing penalty. We apply a group-lasso type penalty that treats each row of the matrix of the regression coefficients as a group and show that this penalty satisfies certain desirable invariance properties. We develop two numerical algorithms to solve the penalized regression problem and establish the asymptotic consistency of the proposed method. In particular, the manifold structure of the reduced-rank regression coefficient matrix is considered and studied in our theoretical analysis. In our simulation study and real data analysis, the new method is compared with several existing variable selection methods for multivariate regression and exhibits competitive performance in prediction and variable selection. © 2012 American Statistical Association.

  7. Integrated environmental monitoring and multivariate data analysis-A case study.

    Science.gov (United States)

    Eide, Ingvar; Westad, Frank; Nilssen, Ingunn; de Freitas, Felipe Sales; Dos Santos, Natalia Gomes; Dos Santos, Francisco; Cabral, Marcelo Montenegro; Bicego, Marcia Caruso; Figueira, Rubens; Johnsen, Ståle

    2017-03-01

    The present article describes integration of environmental monitoring and discharge data and interpretation using multivariate statistics, principal component analysis (PCA), and partial least squares (PLS) regression. The monitoring was carried out at the Peregrino oil field off the coast of Brazil. One sensor platform and 3 sediment traps were placed on the seabed. The sensors measured current speed and direction, turbidity, temperature, and conductivity. The sediment trap samples were used to determine suspended particulate matter that was characterized with respect to a number of chemical parameters (26 alkanes, 16 PAHs, N, C, calcium carbonate, and Ba). Data on discharges of drill cuttings and water-based drilling fluid were provided on a daily basis. The monitoring was carried out during 7 campaigns from June 2010 to October 2012, each lasting 2 to 3 months due to the capacity of the sediment traps. The data from the campaigns were preprocessed, combined, and interpreted using multivariate statistics. No systematic difference could be observed between campaigns or traps despite the fact that the first campaign was carried out before drilling, and 1 of 3 sediment traps was located in an area not expected to be influenced by the discharges. There was a strong covariation between suspended particulate matter and total N and organic C suggesting that the majority of the sediment samples had a natural and biogenic origin. Furthermore, the multivariate regression showed no correlation between discharges of drill cuttings and sediment trap or turbidity data taking current speed and direction into consideration. Because of this lack of correlation with discharges from the drilling location, a more detailed evaluation of chemical indicators providing information about origin was carried out in addition to numerical modeling of dispersion and deposition. The chemical indicators and the modeling of dispersion and deposition support the conclusions from the multivariate

  8. Prediction of diffuse solar irradiance using machine learning and multivariable regression

    International Nuclear Information System (INIS)

    Lou, Siwei; Li, Danny H.W.; Lam, Joseph C.; Chan, Wilco W.H.

    2016-01-01

    Highlights: • 54.9% of the annual global irradiance is composed by its diffuse part in HK. • Hourly diffuse irradiance was predicted by accessible variables. • The importance of variable in prediction was assessed by machine learning. • Simple prediction equations were developed with the knowledge of variable importance. - Abstract: The paper studies the horizontal global, direct-beam and sky-diffuse solar irradiance data measured in Hong Kong from 2008 to 2013. A machine learning algorithm was employed to predict the horizontal sky-diffuse irradiance and conduct sensitivity analysis for the meteorological variables. Apart from the clearness index (horizontal global/extra atmospheric solar irradiance), we found that predictors including solar altitude, air temperature, cloud cover and visibility are also important in predicting the diffuse component. The mean absolute error (MAE) of the logistic regression using the aforementioned predictors was less than 21.5 W/m"2 and 30 W/m"2 for Hong Kong and Denver, USA, respectively. With the systematic recording of the five variables for more than 35 years, the proposed model would be appropriate to estimate of long-term diffuse solar radiation, study climate change and develope typical meteorological year in Hong Kong and places with similar climates.

  9. A Hierarchical Approach Using Machine Learning Methods in Solar Photovoltaic Energy Production Forecasting

    OpenAIRE

    Zhaoxuan Li; SM Mahbobur Rahman; Rolando Vega; Bing Dong

    2016-01-01

    We evaluate and compare two common methods, artificial neural networks (ANN) and support vector regression (SVR), for predicting energy productions from a solar photovoltaic (PV) system in Florida 15 min, 1 h and 24 h ahead of time. A hierarchical approach is proposed based on the machine learning algorithms tested. The production data used in this work corresponds to 15 min averaged power measurements collected from 2014. The accuracy of the model is determined using computing error statisti...

  10. Neutrosophic Hierarchical Clustering Algoritms

    Directory of Open Access Journals (Sweden)

    Rıdvan Şahin

    2014-03-01

    Full Text Available Interval neutrosophic set (INS is a generalization of interval valued intuitionistic fuzzy set (IVIFS, whose the membership and non-membership values of elements consist of fuzzy range, while single valued neutrosophic set (SVNS is regarded as extension of intuitionistic fuzzy set (IFS. In this paper, we extend the hierarchical clustering techniques proposed for IFSs and IVIFSs to SVNSs and INSs respectively. Based on the traditional hierarchical clustering procedure, the single valued neutrosophic aggregation operator, and the basic distance measures between SVNSs, we define a single valued neutrosophic hierarchical clustering algorithm for clustering SVNSs. Then we extend the algorithm to classify an interval neutrosophic data. Finally, we present some numerical examples in order to show the effectiveness and availability of the developed clustering algorithms.

  11. A comparative study of artificial neural network and multivariate regression analysis to analyze optimum renal stone fragmentation by extracorporeal shock wave lithotripsy

    Directory of Open Access Journals (Sweden)

    Goyal Neeraj

    2010-01-01

    Full Text Available To compare the accuracy of artificial neural network (ANN analysis and multi-variate regression analysis (MVRA for renal stone fragmentation by extracorporeal shock wave lithotripsy (ESWL. A total of 276 patients with renal calculus were treated by ESWL during December 2001 to December 2006. Of them, the data of 196 patients were used for training the ANN. The predictability of trained ANN was tested on 80 subsequent patients. The input data include age of patient, stone size, stone burden, number of sittings and urinary pH. The output values (predicted values were number of shocks and shock power. Of these 80 patients, the input was analyzed and output was also calculated by MVRA. The output values (predicted values from both the methods were compared and the results were drawn. The predicted and observed values of shock power and number of shocks were compared using 1:1 slope line. The results were calculated as coefficient of correlation (COC (r2 . For prediction of power, the MVRA COC was 0.0195 and ANN COC was 0.8343. For prediction of number of shocks, the MVRA COC was 0.5726 and ANN COC was 0.9329. In conclusion, ANN gives better COC than MVRA, hence could be a better tool to analyze the optimum renal stone fragmentation by ESWL.

  12. A comparative study of artificial neural network and multivariate regression analysis to analyze optimum renal stone fragmentation by extracorporeal shock wave lithotripsy

    International Nuclear Information System (INIS)

    Neeraj K Goyal, Abhay Kumar; Sameer Trivedi

    2010-01-01

    To compare the accuracy of artificial neural network (ANN) analysis and multivariate regression analysis (MVRA) for renal stone fragmentation by extracorporeal shock wave lithotripsy (ESWL). A total of 276 patients with renal calculus were treated by ESWL during December 2001 to December 2006. Of them, the data of 196 patients were used for training the ANN. The predictability of trained ANN was tested on 80 subsequent patients. The input data include age of patient, stone size, stone burden, number of sittings and urinary pH. The output values (predicted values) were number of shocks and shock power. Of these 80 patients, the input was analyzed and output was also calculated by MVRA. The output values (predicted values) from both the methods were compared and the results were drawn. The predicted and observed values of shock power and number of shocks were compared using 1:1 slope line. The results were calculated as coefficient of correlation (COC) (r2 ). For prediction of power, the MVRA COC was 0.0195 and ANN COC was 0.8343. For prediction of number of shocks, the MVRA COC was 0.5726 and ANN COC was 0.9329. In conclusion, ANN gives better COC than MVRA, hence could be a better tool to analyze the optimum renal stone fragmentation by ESWL (Author).

  13. The Hierarchical Perspective

    Directory of Open Access Journals (Sweden)

    Daniel Sofron

    2015-05-01

    Full Text Available This paper is focused on the hierarchical perspective, one of the methods for representing space that was used before the discovery of the Renaissance linear perspective. The hierarchical perspective has a more or less pronounced scientific character and its study offers us a clear image of the way the representatives of the cultures that developed it used to perceive the sensitive reality. This type of perspective is an original method of representing three-dimensional space on a flat surface, which characterises the art of Ancient Egypt and much of the art of the Middle Ages, being identified in the Eastern European Byzantine art, as well as in the Western European Pre-Romanesque and Romanesque art. At the same time, the hierarchical perspective is also present in naive painting and infantile drawing. Reminiscences of this method can be recognised also in the works of some precursors of the Italian Renaissance. The hierarchical perspective can be viewed as a subjective ranking criterion, according to which the elements are visually represented by taking into account their relevance within the image while perception is ignored. This paper aims to show how the main objective of the artists of those times was not to faithfully represent the objective reality, but rather to emphasize the essence of the world and its perennial aspects. This may represent a possible explanation for the refusal of perspective in the Egyptian, Romanesque and Byzantine painting, characterised by a marked two-dimensionality.

  14. Semiparametric Bernstein–von Mises for the error standard deviation

    OpenAIRE

    Jonge, de, R.; Zanten, van, J.H.

    2013-01-01

    We study Bayes procedures for nonparametric regression problems with Gaussian errors, giving conditions under which a Bernstein–von Mises result holds for the marginal posterior distribution of the error standard deviation. We apply our general results to show that a single Bayes procedure using a hierarchical spline-based prior on the regression function and an independent prior on the error variance, can simultaneously achieve adaptive, rate-optimal estimation of a smooth, multivariate regr...

  15. Comparisons of Flow Patterns over a Hierarchical and a Non-hierarchical Surface in Relation to Biofouling Control

    Directory of Open Access Journals (Sweden)

    Bin Ahmad Fawzan Mohammed Ridha

    2018-01-01

    Full Text Available Biofouling can be defined as unwanted deposition and development of organisms on submerged surfaces. It is a major problem as it causes water contamination, infrastructures damage and increase in maintenance and operational cost especially in the shipping industry. There are a few methods that can prevent this problem. One of the most effective methods which is using chemicals particularly Tributyltin has been banned due to adverse effects on the environment. One of the non-toxic methods found to be effective is surface modification which involves altering the surface topography so that it becomes a low-fouling or a non-stick surface to biofouling organisms. Current literature suggested that non-hierarchical topographies has lower antifouling performance compared to hierarchical topographies. It is still unclear if the effects of the flow on these topographies could have aided in their antifouling properties. This research will use Computational Fluid Dynamics (CFD simulations to study the flow on these two topographies which also involves comparison study of the topographies used. According to the results obtained, it is shown that hierarchical topography has higher antifouling performance compared to non-hierarchical topography. This is because the fluid characteristics at the hierarchical topography is more favorable in controlling biofouling. In addition, hierarchical topography has higher wall shear stress distribution compared to non-hierarchical topography

  16. Adaptive hierarchical multi-agent organizations

    NARCIS (Netherlands)

    Ghijsen, M.; Jansweijer, W.N.H.; Wielinga, B.J.; Babuška, R.; Groen, F.C.A.

    2010-01-01

    In this chapter, we discuss the design of adaptive hierarchical organizations for multi-agent systems (MAS). Hierarchical organizations have a number of advantages such as their ability to handle complex problems and their scalability to large organizations. By introducing adaptivity in the

  17. Symbolic Regression Via Genetic Programming as a Discovery Engine: Insights on Outliers and Prototypes

    Science.gov (United States)

    Kotanchek, Mark E.; Vladislavleva, Ekaterina Y.; Smits, Guido F.

    In this chapter we illustrate a framework based on symbolic regression to generate and sharpen the questions about the nature of the underlying system and provide additional context and understanding based on multi-variate numeric data.

  18. Finding Similarities in Ancient Ceramics by EDXRF and Multivariate Methods

    International Nuclear Information System (INIS)

    Civici, N.; Stamati, F.

    1999-01-01

    We have studied 39 samples of fragments from ceramic roof tiles with different stamps(Diamalas and Heraion), dated between 330 to 170 BC and found at the archaeological site of Dimales, some 30 km from the Adriatic coast. The data from these samples were compared with those obtained from 7 samples of similar objects and period with the stamp H eraion , found at the archaeological site of APOLLONIA. The samples were analyzed by energy-dispersive X -ray fluorescence(EDXRF), using of the x-ray lines of the elements to the intensity of the Compton peak. The results have been treated with diverse multivariate methods. The application of hierarchical cluster analysis and factor analysis permitted the identification of two main clusters. The first cluster is composed from the ''Heraion'' samples discovered in Apollonia, while the second comprises all the samples discovered in Dimale independent of their stamp. (authors)

  19. Soft sensor design by multivariate fusion of image features and process measurements

    DEFF Research Database (Denmark)

    Lin, Bao; Jørgensen, Sten Bay

    2011-01-01

    This paper presents a multivariate data fusion procedure for design of dynamic soft sensors where suitably selected image features are combined with traditional process measurements to enhance the performance of data-driven soft sensors. A key issue of fusing multiple sensor data, i.e. to determine...... with a multivariate analysis technique from RGB pictures. The color information is also transformed to hue, saturation and intensity components. Both sets of image features are combined with traditional process measurements to obtain an inferential model by partial least squares (PLS) regression. A dynamic PLS model...... oxides (NOx) emission of cement kilns. On-site tests demonstrate improved performance over soft sensors based on conventional process measurements only....

  20. Application of Multivariate Adaptive Regression Splines to Sheet Metal Bending Process for Springback Compensation

    Directory of Open Access Journals (Sweden)

    Dilan Rasim Aşkın

    2016-01-01

    Full Text Available An intelligent regression technique is applied for sheet metal bending processes to improve bending performance. This study is a part of another extensive study, automated sheet bending assistance for press brakes. Data related to material properties of sheet metal is collected in an online manner and fed to an intelligent system for determining the most accurate punch displacement without any offline iteration or calibration. The overall system aims to reduce the production time while increasing the performance of press brakes.

  1. Models and Inference for Multivariate Spatial Extremes

    KAUST Repository

    Vettori, Sabrina

    2017-12-07

    The development of flexible and interpretable statistical methods is necessary in order to provide appropriate risk assessment measures for extreme events and natural disasters. In this thesis, we address this challenge by contributing to the developing research field of Extreme-Value Theory. We initially study the performance of existing parametric and non-parametric estimators of extremal dependence for multivariate maxima. As the dimensionality increases, non-parametric estimators are more flexible than parametric methods but present some loss in efficiency that we quantify under various scenarios. We introduce a statistical tool which imposes the required shape constraints on non-parametric estimators in high dimensions, significantly improving their performance. Furthermore, by embedding the tree-based max-stable nested logistic distribution in the Bayesian framework, we develop a statistical algorithm that identifies the most likely tree structures representing the data\\'s extremal dependence using the reversible jump Monte Carlo Markov Chain method. A mixture of these trees is then used for uncertainty assessment in prediction through Bayesian model averaging. The computational complexity of full likelihood inference is significantly decreased by deriving a recursive formula for the nested logistic model likelihood. The algorithm performance is verified through simulation experiments which also compare different likelihood procedures. Finally, we extend the nested logistic representation to the spatial framework in order to jointly model multivariate variables collected across a spatial region. This situation emerges often in environmental applications but is not often considered in the current literature. Simulation experiments show that the new class of multivariate max-stable processes is able to detect both the cross and inner spatial dependence of a number of extreme variables at a relatively low computational cost, thanks to its Bayesian hierarchical

  2. Hierarchical video summarization

    Science.gov (United States)

    Ratakonda, Krishna; Sezan, M. Ibrahim; Crinon, Regis J.

    1998-12-01

    We address the problem of key-frame summarization of vide in the absence of any a priori information about its content. This is a common problem that is encountered in home videos. We propose a hierarchical key-frame summarization algorithm where a coarse-to-fine key-frame summary is generated. A hierarchical key-frame summary facilitates multi-level browsing where the user can quickly discover the content of the video by accessing its coarsest but most compact summary and then view a desired segment of the video with increasingly more detail. At the finest level, the summary is generated on the basis of color features of video frames, using an extension of a recently proposed key-frame extraction algorithm. The finest level key-frames are recursively clustered using a novel pairwise K-means clustering approach with temporal consecutiveness constraint. We also address summarization of MPEG-2 compressed video without fully decoding the bitstream. We also propose efficient mechanisms that facilitate decoding the video when the hierarchical summary is utilized in browsing and playback of video segments starting at selected key-frames.

  3. Efficient Actor-Critic Algorithm with Hierarchical Model Learning and Planning

    Science.gov (United States)

    Fu, QiMing

    2016-01-01

    To improve the convergence rate and the sample efficiency, two efficient learning methods AC-HMLP and RAC-HMLP (AC-HMLP with ℓ 2-regularization) are proposed by combining actor-critic algorithm with hierarchical model learning and planning. The hierarchical models consisting of the local and the global models, which are learned at the same time during learning of the value function and the policy, are approximated by local linear regression (LLR) and linear function approximation (LFA), respectively. Both the local model and the global model are applied to generate samples for planning; the former is used only if the state-prediction error does not surpass the threshold at each time step, while the latter is utilized at the end of each episode. The purpose of taking both models is to improve the sample efficiency and accelerate the convergence rate of the whole algorithm through fully utilizing the local and global information. Experimentally, AC-HMLP and RAC-HMLP are compared with three representative algorithms on two Reinforcement Learning (RL) benchmark problems. The results demonstrate that they perform best in terms of convergence rate and sample efficiency. PMID:27795704

  4. Regression analysis of radiological parameters in nuclear power plants

    International Nuclear Information System (INIS)

    Bhargava, Pradeep; Verma, R.K.; Joshi, M.L.

    2003-01-01

    Indian Pressurized Heavy Water Reactors (PHWRs) have now attained maturity in their operations. Indian PHWR operation started in the year 1972. At present there are 12 operating PHWRs collectively producing nearly 2400 MWe. Sufficient radiological data are available for analysis to draw inferences which may be utilised for better understanding of radiological parameters influencing the collective internal dose. Tritium is the main contributor to the occupational internal dose originating in PHWRs. An attempt has been made to establish the relationship between radiological parameters, which may be useful to draw inferences about the internal dose. Regression analysis have been done to find out the relationship, if it exist, among the following variables: A. Specific tritium activity of heavy water (Moderator and PHT) and tritium concentration in air at various work locations. B. Internal collective occupational dose and tritium release to environment through air route. C. Specific tritium activity of heavy water (Moderator and PHT) and collective internal occupational dose. For this purpose multivariate regression analysis has been carried out. D. Tritium concentration in air at various work location and tritium release to environment through air route. For this purpose multivariate regression analysis has been carried out. This analysis reveals that collective internal dose has got very good correlation with the tritium activity release to the environment through air route. Whereas no correlation has been found between specific tritium activity in the heavy water systems and collective internal occupational dose. The good correlation has been found in case D and F test reveals that it is not by chance. (author)

  5. Concurrent and convergent validity of the mobility- and multidimensional-hierarchical disability categorization models with physical performance in community older adults.

    Science.gov (United States)

    Hu, Ming-Hsia; Yeh, Chih-Jun; Chen, Tou-Rong; Wang, Ching-Yi

    2014-01-01

    A valid, time-efficient and easy-to-use instrument is important for busy clinical settings, large scale surveys, or community screening use. The purpose of this study was to validate the mobility hierarchical disability categorization model (an abbreviated model) by investigating its concurrent validity with the multidimensional hierarchical disability categorization model (a comprehensive model) and triangulating both models with physical performance measures in older adults. 604 community-dwelling older adults of at least 60 years in age volunteered to participate. Self-reported function on mobility, instrumental activities of daily living (IADL) and activities of daily living (ADL) domains were recorded and then the disability status determined based on both the multidimensional hierarchical categorization model and the mobility hierarchical categorization model. The physical performance measures, consisting of grip strength and usual and fastest gait speeds (UGS, FGS), were collected on the same day. Both categorization models showed high correlation (γs = 0.92, p categorization models. The results of multiple regression analysis indicated that both models individually explain similar amount of variance on all physical performances, with adjustments for age, sex, and number of comorbidities. Our results found that the mobility hierarchical disability categorization model is a valid and time efficient tool for large survey or screening use.

  6. On the relation between S-Estimators and M-Estimators of multivariate location and covariance

    NARCIS (Netherlands)

    Lopuhaa, H.P.

    1987-01-01

    We discuss the relation between S-estimators and M-estimators of multivariate location and covariance. As in the case of the estimation of a multiple regression parameter, S-estimators are shown to satisfy first-order conditions of M-estimators. We show that the influence function IF (x;S F) of

  7. Dirichlet Component Regression and its Applications to Psychiatric Data

    OpenAIRE

    Gueorguieva, Ralitza; Rosenheck, Robert; Zelterman, Daniel

    2008-01-01

    We describe a Dirichlet multivariable regression method useful for modeling data representing components as a percentage of a total. This model is motivated by the unmet need in psychiatry and other areas to simultaneously assess the effects of covariates on the relative contributions of different components of a measure. The model is illustrated using the Positive and Negative Syndrome Scale (PANSS) for assessment of schizophrenia symptoms which, like many other metrics in psychiatry, is com...

  8. Study of cyanotoxins presence from experimental cyanobacteria concentrations using a new data mining methodology based on multivariate adaptive regression splines in Trasona reservoir (Northern Spain).

    Science.gov (United States)

    Garcia Nieto, P J; Sánchez Lasheras, F; de Cos Juez, F J; Alonso Fernández, J R

    2011-11-15

    There is an increasing need to describe cyanobacteria blooms since some cyanobacteria produce toxins, termed cyanotoxins. These latter can be toxic and dangerous to humans as well as other animals and life in general. It must be remarked that the cyanobacteria are reproduced explosively under certain conditions. This results in algae blooms, which can become harmful to other species if the cyanobacteria involved produce cyanotoxins. In this research work, the evolution of cyanotoxins in Trasona reservoir (Principality of Asturias, Northern Spain) was studied with success using the data mining methodology based on multivariate adaptive regression splines (MARS) technique. The results of the present study are two-fold. On one hand, the importance of the different kind of cyanobacteria over the presence of cyanotoxins in the reservoir is presented through the MARS model and on the other hand a predictive model able to forecast the possible presence of cyanotoxins in a short term was obtained. The agreement of the MARS model with experimental data confirmed the good performance of the same one. Finally, conclusions of this innovative research are exposed. Copyright © 2011 Elsevier B.V. All rights reserved.

  9. Fast Detection of Copper Content in Rice by Laser-Induced Breakdown Spectroscopy with Uni- and Multivariate Analysis

    Science.gov (United States)

    Ye, Lanhan; Song, Kunlin; Shen, Tingting

    2018-01-01

    Fast detection of heavy metals is very important for ensuring the quality and safety of crops. Laser-induced breakdown spectroscopy (LIBS), coupled with uni- and multivariate analysis, was applied for quantitative analysis of copper in three kinds of rice (Jiangsu rice, regular rice, and Simiao rice). For univariate analysis, three pre-processing methods were applied to reduce fluctuations, including background normalization, the internal standard method, and the standard normal variate (SNV). Linear regression models showed a strong correlation between spectral intensity and Cu content, with an R2 more than 0.97. The limit of detection (LOD) was around 5 ppm, lower than the tolerance limit of copper in foods. For multivariate analysis, partial least squares regression (PLSR) showed its advantage in extracting effective information for prediction, and its sensitivity reached 1.95 ppm, while support vector machine regression (SVMR) performed better in both calibration and prediction sets, where Rc2 and Rp2 reached 0.9979 and 0.9879, respectively. This study showed that LIBS could be considered as a constructive tool for the quantification of copper contamination in rice. PMID:29495445

  10. Evaluating Hierarchical Structure in Music Annotations.

    Science.gov (United States)

    McFee, Brian; Nieto, Oriol; Farbood, Morwaread M; Bello, Juan Pablo

    2017-01-01

    Music exhibits structure at multiple scales, ranging from motifs to large-scale functional components. When inferring the structure of a piece, different listeners may attend to different temporal scales, which can result in disagreements when they describe the same piece. In the field of music informatics research (MIR), it is common to use corpora annotated with structural boundaries at different levels. By quantifying disagreements between multiple annotators, previous research has yielded several insights relevant to the study of music cognition. First, annotators tend to agree when structural boundaries are ambiguous. Second, this ambiguity seems to depend on musical features, time scale, and genre. Furthermore, it is possible to tune current annotation evaluation metrics to better align with these perceptual differences. However, previous work has not directly analyzed the effects of hierarchical structure because the existing methods for comparing structural annotations are designed for "flat" descriptions, and do not readily generalize to hierarchical annotations. In this paper, we extend and generalize previous work on the evaluation of hierarchical descriptions of musical structure. We derive an evaluation metric which can compare hierarchical annotations holistically across multiple levels. sing this metric, we investigate inter-annotator agreement on the multilevel annotations of two different music corpora, investigate the influence of acoustic properties on hierarchical annotations, and evaluate existing hierarchical segmentation algorithms against the distribution of inter-annotator agreement.

  11. Evaluating Hierarchical Structure in Music Annotations

    Directory of Open Access Journals (Sweden)

    Brian McFee

    2017-08-01

    Full Text Available Music exhibits structure at multiple scales, ranging from motifs to large-scale functional components. When inferring the structure of a piece, different listeners may attend to different temporal scales, which can result in disagreements when they describe the same piece. In the field of music informatics research (MIR, it is common to use corpora annotated with structural boundaries at different levels. By quantifying disagreements between multiple annotators, previous research has yielded several insights relevant to the study of music cognition. First, annotators tend to agree when structural boundaries are ambiguous. Second, this ambiguity seems to depend on musical features, time scale, and genre. Furthermore, it is possible to tune current annotation evaluation metrics to better align with these perceptual differences. However, previous work has not directly analyzed the effects of hierarchical structure because the existing methods for comparing structural annotations are designed for “flat” descriptions, and do not readily generalize to hierarchical annotations. In this paper, we extend and generalize previous work on the evaluation of hierarchical descriptions of musical structure. We derive an evaluation metric which can compare hierarchical annotations holistically across multiple levels. sing this metric, we investigate inter-annotator agreement on the multilevel annotations of two different music corpora, investigate the influence of acoustic properties on hierarchical annotations, and evaluate existing hierarchical segmentation algorithms against the distribution of inter-annotator agreement.

  12. Seasonal variation of benzo(a)pyrene in the Spanish airborne PM10. Multivariate linear regression model applied to estimate BaP concentrations.

    Science.gov (United States)

    Callén, M S; López, J M; Mastral, A M

    2010-08-15

    The estimation of benzo(a)pyrene (BaP) concentrations in ambient air is very important from an environmental point of view especially with the introduction of the Directive 2004/107/EC and due to the carcinogenic character of this pollutant. A sampling campaign of particulate matter less or equal than 10 microns (PM10) carried out during 2008-2009 in four locations of Spain was collected to determine experimentally BaP concentrations by gas chromatography mass-spectrometry mass-spectrometry (GC-MS-MS). Multivariate linear regression models (MLRM) were used to predict BaP air concentrations in two sampling places, taking PM10 and meteorological variables as possible predictors. The model obtained with data from two sampling sites (all sites model) (R(2)=0.817, PRESS/SSY=0.183) included the significant variables like PM10, temperature, solar radiation and wind speed and was internally and externally validated. The first validation was performed by cross validation and the last one by BaP concentrations from previous campaigns carried out in Zaragoza from 2001-2004. The proposed model constitutes a first approximation to estimate BaP concentrations in urban atmospheres with very good internal prediction (Q(CV)(2)=0.813, PRESS/SSY=0.187) and with the maximal external prediction for the 2001-2002 campaign (Q(ext)(2)=0.679 and PRESS/SSY=0.321) versus the 2001-2004 campaign (Q(ext)(2)=0.551, PRESS/SSY=0.449). Copyright 2010 Elsevier B.V. All rights reserved.

  13. Seasonal variation of benzo(a)pyrene in the Spanish airborne PM10. Multivariate linear regression model applied to estimate BaP concentrations

    International Nuclear Information System (INIS)

    Callen, M.S.; Lopez, J.M.; Mastral, A.M.

    2010-01-01

    The estimation of benzo(a)pyrene (BaP) concentrations in ambient air is very important from an environmental point of view especially with the introduction of the Directive 2004/107/EC and due to the carcinogenic character of this pollutant. A sampling campaign of particulate matter less or equal than 10 microns (PM10) carried out during 2008-2009 in four locations of Spain was collected to determine experimentally BaP concentrations by gas chromatography mass-spectrometry mass-spectrometry (GC-MS-MS). Multivariate linear regression models (MLRM) were used to predict BaP air concentrations in two sampling places, taking PM10 and meteorological variables as possible predictors. The model obtained with data from two sampling sites (all sites model) (R 2 = 0.817, PRESS/SSY = 0.183) included the significant variables like PM10, temperature, solar radiation and wind speed and was internally and externally validated. The first validation was performed by cross validation and the last one by BaP concentrations from previous campaigns carried out in Zaragoza from 2001-2004. The proposed model constitutes a first approximation to estimate BaP concentrations in urban atmospheres with very good internal prediction (Q CV 2 =0.813, PRESS/SSY = 0.187) and with the maximal external prediction for the 2001-2002 campaign (Q ext 2 =0.679 and PRESS/SSY = 0.321) versus the 2001-2004 campaign (Q ext 2 =0.551, PRESS/SSY = 0.449).

  14. Using Logistic Regression To Predict the Probability of Debris Flows Occurring in Areas Recently Burned By Wildland Fires

    Science.gov (United States)

    Rupert, Michael G.; Cannon, Susan H.; Gartner, Joseph E.

    2003-01-01

    Logistic regression was used to predict the probability of debris flows occurring in areas recently burned by wildland fires. Multiple logistic regression is conceptually similar to multiple linear regression because statistical relations between one dependent variable and several independent variables are evaluated. In logistic regression, however, the dependent variable is transformed to a binary variable (debris flow did or did not occur), and the actual probability of the debris flow occurring is statistically modeled. Data from 399 basins located within 15 wildland fires that burned during 2000-2002 in Colorado, Idaho, Montana, and New Mexico were evaluated. More than 35 independent variables describing the burn severity, geology, land surface gradient, rainfall, and soil properties were evaluated. The models were developed as follows: (1) Basins that did and did not produce debris flows were delineated from National Elevation Data using a Geographic Information System (GIS). (2) Data describing the burn severity, geology, land surface gradient, rainfall, and soil properties were determined for each basin. These data were then downloaded to a statistics software package for analysis using logistic regression. (3) Relations between the occurrence/non-occurrence of debris flows and burn severity, geology, land surface gradient, rainfall, and soil properties were evaluated and several preliminary multivariate logistic regression models were constructed. All possible combinations of independent variables were evaluated to determine which combination produced the most effective model. The multivariate model that best predicted the occurrence of debris flows was selected. (4) The multivariate logistic regression model was entered into a GIS, and a map showing the probability of debris flows was constructed. The most effective model incorporates the percentage of each basin with slope greater than 30 percent, percentage of land burned at medium and high burn severity

  15. Hierarchically Nanostructured Materials for Sustainable Environmental Applications

    Directory of Open Access Journals (Sweden)

    Zheng eRen

    2013-11-01

    Full Text Available This article presents a comprehensive overview of the hierarchical nanostructured materials with either geometry or composition complexity in environmental applications. The hierarchical nanostructures offer advantages of high surface area, synergistic interactions and multiple functionalities towards water remediation, environmental gas sensing and monitoring as well as catalytic gas treatment. Recent advances in synthetic strategies for various hierarchical morphologies such as hollow spheres and urchin-shaped architectures have been reviewed. In addition to the chemical synthesis, the physical mechanisms associated with the materials design and device fabrication have been discussed for each specific application. The development and application of hierarchical complex perovskite oxide nanostructures have also been introduced in photocatalytic water remediation, gas sensing and catalytic converter. Hierarchical nanostructures will open up many possibilities for materials design and device fabrication in environmental chemistry and technology.

  16. Econometric analysis of realised covariation: high frequency covariance, regression and correlation in financial economics

    OpenAIRE

    Ole E. Barndorff-Nielsen; Neil Shephard

    2002-01-01

    This paper analyses multivariate high frequency financial data using realised covariation. We provide a new asymptotic distribution theory for standard methods such as regression, correlation analysis and covariance. It will be based on a fixed interval of time (e.g. a day or week), allowing the number of high frequency returns during this period to go to infinity. Our analysis allows us to study how high frequency correlations, regressions and covariances change through time. In particular w...

  17. The pathways for intelligible speech: multivariate and univariate perspectives.

    Science.gov (United States)

    Evans, S; Kyong, J S; Rosen, S; Golestani, N; Warren, J E; McGettigan, C; Mourão-Miranda, J; Wise, R J S; Scott, S K

    2014-09-01

    An anterior pathway, concerned with extracting meaning from sound, has been identified in nonhuman primates. An analogous pathway has been suggested in humans, but controversy exists concerning the degree of lateralization and the precise location where responses to intelligible speech emerge. We have demonstrated that the left anterior superior temporal sulcus (STS) responds preferentially to intelligible speech (Scott SK, Blank CC, Rosen S, Wise RJS. 2000. Identification of a pathway for intelligible speech in the left temporal lobe. Brain. 123:2400-2406.). A functional magnetic resonance imaging study in Cerebral Cortex used equivalent stimuli and univariate and multivariate analyses to argue for the greater importance of bilateral posterior when compared with the left anterior STS in responding to intelligible speech (Okada K, Rong F, Venezia J, Matchin W, Hsieh IH, Saberi K, Serences JT,Hickok G. 2010. Hierarchical organization of human auditory cortex: evidence from acoustic invariance in the response to intelligible speech. 20: 2486-2495.). Here, we also replicate our original study, demonstrating that the left anterior STS exhibits the strongest univariate response and, in decoding using the bilateral temporal cortex, contains the most informative voxels showing an increased response to intelligible speech. In contrast, in classifications using local "searchlights" and a whole brain analysis, we find greater classification accuracy in posterior rather than anterior temporal regions. Thus, we show that the precise nature of the multivariate analysis used will emphasize different response profiles associated with complex sound to speech processing. © The Author 2013. Published by Oxford University Press.

  18. Hierarchically structured, nitrogen-doped carbon membranes

    KAUST Repository

    Wang, Hong; Wu, Tao

    2017-01-01

    The present invention is a structure, method of making and method of use for a novel macroscopic hierarchically structured, nitrogen-doped, nano-porous carbon membrane (HNDCMs) with asymmetric and hierarchical pore architecture that can be produced

  19. Multivariate statistical analysis of a multi-step industrial processes

    DEFF Research Database (Denmark)

    Reinikainen, S.P.; Høskuldsson, Agnar

    2007-01-01

    Monitoring and quality control of industrial processes often produce information on how the data have been obtained. In batch processes, for instance, the process is carried out in stages; some process or control parameters are set at each stage. However, the obtained data might not be utilized...... efficiently, even if this information may reveal significant knowledge about process dynamics or ongoing phenomena. When studying the process data, it may be important to analyse the data in the light of the physical or time-wise development of each process step. In this paper, a unified approach to analyse...... multivariate multi-step processes, where results from each step are used to evaluate future results, is presented. The methods presented are based on Priority PLS Regression. The basic idea is to compute the weights in the regression analysis for given steps, but adjust all data by the resulting score vectors...

  20. A review of multivariate analyses in imaging genetics

    Directory of Open Access Journals (Sweden)

    Jingyu eLiu

    2014-03-01

    Full Text Available Recent advances in neuroimaging technology and molecular genetics provide the unique opportunity to investigate genetic influence on the variation of brain attributes. Since the year 2000, when the initial publication on brain imaging and genetics was released, imaging genetics has been a rapidly growing research approach with increasing publications every year. Several reviews have been offered to the research community focusing on various study designs. In addition to study design, analytic tools and their proper implementation are also critical to the success of a study. In this review, we survey recent publications using data from neuroimaging and genetics, focusing on methods capturing multivariate effects accommodating the large number of variables from both imaging data and genetic data. We group the analyses of genetic or genomic data into either a prior driven or data driven approach, including gene-set enrichment analysis, multifactor dimensionality reduction, principal component analysis, independent component analysis (ICA, and clustering. For the analyses of imaging data, ICA and extensions of ICA are the most widely used multivariate methods. Given detailed reviews of multivariate analyses of imaging data available elsewhere, we provide a brief summary here that includes a recently proposed method known as independent vector analysis. Finally, we review methods focused on bridging the imaging and genetic data by establishing multivariate and multiple genotype-phenotype associations, including sparse partial least squares, sparse canonical correlation analysis, sparse reduced rank regression and parallel ICA. These methods are designed to extract latent variables from both genetic and imaging data, which become new genotypes and phenotypes, and the links between the new genotype-phenotype pairs are maximized using different cost functions. The relationship between these methods along with their assumptions, advantages, and

  1. Hierarchically organized layout for visualization of biochemical pathways.

    Science.gov (United States)

    Tsay, Jyh-Jong; Wu, Bo-Liang; Jeng, Yu-Sen

    2010-01-01

    Many complex pathways are described as hierarchical structures in which a pathway is recursively partitioned into several sub-pathways, and organized hierarchically as a tree. The hierarchical structure provides a natural way to visualize the global structure of a complex pathway. However, none of the previous research on pathway visualization explores the hierarchical structures provided by many complex pathways. In this paper, we aim to develop algorithms that can take advantages of hierarchical structures, and give layouts that explore the global structures as well as local structures of pathways. We present a new hierarchically organized layout algorithm to produce layouts for hierarchically organized pathways. Our algorithm first decomposes a complex pathway into sub-pathway groups along the hierarchical organization, and then partition each sub-pathway group into basic components. It then applies conventional layout algorithms, such as hierarchical layout and force-directed layout, to compute the layout of each basic component. Finally, component layouts are joined to form a final layout of the pathway. Our main contribution is the development of algorithms for decomposing pathways and joining layouts. Experiment shows that our algorithm is able to give comprehensible visualization for pathways with hierarchies, cycles as well as complex structures. It clearly renders the global component structures as well as the local structure in each component. In addition, it runs very fast, and gives better visualization for many examples from previous related research. 2009 Elsevier B.V. All rights reserved.

  2. Hierarchical screening for multiple mental disorders.

    Science.gov (United States)

    Batterham, Philip J; Calear, Alison L; Sunderland, Matthew; Carragher, Natacha; Christensen, Helen; Mackinnon, Andrew J

    2013-10-01

    There is a need for brief, accurate screening when assessing multiple mental disorders. Two-stage hierarchical screening, consisting of brief pre-screening followed by a battery of disorder-specific scales for those who meet diagnostic criteria, may increase the efficiency of screening without sacrificing precision. This study tested whether more efficient screening could be gained using two-stage hierarchical screening than by administering multiple separate tests. Two Australian adult samples (N=1990) with high rates of psychopathology were recruited using Facebook advertising to examine four methods of hierarchical screening for four mental disorders: major depressive disorder, generalised anxiety disorder, panic disorder and social phobia. Using K6 scores to determine whether full screening was required did not increase screening efficiency. However, pre-screening based on two decision tree approaches or item gating led to considerable reductions in the mean number of items presented per disorder screened, with estimated item reductions of up to 54%. The sensitivity of these hierarchical methods approached 100% relative to the full screening battery. Further testing of the hierarchical screening approach based on clinical criteria and in other samples is warranted. The results demonstrate that a two-phase hierarchical approach to screening multiple mental disorders leads to considerable increases efficiency gains without reducing accuracy. Screening programs should take advantage of prescreeners based on gating items or decision trees to reduce the burden on respondents. © 2013 Elsevier B.V. All rights reserved.

  3. Self-assembled biomimetic superhydrophobic hierarchical arrays.

    Science.gov (United States)

    Yang, Hongta; Dou, Xuan; Fang, Yin; Jiang, Peng

    2013-09-01

    Here, we report a simple and inexpensive bottom-up technology for fabricating superhydrophobic coatings with hierarchical micro-/nano-structures, which are inspired by the binary periodic structure found on the superhydrophobic compound eyes of some insects (e.g., mosquitoes and moths). Binary colloidal arrays consisting of exemplary large (4 and 30 μm) and small (300 nm) silica spheres are first assembled by a scalable Langmuir-Blodgett (LB) technology in a layer-by-layer manner. After surface modification with fluorosilanes, the self-assembled hierarchical particle arrays become superhydrophobic with an apparent water contact angle (CA) larger than 150°. The throughput of the resulting superhydrophobic coatings with hierarchical structures can be significantly improved by templating the binary periodic structures of the LB-assembled colloidal arrays into UV-curable fluoropolymers by a soft lithography approach. Superhydrophobic perfluoroether acrylate hierarchical arrays with large CAs and small CA hysteresis can be faithfully replicated onto various substrates. Both experiments and theoretical calculations based on the Cassie's dewetting model demonstrate the importance of the hierarchical structure in achieving the final superhydrophobic surface states. Copyright © 2013 Elsevier Inc. All rights reserved.

  4. Effects of univariate and multivariate regression on the accuracy of hydrogen quantification with laser-induced breakdown spectroscopy

    Science.gov (United States)

    Ytsma, Cai R.; Dyar, M. Darby

    2018-01-01

    Hydrogen (H) is a critical element to measure on the surface of Mars because its presence in mineral structures is indicative of past hydrous conditions. The Curiosity rover uses the laser-induced breakdown spectrometer (LIBS) on the ChemCam instrument to analyze rocks for their H emission signal at 656.6 nm, from which H can be quantified. Previous LIBS calibrations for H used small data sets measured on standards and/or manufactured mixtures of hydrous minerals and rocks and applied univariate regression to spectra normalized in a variety of ways. However, matrix effects common to LIBS make these calibrations of limited usefulness when applied to the broad range of compositions on the Martian surface. In this study, 198 naturally-occurring hydrous geological samples covering a broad range of bulk compositions with directly-measured H content are used to create more robust prediction models for measuring H in LIBS data acquired under Mars conditions. Both univariate and multivariate prediction models, including partial least square (PLS) and the least absolute shrinkage and selection operator (Lasso), are compared using several different methods for normalization of H peak intensities. Data from the ChemLIBS Mars-analog spectrometer at Mount Holyoke College are compared against spectra from the same samples acquired using a ChemCam-like instrument at Los Alamos National Laboratory and the ChemCam instrument on Mars. Results show that all current normalization and data preprocessing variations for quantifying H result in models with statistically indistinguishable prediction errors (accuracies) ca. ± 1.5 weight percent (wt%) H2O, limiting the applications of LIBS in these implementations for geological studies. This error is too large to allow distinctions among the most common hydrous phases (basalts, amphiboles, micas) to be made, though some clays (e.g., chlorites with ≈ 12 wt% H2O, smectites with 15-20 wt% H2O) and hydrated phases (e.g., gypsum with ≈ 20

  5. Hierarchically structured, nitrogen-doped carbon membranes

    KAUST Repository

    Wang, Hong

    2017-08-03

    The present invention is a structure, method of making and method of use for a novel macroscopic hierarchically structured, nitrogen-doped, nano-porous carbon membrane (HNDCMs) with asymmetric and hierarchical pore architecture that can be produced on a large-scale approach. The unique HNDCM holds great promise as components in separation and advanced carbon devices because they could offer unconventional fluidic transport phenomena on the nanoscale. Overall, the invention set forth herein covers a hierarchically structured, nitrogen-doped carbon membranes and methods of making and using such a membranes.

  6. Hierarchical Rhetorical Sentence Categorization for Scientific Papers

    Science.gov (United States)

    Rachman, G. H.; Khodra, M. L.; Widyantoro, D. H.

    2018-03-01

    Important information in scientific papers can be composed of rhetorical sentences that is structured from certain categories. To get this information, text categorization should be conducted. Actually, some works in this task have been completed by employing word frequency, semantic similarity words, hierarchical classification, and the others. Therefore, this paper aims to present the rhetorical sentence categorization from scientific paper by employing TF-IDF and Word2Vec to capture word frequency and semantic similarity words and employing hierarchical classification. Every experiment is tested in two classifiers, namely Naïve Bayes and SVM Linear. This paper shows that hierarchical classifier is better than flat classifier employing either TF-IDF or Word2Vec, although it increases only almost 2% from 27.82% when using flat classifier until 29.61% when using hierarchical classifier. It shows also different learning model for child-category can be built by hierarchical classifier.

  7. A multivariate multilevel Gaussian model with a mixed effects structure in the mean and covariance part.

    Science.gov (United States)

    Li, Baoyue; Bruyneel, Luk; Lesaffre, Emmanuel

    2014-05-20

    A traditional Gaussian hierarchical model assumes a nested multilevel structure for the mean and a constant variance at each level. We propose a Bayesian multivariate multilevel factor model that assumes a multilevel structure for both the mean and the covariance matrix. That is, in addition to a multilevel structure for the mean we also assume that the covariance matrix depends on covariates and random effects. This allows to explore whether the covariance structure depends on the values of the higher levels and as such models heterogeneity in the variances and correlation structure of the multivariate outcome across the higher level values. The approach is applied to the three-dimensional vector of burnout measurements collected on nurses in a large European study to answer the research question whether the covariance matrix of the outcomes depends on recorded system-level features in the organization of nursing care, but also on not-recorded factors that vary with countries, hospitals, and nursing units. Simulations illustrate the performance of our modeling approach. Copyright © 2013 John Wiley & Sons, Ltd.

  8. Intermediate and advanced topics in multilevel logistic regression analysis.

    Science.gov (United States)

    Austin, Peter C; Merlo, Juan

    2017-09-10

    Multilevel data occur frequently in health services, population and public health, and epidemiologic research. In such research, binary outcomes are common. Multilevel logistic regression models allow one to account for the clustering of subjects within clusters of higher-level units when estimating the effect of subject and cluster characteristics on subject outcomes. A search of the PubMed database demonstrated that the use of multilevel or hierarchical regression models is increasing rapidly. However, our impression is that many analysts simply use multilevel regression models to account for the nuisance of within-cluster homogeneity that is induced by clustering. In this article, we describe a suite of analyses that can complement the fitting of multilevel logistic regression models. These ancillary analyses permit analysts to estimate the marginal or population-average effect of covariates measured at the subject and cluster level, in contrast to the within-cluster or cluster-specific effects arising from the original multilevel logistic regression model. We describe the interval odds ratio and the proportion of opposed odds ratios, which are summary measures of effect for cluster-level covariates. We describe the variance partition coefficient and the median odds ratio which are measures of components of variance and heterogeneity in outcomes. These measures allow one to quantify the magnitude of the general contextual effect. We describe an R 2 measure that allows analysts to quantify the proportion of variation explained by different multilevel logistic regression models. We illustrate the application and interpretation of these measures by analyzing mortality in patients hospitalized with a diagnosis of acute myocardial infarction. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd. © 2017 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.

  9. Hierarchical Recurrent Neural Hashing for Image Retrieval With Hierarchical Convolutional Features.

    Science.gov (United States)

    Lu, Xiaoqiang; Chen, Yaxiong; Li, Xuelong

    Hashing has been an important and effective technology in image retrieval due to its computational efficiency and fast search speed. The traditional hashing methods usually learn hash functions to obtain binary codes by exploiting hand-crafted features, which cannot optimally represent the information of the sample. Recently, deep learning methods can achieve better performance, since deep learning architectures can learn more effective image representation features. However, these methods only use semantic features to generate hash codes by shallow projection but ignore texture details. In this paper, we proposed a novel hashing method, namely hierarchical recurrent neural hashing (HRNH), to exploit hierarchical recurrent neural network to generate effective hash codes. There are three contributions of this paper. First, a deep hashing method is proposed to extensively exploit both spatial details and semantic information, in which, we leverage hierarchical convolutional features to construct image pyramid representation. Second, our proposed deep network can exploit directly convolutional feature maps as input to preserve the spatial structure of convolutional feature maps. Finally, we propose a new loss function that considers the quantization error of binarizing the continuous embeddings into the discrete binary codes, and simultaneously maintains the semantic similarity and balanceable property of hash codes. Experimental results on four widely used data sets demonstrate that the proposed HRNH can achieve superior performance over other state-of-the-art hashing methods.Hashing has been an important and effective technology in image retrieval due to its computational efficiency and fast search speed. The traditional hashing methods usually learn hash functions to obtain binary codes by exploiting hand-crafted features, which cannot optimally represent the information of the sample. Recently, deep learning methods can achieve better performance, since deep

  10. Identifying Interacting Genetic Variations by Fish-Swarm Logic Regression

    Science.gov (United States)

    Yang, Aiyuan; Yan, Chunxia; Zhu, Feng; Zhao, Zhongmeng; Cao, Zhi

    2013-01-01

    Understanding associations between genotypes and complex traits is a fundamental problem in human genetics. A major open problem in mapping phenotypes is that of identifying a set of interacting genetic variants, which might contribute to complex traits. Logic regression (LR) is a powerful multivariant association tool. Several LR-based approaches have been successfully applied to different datasets. However, these approaches are not adequate with regard to accuracy and efficiency. In this paper, we propose a new LR-based approach, called fish-swarm logic regression (FSLR), which improves the logic regression process by incorporating swarm optimization. In our approach, a school of fish agents are conducted in parallel. Each fish agent holds a regression model, while the school searches for better models through various preset behaviors. A swarm algorithm improves the accuracy and the efficiency by speeding up the convergence and preventing it from dropping into local optimums. We apply our approach on a real screening dataset and a series of simulation scenarios. Compared to three existing LR-based approaches, our approach outperforms them by having lower type I and type II error rates, being able to identify more preset causal sites, and performing at faster speeds. PMID:23984382

  11. Identifying Interacting Genetic Variations by Fish-Swarm Logic Regression

    Directory of Open Access Journals (Sweden)

    Xuanping Zhang

    2013-01-01

    Full Text Available Understanding associations between genotypes and complex traits is a fundamental problem in human genetics. A major open problem in mapping phenotypes is that of identifying a set of interacting genetic variants, which might contribute to complex traits. Logic regression (LR is a powerful multivariant association tool. Several LR-based approaches have been successfully applied to different datasets. However, these approaches are not adequate with regard to accuracy and efficiency. In this paper, we propose a new LR-based approach, called fish-swarm logic regression (FSLR, which improves the logic regression process by incorporating swarm optimization. In our approach, a school of fish agents are conducted in parallel. Each fish agent holds a regression model, while the school searches for better models through various preset behaviors. A swarm algorithm improves the accuracy and the efficiency by speeding up the convergence and preventing it from dropping into local optimums. We apply our approach on a real screening dataset and a series of simulation scenarios. Compared to three existing LR-based approaches, our approach outperforms them by having lower type I and type II error rates, being able to identify more preset causal sites, and performing at faster speeds.

  12. Prosthetic alignment after total knee replacement is not associated with dissatisfaction or change in Oxford Knee Score: A multivariable regression analysis.

    Science.gov (United States)

    Huijbregts, Henricus J T A M; Khan, Riaz J K; Fick, Daniel P; Jarrett, Olivia M; Haebich, Samantha

    2016-06-01

    Approximately 18% of the patients are dissatisfied with the result of total knee replacement. However, the relation between dissatisfaction and prosthetic alignment has not been investigated before. We retrospectively analysed prospectively gathered data of all patients who had a primary TKR, preoperative and one-year postoperative Oxford Knee Scores (OKS) and postoperative computed tomography (CT). The CT protocol measures hip-knee-ankle (HKA) angle, and coronal, sagittal and axial component alignment. Satisfaction was defined using a five-item Likert scale. We dichotomised dissatisfaction by combining '(very) dissatisfied' and 'neutral/not sure'. Associations with dissatisfaction and change in OKS were calculated using multivariable logistic and linear regression models. 230 TKRs were implanted in 105 men and 106 women. At one year, 12% were (very) dissatisfied and 10% neutral. Coronal alignment of the femoral component was 0.5 degrees more accurate in patients who were satisfied at one year. The other alignment measurements were not different between satisfied and dissatisfied patients. All radiographic measurements had a P-value>0.10 on univariate analyses. At one year, dissatisfaction was associated with the three-months OKS. Change in OKS was associated with three-months OKS, preoperative physical SF-12, preoperative pain and cruciate retaining design. Neither mechanical axis, nor component alignment, is associated with dissatisfaction at one year following TKR. Patients get the best outcome when pain reduction and function improvement are optimal during the first three months and when the indication to embark on surgery is based on physical limitations rather than on a high pain score. 2. Copyright © 2016 Elsevier B.V. All rights reserved.

  13. Zeolitic materials with hierarchical porous structures.

    Science.gov (United States)

    Lopez-Orozco, Sofia; Inayat, Amer; Schwab, Andreas; Selvam, Thangaraj; Schwieger, Wilhelm

    2011-06-17

    During the past several years, different kinds of hierarchical structured zeolitic materials have been synthesized due to their highly attractive properties, such as superior mass/heat transfer characteristics, lower restriction of the diffusion of reactants in the mesopores, and low pressure drop. Our contribution provides general information regarding types and preparation methods of hierarchical zeolitic materials and their relative advantages and disadvantages. Thereafter, recent advances in the preparation and characterization of hierarchical zeolitic structures within the crystallites by post-synthetic treatment methods, such as dealumination or desilication; and structured devices by in situ and ex situ zeolite coatings on open-cellular ceramic foams as (non-reactive as well as reactive) supports are highlighted. Specific advantages of using hierarchical zeolitic catalysts/structures in selected catalytic reactions, such as benzene to phenol (BTOP) and methanol to olefins (MTO) are presented. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  14. Processing of hierarchical syntactic structure in music.

    Science.gov (United States)

    Koelsch, Stefan; Rohrmeier, Martin; Torrecuso, Renzo; Jentschke, Sebastian

    2013-09-17

    Hierarchical structure with nested nonlocal dependencies is a key feature of human language and can be identified theoretically in most pieces of tonal music. However, previous studies have argued against the perception of such structures in music. Here, we show processing of nonlocal dependencies in music. We presented chorales by J. S. Bach and modified versions in which the hierarchical structure was rendered irregular whereas the local structure was kept intact. Brain electric responses differed between regular and irregular hierarchical structures, in both musicians and nonmusicians. This finding indicates that, when listening to music, humans apply cognitive processes that are capable of dealing with long-distance dependencies resulting from hierarchically organized syntactic structures. Our results reveal that a brain mechanism fundamental for syntactic processing is engaged during the perception of music, indicating that processing of hierarchical structure with nested nonlocal dependencies is not just a key component of human language, but a multidomain capacity of human cognition.

  15. Bayesian semiparametric regression models to characterize molecular evolution

    Directory of Open Access Journals (Sweden)

    Datta Saheli

    2012-10-01

    Full Text Available Abstract Background Statistical models and methods that associate changes in the physicochemical properties of amino acids with natural selection at the molecular level typically do not take into account the correlations between such properties. We propose a Bayesian hierarchical regression model with a generalization of the Dirichlet process prior on the distribution of the regression coefficients that describes the relationship between the changes in amino acid distances and natural selection in protein-coding DNA sequence alignments. Results The Bayesian semiparametric approach is illustrated with simulated data and the abalone lysin sperm data. Our method identifies groups of properties which, for this particular dataset, have a similar effect on evolution. The model also provides nonparametric site-specific estimates for the strength of conservation of these properties. Conclusions The model described here is distinguished by its ability to handle a large number of amino acid properties simultaneously, while taking into account that such data can be correlated. The multi-level clustering ability of the model allows for appealing interpretations of the results in terms of properties that are roughly equivalent from the standpoint of molecular evolution.

  16. Hierarchical Nanoceramics for Industrial Process Sensors

    Energy Technology Data Exchange (ETDEWEB)

    Ruud, James, A.; Brosnan, Kristen, H.; Striker, Todd; Ramaswamy, Vidya; Aceto, Steven, C.; Gao, Yan; Willson, Patrick, D.; Manoharan, Mohan; Armstrong, Eric, N., Wachsman, Eric, D.; Kao, Chi-Chang

    2011-07-15

    This project developed a robust, tunable, hierarchical nanoceramics materials platform for industrial process sensors in harsh-environments. Control of material structure at multiple length scales from nano to macro increased the sensing response of the materials to combustion gases. These materials operated at relatively high temperatures, enabling detection close to the source of combustion. It is anticipated that these materials can form the basis for a new class of sensors enabling widespread use of efficient combustion processes with closed loop feedback control in the energy-intensive industries. The first phase of the project focused on materials selection and process development, leading to hierarchical nanoceramics that were evaluated for sensing performance. The second phase focused on optimizing the materials processes and microstructures, followed by validation of performance of a prototype sensor in a laboratory combustion environment. The objectives of this project were achieved by: (1) synthesizing and optimizing hierarchical nanostructures; (2) synthesizing and optimizing sensing nanomaterials; (3) integrating sensing functionality into hierarchical nanostructures; (4) demonstrating material performance in a sensing element; and (5) validating material performance in a simulated service environment. The project developed hierarchical nanoceramic electrodes for mixed potential zirconia gas sensors with increased surface area and demonstrated tailored electrocatalytic activity operable at high temperatures enabling detection of products of combustion such as NOx close to the source of combustion. Methods were developed for synthesis of hierarchical nanostructures with high, stable surface area, integrated catalytic functionality within the structures for gas sensing, and demonstrated materials performance in harsh lab and combustion gas environments.

  17. Multivariate statistical methods a primer

    CERN Document Server

    Manly, Bryan FJ

    2004-01-01

    THE MATERIAL OF MULTIVARIATE ANALYSISExamples of Multivariate DataPreview of Multivariate MethodsThe Multivariate Normal DistributionComputer ProgramsGraphical MethodsChapter SummaryReferencesMATRIX ALGEBRAThe Need for Matrix AlgebraMatrices and VectorsOperations on MatricesMatrix InversionQuadratic FormsEigenvalues and EigenvectorsVectors of Means and Covariance MatricesFurther Reading Chapter SummaryReferencesDISPLAYING MULTIVARIATE DATAThe Problem of Displaying Many Variables in Two DimensionsPlotting index VariablesThe Draftsman's PlotThe Representation of Individual Data P:ointsProfiles o

  18. Simple and Multivariate Relationships Between Spiritual Intelligence with General Health and Happiness.

    Science.gov (United States)

    Amirian, Mohammad-Elyas; Fazilat-Pour, Masoud

    2016-08-01

    The present study examined simple and multivariate relationships of spiritual intelligence with general health and happiness. The employed method was descriptive and correlational. King's Spiritual Quotient scales, GHQ-28 and Oxford Happiness Inventory, are filled out by a sample consisted of 384 students, which were selected using stratified random sampling from the students of Shahid Bahonar University of Kerman. Data are subjected to descriptive and inferential statistics including correlations and multivariate regressions. Bivariate correlations support positive and significant predictive value of spiritual intelligence toward general health and happiness. Further analysis showed that among the Spiritual Intelligence' subscales, Existential Critical Thinking Predicted General Health and Happiness, reversely. In addition, happiness was positively predicted by generation of personal meaning and transcendental awareness. The findings are discussed in line with the previous studies and the relevant theoretical background.

  19. COMPARISON OF PARTIAL LEAST SQUARES REGRESSION METHOD ALGORITHMS: NIPALS AND PLS-KERNEL AND AN APPLICATION

    Directory of Open Access Journals (Sweden)

    ELİF BULUT

    2013-06-01

    Full Text Available Partial Least Squares Regression (PLSR is a multivariate statistical method that consists of partial least squares and multiple linear regression analysis. Explanatory variables, X, having multicollinearity are reduced to components which explain the great amount of covariance between explanatory and response variable. These components are few in number and they don’t have multicollinearity problem. Then multiple linear regression analysis is applied to those components to model the response variable Y. There are various PLSR algorithms. In this study NIPALS and PLS-Kernel algorithms will be studied and illustrated on a real data set.

  20. The Case for a Hierarchical Cosmology

    Science.gov (United States)

    Vaucouleurs, G. de

    1970-01-01

    The development of modern theoretical cosmology is presented and some questionable assumptions of orthodox cosmology are pointed out. Suggests that recent observations indicate that hierarchical clustering is a basic factor in cosmology. The implications of hierarchical models of the universe are considered. Bibliography. (LC)

  1. Classification using Hierarchical Naive Bayes models

    DEFF Research Database (Denmark)

    Langseth, Helge; Dyhre Nielsen, Thomas

    2006-01-01

    Classification problems have a long history in the machine learning literature. One of the simplest, and yet most consistently well-performing set of classifiers is the Naïve Bayes models. However, an inherent problem with these classifiers is the assumption that all attributes used to describe......, termed Hierarchical Naïve Bayes models. Hierarchical Naïve Bayes models extend the modeling flexibility of Naïve Bayes models by introducing latent variables to relax some of the independence statements in these models. We propose a simple algorithm for learning Hierarchical Naïve Bayes models...

  2. Improved Adhesion and Compliancy of Hierarchical Fibrillar Adhesives.

    Science.gov (United States)

    Li, Yasong; Gates, Byron D; Menon, Carlo

    2015-08-05

    The gecko relies on van der Waals forces to cling onto surfaces with a variety of topography and composition. The hierarchical fibrillar structures on their climbing feet, ranging from mesoscale to nanoscale, are hypothesized to be key elements for the animal to conquer both smooth and rough surfaces. An epoxy-based artificial hierarchical fibrillar adhesive was prepared to study the influence of the hierarchical structures on the properties of a dry adhesive. The presented experiments highlight the advantages of a hierarchical structure despite a reduction of overall density and aspect ratio of nanofibrils. In contrast to an adhesive containing only nanometer-size fibrils, the hierarchical fibrillar adhesives exhibited a higher adhesion force and better compliancy when tested on an identical substrate.

  3. Penalized regression procedures for variable selection in the potential outcomes framework.

    Science.gov (United States)

    Ghosh, Debashis; Zhu, Yeying; Coffman, Donna L

    2015-05-10

    A recent topic of much interest in causal inference is model selection. In this article, we describe a framework in which to consider penalized regression approaches to variable selection for causal effects. The framework leads to a simple 'impute, then select' class of procedures that is agnostic to the type of imputation algorithm as well as penalized regression used. It also clarifies how model selection involves a multivariate regression model for causal inference problems and that these methods can be applied for identifying subgroups in which treatment effects are homogeneous. Analogies and links with the literature on machine learning methods, missing data, and imputation are drawn. A difference least absolute shrinkage and selection operator algorithm is defined, along with its multiple imputation analogs. The procedures are illustrated using a well-known right-heart catheterization dataset. Copyright © 2015 John Wiley & Sons, Ltd.

  4. Handbook of univariate and multivariate data analysis with IBM SPSS

    CERN Document Server

    Ho, Robert

    2013-01-01

    Using the same accessible, hands-on approach as its best-selling predecessor, the Handbook of Univariate and Multivariate Data Analysis with IBM SPSS, Second Edition explains how to apply statistical tests to experimental findings, identify the assumptions underlying the tests, and interpret the findings. This second edition now covers more topics and has been updated with the SPSS statistical package for Windows.New to the Second EditionThree new chapters on multiple discriminant analysis, logistic regression, and canonical correlationNew section on how to deal with missing dataCoverage of te

  5. Fast Detection of Copper Content in Rice by Laser-Induced Breakdown Spectroscopy with Uni- and Multivariate Analysis

    Directory of Open Access Journals (Sweden)

    Fei Liu

    2018-02-01

    Full Text Available Fast detection of heavy metals is very important for ensuring the quality and safety of crops. Laser-induced breakdown spectroscopy (LIBS, coupled with uni- and multivariate analysis, was applied for quantitative analysis of copper in three kinds of rice (Jiangsu rice, regular rice, and Simiao rice. For univariate analysis, three pre-processing methods were applied to reduce fluctuations, including background normalization, the internal standard method, and the standard normal variate (SNV. Linear regression models showed a strong correlation between spectral intensity and Cu content, with an R 2 more than 0.97. The limit of detection (LOD was around 5 ppm, lower than the tolerance limit of copper in foods. For multivariate analysis, partial least squares regression (PLSR showed its advantage in extracting effective information for prediction, and its sensitivity reached 1.95 ppm, while support vector machine regression (SVMR performed better in both calibration and prediction sets, where R c 2 and R p 2 reached 0.9979 and 0.9879, respectively. This study showed that LIBS could be considered as a constructive tool for the quantification of copper contamination in rice.

  6. Comparative multivariate analyses of transient otoacoustic emissions and distorsion products in normal and impaired hearing.

    Science.gov (United States)

    Stamate, Mirela Cristina; Todor, Nicolae; Cosgarea, Marcel

    2015-01-01

    The clinical utility of otoacoustic emissions as a noninvasive objective test of cochlear function has been long studied. Both transient otoacoustic emissions and distorsion products can be used to identify hearing loss, but to what extent they can be used as predictors for hearing loss is still debated. Most studies agree that multivariate analyses have better test performances than univariate analyses. The aim of the study was to determine transient otoacoustic emissions and distorsion products performance in identifying normal and impaired hearing loss, using the pure tone audiogram as a gold standard procedure and different multivariate statistical approaches. The study included 105 adult subjects with normal hearing and hearing loss who underwent the same test battery: pure-tone audiometry, tympanometry, otoacoustic emission tests. We chose to use the logistic regression as a multivariate statistical technique. Three logistic regression models were developed to characterize the relations between different risk factors (age, sex, tinnitus, demographic features, cochlear status defined by otoacoustic emissions) and hearing status defined by pure-tone audiometry. The multivariate analyses allow the calculation of the logistic score, which is a combination of the inputs, weighted by coefficients, calculated within the analyses. The accuracy of each model was assessed using receiver operating characteristics curve analysis. We used the logistic score to generate receivers operating curves and to estimate the areas under the curves in order to compare different multivariate analyses. We compared the performance of each otoacoustic emission (transient, distorsion product) using three different multivariate analyses for each ear, when multi-frequency gold standards were used. We demonstrated that all multivariate analyses provided high values of the area under the curve proving the performance of the otoacoustic emissions. Each otoacoustic emission test presented high

  7. Bayesian Poisson hierarchical models for crash data analysis: Investigating the impact of model choice on site-specific predictions.

    Science.gov (United States)

    Khazraee, S Hadi; Johnson, Valen; Lord, Dominique

    2018-08-01

    The Poisson-gamma (PG) and Poisson-lognormal (PLN) regression models are among the most popular means for motor vehicle crash data analysis. Both models belong to the Poisson-hierarchical family of models. While numerous studies have compared the overall performance of alternative Bayesian Poisson-hierarchical models, little research has addressed the impact of model choice on the expected crash frequency prediction at individual sites. This paper sought to examine whether there are any trends among candidate models predictions e.g., that an alternative model's prediction for sites with certain conditions tends to be higher (or lower) than that from another model. In addition to the PG and PLN models, this research formulated a new member of the Poisson-hierarchical family of models: the Poisson-inverse gamma (PIGam). Three field datasets (from Texas, Michigan and Indiana) covering a wide range of over-dispersion characteristics were selected for analysis. This study demonstrated that the model choice can be critical when the calibrated models are used for prediction at new sites, especially when the data are highly over-dispersed. For all three datasets, the PIGam model would predict higher expected crash frequencies than would the PLN and PG models, in order, indicating a clear link between the models predictions and the shape of their mixing distributions (i.e., gamma, lognormal, and inverse gamma, respectively). The thicker tail of the PIGam and PLN models (in order) may provide an advantage when the data are highly over-dispersed. The analysis results also illustrated a major deficiency of the Deviance Information Criterion (DIC) in comparing the goodness-of-fit of hierarchical models; models with drastically different set of coefficients (and thus predictions for new sites) may yield similar DIC values, because the DIC only accounts for the parameters in the lowest (observation) level of the hierarchy and ignores the higher levels (regression coefficients

  8. Regression Methods for Virtual Metrology of Layer Thickness in Chemical Vapor Deposition

    DEFF Research Database (Denmark)

    Purwins, Hendrik; Barak, Bernd; Nagi, Ahmed

    2014-01-01

    The quality of wafer production in semiconductor manufacturing cannot always be monitored by a costly physical measurement. Instead of measuring a quantity directly, it can be predicted by a regression method (Virtual Metrology). In this paper, a survey on regression methods is given to predict...... average Silicon Nitride cap layer thickness for the Plasma Enhanced Chemical Vapor Deposition (PECVD) dual-layer metal passivation stack process. Process and production equipment Fault Detection and Classification (FDC) data are used as predictor variables. Various variable sets are compared: one most...... algorithm, and Support Vector Regression (SVR). On a test set, SVR outperforms the other methods by a large margin, being more robust towards changes in the production conditions. The method performs better on high-dimensional multivariate input data than on the most predictive variables alone. Process...

  9. Linear regression analysis: part 14 of a series on evaluation of scientific publications.

    Science.gov (United States)

    Schneider, Astrid; Hommel, Gerhard; Blettner, Maria

    2010-11-01

    Regression analysis is an important statistical method for the analysis of medical data. It enables the identification and characterization of relationships among multiple factors. It also enables the identification of prognostically relevant risk factors and the calculation of risk scores for individual prognostication. This article is based on selected textbooks of statistics, a selective review of the literature, and our own experience. After a brief introduction of the uni- and multivariable regression models, illustrative examples are given to explain what the important considerations are before a regression analysis is performed, and how the results should be interpreted. The reader should then be able to judge whether the method has been used correctly and interpret the results appropriately. The performance and interpretation of linear regression analysis are subject to a variety of pitfalls, which are discussed here in detail. The reader is made aware of common errors of interpretation through practical examples. Both the opportunities for applying linear regression analysis and its limitations are presented.

  10. A Bayesian hierarchical model for demand curve analysis.

    Science.gov (United States)

    Ho, Yen-Yi; Nhu Vo, Tien; Chu, Haitao; Luo, Xianghua; Le, Chap T

    2018-07-01

    Drug self-administration experiments are a frequently used approach to assessing the abuse liability and reinforcing property of a compound. It has been used to assess the abuse liabilities of various substances such as psychomotor stimulants and hallucinogens, food, nicotine, and alcohol. The demand curve generated from a self-administration study describes how demand of a drug or non-drug reinforcer varies as a function of price. With the approval of the 2009 Family Smoking Prevention and Tobacco Control Act, demand curve analysis provides crucial evidence to inform the US Food and Drug Administration's policy on tobacco regulation, because it produces several important quantitative measurements to assess the reinforcing strength of nicotine. The conventional approach popularly used to analyze the demand curve data is individual-specific non-linear least square regression. The non-linear least square approach sets out to minimize the residual sum of squares for each subject in the dataset; however, this one-subject-at-a-time approach does not allow for the estimation of between- and within-subject variability in a unified model framework. In this paper, we review the existing approaches to analyze the demand curve data, non-linear least square regression, and the mixed effects regression and propose a new Bayesian hierarchical model. We conduct simulation analyses to compare the performance of these three approaches and illustrate the proposed approaches in a case study of nicotine self-administration in rats. We present simulation results and discuss the benefits of using the proposed approaches.

  11. Learning with hierarchical-deep models.

    Science.gov (United States)

    Salakhutdinov, Ruslan; Tenenbaum, Joshua B; Torralba, Antonio

    2013-08-01

    We introduce HD (or “Hierarchical-Deep”) models, a new compositional learning architecture that integrates deep learning models with structured hierarchical Bayesian (HB) models. Specifically, we show how we can learn a hierarchical Dirichlet process (HDP) prior over the activities of the top-level features in a deep Boltzmann machine (DBM). This compound HDP-DBM model learns to learn novel concepts from very few training example by learning low-level generic features, high-level features that capture correlations among low-level features, and a category hierarchy for sharing priors over the high-level features that are typical of different kinds of concepts. We present efficient learning and inference algorithms for the HDP-DBM model and show that it is able to learn new concepts from very few examples on CIFAR-100 object recognition, handwritten character recognition, and human motion capture datasets.

  12. Permutation Tests of Hierarchical Cluster Analyses of Carrion Communities and Their Potential Use in Forensic Entomology.

    Science.gov (United States)

    van der Ham, Joris L

    2016-05-19

    Forensic entomologists can use carrion communities' ecological succession data to estimate the postmortem interval (PMI). Permutation tests of hierarchical cluster analyses of these data provide a conceptual method to estimate part of the PMI, the post-colonization interval (post-CI). This multivariate approach produces a baseline of statistically distinct clusters that reflect changes in the carrion community composition during the decomposition process. Carrion community samples of unknown post-CIs are compared with these baseline clusters to estimate the post-CI. In this short communication, I use data from previously published studies to demonstrate the conceptual feasibility of this multivariate approach. Analyses of these data produce series of significantly distinct clusters, which represent carrion communities during 1- to 20-day periods of the decomposition process. For 33 carrion community samples, collected over an 11-day period, this approach correctly estimated the post-CI within an average range of 3.1 days. © The Authors 2016. Published by Oxford University Press on behalf of Entomological Society of America. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  13. A spline-based regression parameter set for creating customized DARTEL MRI brain templates from infancy to old age.

    Science.gov (United States)

    Wilke, Marko

    2018-02-01

    This dataset contains the regression parameters derived by analyzing segmented brain MRI images (gray matter and white matter) from a large population of healthy subjects, using a multivariate adaptive regression splines approach. A total of 1919 MRI datasets ranging in age from 1-75 years from four publicly available datasets (NIH, C-MIND, fCONN, and IXI) were segmented using the CAT12 segmentation framework, writing out gray matter and white matter images normalized using an affine-only spatial normalization approach. These images were then subjected to a six-step DARTEL procedure, employing an iterative non-linear registration approach and yielding increasingly crisp intermediate images. The resulting six datasets per tissue class were then analyzed using multivariate adaptive regression splines, using the CerebroMatic toolbox. This approach allows for flexibly modelling smoothly varying trajectories while taking into account demographic (age, gender) as well as technical (field strength, data quality) predictors. The resulting regression parameters described here can be used to generate matched DARTEL or SHOOT templates for a given population under study, from infancy to old age. The dataset and the algorithm used to generate it are publicly available at https://irc.cchmc.org/software/cerebromatic.php.

  14. Admissible Estimators in the General Multivariate Linear Model with Respect to Inequality Restricted Parameter Set

    Directory of Open Access Journals (Sweden)

    Shangli Zhang

    2009-01-01

    Full Text Available By using the methods of linear algebra and matrix inequality theory, we obtain the characterization of admissible estimators in the general multivariate linear model with respect to inequality restricted parameter set. In the classes of homogeneous and general linear estimators, the necessary and suffcient conditions that the estimators of regression coeffcient function are admissible are established.

  15. Synchrotron-Based Microspectroscopic Analysis of Molecular and Biopolymer Structures Using Multivariate Techniques and Advanced Multi-Components Modeling

    International Nuclear Information System (INIS)

    Yu, P.

    2008-01-01

    More recently, advanced synchrotron radiation-based bioanalytical technique (SRFTIRM) has been applied as a novel non-invasive analysis tool to study molecular, functional group and biopolymer chemistry, nutrient make-up and structural conformation in biomaterials. This novel synchrotron technique, taking advantage of bright synchrotron light (which is million times brighter than sunlight), is capable of exploring the biomaterials at molecular and cellular levels. However, with the synchrotron RFTIRM technique, a large number of molecular spectral data are usually collected. The objective of this article was to illustrate how to use two multivariate statistical techniques: (1) agglomerative hierarchical cluster analysis (AHCA) and (2) principal component analysis (PCA) and two advanced multicomponent modeling methods: (1) Gaussian and (2) Lorentzian multi-component peak modeling for molecular spectrum analysis of bio-tissues. The studies indicated that the two multivariate analyses (AHCA, PCA) are able to create molecular spectral corrections by including not just one intensity or frequency point of a molecular spectrum, but by utilizing the entire spectral information. Gaussian and Lorentzian modeling techniques are able to quantify spectral omponent peaks of molecular structure, functional group and biopolymer. By application of these four statistical methods of the multivariate techniques and Gaussian and Lorentzian modeling, inherent molecular structures, functional group and biopolymer onformation between and among biological samples can be quantified, discriminated and classified with great efficiency.

  16. Evaluation of the efficiency of continuous wavelet transform as processing and preprocessing algorithm for resolution of overlapped signals in univariate and multivariate regression analyses; an application to ternary and quaternary mixtures

    Science.gov (United States)

    Hegazy, Maha A.; Lotfy, Hayam M.; Mowaka, Shereen; Mohamed, Ekram Hany

    2016-07-01

    Wavelets have been adapted for a vast number of signal-processing applications due to the amount of information that can be extracted from a signal. In this work, a comparative study on the efficiency of continuous wavelet transform (CWT) as a signal processing tool in univariate regression and a pre-processing tool in multivariate analysis using partial least square (CWT-PLS) was conducted. These were applied to complex spectral signals of ternary and quaternary mixtures. CWT-PLS method succeeded in the simultaneous determination of a quaternary mixture of drotaverine (DRO), caffeine (CAF), paracetamol (PAR) and p-aminophenol (PAP, the major impurity of paracetamol). While, the univariate CWT failed to simultaneously determine the quaternary mixture components and was able to determine only PAR and PAP, the ternary mixtures of DRO, CAF, and PAR and CAF, PAR, and PAP. During the calculations of CWT, different wavelet families were tested. The univariate CWT method was validated according to the ICH guidelines. While for the development of the CWT-PLS model a calibration set was prepared by means of an orthogonal experimental design and their absorption spectra were recorded and processed by CWT. The CWT-PLS model was constructed by regression between the wavelet coefficients and concentration matrices and validation was performed by both cross validation and external validation sets. Both methods were successfully applied for determination of the studied drugs in pharmaceutical formulations.

  17. Hierarchical analysis of acceptable use policies

    Directory of Open Access Journals (Sweden)

    P. A. Laughton

    2008-01-01

    Full Text Available Acceptable use policies (AUPs are vital tools for organizations to protect themselves and their employees from misuse of computer facilities provided. A well structured, thorough AUP is essential for any organization. It is impossible for an effective AUP to deal with every clause and remain readable. For this reason, some sections of an AUP carry more weight than others, denoting importance. The methodology used to develop the hierarchical analysis is a literature review, where various sources were consulted. This hierarchical approach to AUP analysis attempts to highlight important sections and clauses dealt with in an AUP. The emphasis of the hierarchal analysis is to prioritize the objectives of an AUP.

  18. Ensemble preprocessing of near-infrared (NIR) spectra for multivariate calibration

    International Nuclear Information System (INIS)

    Xu Lu; Zhou Yanping; Tang Lijuan; Wu Hailong; Jiang Jianhui; Shen Guoli; Yu Ruqin

    2008-01-01

    Preprocessing of raw near-infrared (NIR) spectral data is indispensable in multivariate calibration when the measured spectra are subject to significant noises, baselines and other undesirable factors. However, due to the lack of sufficient prior information and an incomplete knowledge of the raw data, NIR spectra preprocessing in multivariate calibration is still trial and error. How to select a proper method depends largely on both the nature of the data and the expertise and experience of the practitioners. This might limit the applications of multivariate calibration in many fields, where researchers are not very familiar with the characteristics of many preprocessing methods unique in chemometrics and have difficulties to select the most suitable methods. Another problem is many preprocessing methods, when used alone, might degrade the data in certain aspects or lose some useful information while improving certain qualities of the data. In order to tackle these problems, this paper proposes a new concept of data preprocessing, ensemble preprocessing method, where partial least squares (PLSs) models built on differently preprocessed data are combined by Monte Carlo cross validation (MCCV) stacked regression. Little or no prior information of the data and expertise are required. Moreover, fusion of complementary information obtained by different preprocessing methods often leads to a more stable and accurate calibration model. The investigation of two real data sets has demonstrated the advantages of the proposed method

  19. Virtual timers in hierarchical real-time systems

    NARCIS (Netherlands)

    Heuvel, van den M.M.H.P.; Holenderski, M.J.; Cools, W.A.; Bril, R.J.; Lukkien, J.J.; Zhu, D.

    2009-01-01

    Hierarchical scheduling frameworks (HSFs) provide means for composing complex real-time systems from welldefined subsystems. This paper describes an approach to provide hierarchically scheduled real-time applications with virtual event timers, motivated by the need for integrating priority

  20. Multivariate NIR studies of seed-water interaction in Scots Pine Seeds (Pinus sylvestris L.)

    OpenAIRE

    Lestander, Torbjörn

    2003-01-01

    This thesis describes seed-water interaction using near infrared (NIR) spectroscopy, multivariate regression models and Scots pine seeds. The presented research covers classification of seed viability, prediction of seed moisture content, selection of NIR wavelengths and interpretation of seed-water interaction modelled and analysed by principal component analysis, ordinary least squares (OLS), partial least squares (PLS), bi-orthogonal least squares (BPLS) and genetic algorithms. The potenti...

  1. Hierarchical modeling and analysis for spatial data

    CERN Document Server

    Banerjee, Sudipto; Gelfand, Alan E

    2003-01-01

    Among the many uses of hierarchical modeling, their application to the statistical analysis of spatial and spatio-temporal data from areas such as epidemiology And environmental science has proven particularly fruitful. Yet to date, the few books that address the subject have been either too narrowly focused on specific aspects of spatial analysis, or written at a level often inaccessible to those lacking a strong background in mathematical statistics.Hierarchical Modeling and Analysis for Spatial Data is the first accessible, self-contained treatment of hierarchical methods, modeling, and dat

  2. Multivariate statistics exercises and solutions

    CERN Document Server

    Härdle, Wolfgang Karl

    2015-01-01

    The authors present tools and concepts of multivariate data analysis by means of exercises and their solutions. The first part is devoted to graphical techniques. The second part deals with multivariate random variables and presents the derivation of estimators and tests for various practical situations. The last part introduces a wide variety of exercises in applied multivariate data analysis. The book demonstrates the application of simple calculus and basic multivariate methods in real life situations. It contains altogether more than 250 solved exercises which can assist a university teacher in setting up a modern multivariate analysis course. All computer-based exercises are available in the R language. All R codes and data sets may be downloaded via the quantlet download center  www.quantlet.org or via the Springer webpage. For interactive display of low-dimensional projections of a multivariate data set, we recommend GGobi.

  3. A climate-based multivariate extreme emulator of met-ocean-hydrological events for coastal flooding

    Science.gov (United States)

    Camus, Paula; Rueda, Ana; Mendez, Fernando J.; Tomas, Antonio; Del Jesus, Manuel; Losada, Iñigo J.

    2015-04-01

    Atmosphere-ocean general circulation models (AOGCMs) are useful to analyze large-scale climate variability (long-term historical periods, future climate projections). However, applications such as coastal flood modeling require climate information at finer scale. Besides, flooding events depend on multiple climate conditions: waves, surge levels from the open-ocean and river discharge caused by precipitation. Therefore, a multivariate statistical downscaling approach is adopted to reproduce relationships between variables and due to its low computational cost. The proposed method can be considered as a hybrid approach which combines a probabilistic weather type downscaling model with a stochastic weather generator component. Predictand distributions are reproduced modeling the relationship with AOGCM predictors based on a physical division in weather types (Camus et al., 2012). The multivariate dependence structure of the predictand (extreme events) is introduced linking the independent marginal distributions of the variables by a probabilistic copula regression (Ben Ayala et al., 2014). This hybrid approach is applied for the downscaling of AOGCM data to daily precipitation and maximum significant wave height and storm-surge in different locations along the Spanish coast. Reanalysis data is used to assess the proposed method. A commonly predictor for the three variables involved is classified using a regression-guided clustering algorithm. The most appropriate statistical model (general extreme value distribution, pareto distribution) for daily conditions is fitted. Stochastic simulation of the present climate is performed obtaining the set of hydraulic boundary conditions needed for high resolution coastal flood modeling. References: Camus, P., Menéndez, M., Méndez, F.J., Izaguirre, C., Espejo, A., Cánovas, V., Pérez, J., Rueda, A., Losada, I.J., Medina, R. (2014b). A weather-type statistical downscaling framework for ocean wave climate. Journal of

  4. Effect of Nonalcoholic Fatty Liver Disease on In-Hospital and Long-Term Outcomes in Patients With ST-Segment Elevation Myocardial Infarction.

    Science.gov (United States)

    Keskin, Muhammed; Hayıroğlu, Mert İlker; Uzun, Ahmet Okan; Güvenç, Tolga Sinan; Şahin, Sinan; Kozan, Ömer

    2017-11-15

    Nonalcoholic fatty liver disease (NAFLD) is a risk factor for coronary artery disease. We investigated the effect of NAFLD grade on in-hospital and long-term outcomes in patients with ST-segment elevation myocardial infarction (STEMI). The study group consisted of 360 patients with STEMI. The patients were classified according to the grade of the NAFLD using ultrasonography. Based on this classification, all patients were divided into 4 subgroups as grade 0 (no fatty liver disease), grade 1, grade 2, and grade 3. Hierarchical logistic regression and Cox proportional regression analysis were used to establish the relation between NAFLD grade and outcomes. In-hospital mortality for grade 0, 1, 2, and 3 NAFLDs were 4.7%, 8.3%, 11.3%, and 33.9%, respectively. Three-year mortality for grade 0, 1, 2, and 3 NAFLDs were 5.6%, 7.8%, 9.5%, and 33.3%, respectively. In the multivariable hierarchical logistic regression analysis, in-hospital mortality risks were higher for patients with grade 3 NAFLD (odds ratio 4.2). In a multivariable Cox proportional regression analysis, the mortality risk was higher for patients with grade 3 NAFLD (hazard ratio 4.0). In conclusion, in patients with STEMI, the presence of NAFLD is associated with unfavorable clinical outcomes. Among these patients, grade 3 NAFLD had the highest mortality rates. The present study supports NAFLD screening in patients with STEMI. Copyright © 2017 Elsevier Inc. All rights reserved.

  5. Econometric analysis of realized covariation: high frequency based covariance, regression, and correlation in financial economics

    DEFF Research Database (Denmark)

    Barndorff-Nielsen, Ole Eiler; Shephard, N.

    2004-01-01

    This paper analyses multivariate high frequency financial data using realized covariation. We provide a new asymptotic distribution theory for standard methods such as regression, correlation analysis, and covariance. It will be based on a fixed interval of time (e.g., a day or week), allowing...... the number of high frequency returns during this period to go to infinity. Our analysis allows us to study how high frequency correlations, regressions, and covariances change through time. In particular we provide confidence intervals for each of these quantities....

  6. Hierarchical organization versus self-organization

    OpenAIRE

    Busseniers, Evo

    2014-01-01

    In this paper we try to define the difference between hierarchical organization and self-organization. Organization is defined as a structure with a function. So we can define the difference between hierarchical organization and self-organization both on the structure as on the function. In the next two chapters these two definitions are given. For the structure we will use some existing definitions in graph theory, for the function we will use existing theory on (self-)organization. In the t...

  7. Differentiating regressed melanoma from regressed lichenoid keratosis.

    Science.gov (United States)

    Chan, Aegean H; Shulman, Kenneth J; Lee, Bonnie A

    2017-04-01

    Distinguishing regressed lichen planus-like keratosis (LPLK) from regressed melanoma can be difficult on histopathologic examination, potentially resulting in mismanagement of patients. We aimed to identify histopathologic features by which regressed melanoma can be differentiated from regressed LPLK. Twenty actively inflamed LPLK, 12 LPLK with regression and 15 melanomas with regression were compared and evaluated by hematoxylin and eosin staining as well as Melan-A, microphthalmia transcription factor (MiTF) and cytokeratin (AE1/AE3) immunostaining. (1) A total of 40% of regressed melanomas showed complete or near complete loss of melanocytes within the epidermis with Melan-A and MiTF immunostaining, while 8% of regressed LPLK exhibited this finding. (2) Necrotic keratinocytes were seen in the epidermis in 33% regressed melanomas as opposed to all of the regressed LPLK. (3) A dense infiltrate of melanophages in the papillary dermis was seen in 40% of regressed melanomas, a feature not seen in regressed LPLK. In summary, our findings suggest that a complete or near complete loss of melanocytes within the epidermis strongly favors a regressed melanoma over a regressed LPLK. In addition, necrotic epidermal keratinocytes and the presence of a dense band-like distribution of dermal melanophages can be helpful in differentiating these lesions. © 2016 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  8. The process and utility of classification and regression tree methodology in nursing research.

    Science.gov (United States)

    Kuhn, Lisa; Page, Karen; Ward, John; Worrall-Carter, Linda

    2014-06-01

    This paper presents a discussion of classification and regression tree analysis and its utility in nursing research. Classification and regression tree analysis is an exploratory research method used to illustrate associations between variables not suited to traditional regression analysis. Complex interactions are demonstrated between covariates and variables of interest in inverted tree diagrams. Discussion paper. English language literature was sourced from eBooks, Medline Complete and CINAHL Plus databases, Google and Google Scholar, hard copy research texts and retrieved reference lists for terms including classification and regression tree* and derivatives and recursive partitioning from 1984-2013. Classification and regression tree analysis is an important method used to identify previously unknown patterns amongst data. Whilst there are several reasons to embrace this method as a means of exploratory quantitative research, issues regarding quality of data as well as the usefulness and validity of the findings should be considered. Classification and regression tree analysis is a valuable tool to guide nurses to reduce gaps in the application of evidence to practice. With the ever-expanding availability of data, it is important that nurses understand the utility and limitations of the research method. Classification and regression tree analysis is an easily interpreted method for modelling interactions between health-related variables that would otherwise remain obscured. Knowledge is presented graphically, providing insightful understanding of complex and hierarchical relationships in an accessible and useful way to nursing and other health professions. © 2013 The Authors. Journal of Advanced Nursing Published by John Wiley & Sons Ltd.

  9. Multivariate covariance generalized linear models

    DEFF Research Database (Denmark)

    Bonat, W. H.; Jørgensen, Bent

    2016-01-01

    are fitted by using an efficient Newton scoring algorithm based on quasi-likelihood and Pearson estimating functions, using only second-moment assumptions. This provides a unified approach to a wide variety of types of response variables and covariance structures, including multivariate extensions......We propose a general framework for non-normal multivariate data analysis called multivariate covariance generalized linear models, designed to handle multivariate response variables, along with a wide range of temporal and spatial correlation structures defined in terms of a covariance link...... function combined with a matrix linear predictor involving known matrices. The method is motivated by three data examples that are not easily handled by existing methods. The first example concerns multivariate count data, the second involves response variables of mixed types, combined with repeated...

  10. On logistic regression analysis of dichotomized responses.

    Science.gov (United States)

    Lu, Kaifeng

    2017-01-01

    We study the properties of treatment effect estimate in terms of odds ratio at the study end point from logistic regression model adjusting for the baseline value when the underlying continuous repeated measurements follow a multivariate normal distribution. Compared with the analysis that does not adjust for the baseline value, the adjusted analysis produces a larger treatment effect as well as a larger standard error. However, the increase in standard error is more than offset by the increase in treatment effect so that the adjusted analysis is more powerful than the unadjusted analysis for detecting the treatment effect. On the other hand, the true adjusted odds ratio implied by the normal distribution of the underlying continuous variable is a function of the baseline value and hence is unlikely to be able to be adequately represented by a single value of adjusted odds ratio from the logistic regression model. In contrast, the risk difference function derived from the logistic regression model provides a reasonable approximation to the true risk difference function implied by the normal distribution of the underlying continuous variable over the range of the baseline distribution. We show that different metrics of treatment effect have similar statistical power when evaluated at the baseline mean. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  11. Graph-theoretic measures of multivariate association and prediction

    International Nuclear Information System (INIS)

    Friedman, J.H.; Rafsky, L.C.

    1983-01-01

    Interpoint-distance-based graphs can be used to define measures of association that extend Kendall's notion of a generalized correlation coefficient. The authors present particular statistics that provide distribution-free tests of independence sensitive to alternatives involving non-monotonic relationships. Moreover, since ordering plays no essential role, the ideas that fully applicable in a multivariate setting. The authors also define an asymmetric coefficient measuring the extent to which (a vector) X can be used to make single-valued predictions of (a vector) Y. The authors discuss various techniques for proving that such statistics are asymptotically normal. As an example of the effectiveness of their approach, the authors present an application to the examination of residuals from multiple regression. 18 references, 2 figures, 1 table

  12. Deliberate change without hierarchical influence?

    DEFF Research Database (Denmark)

    Nørskov, Sladjana; Kesting, Peter; Ulhøi, John Parm

    2017-01-01

    reveals that deliberate change is indeed achievable in a non-hierarchical collaborative OSS community context. However, it presupposes the presence and active involvement of informal change agents. The paper identifies and specifies four key drivers for change agents’ influence. Originality....../value The findings contribute to organisational analysis by providing a deeper understanding of the importance of leadership in making deliberate change possible in non-hierarchical settings. It points to the importance of “change-by-conviction”, essentially based on voluntary behaviour. This can open the door...

  13. Multiparty hierarchical quantum-information splitting

    International Nuclear Information System (INIS)

    Wang Xinwen; Zhang Dengyu; Tang Shiqing; Xie Lijun

    2011-01-01

    We propose a scheme for multiparty hierarchical quantum-information splitting (QIS) with a multipartite entangled state, where a boss distributes a secret quantum state to two grades of agents asymmetrically. The agents who belong to different grades have different authorities for recovering the boss's secret. Except for the boss's Bell-state measurement, no nonlocal operation is involved. The presented scheme is also shown to be secure against eavesdropping. Such a hierarchical QIS is expected to find useful applications in the field of modern multipartite quantum cryptography.

  14. Biased trapping issue on weighted hierarchical networks

    Indian Academy of Sciences (India)

    archical networks which are based on the classic scale-free hierarchical networks. ... Weighted hierarchical networks; weight-dependent walks; mean first passage ..... The weighted networks can mimic some real-world natural and social systems to ... the Priority Academic Program Development of Jiangsu Higher Education ...

  15. Economic viability in concrete dams by multivariable regression tool for implantation of small hydroelectric plants

    International Nuclear Information System (INIS)

    Lima, Reginaldo Agapito de; Ribeiro Junior, Leopoldo Uberto

    2010-01-01

    For implantation of a SHP, the barrage is the main structure where its sizing represents from 30% - 50% of general cost of civil works. Considering this it is very important to have a fast, didactic and accurate tool for elaborating a budget, also allowing a quantitative analysis of inherent cost for civil building of barrages concrete made for small hydropower plants. In face of this, the multi changing regression tool is very important as it allows a fast and correct establishing of preliminary costs, even approximate, for estimates of barrages in concrete cost, enabling to ease the budget, guiding feasibility decisions for selecting or neglecting new alternatives of fall. (author)

  16. Introduction to applied Bayesian statistics and estimation for social scientists

    CERN Document Server

    Lynch, Scott M

    2007-01-01

    ""Introduction to Applied Bayesian Statistics and Estimation for Social Scientists"" covers the complete process of Bayesian statistical analysis in great detail from the development of a model through the process of making statistical inference. The key feature of this book is that it covers models that are most commonly used in social science research - including the linear regression model, generalized linear models, hierarchical models, and multivariate regression models - and it thoroughly develops each real-data example in painstaking detail.The first part of the book provides a detailed

  17. Multivariate Statistical Process Control Charts: An Overview

    OpenAIRE

    Bersimis, Sotiris; Psarakis, Stelios; Panaretos, John

    2006-01-01

    In this paper we discuss the basic procedures for the implementation of multivariate statistical process control via control charting. Furthermore, we review multivariate extensions for all kinds of univariate control charts, such as multivariate Shewhart-type control charts, multivariate CUSUM control charts and multivariate EWMA control charts. In addition, we review unique procedures for the construction of multivariate control charts, based on multivariate statistical techniques such as p...

  18. Anti-hierarchical evolution of the active galactic nucleus space density in a hierarchical universe

    International Nuclear Information System (INIS)

    Enoki, Motohiro; Ishiyama, Tomoaki; Kobayashi, Masakazu A. R.; Nagashima, Masahiro

    2014-01-01

    Recent observations show that the space density of luminous active galactic nuclei (AGNs) peaks at higher redshifts than that of faint AGNs. This downsizing trend in the AGN evolution seems to be contradictory to the hierarchical structure formation scenario. In this study, we present the AGN space density evolution predicted by a semi-analytic model of galaxy and AGN formation based on the hierarchical structure formation scenario. We demonstrate that our model can reproduce the downsizing trend of the AGN space density evolution. The reason for the downsizing trend in our model is a combination of the cold gas depletion as a consequence of star formation, the gas cooling suppression in massive halos, and the AGN lifetime scaling with the dynamical timescale. We assume that a major merger of galaxies causes a starburst, spheroid formation, and cold gas accretion onto a supermassive black hole (SMBH). We also assume that this cold gas accretion triggers AGN activity. Since the cold gas is mainly depleted by star formation and gas cooling is suppressed in massive dark halos, the amount of cold gas accreted onto SMBHs decreases with cosmic time. Moreover, AGN lifetime increases with cosmic time. Thus, at low redshifts, major mergers do not always lead to luminous AGNs. Because the luminosity of AGNs is correlated with the mass of accreted gas onto SMBHs, the space density of luminous AGNs decreases more quickly than that of faint AGNs. We conclude that the anti-hierarchical evolution of the AGN space density is not contradictory to the hierarchical structure formation scenario.

  19. Anti-hierarchical evolution of the active galactic nucleus space density in a hierarchical universe

    Energy Technology Data Exchange (ETDEWEB)

    Enoki, Motohiro [Faculty of Business Administration, Tokyo Keizai University, Kokubunji, Tokyo 185-8502 (Japan); Ishiyama, Tomoaki [Center for Computational Sciences, University of Tsukuba, Tsukuba, Ibaraki 305-8577 (Japan); Kobayashi, Masakazu A. R. [Research Center for Space and Cosmic Evolution, Ehime University, Matsuyama, Ehime 790-8577 (Japan); Nagashima, Masahiro, E-mail: enokimt@tku.ac.jp [Faculty of Education, Nagasaki University, Nagasaki, Nagasaki 852-8521 (Japan)

    2014-10-10

    Recent observations show that the space density of luminous active galactic nuclei (AGNs) peaks at higher redshifts than that of faint AGNs. This downsizing trend in the AGN evolution seems to be contradictory to the hierarchical structure formation scenario. In this study, we present the AGN space density evolution predicted by a semi-analytic model of galaxy and AGN formation based on the hierarchical structure formation scenario. We demonstrate that our model can reproduce the downsizing trend of the AGN space density evolution. The reason for the downsizing trend in our model is a combination of the cold gas depletion as a consequence of star formation, the gas cooling suppression in massive halos, and the AGN lifetime scaling with the dynamical timescale. We assume that a major merger of galaxies causes a starburst, spheroid formation, and cold gas accretion onto a supermassive black hole (SMBH). We also assume that this cold gas accretion triggers AGN activity. Since the cold gas is mainly depleted by star formation and gas cooling is suppressed in massive dark halos, the amount of cold gas accreted onto SMBHs decreases with cosmic time. Moreover, AGN lifetime increases with cosmic time. Thus, at low redshifts, major mergers do not always lead to luminous AGNs. Because the luminosity of AGNs is correlated with the mass of accreted gas onto SMBHs, the space density of luminous AGNs decreases more quickly than that of faint AGNs. We conclude that the anti-hierarchical evolution of the AGN space density is not contradictory to the hierarchical structure formation scenario.

  20. Methods of Multivariate Analysis

    CERN Document Server

    Rencher, Alvin C

    2012-01-01

    Praise for the Second Edition "This book is a systematic, well-written, well-organized text on multivariate analysis packed with intuition and insight . . . There is much practical wisdom in this book that is hard to find elsewhere."-IIE Transactions Filled with new and timely content, Methods of Multivariate Analysis, Third Edition provides examples and exercises based on more than sixty real data sets from a wide variety of scientific fields. It takes a "methods" approach to the subject, placing an emphasis on how students and practitioners can employ multivariate analysis in real-life sit

  1. Continuous multivariate exponential extension

    International Nuclear Information System (INIS)

    Block, H.W.

    1975-01-01

    The Freund-Weinman multivariate exponential extension is generalized to the case of nonidentically distributed marginal distributions. A fatal shock model is given for the resulting distribution. Results in the bivariate case and the concept of constant multivariate hazard rate lead to a continuous distribution related to the multivariate exponential distribution (MVE) of Marshall and Olkin. This distribution is shown to be a special case of the extended Freund-Weinman distribution. A generalization of the bivariate model of Proschan and Sullo leads to a distribution which contains both the extended Freund-Weinman distribution and the MVE

  2. Advanced multivariate data evaluation for Fourier transform infrared spectroscopy

    International Nuclear Information System (INIS)

    Diewok, J.

    2002-12-01

    The objective of the presented dissertation was the evaluation, application and further development of advanced multivariate data evaluation methods for qualitative and quantitative Fourier transform infrared (FT-IR) measurements, especially of aqueous samples. The focus was set on 'evolving systems'; i.e. chemical systems that change gradually with a master variable, such as pH, reaction time, elution time, etc. and that are increasingly encountered in analytical chemistry. FT-IR measurements on such systems yield 2-way and 3-way data sets, i.e. data matrices and cubes. The chemometric methods used were soft-modeling techniques, like multivariate curve resolution - alternating least squares (MCR-ALS) or principal component analysis (PCA), hard modeling of equilibrium systems and two-dimensional correlation spectroscopy (2D-CoS). The research results are presented in six publications and comprise: A new combination of FT-IR flow titrations and second-order calibration by MCR-ALS for the quantitative analysis of mixture samples of organic acids and sugars. A novel combination of MCR-ALS with a hard-modeled equilibrium constraint for second-order quantitation in pH-modulated samples where analytes and interferences show very similar acid-base behavior. A detailed study in which MCR-ALS and 2D-CoS are directly compared for the first time. From the analysis of simulated and experimental acid-base equilibrium systems, the performance and interpretability of the two methods is evaluated. Investigation of the binding process of vancomycin, an important antibiotic, to a cell wall analogue tripeptide by time-resolved FT-IR spectroscopy and detailed chemometric evaluation. Determination of red wine constituents by liquid chromatography with FT-IR detection and MCR-ALS for resolution of overlapped peaks. Classification of red wine cultivars from FT-IR spectroscopy of phenolic wine extracts with hierarchical clustering and soft independent modeling of class analogy (SIMCA

  3. Enforcing Co-expression Within a Brain-Imaging Genomics Regression Framework.

    Science.gov (United States)

    Zille, Pascal; Calhoun, Vince D; Wang, Yu-Ping

    2017-06-28

    Among the challenges arising in brain imaging genetic studies, estimating the potential links between neurological and genetic variability within a population is key. In this work, we propose a multivariate, multimodal formulation for variable selection that leverages co-expression patterns across various data modalities. Our approach is based on an intuitive combination of two widely used statistical models: sparse regression and canonical correlation analysis (CCA). While the former seeks multivariate linear relationships between a given phenotype and associated observations, the latter searches to extract co-expression patterns between sets of variables belonging to different modalities. In the following, we propose to rely on a 'CCA-type' formulation in order to regularize the classical multimodal sparse regression problem (essentially incorporating both CCA and regression models within a unified formulation). The underlying motivation is to extract discriminative variables that are also co-expressed across modalities. We first show that the simplest formulation of such model can be expressed as a special case of collaborative learning methods. After discussing its limitation, we propose an extended, more flexible formulation, and introduce a simple and efficient alternating minimization algorithm to solve the associated optimization problem.We explore the parameter space and provide some guidelines regarding parameter selection. Both the original and extended versions are then compared on a simple toy dataset and a more advanced simulated imaging genomics dataset in order to illustrate the benefits of the latter. Finally, we validate the proposed formulation using single nucleotide polymorphisms (SNP) data and functional magnetic resonance imaging (fMRI) data from a population of adolescents (n = 362 subjects, age 16.9 ± 1.9 years from the Philadelphia Neurodevelopmental Cohort) for the study of learning ability. Furthermore, we carry out a significance

  4. Multivariate Formation Pressure Prediction with Seismic-derived Petrophysical Properties from Prestack AVO inversion and Poststack Seismic Motion Inversion

    Science.gov (United States)

    Yu, H.; Gu, H.

    2017-12-01

    A novel multivariate seismic formation pressure prediction methodology is presented, which incorporates high-resolution seismic velocity data from prestack AVO inversion, and petrophysical data (porosity and shale volume) derived from poststack seismic motion inversion. In contrast to traditional seismic formation prediction methods, the proposed methodology is based on a multivariate pressure prediction model and utilizes a trace-by-trace multivariate regression analysis on seismic-derived petrophysical properties to calibrate model parameters in order to make accurate predictions with higher resolution in both vertical and lateral directions. With prestack time migration velocity as initial velocity model, an AVO inversion was first applied to prestack dataset to obtain high-resolution seismic velocity with higher frequency that is to be used as the velocity input for seismic pressure prediction, and the density dataset to calculate accurate Overburden Pressure (OBP). Seismic Motion Inversion (SMI) is an inversion technique based on Markov Chain Monte Carlo simulation. Both structural variability and similarity of seismic waveform are used to incorporate well log data to characterize the variability of the property to be obtained. In this research, porosity and shale volume are first interpreted on well logs, and then combined with poststack seismic data using SMI to build porosity and shale volume datasets for seismic pressure prediction. A multivariate effective stress model is used to convert velocity, porosity and shale volume datasets to effective stress. After a thorough study of the regional stratigraphic and sedimentary characteristics, a regional normally compacted interval model is built, and then the coefficients in the multivariate prediction model are determined in a trace-by-trace multivariate regression analysis on the petrophysical data. The coefficients are used to convert velocity, porosity and shale volume datasets to effective stress and then

  5. Determinants of LSIL Regression in Women from a Colombian Cohort

    International Nuclear Information System (INIS)

    Molano, Monica; Gonzalez, Mauricio; Gamboa, Oscar; Ortiz, Natasha; Luna, Joaquin; Hernandez, Gustavo; Posso, Hector; Murillo, Raul; Munoz, Nubia

    2010-01-01

    Objective: To analyze the role of Human Papillomavirus (HPV) and other risk factors in the regression of cervical lesions in women from the Bogota Cohort. Methods: 200 HPV positive women with abnormal cytology were included for regression analysis. The time of lesion regression was modeled using methods for interval censored survival time data. Median duration of total follow-up was 9 years. Results: 80 (40%) women were diagnosed with Atypical Squamous Cells of Undetermined Significance (ASCUS) or Atypical Glandular Cells of Undetermined Significance (AGUS) while 120 (60%) were diagnosed with Low Grade Squamous Intra-epithelial Lesions (LSIL). Globally, 40% of the lesions were still present at first year of follow up, while 1.5% was still present at 5 year check-up. The multivariate model showed similar regression rates for lesions in women with ASCUS/AGUS and women with LSIL (HR= 0.82, 95% CI 0.59-1.12). Women infected with HR HPV types and those with mixed infections had lower regression rates for lesions than did women infected with LR types (HR=0.526, 95% CI 0.33-0.84, for HR types and HR=0.378, 95% CI 0.20-0.69, for mixed infections). Furthermore, women over 30 years had a higher lesion regression rate than did women under 30 years (HR1.53, 95% CI 1.03-2.27). The study showed that the median time for lesion regression was 9 months while the median time for HPV clearance was 12 months. Conclusions: In the studied population, the type of infection and the age of the women are critical factors for the regression of cervical lesions.

  6. Using the Logistic Regression model in supporting decisions of establishing marketing strategies

    Directory of Open Access Journals (Sweden)

    Cristinel CONSTANTIN

    2015-12-01

    Full Text Available This paper is about an instrumental research regarding the using of Logistic Regression model for data analysis in marketing research. The decision makers inside different organisation need relevant information to support their decisions regarding the marketing strategies. The data provided by marketing research could be computed in various ways but the multivariate data analysis models can enhance the utility of the information. Among these models we can find the Logistic Regression model, which is used for dichotomous variables. Our research is based on explanation the utility of this model and interpretation of the resulted information in order to help practitioners and researchers to use it in their future investigations

  7. Discovering hierarchical structure in normal relational data

    DEFF Research Database (Denmark)

    Schmidt, Mikkel Nørgaard; Herlau, Tue; Mørup, Morten

    2014-01-01

    -parametric generative model for hierarchical clustering of similarity based on multifurcating Gibbs fragmentation trees. This allows us to infer and display the posterior distribution of hierarchical structures that comply with the data. We demonstrate the utility of our method on synthetic data and data of functional...

  8. Ripening-dependent metabolic changes in the volatiles of pineapple (Ananas comosus (L.) Merr.) fruit: II. Multivariate statistical profiling of pineapple aroma compounds based on comprehensive two-dimensional gas chromatography-mass spectrometry.

    Science.gov (United States)

    Steingass, Christof Björn; Jutzi, Manfred; Müller, Jenny; Carle, Reinhold; Schmarr, Hans-Georg

    2015-03-01

    Ripening-dependent changes of pineapple volatiles were studied in a nontargeted profiling analysis. Volatiles were isolated via headspace solid phase microextraction and analyzed by comprehensive 2D gas chromatography and mass spectrometry (HS-SPME-GC×GC-qMS). Profile patterns presented in the contour plots were evaluated applying image processing techniques and subsequent multivariate statistical data analysis. Statistical methods comprised unsupervised hierarchical cluster analysis (HCA) and principal component analysis (PCA) to classify the samples. Supervised partial least squares discriminant analysis (PLS-DA) and partial least squares (PLS) regression were applied to discriminate different ripening stages and describe the development of volatiles during postharvest storage, respectively. Hereby, substantial chemical markers allowing for class separation were revealed. The workflow permitted the rapid distinction between premature green-ripe pineapples and postharvest-ripened sea-freighted fruits. Volatile profiles of fully ripe air-freighted pineapples were similar to those of green-ripe fruits postharvest ripened for 6 days after simulated sea freight export, after PCA with only two principal components. However, PCA considering also the third principal component allowed differentiation between air-freighted fruits and the four progressing postharvest maturity stages of sea-freighted pineapples.

  9. Multilevel regression models describing regional patterns of invertebrate and algal responses to urbanization across the USA

    Science.gov (United States)

    Cuffney, T.F.; Kashuba, R.; Qian, S.S.; Alameddine, I.; Cha, Y.K.; Lee, B.; Coles, J.F.; McMahon, G.

    2011-01-01

    Multilevel hierarchical regression was used to examine regional patterns in the responses of benthic macroinvertebrates and algae to urbanization across 9 metropolitan areas of the conterminous USA. Linear regressions established that responses (intercepts and slopes) to urbanization of invertebrates and algae varied among metropolitan areas. Multilevel hierarchical regression models were able to explain these differences on the basis of region-scale predictors. Regional differences in the type of land cover (agriculture or forest) being converted to urban and climatic factors (precipitation and air temperature) accounted for the differences in the response of macroinvertebrates to urbanization based on ordination scores, total richness, Ephemeroptera, Plecoptera, Trichoptera richness, and average tolerance. Regional differences in climate and antecedent agriculture also accounted for differences in the responses of salt-tolerant diatoms, but differences in the responses of other diatom metrics (% eutraphenic, % sensitive, and % silt tolerant) were best explained by regional differences in soils (mean % clay soils). The effects of urbanization were most readily detected in regions where forest lands were being converted to urban land because agricultural development significantly degraded assemblages before urbanization and made detection of urban effects difficult. The effects of climatic factors (temperature, precipitation) on background conditions (biogeographic differences) and rates of response to urbanization were most apparent after accounting for the effects of agricultural development. The effects of climate and land cover on responses to urbanization provide strong evidence that monitoring, mitigation, and restoration efforts must be tailored for specific regions and that attainment goals (background conditions) may not be possible in regions with high levels of prior disturbance (e.g., agricultural development). ?? 2011 by The North American

  10. MANCOVA for one way classification with homogeneity of regression coefficient vectors

    Science.gov (United States)

    Mokesh Rayalu, G.; Ravisankar, J.; Mythili, G. Y.

    2017-11-01

    The MANOVA and MANCOVA are the extensions of the univariate ANOVA and ANCOVA techniques to multidimensional or vector valued observations. The assumption of a Gaussian distribution has been replaced with the Multivariate Gaussian distribution for the vectors data and residual term variables in the statistical models of these techniques. The objective of MANCOVA is to determine if there are statistically reliable mean differences that can be demonstrated between groups later modifying the newly created variable. When randomization assignment of samples or subjects to groups is not possible, multivariate analysis of covariance (MANCOVA) provides statistical matching of groups by adjusting dependent variables as if all subjects scored the same on the covariates. In this research article, an extension has been made to the MANCOVA technique with more number of covariates and homogeneity of regression coefficient vectors is also tested.

  11. Road Network Selection Based on Road Hierarchical Structure Control

    Directory of Open Access Journals (Sweden)

    HE Haiwei

    2015-04-01

    Full Text Available A new road network selection method based on hierarchical structure is studied. Firstly, road network is built as strokes which are then classified into hierarchical collections according to the criteria of betweenness centrality value (BC value. Secondly, the hierarchical structure of the strokes is enhanced using structural characteristic identification technique. Thirdly, the importance calculation model was established according to the relationships among the hierarchical structure of the strokes. Finally, the importance values of strokes are got supported with the model's hierarchical calculation, and with which the road network is selected. Tests are done to verify the advantage of this method by comparing it with other common stroke-oriented methods using three kinds of typical road network data. Comparision of the results show that this method had few need to semantic data, and could eliminate the negative influence of edge strokes caused by the criteria of BC value well. So, it is better to maintain the global hierarchical structure of road network, and suitable to meet with the selection of various kinds of road network at the same time.

  12. Hierarchically Structured Electrospun Fibers

    Directory of Open Access Journals (Sweden)

    Nicole E. Zander

    2013-01-01

    Full Text Available Traditional electrospun nanofibers have a myriad of applications ranging from scaffolds for tissue engineering to components of biosensors and energy harvesting devices. The generally smooth one-dimensional structure of the fibers has stood as a limitation to several interesting novel applications. Control of fiber diameter, porosity and collector geometry will be briefly discussed, as will more traditional methods for controlling fiber morphology and fiber mat architecture. The remainder of the review will focus on new techniques to prepare hierarchically structured fibers. Fibers with hierarchical primary structures—including helical, buckled, and beads-on-a-string fibers, as well as fibers with secondary structures, such as nanopores, nanopillars, nanorods, and internally structured fibers and their applications—will be discussed. These new materials with helical/buckled morphology are expected to possess unique optical and mechanical properties with possible applications for negative refractive index materials, highly stretchable/high-tensile-strength materials, and components in microelectromechanical devices. Core-shell type fibers enable a much wider variety of materials to be electrospun and are expected to be widely applied in the sensing, drug delivery/controlled release fields, and in the encapsulation of live cells for biological applications. Materials with a hierarchical secondary structure are expected to provide new superhydrophobic and self-cleaning materials.

  13. Hierarchical control of electron-transfer

    DEFF Research Database (Denmark)

    Westerhoff, Hans V.; Jensen, Peter Ruhdal; Egger, Louis

    1997-01-01

    In this chapter the role of electron transfer in determining the behaviour of the ATP synthesising enzyme in E. coli is analysed. It is concluded that the latter enzyme lacks control because of special properties of the electron transfer components. These properties range from absence of a strong...... back pressure by the protonmotive force on the rate of electron transfer to hierarchical regulation of the expression of the gens that encode the electron transfer proteins as a response to changes in the bioenergetic properties of the cell.The discussion uses Hierarchical Control Analysis...

  14. Bayesian inference for multivariate meta-analysis Box-Cox transformation models for individual patient data with applications to evaluation of cholesterol-lowering drugs.

    Science.gov (United States)

    Kim, Sungduk; Chen, Ming-Hui; Ibrahim, Joseph G; Shah, Arvind K; Lin, Jianxin

    2013-10-15

    In this paper, we propose a class of Box-Cox transformation regression models with multidimensional random effects for analyzing multivariate responses for individual patient data in meta-analysis. Our modeling formulation uses a multivariate normal response meta-analysis model with multivariate random effects, in which each response is allowed to have its own Box-Cox transformation. Prior distributions are specified for the Box-Cox transformation parameters as well as the regression coefficients in this complex model, and the deviance information criterion is used to select the best transformation model. Because the model is quite complex, we develop a novel Monte Carlo Markov chain sampling scheme to sample from the joint posterior of the parameters. This model is motivated by a very rich dataset comprising 26 clinical trials involving cholesterol-lowering drugs where the goal is to jointly model the three-dimensional response consisting of low density lipoprotein cholesterol (LDL-C), high density lipoprotein cholesterol (HDL-C), and triglycerides (TG) (LDL-C, HDL-C, TG). Because the joint distribution of (LDL-C, HDL-C, TG) is not multivariate normal and in fact quite skewed, a Box-Cox transformation is needed to achieve normality. In the clinical literature, these three variables are usually analyzed univariately; however, a multivariate approach would be more appropriate because these variables are correlated with each other. We carry out a detailed analysis of these data by using the proposed methodology. Copyright © 2013 John Wiley & Sons, Ltd.

  15. Bayesian inference for multivariate meta-analysis Box-Cox transformation models for individual patient data with applications to evaluation of cholesterol lowering drugs

    Science.gov (United States)

    Kim, Sungduk; Chen, Ming-Hui; Ibrahim, Joseph G.; Shah, Arvind K.; Lin, Jianxin

    2013-01-01

    In this paper, we propose a class of Box-Cox transformation regression models with multidimensional random effects for analyzing multivariate responses for individual patient data (IPD) in meta-analysis. Our modeling formulation uses a multivariate normal response meta-analysis model with multivariate random effects, in which each response is allowed to have its own Box-Cox transformation. Prior distributions are specified for the Box-Cox transformation parameters as well as the regression coefficients in this complex model, and the Deviance Information Criterion (DIC) is used to select the best transformation model. Since the model is quite complex, a novel Monte Carlo Markov chain (MCMC) sampling scheme is developed to sample from the joint posterior of the parameters. This model is motivated by a very rich dataset comprising 26 clinical trials involving cholesterol lowering drugs where the goal is to jointly model the three dimensional response consisting of Low Density Lipoprotein Cholesterol (LDL-C), High Density Lipoprotein Cholesterol (HDL-C), and Triglycerides (TG) (LDL-C, HDL-C, TG). Since the joint distribution of (LDL-C, HDL-C, TG) is not multivariate normal and in fact quite skewed, a Box-Cox transformation is needed to achieve normality. In the clinical literature, these three variables are usually analyzed univariately: however, a multivariate approach would be more appropriate since these variables are correlated with each other. A detailed analysis of these data is carried out using the proposed methodology. PMID:23580436

  16. Understanding the groundwater dynamics in the Southern Rift Valley Lakes Basin (Ethiopia). Multivariate statistical analysis method, oxygen (δ 18O) and deuterium (δ 2H)

    International Nuclear Information System (INIS)

    Girum Admasu Nadew; Zebene Lakew Tefera

    2013-01-01

    Multivariate statistical analysis is very important to classify waters of different hydrochemical groups. Statistical techniques, such as cluster analysis, can provide a powerful tool for analyzing water chemistry data. This method is used to test water quality data and determine if samples can be grouped into distinct populations that may be significant in the geologic context, as well as from a statistical point of view. Multivariate statistical analysis method is applied to the geochemical data in combination with δ 18 O and δ 2 H isotopes with the objective to understand the dynamics of groundwater using hierarchical clustering and isotope analyses. The geochemical and isotope data of the central and southern rift valley lakes have been collected and analyzed from different works. Isotope analysis shows that most springs and boreholes are recharged by July and August rainfalls. The different hydrochemical groups that resulted from the multivariate analysis are described and correlated with the geology of the area and whether it has any interaction with a system or not. (author)

  17. Diagnostic accuracy of atypical p-ANCA in autoimmune hepatitis using ROC- and multivariate regression analysis.

    Science.gov (United States)

    Terjung, B; Bogsch, F; Klein, R; Söhne, J; Reichel, C; Wasmuth, J-C; Beuers, U; Sauerbruch, T; Spengler, U

    2004-09-29

    Antineutrophil cytoplasmic antibodies (atypical p-ANCA) are detected at high prevalence in sera from patients with autoimmune hepatitis (AIH), but their diagnostic relevance for AIH has not been systematically evaluated so far. Here, we studied sera from 357 patients with autoimmune (autoimmune hepatitis n=175, primary sclerosing cholangitis (PSC) n=35, primary biliary cirrhosis n=45), non-autoimmune chronic liver disease (alcoholic liver cirrhosis n=62; chronic hepatitis C virus infection (HCV) n=21) or healthy controls (n=19) for the presence of various non-organ specific autoantibodies. Atypical p-ANCA, antinuclear antibodies (ANA), antibodies against smooth muscles (SMA), antibodies against liver/kidney microsomes (anti-Lkm1) and antimitochondrial antibodies (AMA) were detected by indirect immunofluorescence microscopy, antibodies against the M2 antigen (anti-M2), antibodies against soluble liver antigen (anti-SLA/LP) and anti-Lkm1 by using enzyme linked immunosorbent assays. To define the diagnostic precision of the autoantibodies, results of autoantibody testing were analyzed by receiver operating characteristics (ROC) and forward conditional logistic regression analysis. Atypical p-ANCA were detected at high prevalence in sera from patients with AIH (81%) and PSC (94%). ROC- and logistic regression analysis revealed atypical p-ANCA and SMA, but not ANA as significant diagnostic seromarkers for AIH (atypical p-ANCA: AUC 0.754+/-0.026, odds ratio [OR] 3.4; SMA: 0.652+/-0.028, OR 4.1). Atypical p-ANCA also emerged as the only diagnostically relevant seromarker for PSC (AUC 0.690+/-0.04, OR 3.4). None of the tested antibodies yielded a significant diagnostic accuracy for patients with alcoholic liver cirrhosis, HCV or healthy controls. Atypical p-ANCA along with SMA represent a seromarker with high diagnostic accuracy for AIH and should be explicitly considered in a revised version of the diagnostic score for AIH.

  18. Multi-objective hierarchical genetic algorithms for multilevel redundancy allocation optimization

    Energy Technology Data Exchange (ETDEWEB)

    Kumar, Ranjan [Department of Aeronautics and Astronautics, Kyoto University, Yoshida-honmachi, Sakyo-ku, Kyoto 606-8501 (Japan)], E-mail: ranjan.k@ks3.ecs.kyoto-u.ac.jp; Izui, Kazuhiro [Department of Aeronautics and Astronautics, Kyoto University, Yoshida-honmachi, Sakyo-ku, Kyoto 606-8501 (Japan)], E-mail: izui@prec.kyoto-u.ac.jp; Yoshimura, Masataka [Department of Aeronautics and Astronautics, Kyoto University, Yoshida-honmachi, Sakyo-ku, Kyoto 606-8501 (Japan)], E-mail: yoshimura@prec.kyoto-u.ac.jp; Nishiwaki, Shinji [Department of Aeronautics and Astronautics, Kyoto University, Yoshida-honmachi, Sakyo-ku, Kyoto 606-8501 (Japan)], E-mail: shinji@prec.kyoto-u.ac.jp

    2009-04-15

    Multilevel redundancy allocation optimization problems (MRAOPs) occur frequently when attempting to maximize the system reliability of a hierarchical system, and almost all complex engineering systems are hierarchical. Despite their practical significance, limited research has been done concerning the solving of simple MRAOPs. These problems are not only NP hard but also involve hierarchical design variables. Genetic algorithms (GAs) have been applied in solving MRAOPs, since they are computationally efficient in solving such problems, unlike exact methods, but their applications has been confined to single-objective formulation of MRAOPs. This paper proposes a multi-objective formulation of MRAOPs and a methodology for solving such problems. In this methodology, a hierarchical GA framework for multi-objective optimization is proposed by introducing hierarchical genotype encoding for design variables. In addition, we implement the proposed approach by integrating the hierarchical genotype encoding scheme with two popular multi-objective genetic algorithms (MOGAs)-the strength Pareto evolutionary genetic algorithm (SPEA2) and the non-dominated sorting genetic algorithm (NSGA-II). In the provided numerical examples, the proposed multi-objective hierarchical approach is applied to solve two hierarchical MRAOPs, a 4- and a 3-level problems. The proposed method is compared with a single-objective optimization method that uses a hierarchical genetic algorithm (HGA), also applied to solve the 3- and 4-level problems. The results show that a multi-objective hierarchical GA (MOHGA) that includes elitism and mechanism for diversity preserving performed better than a single-objective GA that only uses elitism, when solving large-scale MRAOPs. Additionally, the experimental results show that the proposed method with NSGA-II outperformed the proposed method with SPEA2 in finding useful Pareto optimal solution sets.

  19. Multi-objective hierarchical genetic algorithms for multilevel redundancy allocation optimization

    International Nuclear Information System (INIS)

    Kumar, Ranjan; Izui, Kazuhiro; Yoshimura, Masataka; Nishiwaki, Shinji

    2009-01-01

    Multilevel redundancy allocation optimization problems (MRAOPs) occur frequently when attempting to maximize the system reliability of a hierarchical system, and almost all complex engineering systems are hierarchical. Despite their practical significance, limited research has been done concerning the solving of simple MRAOPs. These problems are not only NP hard but also involve hierarchical design variables. Genetic algorithms (GAs) have been applied in solving MRAOPs, since they are computationally efficient in solving such problems, unlike exact methods, but their applications has been confined to single-objective formulation of MRAOPs. This paper proposes a multi-objective formulation of MRAOPs and a methodology for solving such problems. In this methodology, a hierarchical GA framework for multi-objective optimization is proposed by introducing hierarchical genotype encoding for design variables. In addition, we implement the proposed approach by integrating the hierarchical genotype encoding scheme with two popular multi-objective genetic algorithms (MOGAs)-the strength Pareto evolutionary genetic algorithm (SPEA2) and the non-dominated sorting genetic algorithm (NSGA-II). In the provided numerical examples, the proposed multi-objective hierarchical approach is applied to solve two hierarchical MRAOPs, a 4- and a 3-level problems. The proposed method is compared with a single-objective optimization method that uses a hierarchical genetic algorithm (HGA), also applied to solve the 3- and 4-level problems. The results show that a multi-objective hierarchical GA (MOHGA) that includes elitism and mechanism for diversity preserving performed better than a single-objective GA that only uses elitism, when solving large-scale MRAOPs. Additionally, the experimental results show that the proposed method with NSGA-II outperformed the proposed method with SPEA2 in finding useful Pareto optimal solution sets

  20. [Use of multiple regression models in observational studies (1970-2013) and requirements of the STROBE guidelines in Spanish scientific journals].

    Science.gov (United States)

    Real, J; Cleries, R; Forné, C; Roso-Llorach, A; Martínez-Sánchez, J M

    In medicine and biomedical research, statistical techniques like logistic, linear, Cox and Poisson regression are widely known. The main objective is to describe the evolution of multivariate techniques used in observational studies indexed in PubMed (1970-2013), and to check the requirements of the STROBE guidelines in the author guidelines in Spanish journals indexed in PubMed. A targeted PubMed search was performed to identify papers that used logistic linear Cox and Poisson models. Furthermore, a review was also made of the author guidelines of journals published in Spain and indexed in PubMed and Web of Science. Only 6.1% of the indexed manuscripts included a term related to multivariate analysis, increasing from 0.14% in 1980 to 12.3% in 2013. In 2013, 6.7, 2.5, 3.5, and 0.31% of the manuscripts contained terms related to logistic, linear, Cox and Poisson regression, respectively. On the other hand, 12.8% of journals author guidelines explicitly recommend to follow the STROBE guidelines, and 35.9% recommend the CONSORT guideline. A low percentage of Spanish scientific journals indexed in PubMed include the STROBE statement requirement in the author guidelines. Multivariate regression models in published observational studies such as logistic regression, linear, Cox and Poisson are increasingly used both at international level, as well as in journals published in Spanish. Copyright © 2015 Sociedad Española de Médicos de Atención Primaria (SEMERGEN). Publicado por Elsevier España, S.L.U. All rights reserved.

  1. Multivariate analysis of risk factors for long-term urethroplasty outcome.

    Science.gov (United States)

    Breyer, Benjamin N; McAninch, Jack W; Whitson, Jared M; Eisenberg, Michael L; Mehdizadeh, Jennifer F; Myers, Jeremy B; Voelzke, Bryan B

    2010-02-01

    We studied the patient risk factors that promote urethroplasty failure. Records of patients who underwent urethroplasty at the University of California, San Francisco Medical Center between 1995 and 2004 were reviewed. Cox proportional hazards regression analysis was used to identify multivariate predictors of urethroplasty outcome. Between 1995 and 2004, 443 patients of 495 who underwent urethroplasty had complete comorbidity data and were included in analysis. Median patient age was 41 years (range 18 to 90). Median followup was 5.8 years (range 1 month to 10 years). Stricture recurred in 93 patients (21%). Primary estimated stricture-free survival at 1, 3 and 5 years was 88%, 82% and 79%. After multivariate analysis smoking (HR 1.8, 95% CI 1.0-3.1, p = 0.05), prior direct vision internal urethrotomy (HR 1.7, 95% CI 1.0-3.0, p = 0.04) and prior urethroplasty (HR 1.8, 95% CI 1.1-3.1, p = 0.03) were predictive of treatment failure. On multivariate analysis diabetes mellitus showed a trend toward prediction of urethroplasty failure (HR 2.0, 95% CI 0.8-4.9, p = 0.14). Length of urethral stricture (greater than 4 cm), prior urethroplasty and failed endoscopic therapy are predictive of failure after urethroplasty. Smoking and diabetes mellitus also may predict failure potentially secondary to microvascular damage. Copyright 2010 American Urological Association. Published by Elsevier Inc. All rights reserved.

  2. Hierarchical Traces for Reduced NSM Memory Requirements

    Science.gov (United States)

    Dahl, Torbjørn S.

    This paper presents work on using hierarchical long term memory to reduce the memory requirements of nearest sequence memory (NSM) learning, a previously published, instance-based reinforcement learning algorithm. A hierarchical memory representation reduces the memory requirements by allowing traces to share common sub-sequences. We present moderated mechanisms for estimating discounted future rewards and for dealing with hidden state using hierarchical memory. We also present an experimental analysis of how the sub-sequence length affects the memory compression achieved and show that the reduced memory requirements do not effect the speed of learning. Finally, we analyse and discuss the persistence of the sub-sequences independent of specific trace instances.

  3. Hierarchical subtask discovery with non-negative matrix factorization

    CSIR Research Space (South Africa)

    Earle, AC

    2018-04-01

    Full Text Available Hierarchical reinforcement learning methods offer a powerful means of planning flexible behavior in complicated domains. However, learning an appropriate hierarchical decomposition of a domain into subtasks remains a substantial challenge. We...

  4. Hierarchical subtask discovery with non-negative matrix factorization

    CSIR Research Space (South Africa)

    Earle, AC

    2017-08-01

    Full Text Available Hierarchical reinforcement learning methods offer a powerful means of planning flexible behavior in complicated domains. However, learning an appropriate hierarchical decomposition of a domain into subtasks remains a substantial challenge. We...

  5. Statistical Significance for Hierarchical Clustering

    Science.gov (United States)

    Kimes, Patrick K.; Liu, Yufeng; Hayes, D. Neil; Marron, J. S.

    2017-01-01

    Summary Cluster analysis has proved to be an invaluable tool for the exploratory and unsupervised analysis of high dimensional datasets. Among methods for clustering, hierarchical approaches have enjoyed substantial popularity in genomics and other fields for their ability to simultaneously uncover multiple layers of clustering structure. A critical and challenging question in cluster analysis is whether the identified clusters represent important underlying structure or are artifacts of natural sampling variation. Few approaches have been proposed for addressing this problem in the context of hierarchical clustering, for which the problem is further complicated by the natural tree structure of the partition, and the multiplicity of tests required to parse the layers of nested clusters. In this paper, we propose a Monte Carlo based approach for testing statistical significance in hierarchical clustering which addresses these issues. The approach is implemented as a sequential testing procedure guaranteeing control of the family-wise error rate. Theoretical justification is provided for our approach, and its power to detect true clustering structure is illustrated through several simulation studies and applications to two cancer gene expression datasets. PMID:28099990

  6. Hierarchical faunal filters: An approach to assessing effects of habitat and nonnative species on native fishes

    Science.gov (United States)

    Quist, M.C.; Rahel, F.J.; Hubert, W.A.

    2005-01-01

    Understanding factors related to the occurrence of species across multiple spatial and temporal scales is critical to the conservation and management of native fishes, especially for those species at the edge of their natural distribution. We used the concept of hierarchical faunal filters to provide a framework for investigating the influence of habitat characteristics and normative piscivores on the occurrence of 10 native fishes in streams of the North Platte River watershed in Wyoming. Three faunal filters were developed for each species: (i) large-scale biogeographic, (ii) local abiotic, and (iii) biotic. The large-scale biogeographic filter, composed of elevation and stream-size thresholds, was used to determine the boundaries within which each species might be expected to occur. Then, a local abiotic filter (i.e., habitat associations), developed using binary logistic-regression analysis, estimated the probability of occurrence of each species from features such as maximum depth, substrate composition, submergent aquatic vegetation, woody debris, and channel morphology (e.g., amount of pool habitat). Lastly, a biotic faunal filter was developed using binary logistic regression to estimate the probability of occurrence of each species relative to the abundance of nonnative piscivores in a reach. Conceptualising fish assemblages within a framework of hierarchical faunal filters is simple and logical, helps direct conservation and management activities, and provides important information on the ecology of fishes in the western Great Plains of North America. ?? Blackwell Munksgaard, 2004.

  7. Hierarchically Nanoporous Bioactive Glasses for High Efficiency Immobilization of Enzymes

    DEFF Research Database (Denmark)

    He, W.; Min, D.D.; Zhang, X.D.

    2014-01-01

    Bioactive glasses with hierarchical nanoporosity and structures have been heavily involved in immobilization of enzymes. Because of meticulous design and ingenious hierarchical nanostructuration of porosities from yeast cell biotemplates, hierarchically nanostructured porous bioactive glasses can...... and products of catalytic reactions can freely diffuse through open mesopores (2–40 nm). The formation mechanism of hierarchically structured porous bioactive glasses, the immobilization mechanism of enzyme and the catalysis mechanism of immobilized enzyme are then discussed. The novel nanostructure...

  8. [Logistic regression model of noninvasive prediction for portal hypertensive gastropathy in patients with hepatitis B associated cirrhosis].

    Science.gov (United States)

    Wang, Qingliang; Li, Xiaojie; Hu, Kunpeng; Zhao, Kun; Yang, Peisheng; Liu, Bo

    2015-05-12

    To explore the risk factors of portal hypertensive gastropathy (PHG) in patients with hepatitis B associated cirrhosis and establish a Logistic regression model of noninvasive prediction. The clinical data of 234 hospitalized patients with hepatitis B associated cirrhosis from March 2012 to March 2014 were analyzed retrospectively. The dependent variable was the occurrence of PHG while the independent variables were screened by binary Logistic analysis. Multivariate Logistic regression was used for further analysis of significant noninvasive independent variables. Logistic regression model was established and odds ratio was calculated for each factor. The accuracy, sensitivity and specificity of model were evaluated by the curve of receiver operating characteristic (ROC). According to univariate Logistic regression, the risk factors included hepatic dysfunction, albumin (ALB), bilirubin (TB), prothrombin time (PT), platelet (PLT), white blood cell (WBC), portal vein diameter, spleen index, splenic vein diameter, diameter ratio, PLT to spleen volume ratio, esophageal varices (EV) and gastric varices (GV). Multivariate analysis showed that hepatic dysfunction (X1), TB (X2), PLT (X3) and splenic vein diameter (X4) were the major occurring factors for PHG. The established regression model was Logit P=-2.667+2.186X1-2.167X2+0.725X3+0.976X4. The accuracy of model for PHG was 79.1% with a sensitivity of 77.2% and a specificity of 80.8%. Hepatic dysfunction, TB, PLT and splenic vein diameter are risk factors for PHG and the noninvasive predicted Logistic regression model was Logit P=-2.667+2.186X1-2.167X2+0.725X3+0.976X4.

  9. Three Ways to Link Merge with Hierarchical Concept-Combination

    Directory of Open Access Journals (Sweden)

    Chris Thornton

    2016-11-01

    Full Text Available In the Minimalist Program, language competence is seen to stem from a fundamental ability to construct hierarchical structure, an operation dubbed ‘Merge’. This raises the problem of how to view hierarchical concept-combination. This is a conceptual operation which also builds hierarchical structure. We can conceive of a garden that consists of a lawn and a flower-bed, for example, or a salad consisting of lettuce, fennel and rocket, or a crew consisting of a pilot and engineer. In such cases, concepts are put together in a way that makes one the accommodating element with respect to the others taken in combination. The accommodating element becomes the root of a hierarchical unit. Since this unit is itself a concept, the operation is inherently recursive. Does this mean the mind has two independent systems of hierarchical construction? Or is some form of integration more likely? Following a detailed examination of the operations involved, this paper shows there are three main ways in which Merge might be linked to hierarchical concept-combination. Also examined are the architectural implications that arise in each case.

  10. Hierarchical modeling and its numerical implementation for layered thin elastic structures

    Energy Technology Data Exchange (ETDEWEB)

    Cho, Jin-Rae [Hongik University, Sejong (Korea, Republic of)

    2017-05-15

    Thin elastic structures such as beam- and plate-like structures and laminates are characterized by the small thickness, which lead to classical plate and laminate theories in which the displacement fields through the thickness are assumed linear or higher-order polynomials. These classical theories are either insufficient to represent the complex stress variation through the thickness or may encounter the accuracy-computational cost dilemma. In order to overcome the inherent problem of classical theories, the concept of hierarchical modeling has been emerged. In the hierarchical modeling, the hierarchical models with different model levels are selected and combined within a structure domain, in order to make the modeling error be distributed as uniformly as possible throughout the problem domain. The purpose of current study is to explore the potential of hierarchical modeling for the effective numerical analysis of layered structures such as laminated composite. For this goal, the hierarchical models are constructed and the hierarchical modeling is implemented by selectively adjusting the level of hierarchical models. As well, the major characteristics of hierarchical models are investigated through the numerical experiments.

  11. Hierarchical surfaces for enhanced self-cleaning applications

    Science.gov (United States)

    Fernández, Ariadna; Francone, Achille; Thamdrup, Lasse H.; Johansson, Alicia; Bilenberg, Brian; Nielsen, Theodor; Guttmann, Markus; Sotomayor Torres, Clivia M.; Kehagias, Nikolaos

    2017-04-01

    In this study we present a flexible and adaptable fabrication method to create complex hierarchical structures over inherently hydrophobic resist materials. We have tested these surfaces for their superhydrophobic behaviour and successfully verified their self-cleaning properties. The followed approach allow us to design and produce superhydrophobic surfaces in a reproducible manner. We have analysed different combination of hierarchical micro-nanostructures for their application to self-cleaning surfaces. A static contact angle value of 170° with a hysteresis of 4° was achieved without the need of any additional chemical treatment on the fabricated hierarchical structures. Dynamic effects were analysed on these surfaces, obtaining a remarkable self-cleaning effect as well as a good robustness over impacting droplets.

  12. An overview of multivariate gamma distributions as seen from a (multivariate) matrix exponential perspective

    DEFF Research Database (Denmark)

    Bladt, Mogens; Nielsen, Bo Friis

    2012-01-01

    Laplace transform. In a longer perspective stochastic and statistical analysis for MVME will in particular apply to any of the previously defined distributions. Multivariate gamma distributions have been used in a variety of fields like hydrology, [11], [10], [6], space (wind modeling) [9] reliability [3......Numerous definitions of multivariate exponential and gamma distributions can be retrieved from the literature [4]. These distribtuions belong to the class of Multivariate Matrix-- Exponetial Distributions (MVME) whenever their joint Laplace transform is a rational function. The majority...... of these distributions further belongs to an important subclass of MVME distributions [5, 1] where the multivariate random vector can be interpreted as a number of simultaneously collected rewards during sojourns in a the states of a Markov chain with one absorbing state, the rest of the states being transient. We...

  13. Improving satellite-based PM2.5 estimates in China using Gaussian processes modeling in a Bayesian hierarchical setting.

    Science.gov (United States)

    Yu, Wenxi; Liu, Yang; Ma, Zongwei; Bi, Jun

    2017-08-01

    Using satellite-based aerosol optical depth (AOD) measurements and statistical models to estimate ground-level PM 2.5 is a promising way to fill the areas that are not covered by ground PM 2.5 monitors. The statistical models used in previous studies are primarily Linear Mixed Effects (LME) and Geographically Weighted Regression (GWR) models. In this study, we developed a new regression model between PM 2.5 and AOD using Gaussian processes in a Bayesian hierarchical setting. Gaussian processes model the stochastic nature of the spatial random effects, where the mean surface and the covariance function is specified. The spatial stochastic process is incorporated under the Bayesian hierarchical framework to explain the variation of PM 2.5 concentrations together with other factors, such as AOD, spatial and non-spatial random effects. We evaluate the results of our model and compare them with those of other, conventional statistical models (GWR and LME) by within-sample model fitting and out-of-sample validation (cross validation, CV). The results show that our model possesses a CV result (R 2  = 0.81) that reflects higher accuracy than that of GWR and LME (0.74 and 0.48, respectively). Our results indicate that Gaussian process models have the potential to improve the accuracy of satellite-based PM 2.5 estimates.

  14. Multivariate Birkhoff interpolation

    CERN Document Server

    Lorentz, Rudolph A

    1992-01-01

    The subject of this book is Lagrange, Hermite and Birkhoff (lacunary Hermite) interpolation by multivariate algebraic polynomials. It unifies and extends a new algorithmic approach to this subject which was introduced and developed by G.G. Lorentz and the author. One particularly interesting feature of this algorithmic approach is that it obviates the necessity of finding a formula for the Vandermonde determinant of a multivariate interpolation in order to determine its regularity (which formulas are practically unknown anyways) by determining the regularity through simple geometric manipulations in the Euclidean space. Although interpolation is a classical problem, it is surprising how little is known about its basic properties in the multivariate case. The book therefore starts by exploring its fundamental properties and its limitations. The main part of the book is devoted to a complete and detailed elaboration of the new technique. A chapter with an extensive selection of finite elements follows as well a...

  15. Genetic Parameters for Body condition score, Body weigth, Milk yield and Fertility estimated using random regression models

    NARCIS (Netherlands)

    Berry, D.P.; Buckley, F.; Dillon, P.; Evans, R.D.; Rath, M.; Veerkamp, R.F.

    2003-01-01

    Genetic (co)variances between body condition score (BCS), body weight (BW), milk yield, and fertility were estimated using a random regression animal model extended to multivariate analysis. The data analyzed included 81,313 BCS observations, 91,937 BW observations, and 100,458 milk test-day yields

  16. Multivariate multiscale entropy of financial markets

    Science.gov (United States)

    Lu, Yunfan; Wang, Jun

    2017-11-01

    In current process of quantifying the dynamical properties of the complex phenomena in financial market system, the multivariate financial time series are widely concerned. In this work, considering the shortcomings and limitations of univariate multiscale entropy in analyzing the multivariate time series, the multivariate multiscale sample entropy (MMSE), which can evaluate the complexity in multiple data channels over different timescales, is applied to quantify the complexity of financial markets. Its effectiveness and advantages have been detected with numerical simulations with two well-known synthetic noise signals. For the first time, the complexity of four generated trivariate return series for each stock trading hour in China stock markets is quantified thanks to the interdisciplinary application of this method. We find that the complexity of trivariate return series in each hour show a significant decreasing trend with the stock trading time progressing. Further, the shuffled multivariate return series and the absolute multivariate return series are also analyzed. As another new attempt, quantifying the complexity of global stock markets (Asia, Europe and America) is carried out by analyzing the multivariate returns from them. Finally we utilize the multivariate multiscale entropy to assess the relative complexity of normalized multivariate return volatility series with different degrees.

  17. Hierarchical Micro-Nano Coatings by Painting

    Science.gov (United States)

    Kirveslahti, Anna; Korhonen, Tuulia; Suvanto, Mika; Pakkanen, Tapani A.

    2016-03-01

    In this paper, the wettability properties of coatings with hierarchical surface structures and low surface energy were studied. Hierarchically structured coatings were produced by using hydrophobic fumed silica nanoparticles and polytetrafluoroethylene (PTFE) microparticles as additives in polyester (PES) and polyvinyldifluoride (PVDF). These particles created hierarchical micro-nano structures on the paint surfaces and lowered or supported the already low surface energy of the paint. Two standard application techniques for paint application were employed and the presented coatings are suitable for mass production and use in large surface areas. By regulating the particle concentrations, it was possible to modify wettability properties gradually. Highly hydrophobic surfaces were achieved with the highest contact angle of 165∘. Dynamic contact angle measurements were carried out for a set of selected samples and low hysteresis was obtained. Produced coatings possessed long lasting durability in the air and in underwater conditions.

  18. Hierarchical virtual screening approaches in small molecule drug discovery.

    Science.gov (United States)

    Kumar, Ashutosh; Zhang, Kam Y J

    2015-01-01

    Virtual screening has played a significant role in the discovery of small molecule inhibitors of therapeutic targets in last two decades. Various ligand and structure-based virtual screening approaches are employed to identify small molecule ligands for proteins of interest. These approaches are often combined in either hierarchical or parallel manner to take advantage of the strength and avoid the limitations associated with individual methods. Hierarchical combination of ligand and structure-based virtual screening approaches has received noteworthy success in numerous drug discovery campaigns. In hierarchical virtual screening, several filters using ligand and structure-based approaches are sequentially applied to reduce a large screening library to a number small enough for experimental testing. In this review, we focus on different hierarchical virtual screening strategies and their application in the discovery of small molecule modulators of important drug targets. Several virtual screening studies are discussed to demonstrate the successful application of hierarchical virtual screening in small molecule drug discovery. Copyright © 2014 Elsevier Inc. All rights reserved.

  19. MODEL APPLICATION MULTIVARIATE ANALYSIS OF STATISTICAL TECHNIQUES PCA AND HCA ASSESSMENT QUESTIONNAIRE ON CUSTOMER SATISFACTION: CASE STUDY IN A METALLURGICAL COMPANY OF METAL CONTAINERS

    Directory of Open Access Journals (Sweden)

    Cláudio Roberto Rosário

    2012-07-01

    Full Text Available The purpose of this research is to improve the practice on customer satisfaction analysis The article presents an analysis model to analyze the answers of a customer satisfaction evaluation in a systematic way with the aid of multivariate statistical techniques, specifically, exploratory analysis with PCA – Partial Components Analysis with HCA - Hierarchical Cluster Analysis. It was tried to evaluate the applicability of the model to be used by the issue company as a tool to assist itself on identifying the value chain perceived by the customer when applied the questionnaire of customer satisfaction. It was found with the assistance of multivariate statistical analysis that it was observed similar behavior among customers. It also allowed the company to conduct reviews on questions of the questionnaires, using analysis of the degree of correlation between the questions that was not a company’s practice before this research.

  20. Use of multivariate statistics to identify unreliable data obtained using CASA.

    Science.gov (United States)

    Martínez, Luis Becerril; Crispín, Rubén Huerta; Mendoza, Maximino Méndez; Gallegos, Oswaldo Hernández; Martínez, Andrés Aragón

    2013-06-01

    In order to identify unreliable data in a dataset of motility parameters obtained from a pilot study acquired by a veterinarian with experience in boar semen handling, but without experience in the operation of a computer assisted sperm analysis (CASA) system, a multivariate graphical and statistical analysis was performed. Sixteen boar semen samples were aliquoted then incubated with varying concentrations of progesterone from 0 to 3.33 µg/ml and analyzed in a CASA system. After standardization of the data, Chernoff faces were pictured for each measurement, and a principal component analysis (PCA) was used to reduce the dimensionality and pre-process the data before hierarchical clustering. The first twelve individual measurements showed abnormal features when Chernoff faces were drawn. PCA revealed that principal components 1 and 2 explained 63.08% of the variance in the dataset. Values of principal components for each individual measurement of semen samples were mapped to identify differences among treatment or among boars. Twelve individual measurements presented low values of principal component 1. Confidence ellipses on the map of principal components showed no statistically significant effects for treatment or boar. Hierarchical clustering realized on two first principal components produced three clusters. Cluster 1 contained evaluations of the two first samples in each treatment, each one of a different boar. With the exception of one individual measurement, all other measurements in cluster 1 were the same as observed in abnormal Chernoff faces. Unreliable data in cluster 1 are probably related to the operator inexperience with a CASA system. These findings could be used to objectively evaluate the skill level of an operator of a CASA system. This may be particularly useful in the quality control of semen analysis using CASA systems.

  1. Nanocrystalline Hierarchical ZSM-5: An Efficient Catalyst for the Alkylation of Phenol with Cyclohexene.

    Science.gov (United States)

    Radhika, N P; Selvin, Rosilda; Kakkar, Rita; Roselin, L Selva

    2018-08-01

    In this paper, authors report the synthesis of nanocrystalline hierarchical zeolite ZSM-5 and its application as a heterogeneous catalyst in the alkylation of phenol with cyclohexene. The catalyst was synthesized by vacuum-concentration coupled hydrothermal technique in the presence of two templates. This synthetic route could successfully introduce pores of higher hierarchy in the zeolite ZSM-5 structure. Hierarchical ZSM-5 could catalyse effectively the industrially important reaction of cyclohexene with phenol. We ascribe the high efficiency of the catalyst to its conducive structural features such as nanoscale size, high surface area, presence of hierarchy of pores and existence of Lewis sites along with Brønsted acid sites. The effect of various reaction parameters like duration, catalyst amount, reactant mole ratio and temperature were assessed. Under optimum reaction conditions, the catalyst showed up to 65% selectivity towards the major product, cyclohexyl phenyl ether. There was no discernible decline in percent conversion or selectivity even when the catalyst was re-used for up to four runs. Kinetic studies were done through regression analysis and a mechanistic route based on LHHW model was suggested.

  2. Adjustment of geochemical background by robust multivariate statistics

    Science.gov (United States)

    Zhou, D.

    1985-01-01

    Conventional analyses of exploration geochemical data assume that the background is a constant or slowly changing value, equivalent to a plane or a smoothly curved surface. However, it is better to regard the geochemical background as a rugged surface, varying with changes in geology and environment. This rugged surface can be estimated from observed geological, geochemical and environmental properties by using multivariate statistics. A method of background adjustment was developed and applied to groundwater and stream sediment reconnaissance data collected from the Hot Springs Quadrangle, South Dakota, as part of the National Uranium Resource Evaluation (NURE) program. Source-rock lithology appears to be a dominant factor controlling the chemical composition of groundwater or stream sediments. The most efficacious adjustment procedure is to regress uranium concentration on selected geochemical and environmental variables for each lithologic unit, and then to delineate anomalies by a common threshold set as a multiple of the standard deviation of the combined residuals. Robust versions of regression and RQ-mode principal components analysis techniques were used rather than ordinary techniques to guard against distortion caused by outliers Anomalies delineated by this background adjustment procedure correspond with uranium prospects much better than do anomalies delineated by conventional procedures. The procedure should be applicable to geochemical exploration at different scales for other metals. ?? 1985.

  3. Introduction into Hierarchical Matrices

    KAUST Repository

    Litvinenko, Alexander

    2013-01-01

    Hierarchical matrices allow us to reduce computational storage and cost from cubic to almost linear. This technique can be applied for solving PDEs, integral equations, matrix equations and approximation of large covariance and precision matrices.

  4. Introduction into Hierarchical Matrices

    KAUST Repository

    Litvinenko, Alexander

    2013-12-05

    Hierarchical matrices allow us to reduce computational storage and cost from cubic to almost linear. This technique can be applied for solving PDEs, integral equations, matrix equations and approximation of large covariance and precision matrices.

  5. Modified Regression Correlation Coefficient for Poisson Regression Model

    Science.gov (United States)

    Kaengthong, Nattacha; Domthong, Uthumporn

    2017-09-01

    This study gives attention to indicators in predictive power of the Generalized Linear Model (GLM) which are widely used; however, often having some restrictions. We are interested in regression correlation coefficient for a Poisson regression model. This is a measure of predictive power, and defined by the relationship between the dependent variable (Y) and the expected value of the dependent variable given the independent variables [E(Y|X)] for the Poisson regression model. The dependent variable is distributed as Poisson. The purpose of this research was modifying regression correlation coefficient for Poisson regression model. We also compare the proposed modified regression correlation coefficient with the traditional regression correlation coefficient in the case of two or more independent variables, and having multicollinearity in independent variables. The result shows that the proposed regression correlation coefficient is better than the traditional regression correlation coefficient based on Bias and the Root Mean Square Error (RMSE).

  6. Simultaneous determination of rifampicin, isoniazid and pyrazinamide in tablet preparations by multivariate spectrophotometric calibration.

    Science.gov (United States)

    Goicoechea, H C; Olivieri, A C

    1999-08-01

    The use of multivariate spectrophotometric calibration is presented for the simultaneous determination of the active components of tablets used in the treatment of pulmonary tuberculosis. The resolution of ternary mixtures of rifampicin, isoniazid and pyrazinamide has been accomplished by using partial least squares (PLS-1) regression analysis. Although the components show an important degree of spectral overlap, they have been simultaneously determined with high accuracy and precision, rapidly and with no need of nonaqueous solvents for dissolving the samples. No interference has been observed from the tablet excipients. A comparison is presented with the related multivariate method of classical least squares (CLS) analysis, which is shown to yield less reliable results due to the severe spectral overlap among the studied compounds. This is highlighted in the case of isoniazid, due to the small absorbances measured for this component.

  7. Extracting information from two-dimensional electrophoresis gels by partial least squares regression

    DEFF Research Database (Denmark)

    Jessen, Flemming; Lametsch, R.; Bendixen, E.

    2002-01-01

    of all proteins/spots in the gels. In the present study it is demonstrated how information can be extracted by multivariate data analysis. The strategy is based on partial least squares regression followed by variable selection to find proteins that individually or in combination with other proteins vary......Two-dimensional gel electrophoresis (2-DE) produces large amounts of data and extraction of relevant information from these data demands a cautious and time consuming process of spot pattern matching between gels. The classical approach of data analysis is to detect protein markers that appear...... or disappear depending on the experimental conditions. Such biomarkers are found by comparing the relative volumes of individual spots in the individual gels. Multivariate statistical analysis and modelling of 2-DE data for comparison and classification is an alternative approach utilising the combination...

  8. An Alternative Flight Software Paradigm: Applying Multivariate Logistic Regression to Sense Trigger Conditions using Inaccurate or Scarce Information

    Science.gov (United States)

    Smith, Kelly; Gay, Robert; Stachowiak, Susan

    2013-01-01

    In late 2014, NASA will fly the Orion capsule on a Delta IV-Heavy rocket for the Exploration Flight Test-1 (EFT-1) mission. For EFT-1, the Orion capsule will be flying with a new GPS receiver and new navigation software. Given the experimental nature of the flight, the flight software must be robust to the loss of GPS measurements. Once the high-speed entry is complete, the drogue parachutes must be deployed within the proper conditions to stabilize the vehicle prior to deploying the main parachutes. When GPS is available in nominal operations, the vehicle will deploy the drogue parachutes based on an altitude trigger. However, when GPS is unavailable, the navigated altitude errors become excessively large, driving the need for a backup barometric altimeter to improve altitude knowledge. In order to increase overall robustness, the vehicle also has an alternate method of triggering the parachute deployment sequence based on planet-relative velocity if both the GPS and the barometric altimeter fail. However, this backup trigger results in large altitude errors relative to the targeted altitude. Motivated by this challenge, this paper demonstrates how logistic regression may be employed to semi-automatically generate robust triggers based on statistical analysis. Logistic regression is used as a ground processor pre-flight to develop a statistical classifier. The classifier would then be implemented in flight software and executed in real-time. This technique offers improved performance even in the face of highly inaccurate measurements. Although the logistic regression-based trigger approach will not be implemented within EFT-1 flight software, the methodology can be carried forward for future missions and vehicles

  9. OPLS statistical model versus linear regression to assess sonographic predictors of stroke prognosis.

    Science.gov (United States)

    Vajargah, Kianoush Fathi; Sadeghi-Bazargani, Homayoun; Mehdizadeh-Esfanjani, Robab; Savadi-Oskouei, Daryoush; Farhoudi, Mehdi

    2012-01-01

    The objective of the present study was to assess the comparable applicability of orthogonal projections to latent structures (OPLS) statistical model vs traditional linear regression in order to investigate the role of trans cranial doppler (TCD) sonography in predicting ischemic stroke prognosis. The study was conducted on 116 ischemic stroke patients admitted to a specialty neurology ward. The Unified Neurological Stroke Scale was used once for clinical evaluation on the first week of admission and again six months later. All data was primarily analyzed using simple linear regression and later considered for multivariate analysis using PLS/OPLS models through the SIMCA P+12 statistical software package. The linear regression analysis results used for the identification of TCD predictors of stroke prognosis were confirmed through the OPLS modeling technique. Moreover, in comparison to linear regression, the OPLS model appeared to have higher sensitivity in detecting the predictors of ischemic stroke prognosis and detected several more predictors. Applying the OPLS model made it possible to use both single TCD measures/indicators and arbitrarily dichotomized measures of TCD single vessel involvement as well as the overall TCD result. In conclusion, the authors recommend PLS/OPLS methods as complementary rather than alternative to the available classical regression models such as linear regression.

  10. A Two-Stage Maximum Entropy Prior of Location Parameter with a Stochastic Multivariate Interval Constraint and Its Properties

    Directory of Open Access Journals (Sweden)

    Hea-Jung Kim

    2016-05-01

    Full Text Available This paper proposes a two-stage maximum entropy prior to elicit uncertainty regarding a multivariate interval constraint of the location parameter of a scale mixture of normal model. Using Shannon’s entropy, this study demonstrates how the prior, obtained by using two stages of a prior hierarchy, appropriately accounts for the information regarding the stochastic constraint and suggests an objective measure of the degree of belief in the stochastic constraint. The study also verifies that the proposed prior plays the role of bridging the gap between the canonical maximum entropy prior of the parameter with no interval constraint and that with a certain multivariate interval constraint. It is shown that the two-stage maximum entropy prior belongs to the family of rectangle screened normal distributions that is conjugate for samples from a normal distribution. Some properties of the prior density, useful for developing a Bayesian inference of the parameter with the stochastic constraint, are provided. We also propose a hierarchical constrained scale mixture of normal model (HCSMN, which uses the prior density to estimate the constrained location parameter of a scale mixture of normal model and demonstrates the scope of its applicability.

  11. Multivariate Bias Correction Procedures for Improving Water Quality Predictions from the SWAT Model

    Science.gov (United States)

    Arumugam, S.; Libera, D.

    2017-12-01

    Water quality observations are usually not available on a continuous basis for longer than 1-2 years at a time over a decadal period given the labor requirements making calibrating and validating mechanistic models difficult. Further, any physical model predictions inherently have bias (i.e., under/over estimation) and require post-simulation techniques to preserve the long-term mean monthly attributes. This study suggests a multivariate bias-correction technique and compares to a common technique in improving the performance of the SWAT model in predicting daily streamflow and TN loads across the southeast based on split-sample validation. The approach is a dimension reduction technique, canonical correlation analysis (CCA) that regresses the observed multivariate attributes with the SWAT model simulated values. The common approach is a regression based technique that uses an ordinary least squares regression to adjust model values. The observed cross-correlation between loadings and streamflow is better preserved when using canonical correlation while simultaneously reducing individual biases. Additionally, canonical correlation analysis does a better job in preserving the observed joint likelihood of observed streamflow and loadings. These procedures were applied to 3 watersheds chosen from the Water Quality Network in the Southeast Region; specifically, watersheds with sufficiently large drainage areas and number of observed data points. The performance of these two approaches are compared for the observed period and over a multi-decadal period using loading estimates from the USGS LOADEST model. Lastly, the CCA technique is applied in a forecasting sense by using 1-month ahead forecasts of P & T from ECHAM4.5 as forcings in the SWAT model. Skill in using the SWAT model for forecasting loadings and streamflow at the monthly and seasonal timescale is also discussed.

  12. Multivariate meta-analysis: Potential and promise

    Science.gov (United States)

    Jackson, Dan; Riley, Richard; White, Ian R

    2011-01-01

    The multivariate random effects model is a generalization of the standard univariate model. Multivariate meta-analysis is becoming more commonly used and the techniques and related computer software, although continually under development, are now in place. In order to raise awareness of the multivariate methods, and discuss their advantages and disadvantages, we organized a one day ‘Multivariate meta-analysis’ event at the Royal Statistical Society. In addition to disseminating the most recent developments, we also received an abundance of comments, concerns, insights, critiques and encouragement. This article provides a balanced account of the day's discourse. By giving others the opportunity to respond to our assessment, we hope to ensure that the various view points and opinions are aired before multivariate meta-analysis simply becomes another widely used de facto method without any proper consideration of it by the medical statistics community. We describe the areas of application that multivariate meta-analysis has found, the methods available, the difficulties typically encountered and the arguments for and against the multivariate methods, using four representative but contrasting examples. We conclude that the multivariate methods can be useful, and in particular can provide estimates with better statistical properties, but also that these benefits come at the price of making more assumptions which do not result in better inference in every case. Although there is evidence that multivariate meta-analysis has considerable potential, it must be even more carefully applied than its univariate counterpart in practice. Copyright © 2011 John Wiley & Sons, Ltd. PMID:21268052

  13. Hierarchical surfaces for enhanced self-cleaning applications

    International Nuclear Information System (INIS)

    Fernández, Ariadna; Francone, Achille; Sotomayor Torres, Clivia M; Kehagias, Nikolaos; Thamdrup, Lasse H; Johansson, Alicia; Bilenberg, Brian; Nielsen, Theodor; Guttmann, Markus

    2017-01-01

    In this study we present a flexible and adaptable fabrication method to create complex hierarchical structures over inherently hydrophobic resist materials. We have tested these surfaces for their superhydrophobic behaviour and successfully verified their self-cleaning properties. The followed approach allow us to design and produce superhydrophobic surfaces in a reproducible manner. We have analysed different combination of hierarchical micro-nanostructures for their application to self-cleaning surfaces. A static contact angle value of 170° with a hysteresis of 4° was achieved without the need of any additional chemical treatment on the fabricated hierarchical structures. Dynamic effects were analysed on these surfaces, obtaining a remarkable self-cleaning effect as well as a good robustness over impacting droplets. (paper)

  14. Hierarchical processing in the prefrontal cortex in a variety of cognitive domains

    Directory of Open Access Journals (Sweden)

    Hyeon-Ae eJeon

    2014-11-01

    Full Text Available This review scrutinizes several findings on human hierarchical processing within the prefrontal cortex (PFC in diverse cognitive domains. Converging evidence from previous studies has shown that the PFC, specifically Brodmann area (BA 44, may function as the essential region for hierarchical processing across the domains. In language fMRI studies, BA 44 was significantly activated for the hierarchical processing of center-embedded sentences and this pattern of activations was also observed in artificial grammar. The same pattern was observed in the visuo-spatial domain where BA44 was actively involved in the processing of hierarchy for the visual symbol. Musical syntax, which is the rule-based arrangement of musical sets, has also been construed as hierarchical processing as in the language domain such that the activation in BA44 was observed in a chord sequence paradigm. P600 ERP was also engendered during the processing of musical hierarchy. Along with a longstanding idea that a human’s number faculty is developed as a by-product of language faculty, BA44 was closely involved in hierarchical processing in mental arithmetic. This review extended its discussion of hierarchical processing to hierarchical behavior, that is, human action which has been referred to as being hierarchically composed. Several lesion and TMS studies supported the involvement of BA44 for hierarchical processing in the action domain. Lastly, the hierarchical organization of cognitive controls was discussed within the PFC, forming a cascade of top-down hierarchical processes operating along a posterior-to-anterior axis of the lateral PFC including BA44 within the network. It is proposed that PFC is actively involved in different forms of hierarchical processing and specifically BA44 may play an integral role in the process. Taking levels of proficiency and subcortical areas into consideration may provide further insight into the functional role of BA44 for hierarchical

  15. Job stress models, depressive disorders and work performance of engineers in microelectronics industry.

    Science.gov (United States)

    Chen, Sung-Wei; Wang, Po-Chuan; Hsin, Ping-Lung; Oates, Anthony; Sun, I-Wen; Liu, Shen-Ing

    2011-01-01

    Microelectronic engineers are considered valuable human capital contributing significantly toward economic development, but they may encounter stressful work conditions in the context of a globalized industry. The study aims at identifying risk factors of depressive disorders primarily based on job stress models, the Demand-Control-Support and Effort-Reward Imbalance models, and at evaluating whether depressive disorders impair work performance in microelectronics engineers in Taiwan. The case-control study was conducted among 678 microelectronics engineers, 452 controls and 226 cases with depressive disorders which were defined by a score 17 or more on the Beck Depression Inventory and a psychiatrist's diagnosis. The self-administered questionnaires included the Job Content Questionnaire, Effort-Reward Imbalance Questionnaire, demography, psychosocial factors, health behaviors and work performance. Hierarchical logistic regression was applied to identify risk factors of depressive disorders. Multivariate linear regressions were used to determine factors affecting work performance. By hierarchical logistic regression, risk factors of depressive disorders are high demands, low work social support, high effort/reward ratio and low frequency of physical exercise. Combining the two job stress models may have better predictive power for depressive disorders than adopting either model alone. Three multivariate linear regressions provide similar results indicating that depressive disorders are associated with impaired work performance in terms of absence, role limitation and social functioning limitation. The results may provide insight into the applicability of job stress models in a globalized high-tech industry considerably focused in non-Western countries, and the design of workplace preventive strategies for depressive disorders in Asian electronics engineering population.

  16. Linear regression and sensitivity analysis in nuclear reactor design

    International Nuclear Information System (INIS)

    Kumar, Akansha; Tsvetkov, Pavel V.; McClarren, Ryan G.

    2015-01-01

    Highlights: • Presented a benchmark for the applicability of linear regression to complex systems. • Applied linear regression to a nuclear reactor power system. • Performed neutronics, thermal–hydraulics, and energy conversion using Brayton’s cycle for the design of a GCFBR. • Performed detailed sensitivity analysis to a set of parameters in a nuclear reactor power system. • Modeled and developed reactor design using MCNP, regression using R, and thermal–hydraulics in Java. - Abstract: The paper presents a general strategy applicable for sensitivity analysis (SA), and uncertainity quantification analysis (UA) of parameters related to a nuclear reactor design. This work also validates the use of linear regression (LR) for predictive analysis in a nuclear reactor design. The analysis helps to determine the parameters on which a LR model can be fit for predictive analysis. For those parameters, a regression surface is created based on trial data and predictions are made using this surface. A general strategy of SA to determine and identify the influential parameters those affect the operation of the reactor is mentioned. Identification of design parameters and validation of linearity assumption for the application of LR of reactor design based on a set of tests is performed. The testing methods used to determine the behavior of the parameters can be used as a general strategy for UA, and SA of nuclear reactor models, and thermal hydraulics calculations. A design of a gas cooled fast breeder reactor (GCFBR), with thermal–hydraulics, and energy transfer has been used for the demonstration of this method. MCNP6 is used to simulate the GCFBR design, and perform the necessary criticality calculations. Java is used to build and run input samples, and to extract data from the output files of MCNP6, and R is used to perform regression analysis and other multivariate variance, and analysis of the collinearity of data

  17. Multivariate statistical analysis of radioactive variables in two phosphate ores from Sudan

    International Nuclear Information System (INIS)

    Adam, Abdel Majid A.; Eltayeb, Mohamed Ahmed H.

    2012-01-01

    Multivariate statistical techniques are efficient ways to display complex relationships among many objects. An attempt was made to study the radioactive data in two types of Sudanese phosphate deposits; Kurun and Uro phosphate, using several multivariate statistical methods. Pearson correlation coefficient revealed that a U-238 distribution in Kurun phosphate is controlled by the variation of K-40 concentration, whereas in Uro phosphate it is controlled by the variation of U-235 and U-234 concentration. Histograms and normal Q–Q plots clearly show that the radioactive variables did not follow a normal distribution. This non-normality feature observed may be attributed to complicating influence of geological factors. The principal components analysis (PCA) gives a model of five components for representing the acquired data from Kurun phosphate, where 89.5% of the total variance is explained. A model of four components was sufficient to represent the acquired data from Uro phosphate, where 87.5% of the total data variance is explained. The hierarchical cluster analysis (HCA) indicates that U-238 behaves in the same manner in the two types of phosphates; it associated with a group of four radionuclides; U-234, Po-210, Ra-226, Th-230, which the most abundant radionuclides, and all belong to the uranium-238 decay series. Two parameters have been adapted for the direct differentiate between the two phosphates. Firstly, U-238 in Uro phosphate have shown higher degree of mobility (CV% = 82.6) than that in Kurun phosphate (CV% = 64.7), and secondly, the activity ratio of Th-230/Th-232 in Uro phosphate is nine times than that in Kurun phosphate. - Highlights: ► Multivariate statistical techniques were used to characterize radioactive data. ► U-238 in Uro phosphate shows higher degree of mobility (CV% = 82.6). ► U-238 in Kurun phosphate shows lower degree of mobility (CV% = 64.7). ► The radioactive variables did not follow a normal distribution. ► The ratio of Th

  18. Air Quality Pattern Assessment in Malaysia Using Multivariate Techniques

    International Nuclear Information System (INIS)

    Hamza Ahmad Isiyaka; Azman Azid

    2015-01-01

    This study aims to investigate the spatial characteristics in the pattern of air quality monitoring sites, identify the most discriminating parameters contributing to air pollution, and predict the level of air pollution index (API) in Malaysia using multivariate techniques. Five parameters observed for five years (2000-2004) were used. Hierarchical agglomerative cluster analysis classified the five air quality monitoring sites into two independent groups based on the characteristics of activities in the monitoring stations. Discriminate analysis for standard, backward stepwise and forward stepwise mode gave a correct assignation of more than 87 % in the confusion matrix. This result indicates that only three parameters (PM_1_0, SO_2 and NO_2) with a p<0.0001 discriminate best in polluting the air. The major possible sources of air pollution were identified using principal component analysis that account for more than 58 % and 60 % in the total variance. Based on the findings, anthropogenic activities (vehicular emission, industrial activities, construction sites, bush burning) have a strong influence in the source of air pollution. Furthermore, artificial neural network (ANN) was used to predict the level of air pollution index at R"2 = 0.8493 and RMSE = 5.9184. This indicates that ANN can predict more than 84 % of the API. (author)

  19. TOURISM SEGMENTATION BASED ON TOURISTS PREFERENCES: A MULTIVARIATE APPROACH

    Directory of Open Access Journals (Sweden)

    Sérgio Dominique Ferreira

    2010-11-01

    Full Text Available Over the last decades, tourism became one of the most important sectors of the international economy. Specifically in Portugal and Brazil, its contribution to Gross Domestic Product (GDP and job creation is quite relevant. In this sense, to follow a strong marketing approach on the management of tourism resources of a country comes to be paramount. Such an approach should be based on innovations which help unveil the preferences of tourists with accuracy, turning it into a competitive advantage. In this context, the main objective of the present study is to illustrate the importance and benefits associated with the use of multivariate methodologies for market segmentation. Another objective of this work is to illustrate on the importance of a post hoc segmentation. In this work, the authors applied a Cluster Analysis, with a hierarchical method followed by an  optimization method. The main results of this study allow the identification of five clusters that are distinguished by assigning special importance to certain tourism attributes at the moment of choosing a specific destination. Thus, the authors present the advantages of post hoc segmentation based on tourists’ preferences, in opposition to an a priori segmentation based on socio-demographic characteristics.

  20. Hierarchical Ag mesostructures for single particle SERS substrate

    Energy Technology Data Exchange (ETDEWEB)

    Xu, Minwei, E-mail: xuminwei@xjtu.edu.cn; Zhang, Yin

    2017-01-30

    Highlights: • Hierarchical Ag mesostructures with the size of 250, 360 and 500 nm are synthesized via a seed-mediated approach. • The Ag mesostructures present the tailorable size and highly roughened surfaces. • The average enhancement factors for individual Ag mesostructures were estimated to be as high as 10{sup 6}. - Abstract: Hierarchical Ag mesostructures with highly rough surface morphology have been synthesized at room temperature through a simple seed-mediated approach. Electron microscopy characterizations indicate that the obtained Ag mesostructures exhibit a textured surface morphology with the flower-like architecture. Moreover, the particle size can be tailored easily in the range of 250–500 nm. For the growth process of the hierarchical Ag mesostructures, it is believed that the self-assembly mechanism is more reasonable rather than the epitaxial overgrowth of Ag seed. The oriented attachment of nanoparticles is revealed during the formation of Ag mesostructures. Single particle surface enhanced Raman spectra (sp-SERS) of crystal violet adsorbed on the hierarchical Ag mesostructures were measured. Results reveal that the hierarchical Ag mesostructures can be highly sensitive sp-SERS substrates with good reproducibility. The average enhancement factors for individual Ag mesostructures are estimated to be about 10{sup 6}.

  1. Using Logistic Regression to Predict the Probability of Debris Flows in Areas Burned by Wildfires, Southern California, 2003-2006

    Science.gov (United States)

    Rupert, Michael G.; Cannon, Susan H.; Gartner, Joseph E.; Michael, John A.; Helsel, Dennis R.

    2008-01-01

    Logistic regression was used to develop statistical models that can be used to predict the probability of debris flows in areas recently burned by wildfires by using data from 14 wildfires that burned in southern California during 2003-2006. Twenty-eight independent variables describing the basin morphology, burn severity, rainfall, and soil properties of 306 drainage basins located within those burned areas were evaluated. The models were developed as follows: (1) Basins that did and did not produce debris flows soon after the 2003 to 2006 fires were delineated from data in the National Elevation Dataset using a geographic information system; (2) Data describing the basin morphology, burn severity, rainfall, and soil properties were compiled for each basin. These data were then input to a statistics software package for analysis using logistic regression; and (3) Relations between the occurrence or absence of debris flows and the basin morphology, burn severity, rainfall, and soil properties were evaluated, and five multivariate logistic regression models were constructed. All possible combinations of independent variables were evaluated to determine which combinations produced the most effective models, and the multivariate models that best predicted the occurrence of debris flows were identified. Percentage of high burn severity and 3-hour peak rainfall intensity were significant variables in all models. Soil organic matter content and soil clay content were significant variables in all models except Model 5. Soil slope was a significant variable in all models except Model 4. The most suitable model can be selected from these five models on the basis of the availability of independent variables in the particular area of interest and field checking of probability maps. The multivariate logistic regression models can be entered into a geographic information system, and maps showing the probability of debris flows can be constructed in recently burned areas of

  2. Multivariate refined composite multiscale entropy analysis

    International Nuclear Information System (INIS)

    Humeau-Heurtier, Anne

    2016-01-01

    Multiscale entropy (MSE) has become a prevailing method to quantify signals complexity. MSE relies on sample entropy. However, MSE may yield imprecise complexity estimation at large scales, because sample entropy does not give precise estimation of entropy when short signals are processed. A refined composite multiscale entropy (RCMSE) has therefore recently been proposed. Nevertheless, RCMSE is for univariate signals only. The simultaneous analysis of multi-channel (multivariate) data often over-performs studies based on univariate signals. We therefore introduce an extension of RCMSE to multivariate data. Applications of multivariate RCMSE to simulated processes reveal its better performances over the standard multivariate MSE. - Highlights: • Multiscale entropy quantifies data complexity but may be inaccurate at large scale. • A refined composite multiscale entropy (RCMSE) has therefore recently been proposed. • Nevertheless, RCMSE is adapted to univariate time series only. • We herein introduce an extension of RCMSE to multivariate data. • It shows better performances than the standard multivariate multiscale entropy.

  3. Multivariate Generalized Multiscale Entropy Analysis

    Directory of Open Access Journals (Sweden)

    Anne Humeau-Heurtier

    2016-11-01

    Full Text Available Multiscale entropy (MSE was introduced in the 2000s to quantify systems’ complexity. MSE relies on (i a coarse-graining procedure to derive a set of time series representing the system dynamics on different time scales; (ii the computation of the sample entropy for each coarse-grained time series. A refined composite MSE (rcMSE—based on the same steps as MSE—also exists. Compared to MSE, rcMSE increases the accuracy of entropy estimation and reduces the probability of inducing undefined entropy for short time series. The multivariate versions of MSE (MMSE and rcMSE (MrcMSE have also been introduced. In the coarse-graining step used in MSE, rcMSE, MMSE, and MrcMSE, the mean value is used to derive representations of the original data at different resolutions. A generalization of MSE was recently published, using the computation of different moments in the coarse-graining procedure. However, so far, this generalization only exists for univariate signals. We therefore herein propose an extension of this generalized MSE to multivariate data. The multivariate generalized algorithms of MMSE and MrcMSE presented herein (MGMSE and MGrcMSE, respectively are first analyzed through the processing of synthetic signals. We reveal that MGrcMSE shows better performance than MGMSE for short multivariate data. We then study the performance of MGrcMSE on two sets of short multivariate electroencephalograms (EEG available in the public domain. We report that MGrcMSE may show better performance than MrcMSE in distinguishing different types of multivariate EEG data. MGrcMSE could therefore supplement MMSE or MrcMSE in the processing of multivariate datasets.

  4. Multivariate return periods in hydrology: a critical and practical review focusing on synthetic design hydrograph estimation

    Directory of Open Access Journals (Sweden)

    B. Gräler

    2013-04-01

    Full Text Available Most of the hydrological and hydraulic studies refer to the notion of a return period to quantify design variables. When dealing with multiple design variables, the well-known univariate statistical analysis is no longer satisfactory, and several issues challenge the practitioner. How should one incorporate the dependence between variables? How should a multivariate return period be defined and applied in order to yield a proper design event? In this study an overview of the state of the art for estimating multivariate design events is given and the different approaches are compared. The construction of multivariate distribution functions is done through the use of copulas, given their practicality in multivariate frequency analyses and their ability to model numerous types of dependence structures in a flexible way. A synthetic case study is used to generate a large data set of simulated discharges that is used for illustrating the effect of different modelling choices on the design events. Based on different uni- and multivariate approaches, the design hydrograph characteristics of a 3-D phenomenon composed of annual maximum peak discharge, its volume, and duration are derived. These approaches are based on regression analysis, bivariate conditional distributions, bivariate joint distributions and Kendall distribution functions, highlighting theoretical and practical issues of multivariate frequency analysis. Also an ensemble-based approach is presented. For a given design return period, the approach chosen clearly affects the calculated design event, and much attention should be given to the choice of the approach used as this depends on the real-world problem at hand.

  5. Referral determinants in Swiss primary care with a special focus on managed care.

    Directory of Open Access Journals (Sweden)

    Ryan Tandjung

    Full Text Available Studies have shown large variation of referral probabilities in different countries, and many influencing factors have been described. This variation is most likely explained by different healthcare systems, particularly to which extent primary care physicians (PCPs act as gatekeepers. In Switzerland no mandatory gatekeeping system exists, however insurance companies offer voluntary managed care plans with reduced insurance premiums. We aimed at investigating the role of managed care plans as a potential referral determinant in a non-gatekeeping healthcare system. We conducted a cross-sectional study with 90 PCPs collecting data on consultations and referrals in 2012/2013. During each consultation up to six reasons for encounters (RFE were documented. For each RFE PCPs indicated whether a referral was initiated. Determinants for referrals were analyzed by hierarchical logistic regression, taking the potential cluster effect of the PCP into account. To further investigate the independent association of the managed care plan with the referral probability, a hierarchical multivariate logistic regression model was applied, taking into account all available data potentially affecting the referring decision. PCPs collected data on 24'774 patients with 42'890 RFE, of which 2427 led to a referral. 37.5% of patients were insured in managed health care plans. Univariate analysis showed significant higher referral rates of patients with managed care plans (10.7% vs. 8.5%. The difference in referral probability remained significant after controlling for other confounders in the hierarchical multivariate regression model (OR 1.355. Patients in managed care plans were more likely to be referred than patients without such a model. These data contradict the argument that patients in managed care plans have limited healthcare access, but underline the central role of PCPs as coordinator of care.

  6. Analysis hierarchical model for discrete event systems

    Science.gov (United States)

    Ciortea, E. M.

    2015-11-01

    The This paper presents the hierarchical model based on discrete event network for robotic systems. Based on the hierarchical approach, Petri network is analysed as a network of the highest conceptual level and the lowest level of local control. For modelling and control of complex robotic systems using extended Petri nets. Such a system is structured, controlled and analysed in this paper by using Visual Object Net ++ package that is relatively simple and easy to use, and the results are shown as representations easy to interpret. The hierarchical structure of the robotic system is implemented on computers analysed using specialized programs. Implementation of hierarchical model discrete event systems, as a real-time operating system on a computer network connected via a serial bus is possible, where each computer is dedicated to local and Petri model of a subsystem global robotic system. Since Petri models are simplified to apply general computers, analysis, modelling, complex manufacturing systems control can be achieved using Petri nets. Discrete event systems is a pragmatic tool for modelling industrial systems. For system modelling using Petri nets because we have our system where discrete event. To highlight the auxiliary time Petri model using transport stream divided into hierarchical levels and sections are analysed successively. Proposed robotic system simulation using timed Petri, offers the opportunity to view the robotic time. Application of goods or robotic and transmission times obtained by measuring spot is obtained graphics showing the average time for transport activity, using the parameters sets of finished products. individually.

  7. Hierarchical effects on target detection and conflict monitoring

    Science.gov (United States)

    Cao, Bihua; Gao, Feng; Ren, Maofang; Li, Fuhong

    2016-01-01

    Previous neuroimaging studies have demonstrated a hierarchical functional structure of the frontal cortices of the human brain, but the temporal course and the electrophysiological signature of the hierarchical representation remains unaddressed. In the present study, twenty-one volunteers were asked to perform a nested cue-target task, while their scalp potentials were recorded. The results showed that: (1) in comparison with the lower-level hierarchical targets, the higher-level targets elicited a larger N2 component (220–350 ms) at the frontal sites, and a smaller P3 component (350–500 ms) across the frontal and parietal sites; (2) conflict-related negativity (non-target minus target) was greater for the lower-level hierarchy than the higher-level, reflecting a more intensive process of conflict monitoring at the final step of target detection. These results imply that decision making, context updating, and conflict monitoring differ among different hierarchical levels of abstraction. PMID:27561989

  8. Programming with Hierarchical Maps

    DEFF Research Database (Denmark)

    Ørbæk, Peter

    This report desribes the hierarchical maps used as a central data structure in the Corundum framework. We describe its most prominent features, ague for its usefulness and briefly describe some of the software prototypes implemented using the technology....

  9. Collaborative regression-based anatomical landmark detection

    International Nuclear Information System (INIS)

    Gao, Yaozong; Shen, Dinggang

    2015-01-01

    Anatomical landmark detection plays an important role in medical image analysis, e.g. for registration, segmentation and quantitative analysis. Among the various existing methods for landmark detection, regression-based methods have recently attracted much attention due to their robustness and efficiency. In these methods, landmarks are localised through voting from all image voxels, which is completely different from the classification-based methods that use voxel-wise classification to detect landmarks. Despite their robustness, the accuracy of regression-based landmark detection methods is often limited due to (1) the inclusion of uninformative image voxels in the voting procedure, and (2) the lack of effective ways to incorporate inter-landmark spatial dependency into the detection step. In this paper, we propose a collaborative landmark detection framework to address these limitations. The concept of collaboration is reflected in two aspects. (1) Multi-resolution collaboration. A multi-resolution strategy is proposed to hierarchically localise landmarks by gradually excluding uninformative votes from faraway voxels. Moreover, for informative voxels near the landmark, a spherical sampling strategy is also designed at the training stage to improve their prediction accuracy. (2) Inter-landmark collaboration. A confidence-based landmark detection strategy is proposed to improve the detection accuracy of ‘difficult-to-detect’ landmarks by using spatial guidance from ‘easy-to-detect’ landmarks. To evaluate our method, we conducted experiments extensively on three datasets for detecting prostate landmarks and head and neck landmarks in computed tomography images, and also dental landmarks in cone beam computed tomography images. The results show the effectiveness of our collaborative landmark detection framework in improving landmark detection accuracy, compared to other state-of-the-art methods. (paper)

  10. Canonical variate regression.

    Science.gov (United States)

    Luo, Chongliang; Liu, Jin; Dey, Dipak K; Chen, Kun

    2016-07-01

    In many fields, multi-view datasets, measuring multiple distinct but interrelated sets of characteristics on the same set of subjects, together with data on certain outcomes or phenotypes, are routinely collected. The objective in such a problem is often two-fold: both to explore the association structures of multiple sets of measurements and to develop a parsimonious model for predicting the future outcomes. We study a unified canonical variate regression framework to tackle the two problems simultaneously. The proposed criterion integrates multiple canonical correlation analysis with predictive modeling, balancing between the association strength of the canonical variates and their joint predictive power on the outcomes. Moreover, the proposed criterion seeks multiple sets of canonical variates simultaneously to enable the examination of their joint effects on the outcomes, and is able to handle multivariate and non-Gaussian outcomes. An efficient algorithm based on variable splitting and Lagrangian multipliers is proposed. Simulation studies show the superior performance of the proposed approach. We demonstrate the effectiveness of the proposed approach in an [Formula: see text] intercross mice study and an alcohol dependence study. © The Author 2016. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  11. Multivariate pattern dependence.

    Directory of Open Access Journals (Sweden)

    Stefano Anzellotti

    2017-11-01

    Full Text Available When we perform a cognitive task, multiple brain regions are engaged. Understanding how these regions interact is a fundamental step to uncover the neural bases of behavior. Most research on the interactions between brain regions has focused on the univariate responses in the regions. However, fine grained patterns of response encode important information, as shown by multivariate pattern analysis. In the present article, we introduce and apply multivariate pattern dependence (MVPD: a technique to study the statistical dependence between brain regions in humans in terms of the multivariate relations between their patterns of responses. MVPD characterizes the responses in each brain region as trajectories in region-specific multidimensional spaces, and models the multivariate relationship between these trajectories. We applied MVPD to the posterior superior temporal sulcus (pSTS and to the fusiform face area (FFA, using a searchlight approach to reveal interactions between these seed regions and the rest of the brain. Across two different experiments, MVPD identified significant statistical dependence not detected by standard functional connectivity. Additionally, MVPD outperformed univariate connectivity in its ability to explain independent variance in the responses of individual voxels. In the end, MVPD uncovered different connectivity profiles associated with different representational subspaces of FFA: the first principal component of FFA shows differential connectivity with occipital and parietal regions implicated in the processing of low-level properties of faces, while the second and third components show differential connectivity with anterior temporal regions implicated in the processing of invariant representations of face identity.

  12. Pre-processing of Fourier transform infrared spectra by means of multivariate analysis implemented in the R environment.

    Science.gov (United States)

    Banas, Krzysztof; Banas, Agnieszka; Gajda, Mariusz; Pawlicki, Bohdan; Kwiatek, Wojciech M; Breese, Mark B H

    2015-04-21

    Pre-processing of Fourier transform infrared (FTIR) spectra is typically the first and crucial step in data analysis. Very often hyperspectral datasets include the regions characterized by the spectra of very low intensity, for example two-dimensional (2D) maps where the areas with only support materials (like mylar foil) are present. In that case segmentation of the complete dataset is required before subsequent evaluation. The method proposed in this contribution is based on a multivariate approach (hierarchical cluster analysis), and shows its superiority when compared to the standard method of cutting-off by using only the mean spectral intensity. Both techniques were implemented and their performance was tested in the R statistical environment - open-source platform - that is a favourable solution if the repeatability and transparency are the key aspects.

  13. Development of methodology for identification the nature of the polyphenolic extracts by FTIR associated with multivariate analysis

    Science.gov (United States)

    Grasel, Fábio dos Santos; Ferrão, Marco Flôres; Wolf, Carlos Rodolfo

    2016-01-01

    Tannins are polyphenolic compounds of complex structures formed by secondary metabolism in several plants. These polyphenolic compounds have different applications, such as drugs, anti-corrosion agents, flocculants, and tanning agents. This study analyses six different type of polyphenolic extracts by Fourier transform infrared spectroscopy (FTIR) combined with multivariate analysis. Through both principal component analysis (PCA) and hierarchical cluster analysis (HCA), we observed well-defined separation between condensed (quebracho and black wattle) and hydrolysable (valonea, chestnut, myrobalan, and tara) tannins. For hydrolysable tannins, it was also possible to observe the formation of two different subgroups between samples of chestnut and valonea and between samples of tara and myrobalan. Among all samples analysed, the chestnut and valonea showed the greatest similarity, indicating that these extracts contain equivalent chemical compositions and structure and, therefore, similar properties.

  14. Hierarchical Porous Structures

    Energy Technology Data Exchange (ETDEWEB)

    Grote, Christopher John [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

    2016-06-07

    Materials Design is often at the forefront of technological innovation. While there has always been a push to generate increasingly low density materials, such as aero or hydrogels, more recently the idea of bicontinuous structures has gone more into play. This review will cover some of the methods and applications for generating both porous, and hierarchically porous structures.

  15. Beer fermentation: monitoring of process parameters by FT-NIR and multivariate data analysis.

    Science.gov (United States)

    Grassi, Silvia; Amigo, José Manuel; Lyndgaard, Christian Bøge; Foschino, Roberto; Casiraghi, Ernestina

    2014-07-15

    This work investigates the capability of Fourier-Transform near infrared (FT-NIR) spectroscopy to monitor and assess process parameters in beer fermentation at different operative conditions. For this purpose, the fermentation of wort with two different yeast strains and at different temperatures was monitored for nine days by FT-NIR. To correlate the collected spectra with °Brix, pH and biomass, different multivariate data methodologies were applied. Principal component analysis (PCA), partial least squares (PLS) and locally weighted regression (LWR) were used to assess the relationship between FT-NIR spectra and the abovementioned process parameters that define the beer fermentation. The accuracy and robustness of the obtained results clearly show the suitability of FT-NIR spectroscopy, combined with multivariate data analysis, to be used as a quality control tool in the beer fermentation process. FT-NIR spectroscopy, when combined with LWR, demonstrates to be a perfectly suitable quantitative method to be implemented in the production of beer. Copyright © 2014 Elsevier Ltd. All rights reserved.

  16. Multivariate regression applied to the performance optimization of a countercurrent ultracentrifuge - a preliminary study

    International Nuclear Information System (INIS)

    Migliavacca, Elder; Andrade, Delvonei Alves de

    2004-01-01

    In this work, the least-squares methodology with covariance matrix is applied to determine a data curve fitting in order to obtain a performance function for the separative power δU of a ultracentrifuge as a function of variables that are experimentally controlled. The experimental data refer to 173 experiments on the ultracentrifugation process for uranium isotope separation. The experimental uncertainties related with these independent variables are considered in the calculation of the experimental separative power values, determining an experimental data input covariance matrix. The process control variables, which significantly influence the δU values, are chosen in order to give information on the ultracentrifuge behaviour when submitted to several levels of feed flow F and cut θ . After the model goodness-of-fit validation, a residual analysis is carried out to verify the assumed basis concerning its randomness and independence and mainly the existence of residual heterocedasticity with any regression model variable. The response curves are made relating the separative power with the control variables F and θ, to compare the fitted model with the experimental data and finally to calculate their optimized values. (author)

  17. Multivariate meta-analysis: a robust approach based on the theory of U-statistic.

    Science.gov (United States)

    Ma, Yan; Mazumdar, Madhu

    2011-10-30

    Meta-analysis is the methodology for combining findings from similar research studies asking the same question. When the question of interest involves multiple outcomes, multivariate meta-analysis is used to synthesize the outcomes simultaneously taking into account the correlation between the outcomes. Likelihood-based approaches, in particular restricted maximum likelihood (REML) method, are commonly utilized in this context. REML assumes a multivariate normal distribution for the random-effects model. This assumption is difficult to verify, especially for meta-analysis with small number of component studies. The use of REML also requires iterative estimation between parameters, needing moderately high computation time, especially when the dimension of outcomes is large. A multivariate method of moments (MMM) is available and is shown to perform equally well to REML. However, there is a lack of information on the performance of these two methods when the true data distribution is far from normality. In this paper, we propose a new nonparametric and non-iterative method for multivariate meta-analysis on the basis of the theory of U-statistic and compare the properties of these three procedures under both normal and skewed data through simulation studies. It is shown that the effect on estimates from REML because of non-normal data distribution is marginal and that the estimates from MMM and U-statistic-based approaches are very similar. Therefore, we conclude that for performing multivariate meta-analysis, the U-statistic estimation procedure is a viable alternative to REML and MMM. Easy implementation of all three methods are illustrated by their application to data from two published meta-analysis from the fields of hip fracture and periodontal disease. We discuss ideas for future research based on U-statistic for testing significance of between-study heterogeneity and for extending the work to meta-regression setting. Copyright © 2011 John Wiley & Sons, Ltd.

  18. Multivariate Max-Stable Spatial Processes

    KAUST Repository

    Genton, Marc G.

    2014-01-06

    Analysis of spatial extremes is currently based on univariate processes. Max-stable processes allow the spatial dependence of extremes to be modelled and explicitly quantified, they are therefore widely adopted in applications. For a better understanding of extreme events of real processes, such as environmental phenomena, it may be useful to study several spatial variables simultaneously. To this end, we extend some theoretical results and applications of max-stable processes to the multivariate setting to analyze extreme events of several variables observed across space. In particular, we study the maxima of independent replicates of multivariate processes, both in the Gaussian and Student-t cases. Then, we define a Poisson process construction in the multivariate setting and introduce multivariate versions of the Smith Gaussian extremevalue, the Schlather extremal-Gaussian and extremal-t, and the BrownResnick models. Inferential aspects of those models based on composite likelihoods are developed. We present results of various Monte Carlo simulations and of an application to a dataset of summer daily temperature maxima and minima in Oklahoma, U.S.A., highlighting the utility of working with multivariate models in contrast to the univariate case. Based on joint work with Simone Padoan and Huiyan Sang.

  19. Multivariate Max-Stable Spatial Processes

    KAUST Repository

    Genton, Marc G.

    2014-01-01

    Analysis of spatial extremes is currently based on univariate processes. Max-stable processes allow the spatial dependence of extremes to be modelled and explicitly quantified, they are therefore widely adopted in applications. For a better understanding of extreme events of real processes, such as environmental phenomena, it may be useful to study several spatial variables simultaneously. To this end, we extend some theoretical results and applications of max-stable processes to the multivariate setting to analyze extreme events of several variables observed across space. In particular, we study the maxima of independent replicates of multivariate processes, both in the Gaussian and Student-t cases. Then, we define a Poisson process construction in the multivariate setting and introduce multivariate versions of the Smith Gaussian extremevalue, the Schlather extremal-Gaussian and extremal-t, and the BrownResnick models. Inferential aspects of those models based on composite likelihoods are developed. We present results of various Monte Carlo simulations and of an application to a dataset of summer daily temperature maxima and minima in Oklahoma, U.S.A., highlighting the utility of working with multivariate models in contrast to the univariate case. Based on joint work with Simone Padoan and Huiyan Sang.

  20. Analyzing security protocols in hierarchical networks

    DEFF Research Database (Denmark)

    Zhang, Ye; Nielson, Hanne Riis

    2006-01-01

    Validating security protocols is a well-known hard problem even in a simple setting of a single global network. But a real network often consists of, besides the public-accessed part, several sub-networks and thereby forms a hierarchical structure. In this paper we first present a process calculus...... capturing the characteristics of hierarchical networks and describe the behavior of protocols on such networks. We then develop a static analysis to automate the validation. Finally we demonstrate how the technique can benefit the protocol development and the design of network systems by presenting a series...

  1. Hierarchical Analysis of the Omega Ontology

    Energy Technology Data Exchange (ETDEWEB)

    Joslyn, Cliff A.; Paulson, Patrick R.

    2009-12-01

    Initial delivery for mathematical analysis of the Omega Ontology. We provide an analysis of the hierarchical structure of a version of the Omega Ontology currently in use within the US Government. After providing an initial statistical analysis of the distribution of all link types in the ontology, we then provide a detailed order theoretical analysis of each of the four main hierarchical links present. This order theoretical analysis includes the distribution of components and their properties, their parent/child and multiple inheritance structure, and the distribution of their vertical ranks.

  2. Use of multivariate analysis to research career advancement of academic librarians

    Directory of Open Access Journals (Sweden)

    Filiberto Felipe Martínez Arellano

    2004-01-01

    Full Text Available Diverse variables dealing with credential factors, bureaucratiuc factors, organizational and disciplinary achievements, academic culture factors, social ascribed factors, and institutional factors were stated as explanatory elements of promotion, tenure status, and earnings. A survey was the research instrument for collecting data to test diverse variables dealing with academic librarians rewards and earnings. Since the study attempted to analyze variables in a multivariate context, variable interactions were tested using multiple regression analysis. Findings of this study contribute to a better understanding of those factors influencing career advancement of academic librarians. Likewise, research methodology of this study could be used in Library and Information Science(LIS research.

  3. A primer of multivariate statistics

    CERN Document Server

    Harris, Richard J

    2014-01-01

    Drawing upon more than 30 years of experience in working with statistics, Dr. Richard J. Harris has updated A Primer of Multivariate Statistics to provide a model of balance between how-to and why. This classic text covers multivariate techniques with a taste of latent variable approaches. Throughout the book there is a focus on the importance of describing and testing one's interpretations of the emergent variables that are produced by multivariate analysis. This edition retains its conversational writing style while focusing on classical techniques. The book gives the reader a feel for why

  4. Hierarchical composites: Analysis of damage evolution based on fiber bundle model

    DEFF Research Database (Denmark)

    Mishnaevsky, Leon

    2011-01-01

    A computational model of multiscale composites is developed on the basis of the fiber bundle model with the hierarchical load sharing rule, and employed to study the effect of the microstructures of hierarchical composites on their damage resistance. Two types of hierarchical materials were consi...

  5. Hierarchical cellular designs for load-bearing biocomposite beams and plates

    International Nuclear Information System (INIS)

    Burgueno, Rigoberto; Quagliata, Mario J.; Mohanty, Amar K.; Mehta, Geeta; Drzal, Lawrence T.; Misra, Manjusri

    2005-01-01

    Scrutiny into the composition of natural, or biological materials convincingly reveals that high material and structural efficiency can be attained, even with moderate-quality constituents, by hierarchical topologies, i.e., successively organized material levels or layers. The present study demonstrates that biologically inspired hierarchical designs can help improve the moderate properties of natural fiber polymer composites or biocomposites and allow them to compete with conventional materials for load-bearing applications. An overview of the mechanics concepts that allow hierarchical designs to achieve higher performance is presented, followed by observation and results from flexural tests on periodic and hierarchical cellular beams and plates made from industrial hemp fibers and unsaturated polyester resin biocomposites. The experimental data is shown to agree well with performance indices predicted by mechanics models. A procedure for the multi-scale integrated material/structural analysis of hierarchical cellular biocomposite components is presented and its advantages and limitations are discussed

  6. Fabrication of Superhydrophobic Surface with Controlled Wetting Property by Hierarchical Particles.

    Science.gov (United States)

    Xu, Jianxiong; Liu, Weiwei; Du, Jingjing; Tang, Zengmin; Xu, Lijian; Li, Na

    2015-04-01

    Hierarchical particles were prepared by synthetically joining appropriately functionalized polystyrene spheres of poly[styrene-co-(3-(4-vinylphenyl)pentane-2,4-dione)] (PS-co-PVPD) nanoparticles and poly(styrene-co-chloromethylstyrene) (PS-co-PCMS) microparticles. The coupling reaction of nucleophilic substitution of pendent β-diketone groups with benzyl chloride was used to form the hierarchical particles. Since the polymeric nanoparticles and microparticles were synthesized by dispersion polymerization and emulsion polymerization, respectively, both the core microparticles and the surface nanoparticles can be different size and chemical composition. By means of changing the size of the PS-co-PVPD surface nanoparticles, a series of hierarchical particles with different scale ratio of the micro/nano surface structure were successfully prepared. Moreover, by employing the PS-co-PVPD microparticles and PS-co-PCMS nanoparticles as building blocks, hierarchical particles with surface nanoaprticles of different composition were made. These as-prepared hierarchical particles were subsequently assembled on glass substrates to form particulate films. Contact angle measurement shows that superhydrophobic surfaces can be obtained and the contact angle of water on the hierarchically structured surface can be adjusted by the scale ratio of the micro/nano surface structure and surface chemical component of hierarchical particles.

  7. Discursive Hierarchical Patterning in Law and Management Cases

    Science.gov (United States)

    Lung, Jane

    2008-01-01

    This paper investigates the differences in the discursive patterning of cases in Law and Management. It examines a corpus of 271 Law and Management cases and discusses the kind of information that these two disciplines call for and how discourses are constructed in discursive hierarchical patterns. A discursive hierarchical pattern is a model…

  8. Hierarchical modularity in human brain functional networks

    Directory of Open Access Journals (Sweden)

    David Meunier

    2009-10-01

    Full Text Available The idea that complex systems have a hierarchical modular organization originates in the early 1960s and has recently attracted fresh support from quantitative studies of large scale, real-life networks. Here we investigate the hierarchical modular (or “modules-within-modules” decomposition of human brain functional networks, measured using functional magnetic resonance imaging (fMRI in 18 healthy volunteers under no-task or resting conditions. We used a customized template to extract networks with more than 1800 regional nodes, and we applied a fast algorithm to identify nested modular structure at several hierarchical levels. We used mutual information, 0 < I < 1, to estimate the similarity of community structure of networks in different subjects, and to identify the individual network that is most representative of the group. Results show that human brain functional networks have a hierarchical modular organization with a fair degree of similarity between subjects, I=0.63. The largest 5 modules at the highest level of the hierarchy were medial occipital, lateral occipital, central, parieto-frontal and fronto-temporal systems; occipital modules demonstrated less sub-modular organization than modules comprising regions of multimodal association cortex. Connector nodes and hubs, with a key role in inter-modular connectivity, were also concentrated in association cortical areas. We conclude that methods are available for hierarchical modular decomposition of large numbers of high resolution brain functional networks using computationally expedient algorithms. This could enable future investigations of Simon's original hypothesis that hierarchy or near-decomposability of physical symbol systems is a critical design feature for their fast adaptivity to changing environmental conditions.

  9. Multivariate stochastic simulation with subjective multivariate normal distributions

    Science.gov (United States)

    P. J. Ince; J. Buongiorno

    1991-01-01

    In many applications of Monte Carlo simulation in forestry or forest products, it may be known that some variables are correlated. However, for simplicity, in most simulations it has been assumed that random variables are independently distributed. This report describes an alternative Monte Carlo simulation technique for subjectively assesed multivariate normal...

  10. Hierarchical Context Modeling for Video Event Recognition.

    Science.gov (United States)

    Wang, Xiaoyang; Ji, Qiang

    2016-10-11

    Current video event recognition research remains largely target-centered. For real-world surveillance videos, targetcentered event recognition faces great challenges due to large intra-class target variation, limited image resolution, and poor detection and tracking results. To mitigate these challenges, we introduced a context-augmented video event recognition approach. Specifically, we explicitly capture different types of contexts from three levels including image level, semantic level, and prior level. At the image level, we introduce two types of contextual features including the appearance context features and interaction context features to capture the appearance of context objects and their interactions with the target objects. At the semantic level, we propose a deep model based on deep Boltzmann machine to learn event object representations and their interactions. At the prior level, we utilize two types of prior-level contexts including scene priming and dynamic cueing. Finally, we introduce a hierarchical context model that systematically integrates the contextual information at different levels. Through the hierarchical context model, contexts at different levels jointly contribute to the event recognition. We evaluate the hierarchical context model for event recognition on benchmark surveillance video datasets. Results show that incorporating contexts in each level can improve event recognition performance, and jointly integrating three levels of contexts through our hierarchical model achieves the best performance.

  11. Model Checking Multivariate State Rewards

    DEFF Research Database (Denmark)

    Nielsen, Bo Friis; Nielson, Flemming; Nielson, Hanne Riis

    2010-01-01

    We consider continuous stochastic logics with state rewards that are interpreted over continuous time Markov chains. We show how results from multivariate phase type distributions can be used to obtain higher-order moments for multivariate state rewards (including covariance). We also generalise...

  12. Hierarchical silica particles by dynamic multicomponent assembly

    DEFF Research Database (Denmark)

    Wu, Z. W.; Hu, Q. Y.; Pang, J. B.

    2005-01-01

    Abstract: Aerosol-assisted assembly of mesoporous silica particles with hierarchically controllable pore structure has been prepared using cetyltrimethylammonium bromide (CTAB) and poly(propylene oxide) (PPO, H[OCH(CH3)CH2],OH) as co-templates. Addition of the hydrophobic PPO significantly...... influences the delicate hydrophilic-hydrophobic balance in the well-studied CTAB-silicate co-assembling system, resulting in various mesostructures (such as hexagonal, lamellar, and hierarchical structure). The co-assembly of CTAB, silicate clusters, and a low-molecular-weight PPO (average M-n 425) results...... in a uniform lamellar structure, while the use of a high-molecular-weight PPO (average M-n 2000), which is more hydrophobic, leads to the formation of hierarchical pore structure that contains meso-meso or meso-macro pore structure. The role of PPO additives on the mesostructure evolution in the CTAB...

  13. Determinação de misturas de sulfametoxazol e trimetoprima por espectroscopia eletrônica multivariada Determination of sulfamethoxazole and trimethoprim mixtures by multivariate electronic spectroscopy

    Directory of Open Access Journals (Sweden)

    Gilcélia A. Cordeiro

    2008-01-01

    Full Text Available In this work a multivariate spectroscopic methodology is proposed for quantitative determination of sulfamethoxazole and trimethoprim in pharmaceutical associations. The multivariate model was developed by partial least-squares regression, using twenty synthetic mixtures and the spectral region between 190 and 350 nm. In the validation stage, which involved the analysis of five synthetic mixtures, prediction errors lower that 3% were observed. The predictive capacity of the multivariate models is seriously affected by spectral changes induced by pH variations, a fact that acquires a great significance in the analysis of real samples (pharmaceuticals that contain chemical additives.

  14. Changes in cod muscle proteins during frozen storage revealed by proteome analysis and multivariate data analysis

    DEFF Research Database (Denmark)

    Kjærsgård, Inger Vibeke Holst; Nørrelykke, M.R.; Jessen, Flemming

    2006-01-01

    Multivariate data analysis has been combined with proteomics to enhance the recovery of information from 2-DE of cod muscle proteins during different storage conditions. Proteins were extracted according to 11 different storage conditions and samples were resolved by 2-DE. Data generated by 2-DE...... was subjected to principal component analysis (PCA) and discriminant partial least squares regression (DPLSR). Applying PCA to 2-DE data revealed the samples to form groups according to frozen storage time, whereas differences due to different storage temperatures or chilled storage in modified atmosphere...... light chain 1, 2 and 3, triose-phosphate isomerase, glyceraldehyde-3-phosphate dehydrogenase, aldolase A and two ?-actin fragments, and a nuclease diphosphate kinase B fragment to change in concentration, during frozen storage. Application of proteomics, multivariate data analysis and MS/MS to analyse...

  15. Multivariate reference technique for quantitative analysis of fiber-optic tissue Raman spectroscopy.

    Science.gov (United States)

    Bergholt, Mads Sylvest; Duraipandian, Shiyamala; Zheng, Wei; Huang, Zhiwei

    2013-12-03

    We report a novel method making use of multivariate reference signals of fused silica and sapphire Raman signals generated from a ball-lens fiber-optic Raman probe for quantitative analysis of in vivo tissue Raman measurements in real time. Partial least-squares (PLS) regression modeling is applied to extract the characteristic internal reference Raman signals (e.g., shoulder of the prominent fused silica boson peak (~130 cm(-1)); distinct sapphire ball-lens peaks (380, 417, 646, and 751 cm(-1))) from the ball-lens fiber-optic Raman probe for quantitative analysis of fiber-optic Raman spectroscopy. To evaluate the analytical value of this novel multivariate reference technique, a rapid Raman spectroscopy system coupled with a ball-lens fiber-optic Raman probe is used for in vivo oral tissue Raman measurements (n = 25 subjects) under 785 nm laser excitation powers ranging from 5 to 65 mW. An accurate linear relationship (R(2) = 0.981) with a root-mean-square error of cross validation (RMSECV) of 2.5 mW can be obtained for predicting the laser excitation power changes based on a leave-one-subject-out cross-validation, which is superior to the normal univariate reference method (RMSE = 6.2 mW). A root-mean-square error of prediction (RMSEP) of 2.4 mW (R(2) = 0.985) can also be achieved for laser power prediction in real time when we applied the multivariate method independently on the five new subjects (n = 166 spectra). We further apply the multivariate reference technique for quantitative analysis of gelatin tissue phantoms that gives rise to an RMSEP of ~2.0% (R(2) = 0.998) independent of laser excitation power variations. This work demonstrates that multivariate reference technique can be advantageously used to monitor and correct the variations of laser excitation power and fiber coupling efficiency in situ for standardizing the tissue Raman intensity to realize quantitative analysis of tissue Raman measurements in vivo, which is particularly appealing in

  16. Hierarchical classification with a competitive evolutionary neural tree.

    Science.gov (United States)

    Adams, R G.; Butchart, K; Davey, N

    1999-04-01

    A new, dynamic, tree structured network, the Competitive Evolutionary Neural Tree (CENT) is introduced. The network is able to provide a hierarchical classification of unlabelled data sets. The main advantage that the CENT offers over other hierarchical competitive networks is its ability to self determine the number, and structure, of the competitive nodes in the network, without the need for externally set parameters. The network produces stable classificatory structures by halting its growth using locally calculated heuristics. The results of network simulations are presented over a range of data sets, including Anderson's IRIS data set. The CENT network demonstrates its ability to produce a representative hierarchical structure to classify a broad range of data sets.

  17. Dual Regression

    OpenAIRE

    Spady, Richard; Stouli, Sami

    2012-01-01

    We propose dual regression as an alternative to the quantile regression process for the global estimation of conditional distribution functions under minimal assumptions. Dual regression provides all the interpretational power of the quantile regression process while avoiding the need for repairing the intersecting conditional quantile surfaces that quantile regression often produces in practice. Our approach introduces a mathematical programming characterization of conditional distribution f...

  18. A Poisson-lognormal conditional-autoregressive model for multivariate spatial analysis of pedestrian crash counts across neighborhoods.

    Science.gov (United States)

    Wang, Yiyi; Kockelman, Kara M

    2013-11-01

    This work examines the relationship between 3-year pedestrian crash counts across Census tracts in Austin, Texas, and various land use, network, and demographic attributes, such as land use balance, residents' access to commercial land uses, sidewalk density, lane-mile densities (by roadway class), and population and employment densities (by type). The model specification allows for region-specific heterogeneity, correlation across response types, and spatial autocorrelation via a Poisson-based multivariate conditional auto-regressive (CAR) framework and is estimated using Bayesian Markov chain Monte Carlo methods. Least-squares regression estimates of walk-miles traveled per zone serve as the exposure measure. Here, the Poisson-lognormal multivariate CAR model outperforms an aspatial Poisson-lognormal multivariate model and a spatial model (without cross-severity correlation), both in terms of fit and inference. Positive spatial autocorrelation emerges across neighborhoods, as expected (due to latent heterogeneity or missing variables that trend in space, resulting in spatial clustering of crash counts). In comparison, the positive aspatial, bivariate cross correlation of severe (fatal or incapacitating) and non-severe crash rates reflects latent covariates that have impacts across severity levels but are more local in nature (such as lighting conditions and local sight obstructions), along with spatially lagged cross correlation. Results also suggest greater mixing of residences and commercial land uses is associated with higher pedestrian crash risk across different severity levels, ceteris paribus, presumably since such access produces more potential conflicts between pedestrian and vehicle movements. Interestingly, network densities show variable effects, and sidewalk provision is associated with lower severe-crash rates. Copyright © 2013 Elsevier Ltd. All rights reserved.

  19. Comparing near-infrared conventional diffuse reflectance spectroscopy and hyperspectral imaging for determination of the bulk properties of solid samples by multivariate regression: determination of Mooney viscosity and plasticity indices of natural rubber.

    Science.gov (United States)

    Juliano da Silva, Carlos; Pasquini, Celio

    2015-01-21

    Conventional reflectance spectroscopy (NIRS) and hyperspectral imaging (HI) in the near-infrared region (1000-2500 nm) are evaluated and compared, using, as the case study, the determination of relevant properties related to the quality of natural rubber. Mooney viscosity (MV) and plasticity indices (PI) (PI0 - original plasticity, PI30 - plasticity after accelerated aging, and PRI - the plasticity retention index after accelerated aging) of rubber were determined using multivariate regression models. Two hundred and eighty six samples of rubber were measured using conventional and hyperspectral near-infrared imaging reflectance instruments in the range of 1000-2500 nm. The sample set was split into regression (n = 191) and external validation (n = 95) sub-sets. Three instruments were employed for data acquisition: a line scanning hyperspectral camera and two conventional FT-NIR spectrometers. Sample heterogeneity was evaluated using hyperspectral images obtained with a resolution of 150 × 150 μm and principal component analysis. The probed sample area (5 cm(2); 24,000 pixels) to achieve representativeness was found to be equivalent to the average of 6 spectra for a 1 cm diameter probing circular window of one FT-NIR instrument. The other spectrophotometer can probe the whole sample in only one measurement. The results show that the rubber properties can be determined with very similar accuracy and precision by Partial Least Square (PLS) regression models regardless of whether HI-NIR or conventional FT-NIR produce the spectral datasets. The best Root Mean Square Errors of Prediction (RMSEPs) of external validation for MV, PI0, PI30, and PRI were 4.3, 1.8, 3.4, and 5.3%, respectively. Though the quantitative results provided by the three instruments can be considered equivalent, the hyperspectral imaging instrument presents a number of advantages, being about 6 times faster than conventional bulk spectrometers, producing robust spectral data by ensuring sample

  20. Logistic regression analysis of factors associated with avascular necrosis of the femoral head following femoral neck fractures in middle-aged and elderly patients.

    Science.gov (United States)

    Ai, Zi-Sheng; Gao, You-Shui; Sun, Yuan; Liu, Yue; Zhang, Chang-Qing; Jiang, Cheng-Hua

    2013-03-01

    Risk factors for femoral neck fracture-induced avascular necrosis of the femoral head have not been elucidated clearly in middle-aged and elderly patients. Moreover, the high incidence of screw removal in China and its effect on the fate of the involved femoral head require statistical methods to reflect their intrinsic relationship. Ninety-nine patients older than 45 years with femoral neck fracture were treated by internal fixation between May 1999 and April 2004. Descriptive analysis, interaction analysis between associated factors, single factor logistic regression, multivariate logistic regression, and detailed interaction analysis were employed to explore potential relationships among associated factors. Avascular necrosis of the femoral head was found in 15 cases (15.2 %). Age × the status of implants (removal vs. maintenance) and gender × the timing of reduction were interactive according to two-factor interactive analysis. Age, the displacement of fractures, the quality of reduction, and the status of implants were found to be significant factors in single factor logistic regression analysis. Age, age × the status of implants, and the quality of reduction were found to be significant factors in multivariate logistic regression analysis. In fine interaction analysis after multivariate logistic regression analysis, implant removal was the most important risk factor for avascular necrosis in 56-to-85-year-old patients, with a risk ratio of 26.00 (95 % CI = 3.076-219.747). The middle-aged and elderly have less incidence of avascular necrosis of the femoral head following femoral neck fractures treated by cannulated screws. The removal of cannulated screws can induce a significantly high incidence of avascular necrosis of the femoral head in elderly patients, while a high-quality reduction is helpful to reduce avascular necrosis.

  1. Direct hierarchical assembly of nanoparticles

    Science.gov (United States)

    Xu, Ting; Zhao, Yue; Thorkelsson, Kari

    2014-07-22

    The present invention provides hierarchical assemblies of a block copolymer, a bifunctional linking compound and a nanoparticle. The block copolymers form one micro-domain and the nanoparticles another micro-domain.

  2. Band structures of two dimensional solid/air hierarchical phononic crystals

    Energy Technology Data Exchange (ETDEWEB)

    Xu, Y.L.; Tian, X.G. [State Key Laboratory for Mechanical Structure Strength and Vibration, Xi' an Jiaotong University, Xi' an 710049 (China); Chen, C.Q., E-mail: chencq@tsinghua.edu.cn [Department of Engineering Mechanics, AML and CNMM, Tsinghua University, Beijing 100084 (China)

    2012-06-15

    The hierarchical phononic crystals to be considered show a two-order 'hierarchical' feature, which consists of square array arranged macroscopic periodic unit cells with each unit cell itself including four sub-units. Propagation of acoustic wave in such two dimensional solid/air phononic crystals is investigated by the finite element method (FEM) with the Bloch theory. Their band structure, wave filtering property, and the physical mechanism responsible for the broadened band gap are explored. The corresponding ordinary phononic crystal without hierarchical feature is used for comparison. Obtained results show that the solid/air hierarchical phononic crystals possess tunable outstanding band gap features, which are favorable for applications such as sound insulation and vibration attenuation.

  3. Nearly Cyclic Pursuit and its Hierarchical variant for Multi-agent Systems

    DEFF Research Database (Denmark)

    Iqbal, Muhammad; Leth, John-Josef; Ngo, Trung Dung

    2015-01-01

    The rendezvous problem for multiple agents under nearly cyclic pursuit and hierarchical nearly cyclic pursuit is discussed in this paper. The control law designed under nearly cyclic pursuit strategy enables the agents to converge at a point dictated by a beacon. A hierarchical version of the nea......The rendezvous problem for multiple agents under nearly cyclic pursuit and hierarchical nearly cyclic pursuit is discussed in this paper. The control law designed under nearly cyclic pursuit strategy enables the agents to converge at a point dictated by a beacon. A hierarchical version...

  4. Automatic Hierarchical Color Image Classification

    Directory of Open Access Journals (Sweden)

    Jing Huang

    2003-02-01

    Full Text Available Organizing images into semantic categories can be extremely useful for content-based image retrieval and image annotation. Grouping images into semantic classes is a difficult problem, however. Image classification attempts to solve this hard problem by using low-level image features. In this paper, we propose a method for hierarchical classification of images via supervised learning. This scheme relies on using a good low-level feature and subsequently performing feature-space reconfiguration using singular value decomposition to reduce noise and dimensionality. We use the training data to obtain a hierarchical classification tree that can be used to categorize new images. Our experimental results suggest that this scheme not only performs better than standard nearest-neighbor techniques, but also has both storage and computational advantages.

  5. Hierarchical clustering using correlation metric and spatial continuity constraint

    Science.gov (United States)

    Stork, Christopher L.; Brewer, Luke N.

    2012-10-02

    Large data sets are analyzed by hierarchical clustering using correlation as a similarity measure. This provides results that are superior to those obtained using a Euclidean distance similarity measure. A spatial continuity constraint may be applied in hierarchical clustering analysis of images.

  6. Clinical value of regression of electrocardiographic left ventricular hypertrophy after aortic valve replacement.

    Science.gov (United States)

    Yamabe, Sayuri; Dohi, Yoshihiro; Higashi, Akifumi; Kinoshita, Hiroki; Sada, Yoshiharu; Hidaka, Takayuki; Kurisu, Satoshi; Shiode, Nobuo; Kihara, Yasuki

    2016-09-01

    Electrocardiographic left ventricular hypertrophy (ECG-LVH) gradually regressed after aortic valve replacement (AVR) in patients with severe aortic stenosis. Sokolow-Lyon voltage (SV1 + RV5/6) is possibly the most widely used criterion for ECG-LVH. The aim of this study was to determine whether decrease in Sokolow-Lyon voltage reflects left ventricular reverse remodeling detected by echocardiography after AVR. Of 129 consecutive patients who underwent AVR for severe aortic stenosis, 38 patients with preoperative ECG-LVH, defined by SV1 + RV5/6 of ≥3.5 mV, were enrolled in this study. Electrocardiography and echocardiography were performed preoperatively and 1 year postoperatively. The patients were divided into ECG-LVH regression group (n = 19) and non-regression group (n = 19) according to the median value of the absolute regression in SV1 + RV5/6. Multivariate logistic regression analysis was performed to assess determinants of ECG-LVH regression among echocardiographic indices. ECG-LVH regression group showed significantly greater decrease in left ventricular mass index and left ventricular dimensions than Non-regression group. ECG-LVH regression was independently determined by decrease in the left ventricular mass index [odds ratio (OR) 1.28, 95 % confidence interval (CI) 1.03-1.69, p = 0.048], left ventricular end-diastolic dimension (OR 1.18, 95 % CI 1.03-1.41, p = 0.014), and left ventricular end-systolic dimension (OR 1.24, 95 % CI 1.06-1.52, p = 0.0047). ECG-LVH regression could be a marker of the effect of AVR on both reducing the left ventricular mass index and left ventricular dimensions. The effect of AVR on reverse remodeling can be estimated, at least in part, by regression of ECG-LVH.

  7. Static and dynamic friction of hierarchical surfaces.

    Science.gov (United States)

    Costagliola, Gianluca; Bosia, Federico; Pugno, Nicola M

    2016-12-01

    Hierarchical structures are very common in nature, but only recently have they been systematically studied in materials science, in order to understand the specific effects they can have on the mechanical properties of various systems. Structural hierarchy provides a way to tune and optimize macroscopic mechanical properties starting from simple base constituents and new materials are nowadays designed exploiting this possibility. This can be true also in the field of tribology. In this paper we study the effect of hierarchical patterned surfaces on the static and dynamic friction coefficients of an elastic material. Our results are obtained by means of numerical simulations using a one-dimensional spring-block model, which has previously been used to investigate various aspects of friction. Despite the simplicity of the model, we highlight some possible mechanisms that explain how hierarchical structures can significantly modify the friction coefficients of a material, providing a means to achieve tunability.

  8. Hierarchical Bayesian Modeling of Fluid-Induced Seismicity

    Science.gov (United States)

    Broccardo, M.; Mignan, A.; Wiemer, S.; Stojadinovic, B.; Giardini, D.

    2017-11-01

    In this study, we present a Bayesian hierarchical framework to model fluid-induced seismicity. The framework is based on a nonhomogeneous Poisson process with a fluid-induced seismicity rate proportional to the rate of injected fluid. The fluid-induced seismicity rate model depends upon a set of physically meaningful parameters and has been validated for six fluid-induced case studies. In line with the vision of hierarchical Bayesian modeling, the rate parameters are considered as random variables. We develop both the Bayesian inference and updating rules, which are used to develop a probabilistic forecasting model. We tested the Basel 2006 fluid-induced seismic case study to prove that the hierarchical Bayesian model offers a suitable framework to coherently encode both epistemic uncertainty and aleatory variability. Moreover, it provides a robust and consistent short-term seismic forecasting model suitable for online risk quantification and mitigation.

  9. Deep hierarchical attention network for video description

    Science.gov (United States)

    Li, Shuohao; Tang, Min; Zhang, Jun

    2018-03-01

    Pairing video to natural language description remains a challenge in computer vision and machine translation. Inspired by image description, which uses an encoder-decoder model for reducing visual scene into a single sentence, we propose a deep hierarchical attention network for video description. The proposed model uses convolutional neural network (CNN) and bidirectional LSTM network as encoders while a hierarchical attention network is used as the decoder. Compared to encoder-decoder models used in video description, the bidirectional LSTM network can capture the temporal structure among video frames. Moreover, the hierarchical attention network has an advantage over single-layer attention network on global context modeling. To make a fair comparison with other methods, we evaluate the proposed architecture with different types of CNN structures and decoders. Experimental results on the standard datasets show that our model has a more superior performance than the state-of-the-art techniques.

  10. Multivariate analysis: models and method

    International Nuclear Information System (INIS)

    Sanz Perucha, J.

    1990-01-01

    Data treatment techniques are increasingly used since computer methods result of wider access. Multivariate analysis consists of a group of statistic methods that are applied to study objects or samples characterized by multiple values. A final goal is decision making. The paper describes the models and methods of multivariate analysis

  11. Hierarchical Sets: Analyzing Pangenome Structure through Scalable Set Visualizations

    DEFF Research Database (Denmark)

    Pedersen, Thomas Lin

    2017-01-01

    of hierarchical sets by applying it to a pangenome based on 113 Escherichia and Shigella genomes and find it provides a powerful addition to pangenome analysis. The described clustering algorithm and visualizations are implemented in the hierarchicalSets R package available from CRAN (https...

  12. Trace element analytics and multivariate statistics for investigation and assessment of the pollution situation in rivers; Elementspurenanalytik und multivariate Statistik zur Untersuchung und Bewertung des Belastungszustandes von Fliessgewaessern

    Energy Technology Data Exchange (ETDEWEB)

    Aulinger, A.M. [GKSS-Forschungszentrum Geesthacht GmbH (Germany). Inst. fuer Kuestenforschung

    2002-07-01

    In order to describe and assess the element distribution and the trend of the pollution in sediments, particulate suspended matter and dissolved matter of the river Elbe more than 60 elements were determined in several sampling campaigns along the entire river during the nineties. By analyzing the resulting data with two- and multi-way principal components analysis geogenic and anthropogenically influenced elements were distinguished and typical longitudinal profiles concerning geogenic or anthropogenic characteristics were summarized. Sampling locations having similiar element distribution patterns were aggregated to characteristic Elbe sections by means of hierarchical cluster analysis. The temporal trend of the pollution within the different sections was quantified by comparing the mean concentrations of the anthropogenically influenced elements. Two- and Three-way PLS regression models were applied to predict element concentrations in one certain river compartment from measured concentrations in one or two different compartments. (orig.)

  13. An Alternative Flight Software Trigger Paradigm: Applying Multivariate Logistic Regression to Sense Trigger Conditions Using Inaccurate or Scarce Information

    Science.gov (United States)

    Smith, Kelly M.; Gay, Robert S.; Stachowiak, Susan J.

    2013-01-01

    In late 2014, NASA will fly the Orion capsule on a Delta IV-Heavy rocket for the Exploration Flight Test-1 (EFT-1) mission. For EFT-1, the Orion capsule will be flying with a new GPS receiver and new navigation software. Given the experimental nature of the flight, the flight software must be robust to the loss of GPS measurements. Once the high-speed entry is complete, the drogue parachutes must be deployed within the proper conditions to stabilize the vehicle prior to deploying the main parachutes. When GPS is available in nominal operations, the vehicle will deploy the drogue parachutes based on an altitude trigger. However, when GPS is unavailable, the navigated altitude errors become excessively large, driving the need for a backup barometric altimeter to improve altitude knowledge. In order to increase overall robustness, the vehicle also has an alternate method of triggering the parachute deployment sequence based on planet-relative velocity if both the GPS and the barometric altimeter fail. However, this backup trigger results in large altitude errors relative to the targeted altitude. Motivated by this challenge, this paper demonstrates how logistic regression may be employed to semi-automatically generate robust triggers based on statistical analysis. Logistic regression is used as a ground processor pre-flight to develop a statistical classifier. The classifier would then be implemented in flight software and executed in real-time. This technique offers improved performance even in the face of highly inaccurate measurements. Although the logistic regression-based trigger approach will not be implemented within EFT-1 flight software, the methodology can be carried forward for future missions and vehicles.

  14. Facile synthesis and photocatalytic activity of zinc oxide hierarchical microcrystals

    KAUST Repository

    Xu, Xinjiang

    2013-04-04

    ZnO microcrystals with hierarchical structure have been synthesized by a simple solvothermal approach. The microcrystals were studied by means of X-ray diffraction, transmission electron microscopy, and scanning electron microscopy. Research on the formation mechanism of the hierarchical microstructure shows that the coordination solvent and precursor concentration have considerable influence on the size and morphology of the microstructures. A possible formation mechanism of the hierarchical structure was suggested. Furthermore, the catalytic activity of the ZnO microcrystals was studied by treating low concentration Rhodamine B (RhB) solution under UV light, and research results show the hierarchical microstructures of ZnO display high catalytic activity in photocatalysis, the catalysis process follows first-order reaction kinetics, and the apparent rate constant k = 0.03195 min-1.

  15. Evaluation of functional outcome of the floating knee injury using multivariate analysis.

    Science.gov (United States)

    Yokoyama, Kazuhiko; Tsukamoto, Tatsuro; Aoki, Shinichi; Wakita, Ryuji; Uchino, Masataka; Noumi, Takashi; Fukushima, Nobuaki; Itoman, Moritoshi

    2002-11-01

    The objective of this study is to evaluate significant contributing factors affecting the functional prognosis of floating knee injuries using multivariate analysis. A total of 68 floating knee injuries (67 patients) were treated at Kitasato University Hospital from 1986 to 1999. Both the femoral fractures and the tibial fractures were managed surgically by various methods. The functional results of these injuries were evaluated using the grading system of Karlström and Olerud. Follow-up periods ranged from 2 to 19 years (mean 50.2 months) after the original injury. We defined satisfactory (S) outcomes as those cases with excellent or good results and unsatisfactory (US) outcomes as those cases with acceptable or poor results. Logistic regression analysis was used as a multivariate analysis, and the dependent variables were defined as a satisfactory outcome or as an unsatisfactory outcome. The explanatory variables were predicting factors influencing the functional outcome such as age at trauma, gender, severity of soft-tissue injury in the femur and the tibia, AO fracture grade in the femur and the tibia, Fraser type (type I or type II), Injury Severity Score (ISS), and fixation time after injury (less than 1 week or more than 1 week) in the femur and the tibia. The final functional results were as follows: 25 cases had excellent results, 15 cases good results, 16 cases acceptable results, and 12 cases poor results. The predictive logistic regression equation was as follows: Log 1-p/p = 3.12-1.52 x Fraser type - 1.65 x severity of soft-tissue injury in the tibia - 1.31 x fixation time after injury in the tibia - 0.821 x AO fracture grade in the tibia + 1.025 x fixation time after injury in the femur - 0.687 x AO fracture grade in the femur ( p=0.01). Among the variables, Fraser type and the severity of soft-tissue injury in the tibia were significantly related to the final result. The multivariate analysis showed that both the involvement of the knee joint and

  16. Dittrichia graveolens (L.) Greuter Essential Oil: Chemical Composition, Multivariate Analysis, and Antimicrobial Activity.

    Science.gov (United States)

    Mitic, Violeta; Stankov Jovanovic, Vesna; Ilic, Marija; Jovanovic, Olga; Djordjevic, Aleksandra; Stojanovic, Gordana

    2016-01-01

    The chemical composition and in vitro antimicrobial activities of Dittrichia graveolens (L.) Greuter essential oil was studied. Moreover, using agglomerative hierarchical cluster (AHC) and principal component analyses (PCA), the interrelationships of the D. graveolens essential-oil profiles characterized so far (including the sample from this study) were investigated. To evaluate the chemical composition of the essential oil, GC-FID and GC/MS analyses were performed. Altogether, 54 compounds were identified, accounting for 92.9% of the total oil composition. The D. graveolens oil belongs to the monoterpenoid chemotype, with monoterpenoids comprising 87.4% of the totally identified compounds. The major components were borneol (43.6%) and bornyl acetate (38.3%). Multivariate analysis showed that the compounds borneol and bornyl acetate exerted the greatest influence on the spatial differences in the composition of the reported oils. The antimicrobial activity against five bacterial and one fungal strain was determined using a disk-diffusion assay. The studied essential oil was active only against Gram-positive bacteria. Copyright © 2016 Verlag Helvetica Chimica Acta AG, Zürich.

  17. Hierarchical decision making for flood risk reduction

    DEFF Research Database (Denmark)

    Custer, Rocco; Nishijima, Kazuyoshi

    2013-01-01

    . In current practice, structures are often optimized individually without considering benefits of having a hierarchy of protection structures. It is here argued, that the joint consideration of hierarchically integrated protection structures is beneficial. A hierarchical decision model is utilized to analyze...... and compare the benefit of large upstream protection structures and local downstream protection structures in regard to epistemic uncertainty parameters. Results suggest that epistemic uncertainty influences the outcome of the decision model and that, depending on the magnitude of epistemic uncertainty...

  18. Ionothermal synthesis of hierarchical BiOBr microspheres for water treatment

    Energy Technology Data Exchange (ETDEWEB)

    Zhang, Dieqing [The Education Ministry Key Lab of Resource Chemistry and Shanghai Key Laboratory of Rare Earth Functional Materials, Shanghai Normal University, 100 Guilin Road, Shanghai 200231 (China); Department of Chemistry and Institute of Environment, Energy and Sustainability, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong (China); Wen, Meicheng; Jiang, Bo; Li, Guisheng [The Education Ministry Key Lab of Resource Chemistry and Shanghai Key Laboratory of Rare Earth Functional Materials, Shanghai Normal University, 100 Guilin Road, Shanghai 200231 (China); Yu, Jimmy C., E-mail: jimyu@cuhk.edu.hk [Department of Chemistry and Institute of Environment, Energy and Sustainability, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong (China)

    2012-04-15

    Graphical abstract: Hierarchical BiOBr microspheres were prepared from a bromine-containing ionic liquid. The material was found effective for removing heavy metals, degrading organic pollutants and killing bacteria. Highlight: Black-Right-Pointing-Pointer Ionothermal synthesis of BiOBr microspheres with hierarchical structure. Black-Right-Pointing-Pointer Efficient mass transfer and excellent light-harvesting ability. Black-Right-Pointing-Pointer Suitable for removing heavy metals and treatment of organic dyes. Black-Right-Pointing-Pointer Remarkable photocatalytic bactericidal property. - Abstract: Bismuth oxybromide (BiOBr) micropsheres with hierarchical morphologies have been fabricated via an ionothermal synthesis route. Ionic liquid acts as a unique soft material capable of promoting nucleation and in situ growth of 3D hierarchical BiOBr mesocrystals without the help of surfactants. The as-prepared BiOBr nanomaterials can effectively remove heavy metal ions and organic dyes from wastewater. They can also kill Micrococcus lylae, a Gram positive bacterium, in water under fluorescent light irradiation. Their high adaptability in water treatment may be ascribed to their hierarchical structure, allowing them high surface to volume ratio, facile species transportation and excellent light-harvesting ability.

  19. Ionothermal synthesis of hierarchical BiOBr microspheres for water treatment

    International Nuclear Information System (INIS)

    Zhang, Dieqing; Wen, Meicheng; Jiang, Bo; Li, Guisheng; Yu, Jimmy C.

    2012-01-01

    Graphical abstract: Hierarchical BiOBr microspheres were prepared from a bromine-containing ionic liquid. The material was found effective for removing heavy metals, degrading organic pollutants and killing bacteria. Highlight: ► Ionothermal synthesis of BiOBr microspheres with hierarchical structure. ► Efficient mass transfer and excellent light-harvesting ability. ► Suitable for removing heavy metals and treatment of organic dyes. ► Remarkable photocatalytic bactericidal property. - Abstract: Bismuth oxybromide (BiOBr) micropsheres with hierarchical morphologies have been fabricated via an ionothermal synthesis route. Ionic liquid acts as a unique soft material capable of promoting nucleation and in situ growth of 3D hierarchical BiOBr mesocrystals without the help of surfactants. The as-prepared BiOBr nanomaterials can effectively remove heavy metal ions and organic dyes from wastewater. They can also kill Micrococcus lylae, a Gram positive bacterium, in water under fluorescent light irradiation. Their high adaptability in water treatment may be ascribed to their hierarchical structure, allowing them high surface to volume ratio, facile species transportation and excellent light-harvesting ability.

  20. BiOCl nanowire with hierarchical structure and its Raman features

    International Nuclear Information System (INIS)

    Tian Ye; Guo Chuanfei; Guo Yanjun; Wang Qi; Liu Qian

    2012-01-01

    BiOCl is a promising V-VI-VII-compound semiconductor with excellent optical and electrical properties, and has great potential applications in photo-catalysis, photoelectric, etc. We successfully synthesize BiOCl nanowire with a hierarchical structure by combining wet etch (top-down) with liquid phase crystal growth (bottom-up) process, opening a novel method to construct ordered bismuth-based nanostructures. The morphology and lattice structures of Bi nanowires, β-Bi 2 O 3 nanowires and BiOCl nanowires with the hierarchical structure are investigated by scanning electron microscope (SEM) and transition electron microscope (TEM). The formation mechanism of such ordered BiOCl hierarchical structure is considered to mainly originate from the highly preferred growth, which is governed by the lattice match between (1 1 0) facet of BiOCl and (2 2 0) or (0 0 2) facet of β-Bi 2 O 3 . A schematic model is also illustrated to depict the formation process of the ordered BiOCl hierarchical structure. In addition, Raman properties of the BiOCl nanowire with the hierarchical structure are investigated deeply.

  1. Multivariate strategies in functional magnetic resonance imaging

    DEFF Research Database (Denmark)

    Hansen, Lars Kai

    2007-01-01

    We discuss aspects of multivariate fMRI modeling, including the statistical evaluation of multivariate models and means for dimensional reduction. In a case study we analyze linear and non-linear dimensional reduction tools in the context of a `mind reading' predictive multivariate fMRI model....

  2. Applied multivariate statistical analysis

    CERN Document Server

    Härdle, Wolfgang Karl

    2015-01-01

    Focusing on high-dimensional applications, this 4th edition presents the tools and concepts used in multivariate data analysis in a style that is also accessible for non-mathematicians and practitioners.  It surveys the basic principles and emphasizes both exploratory and inferential statistics; a new chapter on Variable Selection (Lasso, SCAD and Elastic Net) has also been added.  All chapters include practical exercises that highlight applications in different multivariate data analysis fields: in quantitative financial studies, where the joint dynamics of assets are observed; in medicine, where recorded observations of subjects in different locations form the basis for reliable diagnoses and medication; and in quantitative marketing, where consumers’ preferences are collected in order to construct models of consumer behavior.  All of these examples involve high to ultra-high dimensions and represent a number of major fields in big data analysis. The fourth edition of this book on Applied Multivariate ...

  3. On Utmost Multiplicity of Hierarchical Stellar Systems

    Directory of Open Access Journals (Sweden)

    Gebrehiwot Y. M.

    2016-12-01

    Full Text Available According to theoretical considerations, multiplicity of hierarchical stellar systems can reach, depending on masses and orbital parameters, several hundred, while observational data confirm the existence of at most septuple (seven-component systems. In this study, we cross-match the stellar systems of very high multiplicity (six and more components in modern catalogues of visual double and multiple stars to find among them the candidates to hierarchical systems. After cross-matching the catalogues of closer binaries (eclipsing, spectroscopic, etc., some of their components were found to be binary/multiple themselves, what increases the system's degree of multiplicity. Optical pairs, known from literature or filtered by the authors, were flagged and excluded from the statistics. We compiled a list of hierarchical systems with potentially very high multiplicity that contains ten objects. Their multiplicity does not exceed 12, and we discuss a number of ways to explain the lack of extremely high multiplicity systems.

  4. Hierarchical capillary adhesion of microcantilevers or hairs

    International Nuclear Information System (INIS)

    Liu Jianlin; Feng Xiqiao; Xia Re; Zhao Hongping

    2007-01-01

    As a result of capillary forces, animal hairs, carbon nanotubes or nanowires of a periodically or randomly distributed array often assemble into hierarchical structures. In this paper, the energy method is adopted to analyse the capillary adhesion of microsized hairs, which are modelled as clamped microcantilevers wetted by liquids. The critical conditions for capillary adhesion of two hairs, three hairs or two bundles of hairs are derived in terms of Young's contact angle, elastic modulus and geometric sizes of the beams. Then, the hierarchical capillary adhesion of hairs is addressed. It is found that for multiple hairs or microcantilevers, the system tends to take a hierarchical structure as a result of the minimization of the total potential energy of the system. The level number of structural hierarchy increases with the increase in the number of hairs if they are sufficiently long. Additionally, we performed experiments to verify our theoretical solutions for the adhesion of microbeams

  5. Multivariate Bonferroni-type inequalities theory and applications

    CERN Document Server

    Chen, John

    2014-01-01

    Multivariate Bonferroni-Type Inequalities: Theory and Applications presents a systematic account of research discoveries on multivariate Bonferroni-type inequalities published in the past decade. The emergence of new bounding approaches pushes the conventional definitions of optimal inequalities and demands new insights into linear and Fréchet optimality. The book explores these advances in bounding techniques with corresponding innovative applications. It presents the method of linear programming for multivariate bounds, multivariate hybrid bounds, sub-Markovian bounds, and bounds using Hamil

  6. A kernel version of multivariate alteration detection

    DEFF Research Database (Denmark)

    Nielsen, Allan Aasbjerg; Vestergaard, Jacob Schack

    2013-01-01

    Based on the established methods kernel canonical correlation analysis and multivariate alteration detection we introduce a kernel version of multivariate alteration detection. A case study with SPOT HRV data shows that the kMAD variates focus on extreme change observations.......Based on the established methods kernel canonical correlation analysis and multivariate alteration detection we introduce a kernel version of multivariate alteration detection. A case study with SPOT HRV data shows that the kMAD variates focus on extreme change observations....

  7. A Hierarchical Dispatch Structure for Distribution Network Pricing

    OpenAIRE

    Yuan, Zhao; Hesamzadeh, Mohammad Reza

    2015-01-01

    This paper presents a hierarchical dispatch structure for efficient distribution network pricing. The dispatch coordination problem in the context of hierarchical network operators are addressed. We formulate decentralized generation dispatch into a bilevel optimization problem in which main network operator and the connected distribution network operator optimize their costs in two levels. By using Karush-Kuhn-Tucker conditions and Fortuny-Amat McCarl linearization, the bilevel optimization ...

  8. Multivariate Matrix-Exponential Distributions

    DEFF Research Database (Denmark)

    Bladt, Mogens; Nielsen, Bo Friis

    2010-01-01

    be written as linear combinations of the elements in the exponential of a matrix. For this reason we shall refer to multivariate distributions with rational Laplace transform as multivariate matrix-exponential distributions (MVME). The marginal distributions of an MVME are univariate matrix......-exponential distributions. We prove a characterization that states that a distribution is an MVME distribution if and only if all non-negative, non-null linear combinations of the coordinates have a univariate matrix-exponential distribution. This theorem is analog to a well-known characterization theorem...

  9. Multivariate analysis methods in physics

    International Nuclear Information System (INIS)

    Wolter, M.

    2007-01-01

    A review of multivariate methods based on statistical training is given. Several multivariate methods useful in high-energy physics analysis are discussed. Selected examples from current research in particle physics are discussed, both from the on-line trigger selection and from the off-line analysis. Also statistical training methods are presented and some new application are suggested [ru

  10. Hierarchical materials: Background and perspectives

    DEFF Research Database (Denmark)

    2016-01-01

    Hierarchical design draws inspiration from analysis of biological materials and has opened new possibilities for enhancing performance and enabling new functionalities and extraordinary properties. With the development of nanotechnology, the necessary technological requirements for the manufactur...

  11. Hierarchical Planning Methodology for a Supply Chain Management

    Directory of Open Access Journals (Sweden)

    Virna ORTIZ-ARAYA

    2012-01-01

    Full Text Available Hierarchical production planning is a widely utilized methodology for real world capacitated production planning systems with the aim of establishing different decision–making levels of the planning issues on the time horizon considered. This paper presents a hierarchical approach proposed to a company that produces reusable shopping bags in Chile and Perú, to determine the optimal allocation of resources at the tactical level as well as over the most immediate planning horizon to meet customer demands for the next weeks. Starting from an aggregated production planning model, the aggregated decisions are disaggregated into refined decisions in two levels, using a couple of optimization models that impose appropriate constraints to keep coherence of the plan on the production system. The main features of the hierarchical solution approach are presented.

  12. Uni- and multi-variable modelling of flood losses: experiences gained from the Secchia river inundation event.

    Science.gov (United States)

    Carisi, Francesca; Domeneghetti, Alessio; Kreibich, Heidi; Schröter, Kai; Castellarin, Attilio

    2017-04-01

    Flood risk is function of flood hazard and vulnerability, therefore its accurate assessment depends on a reliable quantification of both factors. The scientific literature proposes a number of objective and reliable methods for assessing flood hazard, yet it highlights a limited understanding of the fundamental damage processes. Loss modelling is associated with large uncertainty which is, among other factors, due to a lack of standard procedures; for instance, flood losses are often estimated based on damage models derived in completely different contexts (i.e. different countries or geographical regions) without checking its applicability, or by considering only one explanatory variable (i.e. typically water depth). We consider the Secchia river flood event of January 2014, when a sudden levee-breach caused the inundation of nearly 200 km2 in Northern Italy. In the aftermath of this event, local authorities collected flood loss data, together with additional information on affected private households and industrial activities (e.g. buildings surface and economic value, number of company's employees and others). Based on these data we implemented and compared a quadratic-regression damage function, with water depth as the only explanatory variable, and a multi-variable model that combines multiple regression trees and considers several explanatory variables (i.e. bagging decision trees). Our results show the importance of data collection revealing that (1) a simple quadratic regression damage function based on empirical data from the study area can be significantly more accurate than literature damage-models derived for a different context and (2) multi-variable modelling may outperform the uni-variable approach, yet it is more difficult to develop and apply due to a much higher demand of detailed data.

  13. Hierarchical Factoring Based On Image Analysis And Orthoblique Rotations.

    Science.gov (United States)

    Stankov, L

    1979-07-01

    The procedure for hierarchical factoring suggested by Schmid and Leiman (1957) is applied within the framework of image analysis and orthoblique rotational procedures. It is shown that this approach necessarily leads to correlated higher order factors. Also, one can obtain a smaller number of factors than produced by typical hierarchical procedures.

  14. Ultrafast Hierarchical OTDM/WDM Network

    Directory of Open Access Journals (Sweden)

    Hideyuki Sotobayashi

    2003-12-01

    Full Text Available Ultrafast hierarchical OTDM/WDM network is proposed for the future core-network. We review its enabling technologies: C- and L-wavelength-band generation, OTDM-WDM mutual multiplexing format conversions, and ultrafast OTDM wavelengthband conversions.

  15. Method for statistical data analysis of multivariate observations

    CERN Document Server

    Gnanadesikan, R

    1997-01-01

    A practical guide for multivariate statistical techniques-- now updated and revised In recent years, innovations in computer technology and statistical methodologies have dramatically altered the landscape of multivariate data analysis. This new edition of Methods for Statistical Data Analysis of Multivariate Observations explores current multivariate concepts and techniques while retaining the same practical focus of its predecessor. It integrates methods and data-based interpretations relevant to multivariate analysis in a way that addresses real-world problems arising in many areas of inte

  16. Multivariate survival analysis and competing risks

    CERN Document Server

    Crowder, Martin J

    2012-01-01

    Multivariate Survival Analysis and Competing Risks introduces univariate survival analysis and extends it to the multivariate case. It covers competing risks and counting processes and provides many real-world examples, exercises, and R code. The text discusses survival data, survival distributions, frailty models, parametric methods, multivariate data and distributions, copulas, continuous failure, parametric likelihood inference, and non- and semi-parametric methods. There are many books covering survival analysis, but very few that cover the multivariate case in any depth. Written for a graduate-level audience in statistics/biostatistics, this book includes practical exercises and R code for the examples. The author is renowned for his clear writing style, and this book continues that trend. It is an excellent reference for graduate students and researchers looking for grounding in this burgeoning field of research.

  17. The value of multivariate model sophistication

    DEFF Research Database (Denmark)

    Rombouts, Jeroen; Stentoft, Lars; Violante, Francesco

    2014-01-01

    We assess the predictive accuracies of a large number of multivariate volatility models in terms of pricing options on the Dow Jones Industrial Average. We measure the value of model sophistication in terms of dollar losses by considering a set of 444 multivariate models that differ in their spec....... In addition to investigating the value of model sophistication in terms of dollar losses directly, we also use the model confidence set approach to statistically infer the set of models that delivers the best pricing performances.......We assess the predictive accuracies of a large number of multivariate volatility models in terms of pricing options on the Dow Jones Industrial Average. We measure the value of model sophistication in terms of dollar losses by considering a set of 444 multivariate models that differ...

  18. Multivariate statistical modelling based on generalized linear models

    CERN Document Server

    Fahrmeir, Ludwig

    1994-01-01

    This book is concerned with the use of generalized linear models for univariate and multivariate regression analysis. Its emphasis is to provide a detailed introductory survey of the subject based on the analysis of real data drawn from a variety of subjects including the biological sciences, economics, and the social sciences. Where possible, technical details and proofs are deferred to an appendix in order to provide an accessible account for non-experts. Topics covered include: models for multi-categorical responses, model checking, time series and longitudinal data, random effects models, and state-space models. Throughout, the authors have taken great pains to discuss the underlying theoretical ideas in ways that relate well to the data at hand. As a result, numerous researchers whose work relies on the use of these models will find this an invaluable account to have on their desks. "The basic aim of the authors is to bring together and review a large part of recent advances in statistical modelling of m...

  19. Multivariate analysis of quantitative traits can effectively classify rapeseed germplasm

    Directory of Open Access Journals (Sweden)

    Jankulovska Mirjana

    2014-01-01

    Full Text Available In this study, the use of different multivariate approaches to classify rapeseed genotypes based on quantitative traits has been presented. Tree regression analysis, PCA analysis and two-way cluster analysis were applied in order todescribe and understand the extent of genetic variability in spring rapeseed genotype by trait data. The traits which highly influenced seed and oil yield in rapeseed were successfully identified by the tree regression analysis. Principal predictor for both response variables was number of pods per plant (NP. NP and 1000 seed weight could help in the selection of high yielding genotypes. High values for both traits and oil content could lead to high oil yielding genotypes. These traits may serve as indirect selection criteria and can lead to improvement of seed and oil yield in rapeseed. Quantitative traits that explained most of the variability in the studied germplasm were classified using principal component analysis. In this data set, five PCs were identified, out of which the first three PCs explained 63% of the total variance. It helped in facilitating the choice of variables based on which the genotypes’ clustering could be performed. The two-way cluster analysissimultaneously clustered genotypes and quantitative traits. The final number of clusters was determined using bootstrapping technique. This approach provided clear overview on the variability of the analyzed genotypes. The genotypes that have similar performance regarding the traits included in this study can be easily detected on the heatmap. Genotypes grouped in the clusters 1 and 8 had high values for seed and oil yield, and relatively short vegetative growth duration period and those in cluster 9, combined moderate to low values for vegetative growth duration and moderate to high seed and oil yield. These genotypes should be further exploited and implemented in the rapeseed breeding program. The combined application of these multivariate methods

  20. Regression: A Bibliography.

    Science.gov (United States)

    Pedrini, D. T.; Pedrini, Bonnie C.

    Regression, another mechanism studied by Sigmund Freud, has had much research, e.g., hypnotic regression, frustration regression, schizophrenic regression, and infra-human-animal regression (often directly related to fixation). Many investigators worked with hypnotic age regression, which has a long history, going back to Russian reflexologists.…