Low rank Multivariate regression
Giraud, Christophe
2010-01-01
We consider in this paper the multivariate regression problem, when the target regression matrix $A$ is close to a low rank matrix. Our primary interest in on the practical case where the variance of the noise is unknown. Our main contribution is to propose in this setting a criterion to select among a family of low rank estimators and prove a non-asymptotic oracle inequality for the resulting estimator. We also investigate the easier case where the variance of the noise is known and outline that the penalties appearing in our criterions are minimal (in some sense). These penalties involve the expected value of the Ky-Fan quasi-norm of some random matrices. These quantities can be evaluated easily in practice and upper-bounds can be derived from recent results in random matrix theory.
Multivariate Adaptive Regression Splines (Preprint)
1990-08-01
characteristics of olive oils as a function of production year by multivariate methods. La Revista Italiana delle Sostanze Grasse, 60, Oct. . Friedman, J...Projection pursuit 76, 817-823. Friedman, J. H. and Wright, M. J. (1981). A nested partitioni: integration. ACM Trans. Math. Software, March. simonious...data. Proc. 1964 ACM Nat. Conf., 517-524. Shumaker, L. L. (1976). Fitting surfaces to scattered data. In Approximation Theory III, G. G. Lorentz, C
Generation of hierarchically correlated multivariate symbolic sequences
Tumminello, Mi; Mantegna, R N
2008-01-01
We introduce an algorithm to generate multivariate series of symbols from a finite alphabet with a given hierarchical structure of similarities. The target hierarchical structure of similarities is arbitrary, for instance the one obtained by some hierarchical clustering procedure as applied to an empirical matrix of Hamming distances. The algorithm can be interpreted as the finite alphabet equivalent of the recently introduced hierarchically nested factor model (M. Tumminello et al. EPL 78 (3) 30006 (2007)). The algorithm is based on a generating mechanism that is different from the one used in the mutation rate approach. We apply the proposed methodology for investigating the relationship between the bootstrap value associated with a node of a phylogeny and the probability of finding that node in the true phylogeny.
Hierarchical linear regression models for conditional quantiles
Institute of Scientific and Technical Information of China (English)
TIAN Maozai; CHEN Gemai
2006-01-01
The quantile regression has several useful features and therefore is gradually developing into a comprehensive approach to the statistical analysis of linear and nonlinear response models,but it cannot deal effectively with the data with a hierarchical structure.In practice,the existence of such data hierarchies is neither accidental nor ignorable,it is a common phenomenon.To ignore this hierarchical data structure risks overlooking the importance of group effects,and may also render many of the traditional statistical analysis techniques used for studying data relationships invalid.On the other hand,the hierarchical models take a hierarchical data structure into account and have also many applications in statistics,ranging from overdispersion to constructing min-max estimators.However,the hierarchical models are virtually the mean regression,therefore,they cannot be used to characterize the entire conditional distribution of a dependent variable given high-dimensional covariates.Furthermore,the estimated coefficient vector (marginal effects)is sensitive to an outlier observation on the dependent variable.In this article,a new approach,which is based on the Gauss-Seidel iteration and taking a full advantage of the quantile regression and hierarchical models,is developed.On the theoretical front,we also consider the asymptotic properties of the new method,obtaining the simple conditions for an n1/2-convergence and an asymptotic normality.We also illustrate the use of the technique with the real educational data which is hierarchical and how the results can be explained.
The Infinite Hierarchical Factor Regression Model
Rai, Piyush
2009-01-01
We propose a nonparametric Bayesian factor regression model that accounts for uncertainty in the number of factors, and the relationship between factors. To accomplish this, we propose a sparse variant of the Indian Buffet Process and couple this with a hierarchical model over factors, based on Kingman's coalescent. We apply this model to two problems (factor analysis and factor regression) in gene-expression data analysis.
Bayesian Inference of a Multivariate Regression Model
Directory of Open Access Journals (Sweden)
Marick S. Sinay
2014-01-01
Full Text Available We explore Bayesian inference of a multivariate linear regression model with use of a flexible prior for the covariance structure. The commonly adopted Bayesian setup involves the conjugate prior, multivariate normal distribution for the regression coefficients and inverse Wishart specification for the covariance matrix. Here we depart from this approach and propose a novel Bayesian estimator for the covariance. A multivariate normal prior for the unique elements of the matrix logarithm of the covariance matrix is considered. Such structure allows for a richer class of prior distributions for the covariance, with respect to strength of beliefs in prior location hyperparameters, as well as the added ability, to model potential correlation amongst the covariance structure. The posterior moments of all relevant parameters of interest are calculated based upon numerical results via a Markov chain Monte Carlo procedure. The Metropolis-Hastings-within-Gibbs algorithm is invoked to account for the construction of a proposal density that closely matches the shape of the target posterior distribution. As an application of the proposed technique, we investigate a multiple regression based upon the 1980 High School and Beyond Survey.
Regularized multivariate regression models with skew-t error distributions
Chen, Lianfu
2014-06-01
We consider regularization of the parameters in multivariate linear regression models with the errors having a multivariate skew-t distribution. An iterative penalized likelihood procedure is proposed for constructing sparse estimators of both the regression coefficient and inverse scale matrices simultaneously. The sparsity is introduced through penalizing the negative log-likelihood by adding L1-penalties on the entries of the two matrices. Taking advantage of the hierarchical representation of skew-t distributions, and using the expectation conditional maximization (ECM) algorithm, we reduce the problem to penalized normal likelihood and develop a procedure to minimize the ensuing objective function. Using a simulation study the performance of the method is assessed, and the methodology is illustrated using a real data set with a 24-dimensional response vector. © 2014 Elsevier B.V.
A Gibbs Sampler for Multivariate Linear Regression
Mantz, Adam B
2015-01-01
Kelly (2007, hereafter K07) described an efficient algorithm, using Gibbs sampling, for performing linear regression in the fairly general case where non-zero measurement errors exist for both the covariates and response variables, where these measurements may be correlated (for the same data point), where the response variable is affected by intrinsic scatter in addition to measurement error, and where the prior distribution of covariates is modeled by a flexible mixture of Gaussians rather than assumed to be uniform. Here I extend the K07 algorithm in two ways. First, the procedure is generalized to the case of multiple response variables. Second, I describe how to model the prior distribution of covariates using a Dirichlet process, which can be thought of as a Gaussian mixture where the number of mixture components is learned from the data. I present an example of multivariate regression using the extended algorithm, namely fitting scaling relations of the gas mass, temperature, and luminosity of dynamica...
Li, Yanming; Nan, Bin; Zhu, Ji
2015-06-01
We propose a multivariate sparse group lasso variable selection and estimation method for data with high-dimensional predictors as well as high-dimensional response variables. The method is carried out through a penalized multivariate multiple linear regression model with an arbitrary group structure for the regression coefficient matrix. It suits many biology studies well in detecting associations between multiple traits and multiple predictors, with each trait and each predictor embedded in some biological functional groups such as genes, pathways or brain regions. The method is able to effectively remove unimportant groups as well as unimportant individual coefficients within important groups, particularly for large p small n problems, and is flexible in handling various complex group structures such as overlapping or nested or multilevel hierarchical structures. The method is evaluated through extensive simulations with comparisons to the conventional lasso and group lasso methods, and is applied to an eQTL association study.
Adaptive Rank Penalized Estimators in Multivariate Regression
Bunea, Florentina; Wegkamp, Marten
2010-01-01
We introduce a new criterion, the Rank Selection Criterion (RSC), for selecting the optimal reduced rank estimator of the coefficient matrix in multivariate response regression models. The corresponding RSC estimator minimizes the Frobenius norm of the fit plus a regularization term proportional to the number of parameters in the reduced rank model. The rank of the RSC estimator provides a consistent estimator of the rank of the coefficient matrix. The consistency results are valid not only in the classic asymptotic regime, when the number of responses $n$ and predictors $p$ stays bounded, and the number of observations $m$ grows, but also when either, or both, $n$ and $p$ grow, possibly much faster than $m$. Our finite sample prediction and estimation performance bounds show that the RSC estimator achieves the optimal balance between the approximation error and the penalty term. Furthermore, our procedure has very low computational complexity, linear in the number of candidate models, making it particularly ...
Entrepreneurial intention modeling using hierarchical multiple regression
Directory of Open Access Journals (Sweden)
Marina Jeger
2014-12-01
Full Text Available The goal of this study is to identify the contribution of effectuation dimensions to the predictive power of the entrepreneurial intention model over and above that which can be accounted for by other predictors selected and confirmed in previous studies. As is often the case in social and behavioral studies, some variables are likely to be highly correlated with each other. Therefore, the relative amount of variance in the criterion variable explained by each of the predictors depends on several factors such as the order of variable entry and sample specifics. The results show the modest predictive power of two dimensions of effectuation prior to the introduction of the theory of planned behavior elements. The article highlights the main advantages of applying hierarchical regression in social sciences as well as in the specific context of entrepreneurial intention formation, and addresses some of the potential pitfalls that this type of analysis entails.
Relationship between Multiple Regression and Selected Multivariable Methods.
Schumacker, Randall E.
The relationship of multiple linear regression to various multivariate statistical techniques is discussed. The importance of the standardized partial regression coefficient (beta weight) in multiple linear regression as it is applied in path, factor, LISREL, and discriminant analyses is emphasized. The multivariate methods discussed in this paper…
Hierarchical multivariate covariance analysis of metabolic connectivity.
Carbonell, Felix; Charil, Arnaud; Zijdenbos, Alex P; Evans, Alan C; Bedell, Barry J
2014-12-01
Conventional brain connectivity analysis is typically based on the assessment of interregional correlations. Given that correlation coefficients are derived from both covariance and variance, group differences in covariance may be obscured by differences in the variance terms. To facilitate a comprehensive assessment of connectivity, we propose a unified statistical framework that interrogates the individual terms of the correlation coefficient. We have evaluated the utility of this method for metabolic connectivity analysis using [18F]2-fluoro-2-deoxyglucose (FDG) positron emission tomography (PET) data from the Alzheimer's Disease Neuroimaging Initiative (ADNI) study. As an illustrative example of the utility of this approach, we examined metabolic connectivity in angular gyrus and precuneus seed regions of mild cognitive impairment (MCI) subjects with low and high β-amyloid burdens. This new multivariate method allowed us to identify alterations in the metabolic connectome, which would not have been detected using classic seed-based correlation analysis. Ultimately, this novel approach should be extensible to brain network analysis and broadly applicable to other imaging modalities, such as functional magnetic resonance imaging (MRI).
An Efficient Local Algorithm for Distributed Multivariate Regression
National Aeronautics and Space Administration — This paper offers a local distributed algorithm for multivariate regression in large peer-to-peer environments. The algorithm is designed for distributed...
A Scalable Local Algorithm for Distributed Multivariate Regression
National Aeronautics and Space Administration — This paper offers a local distributed algorithm for multivariate regression in large peer-to-peer environments. The algorithm can be used for distributed...
Hierarchical Neural Regression Models for Customer Churn Prediction
Directory of Open Access Journals (Sweden)
Golshan Mohammadi
2013-01-01
Full Text Available As customers are the main assets of each industry, customer churn prediction is becoming a major task for companies to remain in competition with competitors. In the literature, the better applicability and efficiency of hierarchical data mining techniques has been reported. This paper considers three hierarchical models by combining four different data mining techniques for churn prediction, which are backpropagation artificial neural networks (ANN, self-organizing maps (SOM, alpha-cut fuzzy c-means (α-FCM, and Cox proportional hazards regression model. The hierarchical models are ANN + ANN + Cox, SOM + ANN + Cox, and α-FCM + ANN + Cox. In particular, the first component of the models aims to cluster data in two churner and nonchurner groups and also filter out unrepresentative data or outliers. Then, the clustered data as the outputs are used to assign customers to churner and nonchurner groups by the second technique. Finally, the correctly classified data are used to create Cox proportional hazards model. To evaluate the performance of the hierarchical models, an Iranian mobile dataset is considered. The experimental results show that the hierarchical models outperform the single Cox regression baseline model in terms of prediction accuracy, Types I and II errors, RMSE, and MAD metrics. In addition, the α-FCM + ANN + Cox model significantly performs better than the two other hierarchical models.
Multivariate Regression with Monotone Missing Observation of the Dependent Variables
Raats, V.M.; van der Genugten, B.B.; Moors, J.J.A.
2002-01-01
Multivariate regression is discussed, where the observations of the dependent variables are (monotone) missing completely at random; the explanatory variables are assumed to be completely observed.We discuss OLS-, GLS- and a certain form of E(stimated) GLS-estimation.It turns out that
Multivariate Local Polynomial Regression with Application to Shenzhen Component Index
Directory of Open Access Journals (Sweden)
Liyun Su
2011-01-01
Full Text Available This study attempts to characterize and predict stock index series in Shenzhen stock market using the concepts of multivariate local polynomial regression. Based on nonlinearity and chaos of the stock index time series, multivariate local polynomial prediction methods and univariate local polynomial prediction method, all of which use the concept of phase space reconstruction according to Takens' Theorem, are considered. To fit the stock index series, the single series changes into bivariate series. To evaluate the results, the multivariate predictor for bivariate time series based on multivariate local polynomial model is compared with univariate predictor with the same Shenzhen stock index data. The numerical results obtained by Shenzhen component index show that the prediction mean squared error of the multivariate predictor is much smaller than the univariate one and is much better than the existed three methods. Even if the last half of the training data are used in the multivariate predictor, the prediction mean squared error is smaller than the univariate predictor. Multivariate local polynomial prediction model for nonsingle time series is a useful tool for stock market price prediction.
Production optimisation in the petrochemical industry by hierarchical multivariate modelling
Energy Technology Data Exchange (ETDEWEB)
Andersson, Magnus; Furusjoe, Erik; Jansson, Aasa
2004-06-01
This project demonstrates the advantages of applying hierarchical multivariate modelling in the petrochemical industry in order to increase knowledge of the total process. The models indicate possible ways to optimise the process regarding the use of energy and raw material, which is directly linked to the environmental impact of the process. The refinery of Nynaes Refining AB (Goeteborg, Sweden) has acted as a demonstration site in this project. The models developed for the demonstration site resulted in: Detection of an unknown process disturbance and suggestions of possible causes; Indications on how to increase the yield in combination with energy savings; The possibility to predict product quality from on-line process measurements, making the results available at a higher frequency than customary laboratory analysis; Quantification of the gradually lowered efficiency of heat transfer in the furnace and increased fuel consumption as an effect of soot build-up on the furnace coils; Increased knowledge of the relation between production rate and the efficiency of the heat exchangers. This report is one of two reports from the project. It contains a technical discussion of the result with some degree of detail. A shorter and more easily accessible report is also available, see IVL report B1586-A.
Production optimisation in the petrochemical industry by hierarchical multivariate modelling
Energy Technology Data Exchange (ETDEWEB)
Andersson, Magnus; Furusjoe, Erik; Jansson, Aasa
2004-06-01
This project demonstrates the advantages of applying hierarchical multivariate modelling in the petrochemical industry in order to increase knowledge of the total process. The models indicate possible ways to optimise the process regarding the use of energy and raw material, which is directly linked to the environmental impact of the process. The refinery of Nynaes Refining AB (Goeteborg, Sweden) has acted as a demonstration site in this project. The models developed for the demonstration site resulted in: Detection of an unknown process disturbance and suggestions of possible causes; Indications on how to increase the yield in combination with energy savings; The possibility to predict product quality from on-line process measurements, making the results available at a higher frequency than customary laboratory analysis; Quantification of the gradually lowered efficiency of heat transfer in the furnace and increased fuel consumption as an effect of soot build-up on the furnace coils; Increased knowledge of the relation between production rate and the efficiency of the heat exchangers. This report is one of two reports from the project. It contains a technical discussion of the result with some degree of detail. A shorter and more easily accessible report is also available, see IVL report B1586-A.
Prediction of longitudinal dispersion coefficient using multivariate adaptive regression splines
Indian Academy of Sciences (India)
Amir Hamzeh Haghiabi
2016-07-01
In this paper, multivariate adaptive regression splines (MARS) was developed as a novel soft-computingtechnique for predicting longitudinal dispersion coefficient (DL) in rivers. As mentioned in the literature,experimental dataset related to DL was collected and used for preparing MARS model. Results of MARSmodel were compared with multi-layer neural network model and empirical formulas. To define the mosteffective parameters on DL, the Gamma test was used. Performance of MARS model was assessed bycalculation of standard error indices. Error indices showed that MARS model has suitable performanceand is more accurate compared to multi-layer neural network model and empirical formulas. Results ofthe Gamma test and MARS model showed that flow depth (H) and ratio of the mean velocity to shearvelocity (u/u^∗) were the most effective parameters on the DL.
Multivariate Regression Analysis of Gravitational Waves from Rotating Core Collapse
Engels, William J; Ott, Christian D
2014-01-01
We present a new multivariate regression model for analysis and parameter estimation of gravitational waves observed from well but not perfectly modeled sources such as core-collapse supernovae. Our approach is based on a principal component decomposition of simulated waveform catalogs. Instead of reconstructing waveforms by direct linear combination of physically meaningless principal components, we solve via least squares for the relationship that encodes the connection between chosen physical parameters and the principal component basis. Although our approach is linear, the waveforms' parameter dependence may be non-linear. For the case of gravitational waves from rotating core collapse, we show, using statistical hypothesis testing, that our method is capable of identifying the most important physical parameters that govern waveform morphology in the presence of simulated detector noise. We also demonstrate our method's ability to predict waveforms from a principal component basis given a set of physical ...
Multivariate parametric random effect regression models for fecundability studies.
Ecochard, R; Clayton, D G
2000-12-01
Delay until conception is generally described by a mixture of geometric distributions. Weinberg and Gladen (1986, Biometrics 42, 547-560) proposed a regression generalization of the beta-geometric mixture model where covariates effects were expressed in terms of contrasts of marginal hazards. Scheike and Jensen (1997, Biometrics 53, 318-329) developed a frailty model for discrete event times data based on discrete-time analogues of Hougaard's results (1984, Biometrika 71, 75-83). This paper is on a generalization to a three-parameter family distribution and an extension to multivariate cases. The model allows the introduction of explanatory variables, including time-dependent variables at the subject-specific level, together with a choice from a flexible family of random effect distributions. This makes it possible, in the context of medically assisted conception, to include data sources with multiple pregnancies (or attempts at pregnancy) per couple.
Empirical likelihood ratio tests for multivariate regression models
Institute of Scientific and Technical Information of China (English)
WU Jianhong; ZHU Lixing
2007-01-01
This paper proposes some diagnostic tools for checking the adequacy of multivariate regression models including classical regression and time series autoregression. In statistical inference, the empirical likelihood ratio method has been well known to be a powerful tool for constructing test and confidence region. For model checking, however, the naive empirical likelihood (EL) based tests are not of Wilks' phenomenon. Hence, we make use of bias correction to construct the EL-based score tests and derive a nonparametric version of Wilks' theorem. Moreover, by the advantages of both the EL and score test method, the EL-based score tests share many desirable features as follows: They are self-scale invariant and can detect the alternatives that converge to the null at rate n-1/2, the possibly fastest rate for lack-of-fit testing; they involve weight functions, which provides us with the flexibility to choose scores for improving power performance, especially under directional alternatives. Furthermore, when the alternatives are not directional, we construct asymptotically distribution-free maximin tests for a large class of possible alternatives. A simulation study is carried out and an application for a real dataset is analyzed.
Multivariate study and regression analysis of gluten-free granola
Directory of Open Access Journals (Sweden)
Lilian Maria Pagamunici
2014-03-01
Full Text Available This study developed a gluten-free granola and evaluated it during storage with the application of multivariate and regression analysis of the sensory and instrumental parameters. The physicochemical, sensory, and nutritional characteristics of a product containing quinoa, amaranth and linseed were evaluated. The crude protein and lipid contents ranged from 97.49 and 122.72 g kg-1 of food, respectively. The polyunsaturated/saturated, and n-6:n-3 fatty acid ratios ranged from 2.82 and 2.59:1, respectively. Granola had the best alpha-linolenic acid content, nutritional indices in the lipid fraction, and mineral content. There were good hygienic and sanitary conditions during storage; probably due to the low water activity of the formulation, which contributed to inhibit microbial growth. The sensory attributes ranged from 'like very much' to 'like slightly', and the regression models were highly fitted and correlated during the storage period. A reduction in the sensory attribute levels and in the product physical stabilisation was verified by principal component analysis. The use of the affective test acceptance and instrumental analysis combined with statistical methods allowed us to obtain promising results about the characteristics of gluten-free granola.
Modern Multivariate Statistical Techniques Regression, Classification, and Manifold Learning
Izenman, Alan Julian
2006-01-01
Describes the advances in computation and data storage that led to the introduction of many statistical tools for high-dimensional data analysis. Focusing on multivariate analysis, this book discusses nonlinear methods as well as linear methods. It presents an integrated mixture of classical and modern multivariate statistical techniques.
Nonparametric Least Squares Estimation of a Multivariate Convex Regression Function
Seijo, Emilio
2010-01-01
This paper deals with the consistency of the least squares estimator of a convex regression function when the predictor is multidimensional. We characterize and discuss the computation of such an estimator via the solution of certain quadratic and linear programs. Mild sufficient conditions for the consistency of this estimator and its subdifferentials in fixed and stochastic design regression settings are provided. We also consider a regression function which is known to be convex and componentwise nonincreasing and discuss the characterization, computation and consistency of its least squares estimator.
Robust Hierarchical Control for Uncertain Multivariable Hexarotor Systems
Directory of Open Access Journals (Sweden)
Wei Lin
2015-01-01
Full Text Available Multirotor helicopter attracts more attention due to its increased load capacity and being highly maneuverable. However, these helicopters are uncertain multivariable systems, which pose a challenge for their robust controller design. In this paper, a robust two-loop control scheme is proposed for a hexarotor system. The resulted controller consists of a nominal controller and a robust compensator. The robust compensators are added to restrain the influences of uncertainties such as nonlinear dynamics, coupling, parametric uncertainties, and external disturbances. It is proven that the tracking errors are ultimately bounded with specified boundaries by choosing the parameters of the robust compensators. Simulation results on the hexarotor demonstrate the effectiveness of the proposed control method.
Genton, Marc G.
2017-09-07
We present a hierarchical decomposition scheme for computing the n-dimensional integral of multivariate normal probabilities that appear frequently in statistics. The scheme exploits the fact that the formally dense covariance matrix can be approximated by a matrix with a hierarchical low rank structure. It allows the reduction of the computational complexity per Monte Carlo sample from O(n2) to O(mn+knlog(n/m)), where k is the numerical rank of off-diagonal matrix blocks and m is the size of small diagonal blocks in the matrix that are not well-approximated by low rank factorizations and treated as dense submatrices. This hierarchical decomposition leads to substantial efficiencies in multivariate normal probability computations and allows integrations in thousands of dimensions to be practical on modern workstations.
Regional flow duration curves: Geostatistical techniques versus multivariate regression
Pugliese, Alessio; Farmer, William H.; Castellarin, Attilio; Archfield, Stacey A.; Vogel, Richard M.
2016-10-01
A period-of-record flow duration curve (FDC) represents the relationship between the magnitude and frequency of daily streamflows. Prediction of FDCs is of great importance for locations characterized by sparse or missing streamflow observations. We present a detailed comparison of two methods which are capable of predicting an FDC at ungauged basins: (1) an adaptation of the geostatistical method, Top-kriging, employing a linear weighted average of dimensionless empirical FDCs, standardised with a reference streamflow value; and (2) regional multiple linear regression of streamflow quantiles, perhaps the most common method for the prediction of FDCs at ungauged sites. In particular, Top-kriging relies on a metric for expressing the similarity between catchments computed as the negative deviation of the FDC from a reference streamflow value, which we termed total negative deviation (TND). Comparisons of these two methods are made in 182 largely unregulated river catchments in the southeastern U.S. using a three-fold cross-validation algorithm. Our results reveal that the two methods perform similarly throughout flow-regimes, with average Nash-Sutcliffe Efficiencies 0.566 and 0.662, (0.883 and 0.829 on log-transformed quantiles) for the geostatistical and the linear regression models, respectively. The differences between the reproduction of FDC's occurred mostly for low flows with exceedance probability (i.e. duration) above 0.98.
Regional flow duration curves: Geostatistical techniques versus multivariate regression
Pugliese, Alessio; Farmer, William H.; Castellarin, Attilio; Archfield, Stacey A.; Vogel, Richard M.
2016-01-01
A period-of-record flow duration curve (FDC) represents the relationship between the magnitude and frequency of daily streamflows. Prediction of FDCs is of great importance for locations characterized by sparse or missing streamflow observations. We present a detailed comparison of two methods which are capable of predicting an FDC at ungauged basins: (1) an adaptation of the geostatistical method, Top-kriging, employing a linear weighted average of dimensionless empirical FDCs, standardised with a reference streamflow value; and (2) regional multiple linear regression of streamflow quantiles, perhaps the most common method for the prediction of FDCs at ungauged sites. In particular, Top-kriging relies on a metric for expressing the similarity between catchments computed as the negative deviation of the FDC from a reference streamflow value, which we termed total negative deviation (TND). Comparisons of these two methods are made in 182 largely unregulated river catchments in the southeastern U.S. using a three-fold cross-validation algorithm. Our results reveal that the two methods perform similarly throughout flow-regimes, with average Nash-Sutcliffe Efficiencies 0.566 and 0.662, (0.883 and 0.829 on log-transformed quantiles) for the geostatistical and the linear regression models, respectively. The differences between the reproduction of FDC's occurred mostly for low flows with exceedance probability (i.e. duration) above 0.98.
Nonparametric Regression Estimation for Multivariate Null Recurrent Processes
Directory of Open Access Journals (Sweden)
Biqing Cai
2015-04-01
Full Text Available This paper discusses nonparametric kernel regression with the regressor being a \\(d\\-dimensional \\(\\beta\\-null recurrent process in presence of conditional heteroscedasticity. We show that the mean function estimator is consistent with convergence rate \\(\\sqrt{n(Th^{d}}\\, where \\(n(T\\ is the number of regenerations for a \\(\\beta\\-null recurrent process and the limiting distribution (with proper normalization is normal. Furthermore, we show that the two-step estimator for the volatility function is consistent. The finite sample performance of the estimate is quite reasonable when the leave-one-out cross validation method is used for bandwidth selection. We apply the proposed method to study the relationship of Federal funds rate with 3-month and 5-year T-bill rates and discover the existence of nonlinearity of the relationship. Furthermore, the in-sample and out-of-sample performance of the nonparametric model is far better than the linear model.
Directory of Open Access Journals (Sweden)
Jin-Jia Wang
2014-01-01
Full Text Available We present the hierarchical interactive lasso penalized logistic regression using the coordinate descent algorithm based on the hierarchy theory and variables interactions. We define the interaction model based on the geometric algebra and hierarchical constraint conditions and then use the coordinate descent algorithm to solve for the coefficients of the hierarchical interactive lasso model. We provide the results of some experiments based on UCI datasets, Madelon datasets from NIPS2003, and daily activities of the elder. The experimental results show that the variable interactions and hierarchy contribute significantly to the classification. The hierarchical interactive lasso has the advantages of the lasso and interactive lasso.
Mapping informative clusters in a hierarchical [corrected] framework of FMRI multivariate analysis.
Directory of Open Access Journals (Sweden)
Rui Xu
Full Text Available Pattern recognition methods have become increasingly popular in fMRI data analysis, which are powerful in discriminating between multi-voxel patterns of brain activities associated with different mental states. However, when they are used in functional brain mapping, the location of discriminative voxels varies significantly, raising difficulties in interpreting the locus of the effect. Here we proposed a hierarchical framework of multivariate approach that maps informative clusters rather than voxels to achieve reliable functional brain mapping without compromising the discriminative power. In particular, we first searched for local homogeneous clusters that consisted of voxels with similar response profiles. Then, a multi-voxel classifier was built for each cluster to extract discriminative information from the multi-voxel patterns. Finally, through multivariate ranking, outputs from the classifiers were served as a multi-cluster pattern to identify informative clusters by examining interactions among clusters. Results from both simulated and real fMRI data demonstrated that this hierarchical approach showed better performance in the robustness of functional brain mapping than traditional voxel-based multivariate methods. In addition, the mapped clusters were highly overlapped for two perceptually equivalent object categories, further confirming the validity of our approach. In short, the hierarchical framework of multivariate approach is suitable for both pattern classification and brain mapping in fMRI studies.
Simultaneous Determination of Cobalt, Copper, and Nickel by Multivariate Linear Regression.
Dado, Greg; Rosenthal, Jeffrey
1990-01-01
Presented is an experiment where the concentrations of three metal ions in a solution are simultaneously determined by ultraviolet-vis spectroscopy. Availability of the computer program used for statistically analyzing data using a multivariate linear regression is listed. (KR)
Reuter, M; Netter, P
2001-01-01
The present study proposes a hierarchical multivariate statistical prediction model which enables to determine the most prominent variables (physiological, biochemical and personality factors) related to nicotine craving and dopaminergic activation. Based on animal studies reporting a reduction of the rewarding effects of psychotropic drugs after blockade or destruction of the mesolimbic dopamine (DA) system, changes in nicotine craving after pharmacological manipulation by means of a DA agonist (lisuride 0.2 mg) and a DA antagonist (fluphenazine 2 mg) were assessed in 36 healthy male heavy smokers. The major aim was the development of a multivariate prediction model which is applicable in samples lacking variance homogeneity or the prerequisite of a multivariate normal distribution. The model proposed is a combination of multivariate parametric and nonparametric methods taking advantage of their individual merits. Especially personality variables, such as sensation seeking, impulsivity, and neuroticism showed to be important predictors of craving in this responder approach.
Hierarchical Least Squares Identification and Its Convergence for Large Scale Multivariable Systems
Institute of Scientific and Technical Information of China (English)
丁锋; 丁韬
2002-01-01
The recursive least squares identification algorithm (RLS) for large scale multivariable systems requires a large amount of calculations, therefore, the RLS algorithm is difficult to implement on a computer. The computational load of estimation algorithms can be reduced using the hierarchical least squares identification algorithm (HLS) for large scale multivariable systems. The convergence analysis using the Martingale Convergence Theorem indicates that the parameter estimation error (PEE) given by the HLS algorithm is uniformly bounded without a persistent excitation signal and that the PEE consistently converges to zero for the persistent excitation condition. The HLS algorithm has a much lower computational load than the RLS algorithm.
Hierarchical Matching and Regression with Application to Photometric Redshift Estimation
Murtagh, Fionn
2017-06-01
This work emphasizes that heterogeneity, diversity, discontinuity, and discreteness in data is to be exploited in classification and regression problems. A global a priori model may not be desirable. For data analytics in cosmology, this is motivated by the variety of cosmological objects such as elliptical, spiral, active, and merging galaxies at a wide range of redshifts. Our aim is matching and similarity-based analytics that takes account of discrete relationships in the data. The information structure of the data is represented by a hierarchy or tree where the branch structure, rather than just the proximity, is important. The representation is related to p-adic number theory. The clustering or binning of the data values, related to the precision of the measurements, has a central role in this methodology. If used for regression, our approach is a method of cluster-wise regression, generalizing nearest neighbour regression. Both to exemplify this analytics approach, and to demonstrate computational benefits, we address the well-known photometric redshift or `photo-z' problem, seeking to match Sloan Digital Sky Survey (SDSS) spectroscopic and photometric redshifts.
Rocconi, Louis M.
2013-01-01
This study examined the differing conclusions one may come to depending upon the type of analysis chosen, hierarchical linear modeling or ordinary least squares (OLS) regression. To illustrate this point, this study examined the influences of seniors' self-reported critical thinking abilities three ways: (1) an OLS regression with the student…
Macnab, Ying C
2009-04-30
This paper presents Bayesian multivariate disease mapping and ecological regression models that take into account errors in covariates. Bayesian hierarchical formulations of multivariate disease models and covariate measurement models, with related methods of estimation and inference, are developed as an integral part of a Bayesian disability adjusted life years (DALYs) methodology for the analysis of multivariate disease or injury data and associated ecological risk factors and for small area DALYs estimation, inference, and mapping. The methodology facilitates the estimation of multivariate small area disease and injury rates and associated risk effects, evaluation of DALYs and 'preventable' DALYs, and identification of regions to which disease or injury prevention resources may be directed to reduce DALYs. The methodology interfaces and intersects the Bayesian disease mapping methodology and the global burden of disease framework such that the impact of disease, injury, and risk factors on population health may be evaluated to inform community health, health needs, and priority considerations for disease and injury prevention. A burden of injury study on road traffic accidents in local health areas in British Columbia, Canada, is presented as an illustrative example.
Directory of Open Access Journals (Sweden)
Lassi Rieppo
Full Text Available Fourier Transform Infrared (FT-IR spectroscopic imaging has been earlier applied for the spatial estimation of the collagen and the proteoglycan (PG contents of articular cartilage (AC. However, earlier studies have been limited to the use of univariate analysis techniques. Current analysis methods lack the needed specificity for collagen and PGs. The aim of the present study was to evaluate the suitability of partial least squares regression (PLSR and principal component regression (PCR methods for the analysis of the PG content of AC. Multivariate regression models were compared with earlier used univariate methods and tested with a sample material consisting of healthy and enzymatically degraded steer AC. Chondroitinase ABC enzyme was used to increase the variation in PG content levels as compared to intact AC. Digital densitometric measurements of Safranin O-stained sections provided the reference for PG content. The results showed that multivariate regression models predict PG content of AC significantly better than earlier used absorbance spectrum (i.e. the area of carbohydrate region with or without amide I normalization or second derivative spectrum univariate parameters. Increased molecular specificity favours the use of multivariate regression models, but they require more knowledge of chemometric analysis and extended laboratory resources for gathering reference data for establishing the models. When true molecular specificity is required, the multivariate models should be used.
Khoshravesh, Mojtaba; Sefidkouhi, Mohammad Ali Gholami; Valipour, Mohammad
2017-07-01
The proper evaluation of evapotranspiration is essential in food security investigation, farm management, pollution detection, irrigation scheduling, nutrient flows, carbon balance as well as hydrologic modeling, especially in arid environments. To achieve sustainable development and to ensure water supply, especially in arid environments, irrigation experts need tools to estimate reference evapotranspiration on a large scale. In this study, the monthly reference evapotranspiration was estimated by three different regression models including the multivariate fractional polynomial (MFP), robust regression, and Bayesian regression in Ardestan, Esfahan, and Kashan. The results were compared with Food and Agriculture Organization (FAO)-Penman-Monteith (FAO-PM) to select the best model. The results show that at a monthly scale, all models provided a closer agreement with the calculated values for FAO-PM ( R 2 > 0.95 and RMSE < 12.07 mm month-1). However, the MFP model gives better estimates than the other two models for estimating reference evapotranspiration at all stations.
Multivariable Linear Regression Model for Promotional Forecasting:The Coca Cola - Morrisons Case
Zheng, Yiwei/Y
2009-01-01
This paper describes a promotional forecasting model, built by linear regression module in Microsoft Excel. It intends to provide quick and reliable forecasts with a moderate credit and to assist the CPFR between the Coca Cola Enterprises (CCE) and the Morrisons. The model is derived from previous researches and literature review on CPFR, promotion, forecasting and modelling. It is designed as a multivariable linear regression model, which involves several promotional mix as variables includi...
Hierarchical Multiple Regression in Counseling Research: Common Problems and Possible Remedies.
Petrocelli, John V.
2003-01-01
A brief content analysis was conducted on the use of hierarchical regression in counseling research published in the "Journal of Counseling Psychology" and the "Journal of Counseling & Development" during the years 1997-2001. Common problems are cited and possible remedies are described. (Contains 43 references and 3 tables.) (Author)
Martens, Edwin P|info:eu-repo/dai/nl/088859010; de Boer, Anthonius|info:eu-repo/dai/nl/075097346; Pestman, Wiebe R; Belitser, Svetlana V; Stricker, Bruno H Ch; Klungel, Olaf H|info:eu-repo/dai/nl/181447649
PURPOSE: To compare adjusted effects of drug treatment for hypertension on the risk of stroke from propensity score (PS) methods with a multivariable Cox proportional hazards (Cox PH) regression in an observational study with censored data. METHODS: From two prospective population-based cohort
Depth-weighted robust multivariate regression with application to sparse data
Dutta, Subhajit
2017-04-05
A robust method for multivariate regression is developed based on robust estimators of the joint location and scatter matrix of the explanatory and response variables using the notion of data depth. The multivariate regression estimator possesses desirable affine equivariance properties, achieves the best breakdown point of any affine equivariant estimator, and has an influence function which is bounded in both the response as well as the predictor variable. To increase the efficiency of this estimator, a re-weighted estimator based on robust Mahalanobis distances of the residual vectors is proposed. In practice, the method is more stable than existing methods that are constructed using subsamples of the data. The resulting multivariate regression technique is computationally feasible, and turns out to perform better than several popular robust multivariate regression methods when applied to various simulated data as well as a real benchmark data set. When the data dimension is quite high compared to the sample size it is still possible to use meaningful notions of data depth along with the corresponding depth values to construct a robust estimator in a sparse setting.
DEFF Research Database (Denmark)
Tybjærg-Hansen, Anne
2009-01-01
Within-person variability in measured values of multiple risk factors can bias their associations with disease. The multivariate regression calibration (RC) approach can correct for such measurement error and has been applied to studies in which true values or independent repeat measurements of t...
Hallin, Marc; Šiman, Miroslav; 10.1214/09-AOS723
2010-01-01
A new multivariate concept of quantile, based on a directional version of Koenker and Bassett's traditional regression quantiles, is introduced for multivariate location and multiple-output regression problems. In their empirical version, those quantiles can be computed efficiently via linear programming techniques. Consistency, Bahadur representation and asymptotic normality results are established. Most importantly, the contours generated by those quantiles are shown to coincide with the classical halfspace depth contours associated with the name of Tukey. This relation does not only allow for efficient depth contour computations by means of parametric linear programming, but also for transferring from the quantile to the depth universe such asymptotic results as Bahadur representations. Finally, linear programming duality opens the way to promising developments in depth-related multivariate rank-based inference.
Henrard, S; Speybroeck, N; Hermans, C
2015-11-01
Haemophilia is a rare genetic haemorrhagic disease characterized by partial or complete deficiency of coagulation factor VIII, for haemophilia A, or IX, for haemophilia B. As in any other medical research domain, the field of haemophilia research is increasingly concerned with finding factors associated with binary or continuous outcomes through multivariable models. Traditional models include multiple logistic regressions, for binary outcomes, and multiple linear regressions for continuous outcomes. Yet these regression models are at times difficult to implement, especially for non-statisticians, and can be difficult to interpret. The present paper sought to didactically explain how, why, and when to use classification and regression tree (CART) analysis for haemophilia research. The CART method is non-parametric and non-linear, based on the repeated partitioning of a sample into subgroups based on a certain criterion. Breiman developed this method in 1984. Classification trees (CTs) are used to analyse categorical outcomes and regression trees (RTs) to analyse continuous ones. The CART methodology has become increasingly popular in the medical field, yet only a few examples of studies using this methodology specifically in haemophilia have to date been published. Two examples using CART analysis and previously published in this field are didactically explained in details. There is increasing interest in using CART analysis in the health domain, primarily due to its ease of implementation, use, and interpretation, thus facilitating medical decision-making. This method should be promoted for analysing continuous or categorical outcomes in haemophilia, when applicable. © 2015 John Wiley & Sons Ltd.
Su, Liyun; Zhao, Yanyong; Yan, Tianshun; Li, Fenglan
2012-01-01
Multivariate local polynomial fitting is applied to the multivariate linear heteroscedastic regression model. Firstly, the local polynomial fitting is applied to estimate heteroscedastic function, then the coefficients of regression model are obtained by using generalized least squares method. One noteworthy feature of our approach is that we avoid the testing for heteroscedasticity by improving the traditional two-stage method. Due to non-parametric technique of local polynomial estimation, it is unnecessary to know the form of heteroscedastic function. Therefore, we can improve the estimation precision, when the heteroscedastic function is unknown. Furthermore, we verify that the regression coefficients is asymptotic normal based on numerical simulations and normal Q-Q plots of residuals. Finally, the simulation results and the local polynomial estimation of real data indicate that our approach is surely effective in finite-sample situations.
DEFF Research Database (Denmark)
Ussery, David; Bohlin, Jon; Skjerve, Eystein
2009-01-01
Recently there has been an explosion in the availability of bacterial genomic sequences, making possible now an analysis of genomic signatures across more than 800 hundred different bacterial chromosomes, from a wide variety of environments. Using genomic signatures, we pair-wise compared 867...... different genomic DNA sequences, taken from chromosomes and plasmids more than 100,000 base-pairs in length. Hierarchical clustering was performed on the outcome of the comparisons before a multinomial regression model was fitted. The regression model included the cluster groups as the response variable...... AT content. Small improvements to the regression model, although significant, were also obtained by factors such as sequence size, habitat, growth temperature, selective pressure measured as oligonucleotide usage variance, and oxygen requirement.The statistics obtained using hierarchical clustering...
Takagi, Daisuke; Ikeda, Ken'ichi; Kawachi, Ichiro
2012-11-01
Crime is an important determinant of public health outcomes, including quality of life, mental well-being, and health behavior. A body of research has documented the association between community social capital and crime victimization. The association between social capital and crime victimization has been examined at multiple levels of spatial aggregation, ranging from entire countries, to states, metropolitan areas, counties, and neighborhoods. In multilevel analysis, the spatial boundaries at level 2 are most often drawn from administrative boundaries (e.g., Census tracts in the U.S.). One problem with adopting administrative definitions of neighborhoods is that it ignores spatial spillover. We conducted a study of social capital and crime victimization in one ward of Tokyo city, using a spatial Durbin model with an inverse-distance weighting matrix that assigned each respondent a unique level of "exposure" to social capital based on all other residents' perceptions. The study is based on a postal questionnaire sent to 20-69 years old residents of Arakawa Ward, Tokyo. The response rate was 43.7%. We examined the contextual influence of generalized trust, perceptions of reciprocity, two types of social network variables, as well as two principal components of social capital (constructed from the above four variables). Our outcome measure was self-reported crime victimization in the last five years. In the spatial Durbin model, we found that neighborhood generalized trust, reciprocity, supportive networks and two principal components of social capital were each inversely associated with crime victimization. By contrast, a multilevel regression performed with the same data (using administrative neighborhood boundaries) found generally null associations between neighborhood social capital and crime. Spatial regression methods may be more appropriate for investigating the contextual influence of social capital in homogeneous cultural settings such as Japan.
A refined method for multivariate meta-analysis and meta-regression.
Jackson, Daniel; Riley, Richard D
2014-02-20
Making inferences about the average treatment effect using the random effects model for meta-analysis is problematic in the common situation where there is a small number of studies. This is because estimates of the between-study variance are not precise enough to accurately apply the conventional methods for testing and deriving a confidence interval for the average effect. We have found that a refined method for univariate meta-analysis, which applies a scaling factor to the estimated effects' standard error, provides more accurate inference. We explain how to extend this method to the multivariate scenario and show that our proposal for refined multivariate meta-analysis and meta-regression can provide more accurate inferences than the more conventional approach. We explain how our proposed approach can be implemented using standard output from multivariate meta-analysis software packages and apply our methodology to two real examples. Copyright © 2013 John Wiley & Sons, Ltd.
MCKissick, Burnell T. (Technical Monitor); Plassman, Gerald E.; Mall, Gerald H.; Quagliano, John R.
2005-01-01
Linear multivariable regression models for predicting day and night Eddy Dissipation Rate (EDR) from available meteorological data sources are defined and validated. Model definition is based on a combination of 1997-2000 Dallas/Fort Worth (DFW) data sources, EDR from Aircraft Vortex Spacing System (AVOSS) deployment data, and regression variables primarily from corresponding Automated Surface Observation System (ASOS) data. Model validation is accomplished through EDR predictions on a similar combination of 1994-1995 Memphis (MEM) AVOSS and ASOS data. Model forms include an intercept plus a single term of fixed optimal power for each of these regression variables; 30-minute forward averaged mean and variance of near-surface wind speed and temperature, variance of wind direction, and a discrete cloud cover metric. Distinct day and night models, regressing on EDR and the natural log of EDR respectively, yield best performance and avoid model discontinuity over day/night data boundaries.
A multivariate linear regression model for the Jordanian industrial electric energy consumption
Energy Technology Data Exchange (ETDEWEB)
Al-Ghandoor, A.; Nahleh, Y.A.; Sandouqa, Y.; Al-Salaymeh, M. [Hashemite Univ., Zarqa (Jordan). Dept. of Industrial Engineering
2007-08-09
The amount of electricity used by the industrial sector in Jordan is an important driver for determining the future energy needs of the country. This paper proposed a model to simulate electricity and energy consumption by industry. The general model approach was based on multivariate regression analysis to provide valuable information regarding energy demands and analysis, and to identify the various factors that influence Jordanian industrial electricity consumption. It was determined that industrial gross output and capacity utilization are the most important variables that drive electricity consumption. The results revealed that the multivariate linear regression model can be used to adequately model the Jordanian industrial electricity consumption with coefficient of determination (R2) and adjusted R2 values of 99.3 and 99.2 per cent, respectively. 19 refs., 4 tabs., 2 figs.
Multivariate nonparametric regression and visualization with R and applications to finance
Klemelä, Jussi
2014-01-01
A modern approach to statistical learning and its applications through visualization methods With a unique and innovative presentation, Multivariate Nonparametric Regression and Visualization provides readers with the core statistical concepts to obtain complete and accurate predictions when given a set of data. Focusing on nonparametric methods to adapt to the multiple types of data generatingmechanisms, the book begins with an overview of classification and regression. The book then introduces and examines various tested and proven visualization techniques for learning samples and functio
A special hierarchical fuzzy neural-networks based reinforcement learning for multi-variables system
Institute of Scientific and Technical Information of China (English)
ZHANG Wen-zhi; LU Tian-sheng
2005-01-01
Proposes a reinforcement learning scheme based on a special Hierarchical Fuzzy Neural-Networks (HFNN) for solving complicated learning tasks in a continuous multi-variables environment. The output of the previous layer in the HFNN is no longer used as if-part of the next layer, but used only in then-part. Thus it can deal with the difficulty when the output of the previous layer is meaningless or its meaning is uncertain. The proposed HFNN has a minimal number of fuzzy rules and can successfully solve the problem of rules combination explosion and decrease the quantity of computation and memory requirement. In the learning process, two HFNN with the same structure perform fuzzy action composition and evaluation function approximation simultaneously where the parameters of neural-networks are tuned and updated on line by using gradient descent algorithm. The reinforcement learning method is proved to be correct and feasible by simulation of a double inverted pendulum system.
Higher-order Multivariable Polynomial Regression to Estimate Human Affective States
Wei, Jie; Chen, Tong; Liu, Guangyuan; Yang, Jiemin
2016-03-01
From direct observations, facial, vocal, gestural, physiological, and central nervous signals, estimating human affective states through computational models such as multivariate linear-regression analysis, support vector regression, and artificial neural network, have been proposed in the past decade. In these models, linear models are generally lack of precision because of ignoring intrinsic nonlinearities of complex psychophysiological processes; and nonlinear models commonly adopt complicated algorithms. To improve accuracy and simplify model, we introduce a new computational modeling method named as higher-order multivariable polynomial regression to estimate human affective states. The study employs standardized pictures in the International Affective Picture System to induce thirty subjects’ affective states, and obtains pure affective patterns of skin conductance as input variables to the higher-order multivariable polynomial model for predicting affective valence and arousal. Experimental results show that our method is able to obtain efficient correlation coefficients of 0.98 and 0.96 for estimation of affective valence and arousal, respectively. Moreover, the method may provide certain indirect evidences that valence and arousal have their brain’s motivational circuit origins. Thus, the proposed method can serve as a novel one for efficiently estimating human affective states.
Higher-order Multivariable Polynomial Regression to Estimate Human Affective States.
Wei, Jie; Chen, Tong; Liu, Guangyuan; Yang, Jiemin
2016-03-21
From direct observations, facial, vocal, gestural, physiological, and central nervous signals, estimating human affective states through computational models such as multivariate linear-regression analysis, support vector regression, and artificial neural network, have been proposed in the past decade. In these models, linear models are generally lack of precision because of ignoring intrinsic nonlinearities of complex psychophysiological processes; and nonlinear models commonly adopt complicated algorithms. To improve accuracy and simplify model, we introduce a new computational modeling method named as higher-order multivariable polynomial regression to estimate human affective states. The study employs standardized pictures in the International Affective Picture System to induce thirty subjects' affective states, and obtains pure affective patterns of skin conductance as input variables to the higher-order multivariable polynomial model for predicting affective valence and arousal. Experimental results show that our method is able to obtain efficient correlation coefficients of 0.98 and 0.96 for estimation of affective valence and arousal, respectively. Moreover, the method may provide certain indirect evidences that valence and arousal have their brain's motivational circuit origins. Thus, the proposed method can serve as a novel one for efficiently estimating human affective states.
A note on constrained M-estimation and its recursive analog in multivariate linear regression models
Institute of Scientific and Technical Information of China (English)
RAO; Calyampudi; R
2009-01-01
In this paper,the constrained M-estimation of the regression coeffcients and scatter parameters in a general multivariate linear regression model is considered.Since the constrained M-estimation is not easy to compute,an up-dating recursion procedure is proposed to simplify the com-putation of the estimators when a new observation is obtained.We show that,under mild conditions,the recursion estimates are strongly consistent.In addition,the asymptotic normality of the recursive constrained M-estimators of regression coeffcients is established.A Monte Carlo simulation study of the recursion estimates is also provided.Besides,robustness and asymptotic behavior of constrained M-estimators are briefly discussed.
Melo, Tatiane F N; Patriota, Alexandre G
2012-01-01
In this paper, we develop a modified version of the likelihood ratio test for multivariate heteroskedastic errors-in-variables regression models. The error terms are allowed to follow a multivariate distribution in the elliptical class of distributions, which has the normal distribution as a special case. We derive the Skovgaard adjusted likelihood ratio statistic, which follows a chi-squared distribution with a high degree of accuracy. We conduct a simulation study and show that the proposed test displays superior finite sample behavior as compared to the standard likelihood ratio test. We illustrate the usefulness of our results in applied settings using a data set from the WHO MONICA Project on cardiovascular disease.
Nieto, Paulino José García; Antón, Juan Carlos Álvarez; Vilán, José Antonio Vilán; García-Gonzalo, Esperanza
2014-10-01
The aim of this research work is to build a regression model of the particulate matter up to 10 micrometers in size (PM10) by using the multivariate adaptive regression splines (MARS) technique in the Oviedo urban area (Northern Spain) at local scale. This research work explores the use of a nonparametric regression algorithm known as multivariate adaptive regression splines (MARS) which has the ability to approximate the relationship between the inputs and outputs, and express the relationship mathematically. In this sense, hazardous air pollutants or toxic air contaminants refer to any substance that may cause or contribute to an increase in mortality or serious illness, or that may pose a present or potential hazard to human health. To accomplish the objective of this study, the experimental dataset of nitrogen oxides (NOx), carbon monoxide (CO), sulfur dioxide (SO2), ozone (O3) and dust (PM10) were collected over 3 years (2006-2008) and they are used to create a highly nonlinear model of the PM10 in the Oviedo urban nucleus (Northern Spain) based on the MARS technique. One main objective of this model is to obtain a preliminary estimate of the dependence between PM10 pollutant in the Oviedo urban area at local scale. A second aim is to determine the factors with the greatest bearing on air quality with a view to proposing health and lifestyle improvements. The United States National Ambient Air Quality Standards (NAAQS) establishes the limit values of the main pollutants in the atmosphere in order to ensure the health of healthy people. Firstly, this MARS regression model captures the main perception of statistical learning theory in order to obtain a good prediction of the dependence among the main pollutants in the Oviedo urban area. Secondly, the main advantages of MARS are its capacity to produce simple, easy-to-interpret models, its ability to estimate the contributions of the input variables, and its computational efficiency. Finally, on the basis of
Using multivariate adaptive regression splines to estimate subadult age from diaphyseal dimensions.
Stull, Kyra E; L'Abbé, Ericka N; Ousley, Stephen D
2014-07-01
Subadult age estimation is considered the most accurate parameter estimated in a subadult biological profile, even though the methods are deficient and the samples from which they are based are inappropriate. The current study addresses the problems that plague subadult age estimation and creates age estimation models from diaphyseal dimensions of modern children. The sample included 1,310 males and females between the ages of birth and 12 years. Eighteen diaphyseal length and breadth measurements were obtained from Lodox Statscan radiographic images generated at two institutions in Cape Town, South Africa, between 2007 and 2012. Univariate and multivariate age estimation models were created using multivariate adaptive regression splines. k-fold cross-validated 95% prediction intervals (PIs) were created for each model, and the precision of each model was assessed. The diaphyseal length models generated the narrowest PIs (2 months to 6 years) for all univariate models. The majority of multivariate models had PIs that ranged from 3 months to 5 and 6 years. Mean bias approximated 0 for each model, but most models lost precision after 10 years of age. Univariate diaphyseal length models are recommended for younger children, whereas multivariate models are recommended for older children where the inclusion of more variables minimized the size of the PIs. If diaphyseal lengths are not available, multivariate breadth models are recommended. The present study provides applicable age estimation formulae and explores the advantages and disadvantages of different subadult age estimation models using diaphyseal dimensions. Am J Phys Anthropol 154:376-386, 2014. © 2014 Wiley Periodicals, Inc.
An empirical approach to update multivariate regression models intended for routine industrial use
Energy Technology Data Exchange (ETDEWEB)
Garcia-Mencia, M.V.; Andrade, J.M.; Lopez-Mahia, P.; Prada, D. [University of La Coruna, La Coruna (Spain). Dept. of Analytical Chemistry
2000-11-01
Many problems currently tackled by analysts are highly complex and, accordingly, multivariate regression models need to be developed. Two intertwined topics are important when such models are to be applied within the industrial routines: (1) Did the model account for the 'natural' variance of the production samples? (2) Is the model stable on time? This paper focuses on the second topic and it presents an empirical approach where predictive models developed by using Mid-FTIR and PLS and PCR hold its utility during about nine months when used to predict the octane number of platforming naphthas in a petrochemical refinery. 41 refs., 10 figs., 1 tab.
Hoffman, Haydn; Lee, Sunghoon I; Garst, Jordan H; Lu, Derek S; Li, Charles H; Nagasawa, Daniel T; Ghalehsari, Nima; Jahanforouz, Nima; Razaghy, Mehrdad; Espinal, Marie; Ghavamrezaii, Amir; Paak, Brian H; Wu, Irene; Sarrafzadeh, Majid; Lu, Daniel C
2015-09-01
This study introduces the use of multivariate linear regression (MLR) and support vector regression (SVR) models to predict postoperative outcomes in a cohort of patients who underwent surgery for cervical spondylotic myelopathy (CSM). Currently, predicting outcomes after surgery for CSM remains a challenge. We recruited patients who had a diagnosis of CSM and required decompressive surgery with or without fusion. Fine motor function was tested preoperatively and postoperatively with a handgrip-based tracking device that has been previously validated, yielding mean absolute accuracy (MAA) results for two tracking tasks (sinusoidal and step). All patients completed Oswestry disability index (ODI) and modified Japanese Orthopaedic Association questionnaires preoperatively and postoperatively. Preoperative data was utilized in MLR and SVR models to predict postoperative ODI. Predictions were compared to the actual ODI scores with the coefficient of determination (R(2)) and mean absolute difference (MAD). From this, 20 patients met the inclusion criteria and completed follow-up at least 3 months after surgery. With the MLR model, a combination of the preoperative ODI score, preoperative MAA (step function), and symptom duration yielded the best prediction of postoperative ODI (R(2)=0.452; MAD=0.0887; p=1.17 × 10(-3)). With the SVR model, a combination of preoperative ODI score, preoperative MAA (sinusoidal function), and symptom duration yielded the best prediction of postoperative ODI (R(2)=0.932; MAD=0.0283; p=5.73 × 10(-12)). The SVR model was more accurate than the MLR model. The SVR can be used preoperatively in risk/benefit analysis and the decision to operate.
Analysis of Multivariate Experimental Data Using A Simplified Regression Model Search Algorithm
Ulbrich, Norbert Manfred
2013-01-01
A new regression model search algorithm was developed in 2011 that may be used to analyze both general multivariate experimental data sets and wind tunnel strain-gage balance calibration data. The new algorithm is a simplified version of a more complex search algorithm that was originally developed at the NASA Ames Balance Calibration Laboratory. The new algorithm has the advantage that it needs only about one tenth of the original algorithm's CPU time for the completion of a search. In addition, extensive testing showed that the prediction accuracy of math models obtained from the simplified algorithm is similar to the prediction accuracy of math models obtained from the original algorithm. The simplified algorithm, however, cannot guarantee that search constraints related to a set of statistical quality requirements are always satisfied in the optimized regression models. Therefore, the simplified search algorithm is not intended to replace the original search algorithm. Instead, it may be used to generate an alternate optimized regression model of experimental data whenever the application of the original search algorithm either fails or requires too much CPU time. Data from a machine calibration of NASA's MK40 force balance is used to illustrate the application of the new regression model search algorithm.
Shetty, Rahul; Bigiel, Frank
2012-01-01
We develop a Bayesian linear regression method which rigorously treats measurement uncertainties, and accounts for hierarchical data structure for investigating the relationship between the star formation rate and gas surface density. The method simultaneously estimates the intercept, slope, and scatter about the regression line of each individual subject (e.g. a galaxy) and the population (e.g. an ensemble of galaxies). Using synthetic datasets, we demonstrate that the Bayesian method accurately recovers the parameters of both the individuals and the population, especially when compared to commonly employed least squares methods, such as the bisector. We apply the Bayesian method to estimate the Kennicutt-Schmidt (KS) parameters of a sample of spiral galaxies compiled by Bigiel et al. (2008). We find significant variation in the KS parameters, indicating that no single KS relationship holds for all galaxies. This suggests that the relationship between molecular gas and star formation differs between galaxies...
Multivariate linear regression of high-dimensional fMRI data with multiple target variables.
Valente, Giancarlo; Castellanos, Agustin Lage; Vanacore, Gianluca; Formisano, Elia
2014-05-01
Multivariate regression is increasingly used to study the relation between fMRI spatial activation patterns and experimental stimuli or behavioral ratings. With linear models, informative brain locations are identified by mapping the model coefficients. This is a central aspect in neuroimaging, as it provides the sought-after link between the activity of neuronal populations and subject's perception, cognition or behavior. Here, we show that mapping of informative brain locations using multivariate linear regression (MLR) may lead to incorrect conclusions and interpretations. MLR algorithms for high dimensional data are designed to deal with targets (stimuli or behavioral ratings, in fMRI) separately, and the predictive map of a model integrates information deriving from both neural activity patterns and experimental design. Not accounting explicitly for the presence of other targets whose associated activity spatially overlaps with the one of interest may lead to predictive maps of troublesome interpretation. We propose a new model that can correctly identify the spatial patterns associated with a target while achieving good generalization. For each target, the training is based on an augmented dataset, which includes all remaining targets. The estimation on such datasets produces both maps and interaction coefficients, which are then used to generalize. The proposed formulation is independent of the regression algorithm employed. We validate this model on simulated fMRI data and on a publicly available dataset. Results indicate that our method achieves high spatial sensitivity and good generalization and that it helps disentangle specific neural effects from interaction with predictive maps associated with other targets.
Wilderjans, Tom Frans; Vande Gaer, Eva; Kiers, Henk A L; Van Mechelen, Iven; Ceulemans, Eva
2017-03-01
In the behavioral sciences, many research questions pertain to a regression problem in that one wants to predict a criterion on the basis of a number of predictors. Although in many cases, ordinary least squares regression will suffice, sometimes the prediction problem is more challenging, for three reasons: first, multiple highly collinear predictors can be available, making it difficult to grasp their mutual relations as well as their relations to the criterion. In that case, it may be very useful to reduce the predictors to a few summary variables, on which one regresses the criterion and which at the same time yields insight into the predictor structure. Second, the population under study may consist of a few unknown subgroups that are characterized by different regression models. Third, the obtained data are often hierarchically structured, with for instance, observations being nested into persons or participants within groups or countries. Although some methods have been developed that partially meet these challenges (i.e., principal covariates regression (PCovR), clusterwise regression (CR), and structural equation models), none of these methods adequately deals with all of them simultaneously. To fill this gap, we propose the principal covariates clusterwise regression (PCCR) method, which combines the key idea's behind PCovR (de Jong & Kiers in Chemom Intell Lab Syst 14(1-3):155-164, 1992) and CR (Späth in Computing 22(4):367-373, 1979). The PCCR method is validated by means of a simulation study and by applying it to cross-cultural data regarding satisfaction with life.
Ben Alaya, M. A.; Chebana, F.; Ouarda, T. B. M. J.
2016-09-01
Statistical downscaling techniques are required to refine atmosphere-ocean global climate data and provide reliable meteorological information such as a realistic temporal variability and relationships between sites and variables in a changing climate. To this end, the present paper introduces a modular structure combining two statistical tools of increasing interest during the last years: (1) Gaussian copula and (2) quantile regression. The quantile regression tool is employed to specify the entire conditional distribution of downscaled variables and to address the limitations of traditional regression-based approaches whereas the Gaussian copula is performed to describe and preserve the dependence between both variables and sites. A case study based on precipitation and maximum and minimum temperatures from the province of Quebec, Canada, is used to evaluate the performance of the proposed model. Obtained results suggest that this approach is capable of generating series with realistic correlation structures and temporal variability. Furthermore, the proposed model performed better than a classical multisite multivariate statistical downscaling model for most evaluation criteria.
Nieto, P J García; Antón, J C Álvarez; Vilán, J A Vilán; García-Gonzalo, E
2015-05-01
The aim of this research work is to build a regression model of air quality by using the multivariate adaptive regression splines (MARS) technique in the Oviedo urban area (northern Spain) at a local scale. To accomplish the objective of this study, the experimental data set made up of nitrogen oxides (NO x ), carbon monoxide (CO), sulfur dioxide (SO2), ozone (O3), and dust (PM10) was collected over 3 years (2006-2008). The US National Ambient Air Quality Standards (NAAQS) establishes the limit values of the main pollutants in the atmosphere in order to ensure the health of healthy people. Firstly, this MARS regression model captures the main perception of statistical learning theory in order to obtain a good prediction of the dependence among the main pollutants in the Oviedo urban area. Secondly, the main advantages of MARS are its capacity to produce simple, easy-to-interpret models, its ability to estimate the contributions of the input variables, and its computational efficiency. Finally, on the basis of these numerical calculations, using the MARS technique, conclusions of this research work are exposed.
Directory of Open Access Journals (Sweden)
Yoonsu Shin
2016-01-01
Full Text Available In the 5G era, the operational cost of mobile wireless networks will significantly increase. Further, massive network capacity and zero latency will be needed because everything will be connected to mobile networks. Thus, self-organizing networks (SON are needed, which expedite automatic operation of mobile wireless networks, but have challenges to satisfy the 5G requirements. Therefore, researchers have proposed a framework to empower SON using big data. The recent framework of a big data-empowered SON analyzes the relationship between key performance indicators (KPIs and related network parameters (NPs using machine-learning tools, and it develops regression models using a Gaussian process with those parameters. The problem, however, is that the methods of finding the NPs related to the KPIs differ individually. Moreover, the Gaussian process regression model cannot determine the relationship between a KPI and its various related NPs. In this paper, to solve these problems, we proposed multivariate multiple regression models to determine the relationship between various KPIs and NPs. If we assume one KPI and multiple NPs as one set, the proposed models help us process multiple sets at one time. Also, we can find out whether some KPIs are conflicting or not. We implement the proposed models using MapReduce.
DEFF Research Database (Denmark)
Sørensen, Jens Benn; Badsberg, Jens Henrik; Olsen, Jens
1989-01-01
The prognostic factors for survival in advanced adenocarcinoma of the lung were investigated in a consecutive series of 259 patients treated with chemotherapy. Twenty-eight pretreatment variables were investigated by use of Cox's multivariate regression model, including histological subtypes...... and degree of differentiation, the new international staging system for lung cancer, and seven laboratory parameters. Staging of the patients included bone marrow examination but were otherwise nonextensive without routine bone, liver, and brain scans. Factors predicting poor survival were low performance...... status, stage IV disease, no prior nonradical resection, liver metastases, high values of white blood cell count, and lactate dehydrogenase, and low values of aspartate aminotransaminase. The nonradical resection may not be a prognostic factor because of the resection itself but may rather serve...
Institute of Scientific and Technical Information of China (English)
管军; 杨兴易; 赵良; 林兆奋; 郭昌星; 李文放
2003-01-01
Objective To investigate the incidence, crude mortality and independent risk factors of ventilator-associated pneumonia (VAP) in comprehensive ICU in China.Methods The clinical and microbiological data were retrospectively collected and analysed of all the 97 patients receiving mechanical ventilation (>48hr) in our comprehensive ICU during 1999. 1 - 2000. 12. Firstly several statistically significant risk factors were screened out with univariate analysis, then independent risk factors were determined with multivariate stepwise logistic regression analysis.Results The incidence of VAP was 54. 64% (15. 60 cases per 1000 ventilation days), the crude mortality 47.42% . Interval between the establishment of artificial airway and diagnosis of VAP was 6.9 ± 4.3 d. Univariate analysis suggested that indwelling naso-gastric tube, corticosteroid, acid inhibitor, third-generation cephalosporin/ imipenem, non - infection lung disease, and extrapulmonary infection were the statistically significant risk factors of
A frailty model approach for regression analysis of multivariate current status data.
Chen, Man-Hua; Tong, Xingwei; Sun, Jianguo
2009-11-30
This paper discusses regression analysis of multivariate current status failure time data (The Statistical Analysis of Interval-censoring Failure Time Data. Springer: New York, 2006), which occur quite often in, for example, tumorigenicity experiments and epidemiologic investigations of the natural history of a disease. For the problem, several marginal approaches have been proposed that model each failure time of interest individually (Biometrics 2000; 56:940-943; Statist. Med. 2002; 21:3715-3726). In this paper, we present a full likelihood approach based on the proportional hazards frailty model. For estimation, an Expectation Maximization (EM) algorithm is developed and simulation studies suggest that the presented approach performs well for practical situations. The approach is applied to a set of bivariate current status data arising from a tumorigenicity experiment.
Directory of Open Access Journals (Sweden)
Marco Flôres Ferrão
2010-11-01
Full Text Available In the present work multivariate regression models using interval partial least square (iPLS and backward interval partial least square (biPLS had been analyzed and compared. iPLS and biPLS models had been developed to determine the concentration of biodiesel in blends of biodiesel/diesel using infrared spectroscopy signals. 45 samples with concentrations in range 8-30% of biodiesel, and two distinct spectrophotometers were used. Both the techniques (iPLS and biPLS using the data obtained by HATR-FTIR if had shown promising to develop simpler, faster and non-destructive methodologies for the biodiesel determination in commercial blends.
Giacomo, Della Riccia; Stefania, Del Zotto
2013-12-15
Fumonisins are mycotoxins produced by Fusarium species that commonly live in maize. Whereas fungi damage plants, fumonisins cause disease both to cattle breedings and human beings. Law limits set fumonisins tolerable daily intake with respect to several maize based feed and food. Chemical techniques assure the most reliable and accurate measurements, but they are expensive and time consuming. A method based on Near Infrared spectroscopy and multivariate statistical regression is described as a simpler, cheaper and faster alternative. We apply Partial Least Squares with full cross validation. Two models are described, having high correlation of calibration (0.995, 0.998) and of validation (0.908, 0.909), respectively. Description of observed phenomenon is accurate and overfitting is avoided. Screening of contaminated maize with respect to European legal limit of 4 mg kg(-1) should be assured.
Wilms, M.; Werner, R.; Ehrhardt, J.; Schmidt-Richberg, A.; Schlemmer, H.-P.; Handels, H.
2014-03-01
Breathing-induced location uncertainties of internal structures are still a relevant issue in the radiation therapy of thoracic and abdominal tumours. Motion compensation approaches like gating or tumour tracking are usually driven by low-dimensional breathing signals, which are acquired in real-time during the treatment. These signals are only surrogates of the internal motion of target structures and organs at risk, and, consequently, appropriate models are needed to establish correspondence between the acquired signals and the sought internal motion patterns. In this work, we present a diffeomorphic framework for correspondence modelling based on the Log-Euclidean framework and multivariate regression. Within the framework, we systematically compare standard and subspace regression approaches (principal component regression, partial least squares, canonical correlation analysis) for different types of common breathing signals (1D: spirometry, abdominal belt, diaphragm tracking; multi-dimensional: skin surface tracking). Experiments are based on 4D CT and 4D MRI data sets and cover intra- and inter-cycle as well as intra- and inter-session motion variations. Only small differences in internal motion estimation accuracy are observed between the 1D surrogates. Increasing the surrogate dimensionality, however, improved the accuracy significantly; this is shown for both 2D signals, which consist of a common 1D signal and its time derivative, and high-dimensional signals containing the motion of many skin surface points. Eventually, comparing the standard and subspace regression variants when applied to the high-dimensional breathing signals, only small differences in terms of motion estimation accuracy are found.
Zare Abyaneh, Hamid
2014-01-01
This paper examined the efficiency of multivariate linear regression (MLR) and artificial neural network (ANN) models in prediction of two major water quality parameters in a wastewater treatment plant. Biochemical oxygen demand (BOD) and chemical oxygen demand (COD) as well as indirect indicators of organic matters are representative parameters for sewer water quality. Performance of the ANN models was evaluated using coefficient of correlation (r), root mean square error (RMSE) and bias values. The computed values of BOD and COD by model, ANN method and regression analysis were in close agreement with their respective measured values. Results showed that the ANN performance model was better than the MLR model. Comparative indices of the optimized ANN with input values of temperature (T), pH, total suspended solid (TSS) and total suspended (TS) for prediction of BOD was RMSE = 25.1 mg/L, r = 0.83 and for prediction of COD was RMSE = 49.4 mg/L, r = 0.81. It was found that the ANN model could be employed successfully in estimating the BOD and COD in the inlet of wastewater biochemical treatment plants. Moreover, sensitive examination results showed that pH parameter have more effect on BOD and COD predicting to another parameters. Also, both implemented models have predicted BOD better than COD.
Directory of Open Access Journals (Sweden)
Marcia Werlang
2008-08-01
Full Text Available Discrete wavelet transform (DWT Daubecheis was used to compress the dimension of spectral infrared data for determination to the hydroxyl value (OHV of soybean polyols samples. Spectral data were recorded between 650 and 4000 cm-1 with a 4 cm-1 resolution by Fourier transform infrared spectroscopy (FTIR coupled with attenuated total reflection (ATR accessory. Through the models of regression using partial least squares (PLS and interval partial least squares (iPLS methods, the performance of each was compared with the original and/or between them. The spectra data set compressed the 1/4 of its original dimension they had presented the best one resulted with a lesser RMSEP that the model with the not compress signal and a similar correlation. With this result a model of lesser dimension was gotten however with the same capacity, thus DWT, getting a robust method for the reduction of the dimension of the spectra data sets, when if to intend to construct regression multivariate models.
B Gadžurić, Slobodan; O Podunavac Kuzmanović, Sanja; B Vraneš, Milan; Petrin, Marija; Bugarski, Tatjana; Kovačević, Strahinja Z
2016-01-01
The purpose of this work is to promote and facilitate forensic profiling and chemical analysis of illicit drug samples in order to determine their origin, methods of production and transfer through the country. The article is based on the gas chromatography analysis of heroin samples seized from three different locations in Serbia. Chemometric approach with appropriate statistical tools (multiple-linear regression (MLR), hierarchical cluster analysis (HCA) and Wald-Wolfowitz run (WWR) test) were applied on chromatographic data of heroin samples in order to correlate and examine the geographic origin of seized heroin samples. The best MLR models were further validated by leave-one-out technique as well as by the calculation of basic statistical parameters for the established models. To confirm the predictive power of the models, external set of heroin samples was used. High agreement between experimental and predicted values of acetyl thebaol and diacetyl morphine peak ratio, obtained in the validation procedure, indicated the good quality of derived MLR models. WWR test showed which examined heroin samples come from the same population, and HCA was applied in order to overview the similarities among the studied heroine samples.
Remotely mapping river water quality using multivariate regression with prediction validation
Stork, Chris L.; Autrey, Bradley C.
2005-09-01
Remote spectral sensing offers an attractive means of mapping river water quality over wide spatial regions. While previous research has focused on development of spectral indices and models to predict river water quality based on remote images, little attention has been paid to subsequent validation of these predictions. To address this oversight, we describe a retrospective analysis of remote, multispectral Compact Airborne Spectrographic Imager (CASI) images of the Ohio River and its Licking River and Little Miami River tributaries. In conjunction with the CASI acquisitions, ground truth measurements of chlorophyll-a concentration and turbidity were made for a small set of locations in the Ohio River. Partial least squares regression models relating the remote river images to ground truth measurements of chlorophyll-a concentration and turbidity for the Ohio River were developed. Employing these multivariate models, chlorophyll-a concentrations and turbidity levels were predicted in river pixels lacking ground truth measurements, generating detailed estimated water quality maps. An important but often neglected step in the regression process is to validate prediction results using a spectral residual statistic. For both the chlorophyll-a and turbidity regression models, a spectral residual value was calculated for each river pixel and compared to the associated statistical confidence limit for the model. These spectral residual statistic results revealed that while the chlorophyll-a and turbidity models could validly be applied to a vast majority of Ohio River and Licking River pixels, application of these models to Little Miami River pixels was inappropriate due to an unmodeled source of spectral variation.
Directory of Open Access Journals (Sweden)
A. J. Cannon
2011-03-01
Full Text Available A global climate classification is defined using a multivariate regression tree (MRT. The MRT algorithm is automated, which removes the need for a practitioner to manually define the classes; it is hierarchical, which allows a series of nested classes to be defined; and it is rule-based, which allows climate classes to be unambiguously defined and easily interpreted. Climate variables used in the MRT are restricted to those from the Köppen-Geiger climate classification. The result is a hierarchical, rule-based climate classification that can be directly compared against the traditional system. An objective comparison between the two climate classifications at their 5, 13, and 30 class hierarchical levels indicates that both perform well in terms of identifying regions of homogeneous temperature variability, although the MRT still generally outperforms the Köppen-Geiger system. In terms of precipitation discrimination, the Köppen-Geiger classification performs poorly relative to the MRT. The data and algorithm implementation used in this study are freely available. Thus, the MRT climate classification offers instructors and students in the geosciences a simple instrument for exploring modern, computer-based climatological methods.
Cannon, Alex
2017-04-01
univariate technique, and cannot incorporate information from additional covariates, for example ENSO state or physiographic controls on extreme rainfall within a region. Here, the univariate MQR model is extended to allow the use of multiple covariates. Multivariate monotone quantile regression (MMQR) is based on a single hidden-layer feedforward network with the quantile regression error function and partial monotonicity constraints. The MMQR model is demonstrated via Monte Carlo simulations and the estimation and visualization of regional trends in moderate rainfall extremes based on homogenized sub-daily precipitation data at stations in Canada.
Institute of Scientific and Technical Information of China (English)
Wengang Zhang; Anthony T.C. Goh
2016-01-01
Piles are long, slender structural elements used to transfer the loads from the superstructure through weak strata onto stiffer soils or rocks. For driven piles, the impact of the piling hammer induces compression and tension stresses in the piles. Hence, an important design consideration is to check that the strength of the pile is sufficient to resist the stresses caused by the impact of the pile hammer. Due to its complexity, pile drivability lacks a precise analytical solution with regard to the phenomena involved. In situations where measured data or numerical hypothetical results are available, neural networks stand out in mapping the nonlinear interactions and relationships between the system’s predictors and dependent responses. In addition, unlike most computational tools, no mathematical relationship assumption between the dependent and independent variables has to be made. Nevertheless, neural networks have been criticized for their long trial-and-error training process since the optimal configu-ration is not known a priori. This paper investigates the use of a fairly simple nonparametric regression algorithm known as multivariate adaptive regression splines (MARS), as an alternative to neural net-works, to approximate the relationship between the inputs and dependent response, and to mathe-matically interpret the relationship between the various parameters. In this paper, the Back propagation neural network (BPNN) and MARS models are developed for assessing pile drivability in relation to the prediction of the Maximum compressive stresses (MCS), Maximum tensile stresses (MTS), and Blow per foot (BPF). A database of more than four thousand piles is utilized for model development and comparative performance between BPNN and MARS predictions.
Forghani, Ali; Peralta, Richard C.
2017-10-01
The study presents a procedure using solute transport and statistical models to evaluate the performance of aquifer storage and recovery (ASR) systems designed to earn additional water rights in freshwater aquifers. The recovery effectiveness (REN) index quantifies the performance of these ASR systems. REN is the proportion of the injected water that the same ASR well can recapture during subsequent extraction periods. To estimate REN for individual ASR wells, the presented procedure uses finely discretized groundwater flow and contaminant transport modeling. Then, the procedure uses multivariate adaptive regression splines (MARS) analysis to identify the significant variables affecting REN, and to identify the most recovery-effective wells. Achieving REN values close to 100% is the desire of the studied 14-well ASR system operator. This recovery is feasible for most of the ASR wells by extracting three times the injectate volume during the same year as injection. Most of the wells would achieve RENs below 75% if extracting merely the same volume as they injected. In other words, recovering almost all the same water molecules that are injected requires having a pre-existing water right to extract groundwater annually. MARS shows that REN most significantly correlates with groundwater flow velocity, or hydraulic conductivity and hydraulic gradient. MARS results also demonstrate that maximizing REN requires utilizing the wells located in areas with background Darcian groundwater velocities less than 0.03 m/d. The study also highlights the superiority of MARS over regular multiple linear regressions to identify the wells that can provide the maximum REN. This is the first reported application of MARS for evaluating performance of an ASR system in fresh water aquifers.
Dinç, Erdal; Ustündağ, Ozgür; Baleanu, Dumitru
2010-08-01
The sole use of pyridoxine hydrochloride during treatment of tuberculosis gives rise to pyridoxine deficiency. Therefore, a combination of pyridoxine hydrochloride and isoniazid is used in pharmaceutical dosage form in tuberculosis treatment to reduce this side effect. In this study, two chemometric methods, partial least squares (PLS) and principal component regression (PCR), were applied to the simultaneous determination of pyridoxine (PYR) and isoniazid (ISO) in their tablets. A concentration training set comprising binary mixtures of PYR and ISO consisting of 20 different combinations were randomly prepared in 0.1 M HCl. Both multivariate calibration models were constructed using the relationships between the concentration data set (concentration data matrix) and absorbance data matrix in the spectral region 200-330 nm. The accuracy and the precision of the proposed chemometric methods were validated by analyzing synthetic mixtures containing the investigated drugs. The recovery results obtained by applying PCR and PLS calibrations to the artificial mixtures were found between 100.0 and 100.7%. Satisfactory results obtained by applying the PLS and PCR methods to both artificial and commercial samples were obtained. The results obtained in this manuscript strongly encourage us to use them for the quality control and the routine analysis of the marketing tablets containing PYR and ISO drugs.
Expert Involvement Predicts mHealth App Downloads: Multivariate Regression Analysis of Urology Apps.
Pereira-Azevedo, Nuno; Osório, Luís; Cavadas, Vitor; Fraga, Avelino; Carrasquinho, Eduardo; Cardoso de Oliveira, Eduardo; Castelo-Branco, Miguel; Roobol, Monique J
2016-07-15
Urological mobile medical (mHealth) apps are gaining popularity with both clinicians and patients. mHealth is a rapidly evolving and heterogeneous field, with some urology apps being downloaded over 10,000 times and others not at all. The factors that contribute to medical app downloads have yet to be identified, including the hypothetical influence of expert involvement in app development. The objective of our study was to identify predictors of the number of urology app downloads. We reviewed urology apps available in the Google Play Store and collected publicly available data. Multivariate ordinal logistic regression evaluated the effect of publicly available app variables on the number of apps being downloaded. Of 129 urology apps eligible for study, only 2 (1.6%) had >10,000 downloads, with half having ≤100 downloads and 4 (3.1%) having none at all. Apps developed with expert urologist involvement (P=.003), optional in-app purchases (P=.01), higher user rating (PApp cost was inversely related to the number of downloads (Papp development is likely to enhance its chances to have a higher number of downloads. This finding should help in the design of better apps and further promote urologist involvement in mHealth. Official certification processes are required to ensure app quality and user safety.
Review Content Analytics for the Prediction of Learner’s Feedback with Multivariate Regression Model
Directory of Open Access Journals (Sweden)
T. Chellatamilan
2015-06-01
Full Text Available E-learning facilitates both synchronous and asynchronous learning and it plays very important role in the teaching learning process. A large group of learners are engaged in the idea exchange independently by interacting with the members present in the learning management system. In order to generate meaningful learning outcome of the individual peer learners, the feedback review is very essential to extract the conceptual content which reflect the instantaneous learner’s behavior, emotions, capabilities, interestingness and difficulties and to fits them effectively. Collecting feedback in the form of numeric scale is very tough for both the learners and facilitators while specifying the rating, but it is too easy for the learners provide feedback in the form of text messages. The key challenge for analyzers is to extract the meaningful feedback content and dynamic rating of the learner’s feedback related to various conceptual contexts. We propose a novel method using multivariate predictive model for conceptual content analytics based on e-learners reviews using standard statistical model inverse regression. Finally the analysis is used in the prediction studies and to illustrate their effectiveness against the learner’s feedback.
Sih, A M; Kopp, D M; Tang, J H; Rosenberg, N E; Chipungu, E; Harfouche, M; Moyo, M; Mwale, M; Wilkinson, J P
2016-04-01
To compare primiparous and multiparous women who develop obstetric fistula (OF) and to assess predictors of fistula location. Cross-sectional study. Fistula Care Centre at Bwaila Hospital, Lilongwe, Malawi. Women with OF who presented between September 2011 and July 2014 with a complete obstetric history were eligible for the study. Women with OF were surveyed for their obstetric history. Women were classified as multiparous if prior vaginal or caesarean delivery was reported. The location of the fistula was determined at operation: OF involving the urethra, bladder neck, and midvagina were classified as low; OF involving the vaginal apex, cervix, uterus, and ureters were classified as high. Demographic information was compared between primiparous and multiparous women using chi-squared and Mann-Whitney U-tests. Multivariate logistic regression models were implemented to assess the relationship between variables of interest and fistula location. During the study period, 533 women presented for repair, of which 452 (84.8%) were included in the analysis. The majority (56.6%) were multiparous when the fistula formed. Multiparous women were more likely to have laboured fistula location (37.5 versus 11.2%, P fistula. Multiparity was common in our cohort, and these women were more likely to have a high fistula. Additional research is needed to understand the aetiology of high fistula including potential iatrogenic causes. Multiparity and caesarean delivery were associated with a high tract fistula in our Malawian cohort. © 2016 Royal College of Obstetricians and Gynaecologists.
Sih, Allison M.; Kopp, Dawn M.; Tang, Jennifer H.; Rosenberg, Nora E.; Chipungu, Ennet; Harfouche, Melike; Moyo, Margaret; Mwale, Mwawi; Wilkinson, Jeffrey P.
2016-01-01
Objective To compare primiparous and multiparous women who develop obstetric fistula (OF) and to assess predictors of fistula location Design Cross-sectional study Setting Fistula Care Center at Bwaila Hospital, Lilongwe, Malawi Population Women with OF who presented between September 2011 and July 2014 with a complete obstetric history were eligible for the study. Methods Women with OF were surveyed for their obstetric history. Women were classified as multiparous if prior vaginal or cesarean delivery was reported. Location of fistula was determined at operation. OF involving the urethra, bladder neck, and midvagina were classified as low; OF involving the vaginal apex, cervix, uterus, and ureters were classified as high. Main Outcome Measures Demographic information was compared between primiparous and multiparous women using Chi-squared and Mann-Whitney U tests. Multivariate logistic regression models were implemented to assess the relationship between variables of interest and fistula location. Results During the study period, 533 women presented for repair, of which 452 (84.8%) were included in the analysis. The majority (56.6%) were multiparous when the fistula formed. Multiparous women were more likely to have labored less than a day (62.4% vs 44.5%, pfistula location (37.5% vs 11.2%, pfistula. Conclusions Multiparity was common in our cohort, and these women were more likely to have a high fistula. Additional research is needed to understand the etiology of high fistula including potential iatrogenic causes. PMID:26853525
Directory of Open Access Journals (Sweden)
Omholt Stig W
2011-06-01
Full Text Available Abstract Background Deterministic dynamic models of complex biological systems contain a large number of parameters and state variables, related through nonlinear differential equations with various types of feedback. A metamodel of such a dynamic model is a statistical approximation model that maps variation in parameters and initial conditions (inputs to variation in features of the trajectories of the state variables (outputs throughout the entire biologically relevant input space. A sufficiently accurate mapping can be exploited both instrumentally and epistemically. Multivariate regression methodology is a commonly used approach for emulating dynamic models. However, when the input-output relations are highly nonlinear or non-monotone, a standard linear regression approach is prone to give suboptimal results. We therefore hypothesised that a more accurate mapping can be obtained by locally linear or locally polynomial regression. We present here a new method for local regression modelling, Hierarchical Cluster-based PLS regression (HC-PLSR, where fuzzy C-means clustering is used to separate the data set into parts according to the structure of the response surface. We compare the metamodelling performance of HC-PLSR with polynomial partial least squares regression (PLSR and ordinary least squares (OLS regression on various systems: six different gene regulatory network models with various types of feedback, a deterministic mathematical model of the mammalian circadian clock and a model of the mouse ventricular myocyte function. Results Our results indicate that multivariate regression is well suited for emulating dynamic models in systems biology. The hierarchical approach turned out to be superior to both polynomial PLSR and OLS regression in all three test cases. The advantage, in terms of explained variance and prediction accuracy, was largest in systems with highly nonlinear functional relationships and in systems with positive feedback
Selecting minimum dataset soil variables using PLSR as a regressive multivariate method
Stellacci, Anna Maria; Armenise, Elena; Castellini, Mirko; Rossi, Roberta; Vitti, Carolina; Leogrande, Rita; De Benedetto, Daniela; Ferrara, Rossana M.; Vivaldi, Gaetano A.
2017-04-01
Long-term field experiments and science-based tools that characterize soil status (namely the soil quality indices, SQIs) assume a strategic role in assessing the effect of agronomic techniques and thus in improving soil management especially in marginal environments. Selecting key soil variables able to best represent soil status is a critical step for the calculation of SQIs. Current studies show the effectiveness of statistical methods for variable selection to extract relevant information deriving from multivariate datasets. Principal component analysis (PCA) has been mainly used, however supervised multivariate methods and regressive techniques are progressively being evaluated (Armenise et al., 2013; de Paul Obade et al., 2016; Pulido Moncada et al., 2014). The present study explores the effectiveness of partial least square regression (PLSR) in selecting critical soil variables, using a dataset comparing conventional tillage and sod-seeding on durum wheat. The results were compared to those obtained using PCA and stepwise discriminant analysis (SDA). The soil data derived from a long-term field experiment in Southern Italy. On samples collected in April 2015, the following set of variables was quantified: (i) chemical: total organic carbon and nitrogen (TOC and TN), alkali-extractable C (TEC and humic substances - HA-FA), water extractable N and organic C (WEN and WEOC), Olsen extractable P, exchangeable cations, pH and EC; (ii) physical: texture, dry bulk density (BD), macroporosity (Pmac), air capacity (AC), and relative field capacity (RFC); (iii) biological: carbon of the microbial biomass quantified with the fumigation-extraction method. PCA and SDA were previously applied to the multivariate dataset (Stellacci et al., 2016). PLSR was carried out on mean centered and variance scaled data of predictors (soil variables) and response (wheat yield) variables using the PLS procedure of SAS/STAT. In addition, variable importance for projection (VIP
Expert Involvement Predicts mHealth App Downloads: Multivariate Regression Analysis of Urology Apps
Osório, Luís; Cavadas, Vitor; Fraga, Avelino; Carrasquinho, Eduardo; Cardoso de Oliveira, Eduardo; Castelo-Branco, Miguel; Roobol, Monique J
2016-01-01
Background Urological mobile medical (mHealth) apps are gaining popularity with both clinicians and patients. mHealth is a rapidly evolving and heterogeneous field, with some urology apps being downloaded over 10,000 times and others not at all. The factors that contribute to medical app downloads have yet to be identified, including the hypothetical influence of expert involvement in app development. Objective The objective of our study was to identify predictors of the number of urology app downloads. Methods We reviewed urology apps available in the Google Play Store and collected publicly available data. Multivariate ordinal logistic regression evaluated the effect of publicly available app variables on the number of apps being downloaded. Results Of 129 urology apps eligible for study, only 2 (1.6%) had >10,000 downloads, with half having ≤100 downloads and 4 (3.1%) having none at all. Apps developed with expert urologist involvement (P=.003), optional in-app purchases (P=.01), higher user rating (P<.001), and more user reviews (P<.001) were more likely to be installed. App cost was inversely related to the number of downloads (P<.001). Only data from the Google Play Store and the developers’ websites, but not other platforms, were publicly available for analysis, and the level and nature of expert involvement was not documented. Conclusions The explicit participation of urologists in app development is likely to enhance its chances to have a higher number of downloads. This finding should help in the design of better apps and further promote urologist involvement in mHealth. Official certification processes are required to ensure app quality and user safety. PMID:27421338
Directory of Open Access Journals (Sweden)
M. Ahmadlou
2015-12-01
Full Text Available Land use change (LUC models used for modelling urban growth are different in structure and performance. Local models divide the data into separate subsets and fit distinct models on each of the subsets. Non-parametric models are data driven and usually do not have a fixed model structure or model structure is unknown before the modelling process. On the other hand, global models perform modelling using all the available data. In addition, parametric models have a fixed structure before the modelling process and they are model driven. Since few studies have compared local non-parametric models with global parametric models, this study compares a local non-parametric model called multivariate adaptive regression spline (MARS, and a global parametric model called artificial neural network (ANN to simulate urbanization in Mumbai, India. Both models determine the relationship between a dependent variable and multiple independent variables. We used receiver operating characteristic (ROC to compare the power of the both models for simulating urbanization. Landsat images of 1991 (TM and 2010 (ETM+ were used for modelling the urbanization process. The drivers considered for urbanization in this area were distance to urban areas, urban density, distance to roads, distance to water, distance to forest, distance to railway, distance to central business district, number of agricultural cells in a 7 by 7 neighbourhoods, and slope in 1991. The results showed that the area under the ROC curve for MARS and ANN was 94.77% and 95.36%, respectively. Thus, ANN performed slightly better than MARS to simulate urban areas in Mumbai, India.
Ahmadlou, M.; Delavar, M. R.; Tayyebi, A.; Shafizadeh-Moghadam, H.
2015-12-01
Land use change (LUC) models used for modelling urban growth are different in structure and performance. Local models divide the data into separate subsets and fit distinct models on each of the subsets. Non-parametric models are data driven and usually do not have a fixed model structure or model structure is unknown before the modelling process. On the other hand, global models perform modelling using all the available data. In addition, parametric models have a fixed structure before the modelling process and they are model driven. Since few studies have compared local non-parametric models with global parametric models, this study compares a local non-parametric model called multivariate adaptive regression spline (MARS), and a global parametric model called artificial neural network (ANN) to simulate urbanization in Mumbai, India. Both models determine the relationship between a dependent variable and multiple independent variables. We used receiver operating characteristic (ROC) to compare the power of the both models for simulating urbanization. Landsat images of 1991 (TM) and 2010 (ETM+) were used for modelling the urbanization process. The drivers considered for urbanization in this area were distance to urban areas, urban density, distance to roads, distance to water, distance to forest, distance to railway, distance to central business district, number of agricultural cells in a 7 by 7 neighbourhoods, and slope in 1991. The results showed that the area under the ROC curve for MARS and ANN was 94.77% and 95.36%, respectively. Thus, ANN performed slightly better than MARS to simulate urban areas in Mumbai, India.
Mehdizadeh, Saeid; Behmanesh, Javad; Khalili, Keivan
2017-07-01
Soil temperature (T s) and its thermal regime are the most important factors in plant growth, biological activities, and water movement in soil. Due to scarcity of the T s data, estimation of soil temperature is an important issue in different fields of sciences. The main objective of the present study is to investigate the accuracy of multivariate adaptive regression splines (MARS) and support vector machine (SVM) methods for estimating the T s. For this aim, the monthly mean data of the T s (at depths of 5, 10, 50, and 100 cm) and meteorological parameters of 30 synoptic stations in Iran were utilized. To develop the MARS and SVM models, various combinations of minimum, maximum, and mean air temperatures (T min, T max, T); actual and maximum possible sunshine duration; sunshine duration ratio (n, N, n/N); actual, net, and extraterrestrial solar radiation data (R s, R n, R a); precipitation (P); relative humidity (RH); wind speed at 2 m height (u 2); and water vapor pressure (Vp) were used as input variables. Three error statistics including root-mean-square-error (RMSE), mean absolute error (MAE), and determination coefficient (R 2) were used to check the performance of MARS and SVM models. The results indicated that the MARS was superior to the SVM at different depths. In the test and validation phases, the most accurate estimations for the MARS were obtained at the depth of 10 cm for T max, T min, T inputs (RMSE = 0.71 °C, MAE = 0.54 °C, and R 2 = 0.995) and for RH, V p, P, and u 2 inputs (RMSE = 0.80 °C, MAE = 0.61 °C, and R 2 = 0.996), respectively.
Directory of Open Access Journals (Sweden)
Cristina eGorrostieta
2013-11-01
Full Text Available Vector auto-regressive (VAR models typically form the basis for constructing directed graphical models for investigating connectivity in a brain network with brain regions of interest (ROIs as nodes. There are limitations in the standard VAR models. The number of parameters in the VAR model increases quadratically with the number of ROIs and linearly with the order of the model and thus due to the large number of parameters, the model could pose serious estimation problems. Moreover, when applied to imaging data, the standard VAR model does not account for variability in the connectivity structure across all subjects. In this paper, we develop a novel generalization of the VAR model that overcomes these limitations. To deal with the high dimensionality of the parameter space, we propose a Bayesian hierarchical framework for the VAR model that will account for both temporal correlation within a subject and between subject variation. Our approach uses prior distributions that give rise to estimates that correspond to penalized least squares criterion with the elastic net penalty. We apply the proposed model to investigate differences in effective connectivity during a hand grasp experiment between healthy controls and patients with residual motor deficit following a stroke.
Snyder, Carolyn W.
2016-09-01
Statistical challenges often preclude comparisons among different sea surface temperature (SST) reconstructions over the past million years. Inadequate consideration of uncertainty can result in misinterpretation, overconfidence, and biased conclusions. Here I apply Bayesian hierarchical regressions to analyze local SST responsiveness to climate changes for 54 SST reconstructions from across the globe over the past million years. I develop methods to account for multiple sources of uncertainty, including the quantification of uncertainty introduced from absolute dating into interrecord comparisons. The estimates of local SST responsiveness explain 64% (62% to 77%, 95% interval) of the total variation within each SST reconstruction with a single number. There is remarkable agreement between SST proxy methods, with the exception of Mg/Ca proxy methods estimating muted responses at high latitudes. The Indian Ocean exhibits a muted response in comparison to other oceans. I find a stable estimate of the proposed "universal curve" of change in local SST responsiveness to climate changes as a function of sin2(latitude) over the past 400,000 years: SST change at 45°N/S is larger than the average tropical response by a factor of 1.9 (1.5 to 2.6, 95% interval) and explains 50% (35% to 58%, 95% interval) of the total variation between each SST reconstruction. These uncertainty and statistical methods are well suited for application across paleoclimate and environmental data series intercomparisons.
Mandel, Kaisey S; Kirshner, Robert P
2014-01-01
We investigate the correlations between the peak intrinsic colors of Type Ia supernovae (SN Ia) and their expansion velocities at maximum light, measured from the Si II 6355 A spectral feature. We construct a new hierarchical Bayesian regression model and Gibbs sampler to estimate the dependence of the intrinsic colors of a SN Ia on its ejecta velocity, while accounting for the random effects of intrinsic scatter, measurement error, and reddening by host galaxy dust. The method is applied to the apparent color data from BVRI light curves and Si II velocity data for 79 nearby SN Ia. Comparison of the apparent color distributions of high velocity (HV) and normal velocity (NV) supernovae reveals significant discrepancies in B-V and B-R, but not other colors. Hence, they are likely due to intrinsic color differences originating in the B-band, rather than dust reddening. The mean intrinsic B-V and B-R color differences between HV and NV groups are 0.06 +/- 0.02 and 0.09 +/- 0.02 mag, respectively. Under a linear m...
Rogers, David
1991-01-01
G/SPLINES are a hybrid of Friedman's Multivariable Adaptive Regression Splines (MARS) algorithm with Holland's Genetic Algorithm. In this hybrid, the incremental search is replaced by a genetic search. The G/SPLINE algorithm exhibits performance comparable to that of the MARS algorithm, requires fewer least squares computations, and allows significantly larger problems to be considered.
Torabi, Mahmoud
2016-09-01
Disease mapping of a single disease has been widely studied in the public health setup. Simultaneous modeling of related diseases can also be a valuable tool both from the epidemiological and from the statistical point of view. In particular, when we have several measurements recorded at each spatial location, we need to consider multivariate models in order to handle the dependence among the multivariate components as well as the spatial dependence between locations. It is then customary to use multivariate spatial models assuming the same distribution through the entire population density. However, in many circumstances, it is a very strong assumption to have the same distribution for all the areas of population density. To overcome this issue, we propose a hierarchical multivariate mixture generalized linear model to simultaneously analyze spatial Normal and non-Normal outcomes. As an application of our proposed approach, esophageal and lung cancer deaths in Minnesota are used to show the outperformance of assuming different distributions for different counties of Minnesota rather than assuming a single distribution for the population density. Performance of the proposed approach is also evaluated through a simulation study. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Significant drivers of the virtual water trade evaluated with a multivariate regression analysis
Tamea, Stefania; Laio, Francesco; Ridolfi, Luca
2014-05-01
International trade of food is vital for the food security of many countries, which rely on trade to compensate for an agricultural production insufficient to feed the population. At the same time, food trade has implications on the distribution and use of water resources, because through the international trade of food commodities, countries virtually displace the water used for food production, known as "virtual water". Trade thus implies a network of virtual water fluxes from exporting to importing countries, which has been estimated to displace more than 2 billions of m3 of water per year, or about the 2% of the annual global precipitation above land. It is thus important to adequately identify the dynamics and the controlling factors of the virtual water trade in that it supports and enables the world food security. Using the FAOSTAT database of international trade and the virtual water content available from the Water Footprint Network, we reconstructed 25 years (1986-2010) of virtual water fluxes. We then analyzed the dependence of exchanged fluxes on a set of major relevant factors, that includes: population, gross domestic product, arable land, virtual water embedded in agricultural production and dietary consumption, and geographical distance between countries. Significant drivers have been identified by means of a multivariate regression analysis, applied separately to the export and import fluxes of each country; temporal trends are outlined and the relative importance of drivers is assessed by a commonality analysis. Results indicate that population, gross domestic product and geographical distance are the major drivers of virtual water fluxes, with a minor (but non-negligible) contribution given by the agricultural production of exporting countries. Such drivers have become relevant for an increasing number of countries throughout the years, with an increasing variance explained by the distance between countries and a decreasing role of the gross
Directory of Open Access Journals (Sweden)
A. J. Cannon
2012-01-01
Full Text Available A global climate classification is defined using a multivariate regression tree (MRT. The MRT algorithm is automated, hierarchical, and rule-based, thus allowing a system of climate classes to be quickly defined and easily interpreted. Climate variables used in the MRT are restricted to those from the Köppen-Geiger classification system. The result is a set of classes that can be directly compared against those from the traditional system. The two climate classifications are compared at their 5, 13, and 30 class hierarchical levels in terms of climate homogeneity. Results indicate that both perform well in terms of identifying regions of homogeneous temperature variability, although the MRT still generally outperforms the Köppen-Geiger system. In terms of precipitation discrimination, the Köppen-Geiger classification performs poorly relative to the MRT. The data and algorithm implementation used in this study are freely available. Thus, the MRT climate classification offers instructors and students in the geosciences a simple instrument for exploring modern, computer-based climatological methods.
Energy Technology Data Exchange (ETDEWEB)
Mandel, Kaisey S.; Kirshner, Robert P. [Harvard-Smithsonian Center for Astrophysics, 60 Garden Street, Cambridge, MA 02138 (United States); Foley, Ryan J., E-mail: kmandel@cfa.harvard.edu [Astronomy Department, University of Illinois at Urbana-Champaign, 1002 West Green Street, Urbana, IL 61801 (United States)
2014-12-20
We investigate the statistical dependence of the peak intrinsic colors of Type Ia supernovae (SNe Ia) on their expansion velocities at maximum light, measured from the Si II λ6355 spectral feature. We construct a new hierarchical Bayesian regression model, accounting for the random effects of intrinsic scatter, measurement error, and reddening by host galaxy dust, and implement a Gibbs sampler and deviance information criteria to estimate the correlation. The method is applied to the apparent colors from BVRI light curves and Si II velocity data for 79 nearby SNe Ia. The apparent color distributions of high-velocity (HV) and normal velocity (NV) supernovae exhibit significant discrepancies for B – V and B – R, but not other colors. Hence, they are likely due to intrinsic color differences originating in the B band, rather than dust reddening. The mean intrinsic B – V and B – R color differences between HV and NV groups are 0.06 ± 0.02 and 0.09 ± 0.02 mag, respectively. A linear model finds significant slopes of –0.021 ± 0.006 and –0.030 ± 0.009 mag (10{sup 3} km s{sup –1}){sup –1} for intrinsic B – V and B – R colors versus velocity, respectively. Because the ejecta velocity distribution is skewed toward high velocities, these effects imply non-Gaussian intrinsic color distributions with skewness up to +0.3. Accounting for the intrinsic-color-velocity correlation results in corrections to A{sub V} extinction estimates as large as –0.12 mag for HV SNe Ia and +0.06 mag for NV events. Velocity measurements from SN Ia spectra have the potential to diminish systematic errors from the confounding of intrinsic colors and dust reddening affecting supernova distances.
Directory of Open Access Journals (Sweden)
Giuseppe Palermo
2009-05-01
Full Text Available Giuseppe Palermo1, Paolo Piraino2, Hans-Dieter Zucht31Digilab BioVision GmbH, Hannover, Germany; 2Dr Paolo Piraino Statistical Consulting, Rende (CS, Italy; 3Proteome Sciences R&D GmbH and C. KG, Frankfurt am Main, GermanyAbstract: Multivariate partial least square (PLS regression allows the modeling of complex biological events, by considering different factors at the same time. It is unaffected by data collinearity, representing a valuable method for modeling high-dimensional biological data (as derived from genomics, proteomics and peptidomics. In presence of multiple responses, it is of particular interest how to appropriately “dissect” the model, to reveal the importance of single attributes with regard to individual responses (for example, variable selection. In this paper, performances of multivariate PLS regression coefficients, in selecting relevant predictors for different responses in omics-type of data, were investigated by means of a receiver operating characteristic (ROC analysis. For this purpose, simulated data, mimicking the covariance structures of microarray and liquid chromatography mass spectrometric data, were used to generate matrices of predictors and responses. The relevant predictors were set a priori. The influences of noise, the source of data with different covariance structure and the size of relevant predictors were investigated. Results demonstrate the applicability of PLS regression coeffi cients in selecting variables for each response of a multivariate PLS, in omics-type of data. Comparisons with other feature selection methods, such as variable importance in the projection scores, principal component regression, and least absolute shrinkage and selection operator regression were also provided.Keywords: partial least square regression, regression coefficients, variable selection, biomarker discovery, omics-data
A comparison of various methods for multivariate regression with highly collinear variables
Kiers, Henk A.L.; Smilde, Age K.
2007-01-01
Regression tends to give very unstable and unreliable regression weights when predictors are highly collinear. Several methods have been proposed to counter this problem. A subset of these do so by finding components that summarize the information in the predictors and the criterion variables. The p
A comparison of various methods for multivariate regression with highly collinear variables
Kiers, Henk A.L.; Smilde, Age K.
2007-01-01
Regression tends to give very unstable and unreliable regression weights when predictors are highly collinear. Several methods have been proposed to counter this problem. A subset of these do so by finding components that summarize the information in the predictors and the criterion variables. The p
RF Calibration of On-Chip DfT Chain by DC Stimuli and Statistical Multivariate Regression Technique
Ramzan, Rashad; Dabrowski, Jerzy
2015-01-01
The problem of parameter variability in RF and analog circuits is escalating with CMOS scaling. Consequently every RF chip produced in nano-meter CMOS technologies needs to be tested. On-chip Design for Testability (DfT) features, which are meant to reduce test time and cost also suffer from parameter variability. Therefore, RF calibration of all on-chip test structures is mandatory. In this paper, Artificial Neural Networks (ANN) are employed as a multivariate regression technique to archite...
Real, Jordi; Forné, Carles; Roso-Llorach, Albert; Martínez-Sánchez, Jose M
2016-05-01
Controlling for confounders is a crucial step in analytical observational studies, and multivariable models are widely used as statistical adjustment techniques. However, the validation of the assumptions of the multivariable regression models (MRMs) should be made clear in scientific reporting. The objective of this study is to review the quality of statistical reporting of the most commonly used MRMs (logistic, linear, and Cox regression) that were applied in analytical observational studies published between 2003 and 2014 by journals indexed in MEDLINE.Review of a representative sample of articles indexed in MEDLINE (n = 428) with observational design and use of MRMs (logistic, linear, and Cox regression). We assessed the quality of reporting about: model assumptions and goodness-of-fit, interactions, sensitivity analysis, crude and adjusted effect estimate, and specification of more than 1 adjusted model.The tests of underlying assumptions or goodness-of-fit of the MRMs used were described in 26.2% (95% CI: 22.0-30.3) of the articles and 18.5% (95% CI: 14.8-22.1) reported the interaction analysis. Reporting of all items assessed was higher in articles published in journals with a higher impact factor.A low percentage of articles indexed in MEDLINE that used multivariable techniques provided information demonstrating rigorous application of the model selected as an adjustment method. Given the importance of these methods to the final results and conclusions of observational studies, greater rigor is required in reporting the use of MRMs in the scientific literature.
Jafari, Masoumeh; Salimifard, Maryam; Dehghani, Maryam
2014-07-01
This paper presents an efficient method for identification of nonlinear Multi-Input Multi-Output (MIMO) systems in the presence of colored noises. The method studies the multivariable nonlinear Hammerstein and Wiener models, in which, the nonlinear memory-less block is approximated based on arbitrary vector-based basis functions. The linear time-invariant (LTI) block is modeled by an autoregressive moving average with exogenous (ARMAX) model which can effectively describe the moving average noises as well as the autoregressive and the exogenous dynamics. According to the multivariable nature of the system, a pseudo-linear-in-the-parameter model is obtained which includes two different kinds of unknown parameters, a vector and a matrix. Therefore, the standard least squares algorithm cannot be applied directly. To overcome this problem, a Hierarchical Least Squares Iterative (HLSI) algorithm is used to simultaneously estimate the vector and the matrix of unknown parameters as well as the noises. The efficiency of the proposed identification approaches are investigated through three nonlinear MIMO case studies.
Prediction of lip response to orthodontic treatment using a multivariable regression model
Directory of Open Access Journals (Sweden)
Amin Shirvani
2016-01-01
Conclusion: Within the limitation of this study, multiple regression technique was slightly more accurate than the ratio of mean prediction (Viewbox4.0 software and appears to be useful in the prediction of soft tissue changes. As the variability of the predicted individual outcome seems to be relatively high, caution should be taken in predicting hard and soft tissue positional changes.
Delwiche, Stephen R; Reeves, James B
2010-01-01
In multivariate regression analysis of spectroscopy data, spectral preprocessing is often performed to reduce unwanted background information (offsets, sloped baselines) or accentuate absorption features in intrinsically overlapping bands. These procedures, also known as pretreatments, are commonly smoothing operations or derivatives. While such operations are often useful in reducing the number of latent variables of the actual decomposition and lowering residual error, they also run the risk of misleading the practitioner into accepting calibration equations that are poorly adapted to samples outside of the calibration. The current study developed a graphical method to examine this effect on partial least squares (PLS) regression calibrations of near-infrared (NIR) reflection spectra of ground wheat meal with two analytes, protein content and sodium dodecyl sulfate sedimentation (SDS) volume (an indicator of the quantity of the gluten proteins that contribute to strong doughs). These two properties were chosen because of their differing abilities to be modeled by NIR spectroscopy: excellent for protein content, fair for SDS sedimentation volume. To further demonstrate the potential pitfalls of preprocessing, an artificial component, a randomly generated value, was included in PLS regression trials. Savitzky-Golay (digital filter) smoothing, first-derivative, and second-derivative preprocess functions (5 to 25 centrally symmetric convolution points, derived from quadratic polynomials) were applied to PLS calibrations of 1 to 15 factors. The results demonstrated the danger of an over reliance on preprocessing when (1) the number of samples used in a multivariate calibration is low (method has application to the evaluation of other preprocess functions and various types of spectroscopy data.
Gibbons, A.; Thomas, B. F.; Famiglietti, J. S.
2014-12-01
Global groundwater dependence is likely to increase with continued population growth and climate-driven freshwater redistribution. Recent groundwater quantity studies have estimated large-scale aquifer depletion rates using monthly water storage variations from NASA's Gravity Recovery and Climate Experiment (GRACE) mission. These innovative approaches currently fail to evaluate groundwater quality, integral to assess the availability of potable groundwater resources. We present multivariate relationships to predict total dissolved solid (TDS) concentrations as a function of GRACE-derived variations in water table depth, dominant land use, and other physical parameters in two important aquifer systems in the United States: the High Plains aquifer and the Central Valley aquifer. Model evaluations were performed using goodness of fit procedures and cross validation to identify general model forms. Results of this work demonstrate the potential to characterize global groundwater potability using remote sensing.
Feng, Xin; Winters, Jack M
2011-01-01
Individualizing a neurorehabilitation training protocol requires understanding the performance of subjects with various capabilities under different task settings. We use multivariate regression to evaluate the performance of subjects with stroke-induced hemiparesis in trajectory tracking tasks using a force-reflecting joystick. A nonlinear effect was consistently shown in both dimensions of force field strength and impairment level for selected kinematic performance measures, with greatest sensitivity at lower force fields. This suggests that the form of a force field may play a different "role" for subjects with various impairment levels, and confirms that to achieve optimized therapeutic benefit, it is necessary to personalize interfaces.
The analysis of internet addiction scale using multivariate adaptive regression splines.
Kayri, M
2010-01-01
Determining real effects on internet dependency is too crucial with unbiased and robust statistical method. MARS is a new non-parametric method in use in the literature for parameter estimations of cause and effect based research. MARS can both obtain legible model curves and make unbiased parametric predictions. In order to examine the performance of MARS, MARS findings will be compared to Classification and Regression Tree (C&RT) findings, which are considered in the literature to be efficient in revealing correlations between variables. The data set for the study is taken from "The Internet Addiction Scale" (IAS), which attempts to reveal addiction levels of individuals. The population of the study consists of 754 secondary school students (301 female, 443 male students with 10 missing data). MARS 2.0 trial version is used for analysis by MARS method and C&RT analysis was done by SPSS. MARS obtained six base functions of the model. As a common result of these six functions, regression equation of the model was found. Over the predicted variable, MARS showed that the predictors of daily Internet-use time on average, the purpose of Internet-use, grade of students and occupations of mothers had a significant effect (Pdependency level prediction. The fact that MARS revealed extent to which the variable, which was considered significant, changes the character of the model was observed in this study.
Monakhova, Yulia B; Diehl, Bernd W K
2015-11-10
(1)H NMR spectroscopy was used to distinguish pure porcine heparin and porcine heparin blended with bovine species and to quantify the degree of such adulteration. For multivariate modelling several statistical methods such as partial least squares regression (PLS), ridge regression (RR), stepwise regression with variable selection (SR), stepwise principal component regression (SPCR) were utilized for modeling NMR data of in-house prepared blends (n=80). The models were exhaustively validated using independent test and prediction sets. PLS and RR showed the best performance for estimating heparin falsification regarding its animal origin with the limit of detection (LOD) and root mean square error of validation (RMSEV) below 2% w/w and 1% w/w, respectively. Reproducibility expressed in coefficients of variation was estimated to be below 10% starting from approximately 5% w/w of bovine adulteration. Acceptable calibration model was obtained by SPCR, by its application range was limited, whereas SR is least recommended for heparin matrix. The developed method was found to be applicable also to heparinoid matrix (not purified heparin). In this case root mean square of prediction (RMSEP) and LOD were approximately 7% w/w and 8% w/w, respectively. The simple and cheap NMR method is recommended for screening of heparin animal origin in parallel with official NMR test of heparin authenticity and purity.
Ghasemi, Jahan B; Zolfonoun, Ehsan
2013-11-01
A new multicomponent analysis method, based on principal component analysis-multivariate adaptive regression splines (PC-MARS) is proposed for the determination of dialkyltin compounds. In Tween-20 micellar media, dimethyl and dibutyltin react with morin to give fluorescent complexes with the maximum emission peaks at 527 and 520nm, respectively. The spectrofluorimetric matrix data, before building the MARS models, were subjected to principal component analysis and decomposed to PC scores as starting points for the MARS algorithm. The algorithm classifies the calibration data into several groups, in each a regression line or hyperplane is fitted. Performances of the proposed methods were tested in term of root mean square errors of prediction (RMSEP), using synthetic solutions. The results show the strong potential of PC-MARS, as a multivariate calibration method, to be applied to spectral data for multicomponent determinations. The effect of different experimental parameters on the performance of the method were studied and discussed. The prediction capability of the proposed method compared with GC-MS method for determination of dimethyltin and/or dibutyltin. Copyright © 2013 Elsevier B.V. All rights reserved.
Directory of Open Access Journals (Sweden)
Tao Gao
2014-01-01
Full Text Available Extreme precipitation is likely to be one of the most severe meteorological disasters in China; however, studies on the physical factors affecting precipitation extremes and corresponding prediction models are not accurately available. From a new point of view, the sensible heat flux (SHF and latent heat flux (LHF, which have significant impacts on summer extreme rainfall in Yangtze River basin (YRB, have been quantified and then selections of the impact factors are conducted. Firstly, a regional extreme precipitation index was applied to determine Regions of Significant Correlation (RSC by analyzing spatial distribution of correlation coefficients between this index and SHF, LHF, and sea surface temperature (SST on global ocean scale; then the time series of SHF, LHF, and SST in RSCs during 1967–2010 were selected. Furthermore, other factors that significantly affect variations in precipitation extremes over YRB were also selected. The methods of multiple stepwise regression and leave-one-out cross-validation (LOOCV were utilized to analyze and test influencing factors and statistical prediction model. The correlation coefficient between observed regional extreme index and model simulation result is 0.85, with significant level at 99%. This suggested that the forecast skill was acceptable although many aspects of the prediction model should be improved.
Energy Technology Data Exchange (ETDEWEB)
Park, Jinyong [Univ. of Arizona, Tucson, AZ (United States); Balasingham, P [Univ. of Arizona, Tucson, AZ (United States); McKenna, Sean Andrew [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Kulatilake, Pinnaduwa H.S.W. [Univ. of Arizona, Tucson, AZ (United States)
2004-09-01
Sandia National Laboratories, under contract to Nuclear Waste Management Organization of Japan (NUMO), is performing research on regional classification of given sites in Japan with respect to potential volcanic disruption using multivariate statistics and geo-statistical interpolation techniques. This report provides results obtained for hierarchical probabilistic regionalization of volcanism for the Sengan region in Japan by applying multivariate statistical techniques and geostatistical interpolation techniques on the geologic data provided by NUMO. A workshop report produced in September 2003 by Sandia National Laboratories (Arnold et al., 2003) on volcanism lists a set of most important geologic variables as well as some secondary information related to volcanism. Geologic data extracted for the Sengan region in Japan from the data provided by NUMO revealed that data are not available at the same locations for all the important geologic variables. In other words, the geologic variable vectors were found to be incomplete spatially. However, it is necessary to have complete geologic variable vectors to perform multivariate statistical analyses. As a first step towards constructing complete geologic variable vectors, the Universal Transverse Mercator (UTM) zone 54 projected coordinate system and a 1 km square regular grid system were selected. The data available for each geologic variable on a geographic coordinate system were transferred to the aforementioned grid system. Also the recorded data on volcanic activity for Sengan region were produced on the same grid system. Each geologic variable map was compared with the recorded volcanic activity map to determine the geologic variables that are most important for volcanism. In the regionalized classification procedure, this step is known as the variable selection step. The following variables were determined as most important for volcanism: geothermal gradient, groundwater temperature, heat discharge, groundwater
Directory of Open Access Journals (Sweden)
Chong Wei
2015-01-01
Full Text Available Logistic regression models have been widely used in previous studies to analyze public transport utilization. These studies have shown travel time to be an indispensable variable for such analysis and usually consider it to be a deterministic variable. This formulation does not allow us to capture travelers’ perception error regarding travel time, and recent studies have indicated that this error can have a significant effect on modal choice behavior. In this study, we propose a logistic regression model with a hierarchical random error term. The proposed model adds a new random error term for the travel time variable. This term structure enables us to investigate travelers’ perception error regarding travel time from a given choice behavior dataset. We also propose an extended model that allows constraining the sign of this error in the model. We develop two Gibbs samplers to estimate the basic hierarchical model and the extended model. The performance of the proposed models is examined using a well-known dataset.
Directory of Open Access Journals (Sweden)
Guo Junqiao
2008-09-01
Full Text Available Abstract Background The effects of climate variations on bacillary dysentery incidence have gained more recent concern. However, the multi-collinearity among meteorological factors affects the accuracy of correlation with bacillary dysentery incidence. Methods As a remedy, a modified method to combine ridge regression and hierarchical cluster analysis was proposed for investigating the effects of climate variations on bacillary dysentery incidence in northeast China. Results All weather indicators, temperatures, precipitation, evaporation and relative humidity have shown positive correlation with the monthly incidence of bacillary dysentery, while air pressure had a negative correlation with the incidence. Ridge regression and hierarchical cluster analysis showed that during 1987–1996, relative humidity, temperatures and air pressure affected the transmission of the bacillary dysentery. During this period, all meteorological factors were divided into three categories. Relative humidity and precipitation belonged to one class, temperature indexes and evaporation belonged to another class, and air pressure was the third class. Conclusion Meteorological factors have affected the transmission of bacillary dysentery in northeast China. Bacillary dysentery prevention and control would benefit from by giving more consideration to local climate variations.
The Analysis of Internet Addiction Scale Using Multivariate Adaptive Regression Splines
Directory of Open Access Journals (Sweden)
M Kayri
2010-12-01
Full Text Available "nBackground: Determining real effects on internet dependency is too crucial with unbiased and robust statistical method. MARS is a new non-parametric method in use in the literature for parameter estimations of cause and effect based research. MARS can both obtain legible model curves and make unbiased parametric predictions."nMethods: In order to examine the performance of MARS, MARS findings will be compared to Classification and Regression Tree (C&RT findings, which are considered in the literature to be efficient in revealing correlations between variables. The data set for the study is taken from "The Internet Addiction Scale" (IAS, which attempts to reveal addiction levels of individuals. The population of the study consists of 754 secondary school students (301 female, 443 male students with 10 missing data. MARS 2.0 trial version is used for analysis by MARS method and C&RT analysis was done by SPSS."nResults: MARS obtained six base functions of the model. As a common result of these six functions, regression equation of the model was found. Over the predicted variable, MARS showed that the predictors of daily Internet-use time on average, the purpose of Internet- use, grade of students and occupations of mothers had a significant effect (P< 0.05. In this comparative study, MARS obtained different findings from C&RT in dependency level prediction."nConclusion: The fact that MARS revealed extent to which the variable, which was considered significant, changes the character of the model was observed in this study.
Hierarchical design of a polymeric nanovehicle for efficient tumor regression and imaging
An, Jinxia; Guo, Qianqian; Zhang, Peng; Sinclair, Andrew; Zhao, Yu; Zhang, Xinge; Wu, Kan; Sun, Fang; Hung, Hsiang-Chieh; Li, Chaoxing; Jiang, Shaoyi
2016-04-01
Effective delivery of therapeutics to disease sites significantly contributes to drug efficacy, toxicity and clearance. Here we designed a hierarchical polymeric nanoparticle structure for anti-cancer chemotherapy delivery by utilizing state-of-the-art polymer chemistry and co-assembly techniques. This novel structural design combines the most desired merits for drug delivery in a single particle, including a long in vivo circulation time, inhibited non-specific cell uptake, enhanced tumor cell internalization, pH-controlled drug release and simultaneous imaging. This co-assembled nanoparticle showed exceptional stability in complex biological media. Benefiting from the synergistic effects of zwitterionic and multivalent galactose polymers, drug-loaded nanoparticles were selectively internalized by cancer cells rather than normal tissue cells. In addition, the pH-responsive core retained their cargo within their polymeric coating through hydrophobic interaction and released it under slightly acidic conditions. In vivo pharmacokinetic studies in mice showed minimal uptake of nanoparticles by the mononuclear phagocyte system and excellent blood circulation half-lives of 14.4 h. As a result, tumor growth was completely inhibited and no damage was observed for normal organ tissues. This newly developed drug nanovehicle has great potential in cancer therapy, and the hierarchical design principle should provide valuable information for the development of the next generation of drug delivery systems.Effective delivery of therapeutics to disease sites significantly contributes to drug efficacy, toxicity and clearance. Here we designed a hierarchical polymeric nanoparticle structure for anti-cancer chemotherapy delivery by utilizing state-of-the-art polymer chemistry and co-assembly techniques. This novel structural design combines the most desired merits for drug delivery in a single particle, including a long in vivo circulation time, inhibited non-specific cell uptake
Wu, W.; Chen, G. Y.; Kang, R.; Xia, J. C.; Huang, Y. P.; Chen, K. J.
2017-07-01
During slaughtering and further processing, chicken carcasses are inevitably contaminated by microbial pathogen contaminants. Due to food safety concerns, many countries implement a zero-tolerance policy that forbids the placement of visibly contaminated carcasses in ice-water chiller tanks during processing. Manual detection of contaminants is labor consuming and imprecise. Here, a successive projections algorithm (SPA)-multivariable linear regression (MLR) classifier based on an optimal performance threshold was developed for automatic detection of contaminants on chicken carcasses. Hyperspectral images were obtained using a hyperspectral imaging system. A regression model of the classifier was established by MLR based on twelve characteristic wavelengths (505, 537, 561, 562, 564, 575, 604, 627, 656, 665, 670, and 689 nm) selected by SPA , and the optimal threshold T = 1 was obtained from the receiver operating characteristic (ROC) analysis. The SPA-MLR classifier provided the best detection results when compared with the SPA-partial least squares (PLS) regression classifier and the SPA-least squares supported vector machine (LS-SVM) classifier. The true positive rate (TPR) of 100% and the false positive rate (FPR) of 0.392% indicate that the SPA-MLR classifier can utilize spatial and spectral information to effectively detect contaminants on chicken carcasses.
Zhang, M.; Zhang, Y.; Lichtner, P. C.
2013-12-01
/tailing behavior of the FHM can generally be captured by the HSMs. At all the variances tested, the 8-unit upscaled model is always the most accurate. When the variance is low to moderate, this model can provide accurate to adequate predictions of all the FHM plume moments. In addition, upscaled dispersivities computed with the stochastic versus deterministic techniques yield similar solute predictions, which suggest that in this analysis, an ergodic transport regime has emerged. However, when the variance of ln(k) increases to 4.5, the upscaled dispersivities predicted by the stochastic methods result in significant upstream dispersion that is nonphysical. In this case, the HSMs cannot capture the FHM plume moments for the given ln(K) variance. In summary, simulation results suggest that the upscaling dispersivity can be used to accurately capture solute transport in low ln(K) variance systems but fails to describe the solute motion if system variance is high. Reference: Mingkan Zhang, and Ye Zhang, Multiscale, Multi-variance Dispersivity Upscaling for A Three-Dimensional Hierarchical Aquifer: Developing and Testing a Parallel Random Walk Method with a Drift Term in the Dispersion Tensor, Water Resources Research, in preparation.
Energy Technology Data Exchange (ETDEWEB)
Sander, R.K.; Quagliano, J.R.; Fry, H. [and others
1997-08-01
Until recently use of lasers for long path absorption measurements has relied on using differential absorption at two wavelengths to look for one species at a time in the atmosphere. With the advent of multi-line CO{sub 2} lasers it is now feasible to generate 30 to 40 lines in a rapid burst to look for spectra of all the chemical species that may be present. Measurements have been made under relatively constant meteorological conditions in a summertime desert environment with a multi-line tunable laser. Multivariate regression analysis of this data shows that the spectra can be accurately fit using a small number of spectral factors or eigenvectors of the time dependent spectral data matrix. The factors can be rationalized in terms of lidar system effects and atmospheric composition changes.
Hussain, Mirza Zahid; Li, Fuguo; Wang, Jing; Yuan, Zhanwei; Li, Pan; Wu, Tao
2015-07-01
The present study comprises the determination of constitutive relationship for thermo-mechanical processing of INCONEL 718 through double multivariate nonlinear regression, a newly developed approach which not only considers the effect of strain, strain rate, and temperature on flow stress but also explains the interaction effect of these thermo-mechanical parameters on flow behavior of the alloy. Hot isothermal compression experiments were performed on Gleeble-3500 thermo-mechanical testing machine in the temperature range of 1153 to 1333 K within the strain rate range of 0.001 to 10 s-1. The deformation behavior of INCONEL 718 is analyzed and summarized by establishing the high temperature deformation constitutive equation. The calculated correlation coefficient ( R) and average absolute relative error ( AARE) underline the precision of proposed constitutive model.
Energy Technology Data Exchange (ETDEWEB)
Dey, Prasenjit; Dad, Ajoy K. [Mechanical Engineering Department, National Institute of Technology, Agartala (India)
2016-12-15
The present study aims to predict the heat transfer characteristics around a square cylinder with different corner radii using multivariate adaptive regression splines (MARS). Further, the MARS-generated objective function is optimized by particle swarm optimization. The data for the prediction are taken from the recently published article by the present authors [P. Dey, A. Sarkar, A.K. Das, Development of GEP and ANN model to predict the unsteady forced convection over a cylinder, Neural Comput. Appl. (2015). Further, the MARS model is compared with artificial neural network and gene expression programming. It has been found that the MARS model is very efficient in predicting the heat transfer characteristics. It has also been found that MARS is more efficient than artificial neural network and gene expression programming in predicting the forced convection data, and also particle swarm optimization can efficiently optimize the heat transfer rate.
Balabin, Roman M; Smirnov, Sergey V
2012-04-07
Modern analytical chemistry of industrial products is in need of rapid, robust, and cheap analytical methods to continuously monitor product quality parameters. For this reason, spectroscopic methods are often used to control the quality of industrial products in an on-line/in-line regime. Vibrational spectroscopy, including mid-infrared (MIR), Raman, and near-infrared (NIR), is one of the best ways to obtain information about the chemical structures and the quality coefficients of multicomponent mixtures. Together with chemometric algorithms and multivariate data analysis (MDA) methods, which were especially created for the analysis of complicated, noisy, and overlapping signals, NIR spectroscopy shows great results in terms of its accuracy, including classical prediction error, RMSEP. However, it is unclear whether the combined NIR + MDA methods are capable of dealing with much more complex interpolation or extrapolation problems that are inevitably present in real-world applications. In the current study, we try to make a rather general comparison of linear, such as partial least squares or projection to latent structures (PLS); "quasi-nonlinear", such as the polynomial version of PLS (Poly-PLS); and intrinsically non-linear, such as artificial neural networks (ANNs), support vector regression (SVR), and least-squares support vector machines (LS-SVM/LSSVM), regression methods in terms of their robustness. As a measure of robustness, we will try to estimate their accuracy when solving interpolation and extrapolation problems. Petroleum and biofuel (biodiesel) systems were chosen as representative examples of real-world samples. Six very different chemical systems that differed in complexity, composition, structure, and properties were studied; these systems were gasoline, ethanol-gasoline biofuel, diesel fuel, aromatic solutions of petroleum macromolecules, petroleum resins in benzene, and biodiesel. Eighteen different sample sets were used in total. General
Institute of Scientific and Technical Information of China (English)
无
2001-01-01
It is well known that Landsat TM images are the most widely usedremote sensing data in various fields.Usually,it has 7 different electromagnetic spectrum bands,among which the sixth one has much lower ground resolution compared with the other six bands.Nevertheless,it is useful in the study of rock spectrum reflection,geo-thermal resources exploration,etc.To improve the ground resolution of TM6 to the level as that of the other six bands is a problem .This paper presents an algorithm based on the combination of multi-variate regression model with semi-variogram function which can improve the ground resolution of TM6 by "fusing" the data of other six bands.It includes the following main steps: (1) testing the correlation between TM6 and one of TM1-5,7.If the correlation coefficient between TM6 and another one is greater than a given threshold value,then select the band to the regression analysis as an argument.(2) calculating the size of the template window within which some parameters needed by the regression model will be calculated; (3) replacing the original pixel values of TM6 by those obtained by regression analysis; (4) using image entropy as a measurement to evaluate the quality of the fused image of TM6.The basic mechanism of the algorithm is discussed and the V C++ program for implementing this algorithm is also presented.A simple application example is given in the last part of this paper,showing the effectiveness of the algorithm.
Djimadoumngar, K. N.; Lee, J.; Bila, M. D.; Djoret, D.; Ichoku, C. M.
2016-12-01
Food security and water shortage from frequent droughts have been major issues in the northern sub-Saharan Africa (NSSA). The shrinking Lake Chad is one of the examples experiencing severe droughts and insecure food production. One of major challenges in the NSSA is the lack of data collection and monitoring systems to support a decision-making process in the agriculture and water resources management. The present study aims to help better understand a hydrologic system of the Lake Chad using multivariate regression models and enhance the models to forecast the river discharge along the Chari-Logone river system, which contributes over 90% of water into the Lake. As regressands, the river discharge data from two monitoring stations at Bongor and Logone-Gana were collected from 2001-2007. The regressors include precipitation, soil moisture, soil and air temperature, specific humidity, evapotranspiration and surface runoff. The Tropical Rainfall Measuring Mission (TRMM) data were used for the precipitation, and all other regressor parameters were obtained from the Global Land Data Assimilation System (GLDAS). We performed cross-correlation analysis between the river discharge and each regressor parameter to quantify the time lag to have the best correlation, which implies the responding time of the river discharge to the change of other hydrological parameters. The estimated time lags were integrated into the multivariate regression model. The results show that, for the river discharge data, precipitation, soil moisture, and surface runoff have linear relationships while evapotranspiration, soil and air temperature, and specific humidity have non-linear relationships. The observed river discharge and the predicted one, which is a function of precipitation and soil moisture, shows a good match with 93% of correlation.
Directory of Open Access Journals (Sweden)
Pao-Shin Chu
2007-01-01
Full Text Available In this study, a multivariate linear regression model is applied to predict the seasonal tropical cyclone (TC count in the vicinity of Taiwan using large-scale climate variables available from the preceding May. Here the season encompasses the five-month period from June through October, when typhoons are most active in the study domain. The model is based on the least absolute deviation so that regression estimates are more resistant (i.e., not unduly influenced by outliers than those derived from the ordinary least square method. Through lagged correlation analysis, five parameters (sea surface temperature, sea level pressure, precipitable water, low-level relative vorticity, and vertical wind shear in key locations of the tropical western North Pacific are identified as predictor datasets. Results from crossvalidation suggest that the statistical model is skillful in predicting TC activity, with a correlation coefficient of 0.63 for 1970 - 2003. If more recent data are included, the correlation coefficient reaches 0.69 for 1970 - 2006. Relative importance of each predictor variable is evaluated. For predicting higher than normal seasonal TC activity, warmer sea surface temperatures, a moist troposphere, and the presence of a low-level cyclonic circulation coupled with low-latitude westerlies in the Philippine Sea in the antecedent May appear to be important.
Directory of Open Access Journals (Sweden)
Paulino José García Nieto
2015-06-01
Full Text Available The aim of this study was to obtain a predictive model able to perform an early detection of central segregation severity in continuous cast steel slabs. Segregation in steel cast products is an internal defect that can be very harmful when slabs are rolled in heavy plate mills. In this research work, the central segregation was studied with success using the data mining methodology based on multivariate adaptive regression splines (MARS technique. For this purpose, the most important physical-chemical parameters are considered. The results of the present study are two-fold. In the first place, the significance of each physical-chemical variable on the segregation is presented through the model. Second, a model for forecasting segregation is obtained. Regression with optimal hyperparameters was performed and coefficients of determination equal to 0.93 for continuity factor estimation and 0.95 for average width were obtained when the MARS technique was applied to the experimental dataset, respectively. The agreement between experimental data and the model confirmed the good performance of the latter.
García Nieto, Paulino José; González Suárez, Victor Manuel; Álvarez Antón, Juan Carlos; Mayo Bayón, Ricardo; Sirgo Blanco, José Ángel; Díaz Fernández, Ana María
2015-01-01
The aim of this study was to obtain a predictive model able to perform an early detection of central segregation severity in continuous cast steel slabs. Segregation in steel cast products is an internal defect that can be very harmful when slabs are rolled in heavy plate mills. In this research work, the central segregation was studied with success using the data mining methodology based on multivariate adaptive regression splines (MARS) technique. For this purpose, the most important physical-chemical parameters are considered. The results of the present study are two-fold. In the first place, the significance of each physical-chemical variable on the segregation is presented through the model. Second, a model for forecasting segregation is obtained. Regression with optimal hyperparameters was performed and coefficients of determination equal to 0.93 for continuity factor estimation and 0.95 for average width were obtained when the MARS technique was applied to the experimental dataset, respectively. The agreement between experimental data and the model confirmed the good performance of the latter.
Directory of Open Access Journals (Sweden)
Kehinde Anthony Mogaji
2016-07-01
Full Text Available This study developed a GIS-based multivariate regression (MVR yield rate prediction model of groundwater resource sustainability in the hard-rock geology terrain of southwestern Nigeria. This model can economically manage the aquifer yield rate potential predictions that are often overlooked in groundwater resources development. The proposed model relates the borehole yield rate inventory of the area to geoelectrically derived parameters. Three sets of borehole yield rate conditioning geoelectrically derived parameters—aquifer unit resistivity (ρ, aquifer unit thickness (D and coefficient of anisotropy (λ—were determined from the acquired and interpreted geophysical data. The extracted borehole yield rate values and the geoelectrically derived parameter values were regressed to develop the MVR relationship model by applying linear regression and GIS techniques. The sensitivity analysis results of the MVR model evaluated at P ⩽ 0.05 for the predictors ρ, D and λ provided values of 2.68 × 10−05, 2 × 10−02 and 2.09 × 10−06, respectively. The accuracy and predictive power tests conducted on the MVR model using the Theil inequality coefficient measurement approach, coupled with the sensitivity analysis results, confirmed the model yield rate estimation and prediction capability. The MVR borehole yield prediction model estimates were processed in a GIS environment to model an aquifer yield potential prediction map of the area. The information on the prediction map can serve as a scientific basis for predicting aquifer yield potential rates relevant in groundwater resources sustainability management. The developed MVR borehole yield rate prediction mode provides a good alternative to other methods used for this purpose.
Qi, Danyi; Roe, Brian E
2016-01-01
We estimate models of consumer food waste awareness and attitudes using responses from a national survey of U.S. residents. Our models are interpreted through the lens of several theories that describe how pro-social behaviors relate to awareness, attitudes and opinions. Our analysis of patterns among respondents' food waste attitudes yields a model with three principal components: one that represents perceived practical benefits households may lose if food waste were reduced, one that represents the guilt associated with food waste, and one that represents whether households feel they could be doing more to reduce food waste. We find our respondents express significant agreement that some perceived practical benefits are ascribed to throwing away uneaten food, e.g., nearly 70% of respondents agree that throwing away food after the package date has passed reduces the odds of foodborne illness, while nearly 60% agree that some food waste is necessary to ensure meals taste fresh. We identify that these attitudinal responses significantly load onto a single principal component that may represent a key attitudinal construct useful for policy guidance. Further, multivariate regression analysis reveals a significant positive association between the strength of this component and household income, suggesting that higher income households most strongly agree with statements that link throwing away uneaten food to perceived private benefits.
Directory of Open Access Journals (Sweden)
Shahab Karimi
2014-01-01
Full Text Available In this study, the effects of ratios of dolomite, base/acid, silica, SiO2/Al2O3, and Fe2O3/CaO, base and acid oxides, and 11 oxides (SiO2, Al2O3, CaO, MgO, MnO, Na2O, K2O, Fe2O3, TiO2, P2O5, and SO3 on ash fusion temperatures for 1040 US coal samples from 12 states were evaluated using regression and adaptive neurofuzzy inference system (ANFIS methods. Different combinations of independent variables were examined to predict ash fusion temperatures in the multivariable procedure. The combination of the “11 oxides + (Base/Acid + Silica ratio” was the best predictor. Correlation coefficients (R2 of 0.891, 0.917, and 0.94 were achieved using nonlinear equations for the prediction of initial deformation temperature (IDT, softening temperature (ST, and fluid temperature (FT, respectively. The mentioned “best predictor” was used as input to the ANFIS system as well, and the correlation coefficients (R2 of the prediction were enhanced to 0.97, 0.98, and 0.99 for IDT, ST, and FT, respectively. The prediction precision that was achieved in this work exceeded that reported in previously published works.
Berg, Gregory D; Donnelly, Shawn; Warnick, Kathleen; Medina, Wendie; Miller, Mary
2014-07-03
The prevalence of schizophrenia and depression in the United States is far higher among Medicaid recipients than in the general population. Individuals suffering from mental illness, including schizophrenia and depression, also have higher rates of emergency department utilization, which is costly and may not generate the positive health outcomes desired. Disease management programs strive to help individuals suffering from chronic illnesses better manage their condition(s) and seek health care in the appropriate settings. The objective of this manuscript is to estimate a dose-response impact on hospital inpatient and emergency room utilizations for any reason by Medicaid recipients with depression or schizophrenia who received disease management contacts. Multivariate regression analysis of panel data taken from administrative claims was conducted to test the hypothesis that increased contacts lower the likelihood of all-cause inpatient admissions and emergency room visits. Subjects included 6,274 members of Illinois' non-institutionalized Medicaid-only aged, blind or disabled population diagnosed with depression or schizophrenia. The statistical measure is the odds ratio. The odds ratio association is between the monthly utilization indicators and the number of contacts (doses) a member had for each particular disease management intervention. Higher numbers of intervention contacts for Medicaid recipients diagnosed with depression or schizophrenia were associated with statistically significant reductions in all-cause inpatient admissions and emergency room utilizations. There is a high correlation between depression and schizophrenia disease management contacts and lowered all-cause hospital inpatient and emergency room utilizations.
Grinn-Gofroń, Agnieszka; Strzelczak, Agnieszka
2009-11-01
A study was made of the link between time of day, weather variables and the hourly content of certain fungal spores in the atmosphere of the city of Szczecin, Poland, in 2004-2007. Sampling was carried out with a Lanzoni 7-day-recording spore trap. The spores analysed belonged to the taxa Alternaria and Cladosporium. These spores were selected both for their allergenic capacity and for their high level presence in the atmosphere, particularly during summer. Spearman correlation coefficients between spore concentrations, meteorological parameters and time of day showed different indices depending on the taxon being analysed. Relative humidity (RH), air temperature, air pressure and clouds most strongly and significantly influenced the concentration of Alternaria spores. Cladosporium spores correlated less strongly and significantly than Alternaria. Multivariate regression tree analysis revealed that, at air pressures lower than 1,011 hPa the concentration of Alternaria spores was low. Under higher air pressure spore concentrations were higher, particularly when RH was lower than 36.5%. In the case of Cladosporium, under higher air pressure (>1,008 hPa), the spores analysed were more abundant, particularly after 0330 hours. In artificial neural networks, RH, air pressure and air temperature were the most important variables in the model for Alternaria spore concentration. For Cladosporium, clouds, time of day, air pressure, wind speed and dew point temperature were highly significant factors influencing spore concentration. The maximum abundance of Cladosporium spores in air fell between 1200 and 1700 hours.
Alamdari, R F; Mani-Varnosfaderani, A; Asadollahi-Baboli, M; Khalafi-Nezhad, A
2012-10-01
The present work focuses on the development of an interpretable quantitative structure-activity relationship (QSAR) model for predicting the anti-HIV activities of 67 thiazolylthiourea derivatives. This set of molecules has been proposed as potent HIV-1 reverse transcriptase inhibitors (RT-INs). The molecules were encoded to a diverse set of molecular descriptors, spanning different physical and chemical properties. Monte Carlo (MC) sampling and multivariate adaptive regression spline (MARS) techniques were used to select the most important descriptors and to predict the activity of the molecules. The most important descriptor was found to be the aspherisity index. The analysis of variance (ANOVA) and interpretable spline equations showed that the geometrical shape of the molecules has considerable effect on their activities. It seems that the linear molecules are more active than symmetric top compounds. The final MARS model derived displayed a good predictive ability judging from the determination coefficient corresponding to the leave multiple out (LMO) cross-validation technique, i.e. r (2 )= 0.828 (M = 12) and r (2 )= 0.813 (M = 20). The results of this work showed that the developed spline model is robust, has a good predictive power, and can then be used as a reliable tool for designing novel HIV-1 RT-INs.
2016-01-01
We estimate models of consumer food waste awareness and attitudes using responses from a national survey of U.S. residents. Our models are interpreted through the lens of several theories that describe how pro-social behaviors relate to awareness, attitudes and opinions. Our analysis of patterns among respondents’ food waste attitudes yields a model with three principal components: one that represents perceived practical benefits households may lose if food waste were reduced, one that represents the guilt associated with food waste, and one that represents whether households feel they could be doing more to reduce food waste. We find our respondents express significant agreement that some perceived practical benefits are ascribed to throwing away uneaten food, e.g., nearly 70% of respondents agree that throwing away food after the package date has passed reduces the odds of foodborne illness, while nearly 60% agree that some food waste is necessary to ensure meals taste fresh. We identify that these attitudinal responses significantly load onto a single principal component that may represent a key attitudinal construct useful for policy guidance. Further, multivariate regression analysis reveals a significant positive association between the strength of this component and household income, suggesting that higher income households most strongly agree with statements that link throwing away uneaten food to perceived private benefits. PMID:27441687
Directory of Open Access Journals (Sweden)
Danyi Qi
Full Text Available We estimate models of consumer food waste awareness and attitudes using responses from a national survey of U.S. residents. Our models are interpreted through the lens of several theories that describe how pro-social behaviors relate to awareness, attitudes and opinions. Our analysis of patterns among respondents' food waste attitudes yields a model with three principal components: one that represents perceived practical benefits households may lose if food waste were reduced, one that represents the guilt associated with food waste, and one that represents whether households feel they could be doing more to reduce food waste. We find our respondents express significant agreement that some perceived practical benefits are ascribed to throwing away uneaten food, e.g., nearly 70% of respondents agree that throwing away food after the package date has passed reduces the odds of foodborne illness, while nearly 60% agree that some food waste is necessary to ensure meals taste fresh. We identify that these attitudinal responses significantly load onto a single principal component that may represent a key attitudinal construct useful for policy guidance. Further, multivariate regression analysis reveals a significant positive association between the strength of this component and household income, suggesting that higher income households most strongly agree with statements that link throwing away uneaten food to perceived private benefits.
Directory of Open Access Journals (Sweden)
Jairo Vanegas
2017-05-01
Full Text Available Multivariate Adaptative Regression Splines (MARS es un método de modelación no paramétrico que extiende el modelo lineal incorporando no linealidades e interacciones de variables. Es una herramienta flexible que automatiza la construcción de modelos de predicción, seleccionando variables relevantes, transformando las variables predictoras, tratando valores perdidos y previniendo sobreajustes mediante un autotest. También permite predecir tomando en cuenta factores estructurales que pudieran tener influencia sobre la variable respuesta, generando modelos hipotéticos. El resultado final serviría para identificar puntos de corte relevantes en series de datos. En el área de la salud es poco utilizado, por lo que se propone como una herramienta más para la evaluación de indicadores relevantes en salud pública. Para efectos demostrativos se utilizaron series de datos de mortalidad de menores de 5 años de Costa Rica en el periodo 1978-2008.
Buscot, Marie-Jeanne; Wotherspoon, Simon S; Magnussen, Costan G; Juonala, Markus; Sabin, Matthew A; Burgner, David P; Lehtimäki, Terho; Viikari, Jorma S A; Hutri-Kähönen, Nina; Raitakari, Olli T; Thomson, Russell J
2017-06-06
Bayesian hierarchical piecewise regression (BHPR) modeling has not been previously formulated to detect and characterise the mechanism of trajectory divergence between groups of participants that have longitudinal responses with distinct developmental phases. These models are useful when participants in a prospective cohort study are grouped according to a distal dichotomous health outcome. Indeed, a refined understanding of how deleterious risk factor profiles develop across the life-course may help inform early-life interventions. Previous techniques to determine between-group differences in risk factors at each age may result in biased estimate of the age at divergence. We demonstrate the use of Bayesian hierarchical piecewise regression (BHPR) to generate a point estimate and credible interval for the age at which trajectories diverge between groups for continuous outcome measures that exhibit non-linear within-person response profiles over time. We illustrate our approach by modeling the divergence in childhood-to-adulthood body mass index (BMI) trajectories between two groups of adults with/without type 2 diabetes mellitus (T2DM) in the Cardiovascular Risk in Young Finns Study (YFS). Using the proposed BHPR approach, we estimated the BMI profiles of participants with T2DM diverged from healthy participants at age 16 years for males (95% credible interval (CI):13.5-18 years) and 21 years for females (95% CI: 19.5-23 years). These data suggest that a critical window for weight management intervention in preventing T2DM might exist before the age when BMI growth rate is naturally expected to decrease. Simulation showed that when using pairwise comparison of least-square means from categorical mixed models, smaller sample sizes tended to conclude a later age of divergence. In contrast, the point estimate of the divergence time is not biased by sample size when using the proposed BHPR method. BHPR is a powerful analytic tool to model long-term non
Trigila, Alessandro; Iadanza, Carla; Esposito, Carlo; Scarascia-Mugnozza, Gabriele
2015-04-01
first phase of the work addressed to identify the spatial relationships between the landslides location and the 13 related factors by using the Frequency Ratio bivariate statistical method. The analysis was then carried out by adopting a multivariate statistical approach, according to the Logistic Regression technique and Random Forests technique that gave best results in terms of AUC. The models were performed and evaluated with different sample sizes and also taking into account the temporal variation of input variables such as burned areas by wildfire. The most significant outcome of this work are: the relevant influence of the sample size on the model results and the strong importance of some environmental factors (e.g. land use and wildfires) for the identification of the depletion zones of extremely rapid shallow landslides.
Directory of Open Access Journals (Sweden)
Roberts MH
2012-03-01
Full Text Available Melissa H Roberts1, Anand A Dalal21Lovelace Clinic Foundation, (Lovelace Respiratory Research Institute at the time of the study, Albuquerque, NM, 2US Health Outcomes, GlaxoSmithKline, Durham, NC, USAPurpose: To investigate equivalency of results from multivariable regression (MR and propensity score matching (PSM models, observational research methods used to mitigate bias stemming from non-randomization (and consequently unbalanced groups at baseline, using, as an example, a large study of chronic obstructive pulmonary disease (COPD initial maintenance therapy.Methods: Patients were 32,338 health plan members, age ≥40 years, with COPD initially treated with fluticasone propionate/salmeterol combination (FSC, tiotropium (TIO, or ipratropium (IPR alone or in combination with albuterol. Using MR and PSM methods, the proportion of patients with COPD-related health care utilization, mean costs, odds ratios (ORs, and incidence rate ratios (IRRs for utilization events were calculated for the 12 months following therapy initiation.Results: Of 12,595 FSC, 9126 TIO, and 10,617 IPR patients meeting MR inclusion criteria, 89.1% (8135 of TIO and 80.2% (8514 of IPR patients were matched to FSC patients for the PSM analysis. Methods produced substantially similar findings for mean cost comparisons, ORs, and IRRs for most utilization events. In contrast to MR, for TIO compared to FSC, PSM did not produce statistically significant ORs for hospitalization or outpatient visit with antibiotic or significant IRRs for hospitalization or outpatient visit with oral corticosteroid. As in the MR analysis, compared to FSC, ORs and IRRs for all other utilization events, as well as mean costs, were less favorable for IPR and TIO.Conclusion: In this example of an observational study of maintenance therapy for COPD, more than 80% of the original treatment groups used in the MR analysis were matched to comparison treatment groups for the PSM analysis. While some sample
Institute of Scientific and Technical Information of China (English)
袁平; 丁峰
2008-01-01
利用Kronecker积,推导出多变量ARX-like随机系统的辨识模型,使用递阶辨识原理研制了一个递阶最小二乘参数估计算法.提出的递阶最小二乘算法比现存递推最小二乘算法计算量小.给出了为仿真例子.%By using the Kronecker product,An identification model for multivariable ARX-like stochastic systems is derived and developed a hierarchical least squares parameter estimation algorithm by the hierarchical identification principle．The proposed algorithm has less computational eorts than the recursive least squares algorithm.A simulation example is included.
Keegan, John P.; Chan, Fong; Ditchman, Nicole; Chiu, Chung-Yi
2012-01-01
The main objective of this study was to validate Pender's Health Promotion Model (HPM) as a motivational model for exercise/physical activity self-management for people with spinal cord injuries (SCIs). Quantitative descriptive research design using hierarchical regression analysis (HRA) was used. A total of 126 individuals with SCI were recruited…
Directory of Open Access Journals (Sweden)
Sepedeh Gholizadeh
2016-07-01
Full Text Available Background:Obesity and hypertension are the most important non-communicable diseases thatin many studies, the prevalence and their risk factors have been performedin each geographic region univariately.Study of factors affecting both obesity and hypertension may have an important role which to be adrressed in this study. Materials &Methods:This cross-sectional study was conducted on 1000 men aged 20-70 living in Bushehr province. Blood pressure was measured three times and the average of them was considered as one of the response variables. Hypertension was defined as systolic blood pressure ≥140 (and-or diastolic blood pressure ≥90 and obesity was defined as body mass index ≥25. Data was analyzed by using multilevel, multivariate logistic regression model by MlwiNsoftware. Results:Intra class correlations in cluster level obtained 33% for high blood pressure and 37% for obesity, so two level model was fitted to data. The prevalence of obesity and hypertension obtained 43.6% (0.95%CI; 40.6-46.5, 29.4% (0.95%CI; 26.6-32.1 respectively. Age, gender, smoking, hyperlipidemia, diabetes, fruit and vegetable consumption and physical activity were the factors affecting blood pressure (p≤0.05. Age, gender, hyperlipidemia, diabetes, fruit and vegetable consumption, physical activity and place of residence are effective on obesity (p≤0.05. Conclusion: The multilevel models with considering levels distribution provide more precise estimates. As regards obesity and hypertension are the major risk factors for cardiovascular disease, by knowing the high-risk groups we can d careful planning to prevention of non-communicable diseases and promotion of society health.
Bonsu, Bema K; Harper, Marvin B
2004-06-01
Although accurate models for predicting acute bacterial meningitis exist, most have narrow application because of the specific variables selected for them. In this study, we estimate the accuracy of a simple new model with potentially broader applicability. On the basis of previous reports, we created a reduced multivariable logistic regression model for predicting bacterial meningitis that relies on age (years) (AGE), cerebrospinal fluid (CSF), total protein (TP) and total neutrophil count (TNC) alone. Data were from children ages 1 month-18 years diagnosed with acute enteroviral or bacterial meningitis whose initial CSF revealed >7 white blood cells/mm. A fractional polynomial model was specified and validated internally by the bootstrap procedure. The area under the receiver operating characteristic curve (discrimination: criterion standard, >0.7), the Hosmer-Lemeshow deciles-of-risk statistic (calibration: criterion standard, P > 0.05) and sensitivity-specificity pairs at prespecified probability thresholds of the model were computed. We identified 60 children with bacterial meningitis and 82 with enteroviral meningitis. At an area under the receiver operating characteristic curve of 0.97, our model represented by the equation: log odds of bacterial meningitis = 0.343 - 0.003 TNC - 34.802 TP + 21.991 TP - 0.345 AGE, was highly accurate when differentiating between bacterial and enteroviral meningitis. The model fit the data well (Hosmer-Lemeshow statistic; P =[r] 0.53). At probability cutoffs between 0.1 and 0.4, the model had sensitivity values between 98 and 92% and specificity values between 62 and 94%. Among children with CSF pleocytosis, a prediction model based exclusively on age, CSF total protein and CSF neutrophils differentiates accurately between acute bacterial and viral meningitis.
Accurate, nonintrusive, and inexpensive techniques are needed to measure energy expenditure (EE) in free-living populations. Our primary aim in this study was to validate cross-sectional time series (CSTS) and multivariate adaptive regression splines (MARS) models based on observable participant cha...
Jackson, Dan; White, Ian R; Riley, Richard D
2013-03-01
Multivariate meta-analysis is becoming more commonly used. Methods for fitting the multivariate random effects model include maximum likelihood, restricted maximum likelihood, Bayesian estimation and multivariate generalisations of the standard univariate method of moments. Here, we provide a new multivariate method of moments for estimating the between-study covariance matrix with the properties that (1) it allows for either complete or incomplete outcomes and (2) it allows for covariates through meta-regression. Further, for complete data, it is invariant to linear transformations. Our method reduces to the usual univariate method of moments, proposed by DerSimonian and Laird, in a single dimension. We illustrate our method and compare it with some of the alternatives using a simulation study and a real example. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Deo, Ravinesh C.; Kisi, Ozgur; Singh, Vijay P.
2017-02-01
Drought forecasting using standardized metrics of rainfall is a core task in hydrology and water resources management. Standardized Precipitation Index (SPI) is a rainfall-based metric that caters for different time-scales at which the drought occurs, and due to its standardization, is well-suited for forecasting drought at different periods in climatically diverse regions. This study advances drought modelling using multivariate adaptive regression splines (MARS), least square support vector machine (LSSVM), and M5Tree models by forecasting SPI in eastern Australia. MARS model incorporated rainfall as mandatory predictor with month (periodicity), Southern Oscillation Index, Pacific Decadal Oscillation Index and Indian Ocean Dipole, ENSO Modoki and Nino 3.0, 3.4 and 4.0 data added gradually. The performance was evaluated with root mean square error (RMSE), mean absolute error (MAE), and coefficient of determination (r2). Best MARS model required different input combinations, where rainfall, sea surface temperature and periodicity were used for all stations, but ENSO Modoki and Pacific Decadal Oscillation indices were not required for Bathurst, Collarenebri and Yamba, and the Southern Oscillation Index was not required for Collarenebri. Inclusion of periodicity increased the r2 value by 0.5-8.1% and reduced RMSE by 3.0-178.5%. Comparisons showed that MARS superseded the performance of the other counterparts for three out of five stations with lower MAE by 15.0-73.9% and 7.3-42.2%, respectively. For the other stations, M5Tree was better than MARS/LSSVM with lower MAE by 13.8-13.4% and 25.7-52.2%, respectively, and for Bathurst, LSSVM yielded more accurate result. For droughts identified by SPI ≤ - 0.5, accurate forecasts were attained by MARS/M5Tree for Bathurst, Yamba and Peak Hill, whereas for Collarenebri and Barraba, M5Tree was better than LSSVM/MARS. Seasonal analysis revealed disparate results where MARS/M5Tree was better than LSSVM. The results highlight the
Pradhan, Biswajeet
2010-05-01
This paper presents the results of the cross-validation of a multivariate logistic regression model using remote sensing data and GIS for landslide hazard analysis on the Penang, Cameron, and Selangor areas in Malaysia. Landslide locations in the study areas were identified by interpreting aerial photographs and satellite images, supported by field surveys. SPOT 5 and Landsat TM satellite imagery were used to map landcover and vegetation index, respectively. Maps of topography, soil type, lineaments and land cover were constructed from the spatial datasets. Ten factors which influence landslide occurrence, i.e., slope, aspect, curvature, distance from drainage, lithology, distance from lineaments, soil type, landcover, rainfall precipitation, and normalized difference vegetation index (ndvi), were extracted from the spatial database and the logistic regression coefficient of each factor was computed. Then the landslide hazard was analysed using the multivariate logistic regression coefficients derived not only from the data for the respective area but also using the logistic regression coefficients calculated from each of the other two areas (nine hazard maps in all) as a cross-validation of the model. For verification of the model, the results of the analyses were then compared with the field-verified landslide locations. Among the three cases of the application of logistic regression coefficient in the same study area, the case of Selangor based on the Selangor logistic regression coefficients showed the highest accuracy (94%), where as Penang based on the Penang coefficients showed the lowest accuracy (86%). Similarly, among the six cases from the cross application of logistic regression coefficient in other two areas, the case of Selangor based on logistic coefficient of Cameron showed highest (90%) prediction accuracy where as the case of Penang based on the Selangor logistic regression coefficients showed the lowest accuracy (79%). Qualitatively, the cross
Institute of Scientific and Technical Information of China (English)
丁锋; 王艳娇
2014-01-01
According to the hierarchical identification principle,this paper presents the hierarchical stochastic gra-dient algorithms and the hierarchical gradient based iterative algorithms, the hierarchical least squares algorithms and the hierarchical least squares based iterative algorithms for multivariable equation-error-like systems and multi-variable equation-error ARMA-like systems,and further derives the hierarchical multi-innovation gradient algorithms and the hierarchical multi-innovation least squares algorithms. In order to reduce computational burdens,this paper derives the filtering based hierarchical identification algorithms and the filtering based hierarchical multi-innovation identification algorithms for multivariable equation-error ARMA-like systems using the filtering technique. Finally, the computational efficiency and the computational steps of some typical identification algorithms are discussed.%根据递阶辨识原理，研究了类多变量方程误差系统和类多变量方程误差ARMA系统递阶随机梯度方法和递阶梯度迭代方法、递阶最小二乘方法和递阶最小二乘迭代方法。进一步利用多新息辨识理论，推导了递阶多新息梯度辨识方法和递阶多新息最小二乘辨识方法。为减小计算量，推导了基于滤波的类多变量方程误差ARMA系统递阶辨识方法和递阶多新息辨识方法。讨论了几个典型辨识算法的计算量，并给出了计算参数估计的步骤。
Zhou, Jinzhe; Zhou, Yanbing; Cao, Shougen; Li, Shikuan; Wang, Hao; Niu, Zhaojian; Chen, Dong; Wang, Dongsheng; Lv, Liang; Zhang, Jian; Li, Yu; Jiao, Xuelong; Tan, Xiaojie; Zhang, Jianli; Wang, Haibo; Zhang, Bingyuan; Lu, Yun; Sun, Zhenqing
2016-01-01
Reporting of surgical complications is common, but few provide information about the severity and estimate risk factors of complications. If have, but lack of specificity. We retrospectively analyzed data on 2795 gastric cancer patients underwent surgical procedure at the Affiliated Hospital of Qingdao University between June 2007 and June 2012, established multivariate logistic regression model to predictive risk factors related to the postoperative complications according to the Clavien-Dindo classification system. Twenty-four out of 86 variables were identified statistically significant in univariate logistic regression analysis, 11 significant variables entered multivariate analysis were employed to produce the risk model. Liver cirrhosis, diabetes mellitus, Child classification, invasion of neighboring organs, combined resection, introperative transfusion, Billroth II anastomosis of reconstruction, malnutrition, surgical volume of surgeons, operating time and age were independent risk factors for postoperative complications after gastrectomy. Based on logistic regression equation, p=Exp∑BiXi / (1+Exp∑BiXi), multivariate logistic regression predictive model that calculated the risk of postoperative morbidity was developed, p = 1/(1 + e((4.810-1.287X1-0.504X2-0.500X3-0.474X4-0.405X5-0.318X6-0.316X7-0.305X8-0.278X9-0.255X10-0.138X11))). The accuracy, sensitivity and specificity of the model to predict the postoperative complications were 86.7%, 76.2% and 88.6%, respectively. This risk model based on Clavien-Dindo grading severity of complications system and logistic regression analysis can predict severe morbidity specific to an individual patient's risk factors, estimate patients' risks and benefits of gastric surgery as an accurate decision-making tool and may serve as a template for the development of risk models for other surgical groups.
Yamakoshi, Yasuhiro; Ogawa, Mitsuhiro; Yamakoshi, Takehiro; Tamura, Toshiyo; Yamakoshi, Ken-ichi
2009-01-01
A novel optical non-invasive in vivo blood glucose concentration (BGL) measurement technique, named "Pulse Glucometry", was combined with a kernel method; support vector machines. The total transmitted radiation intensity (I(lambda)) and the cardiac-related pulsatile changes superimposed on I(lambda) in human adult fingertips were measured over the wavelength range from 900 to 1700 nm using a very fast spectrophotometer, obtaining a differential optical density (DeltaOD(lambda)) related to the blood component in the finger tissues. Subsequently, a calibration model using paired data of a family of DeltaOD(lambda)s and the corresponding known BGLs was constructed with support vector machines (SVMs) regression instead of using calibration by a conventional primary component regression (PCR) and partial least squares regression (PLS). Secondly, SVM method was applied to make a nonlinear discriminant calibration model for "Pulse glucometry." Our results show that the regression calibration model based on the support vector machines can provide a good regression for the 101 paired data, in which the BGLs ranged from 89.0-219 mg/dl (4.94-12.2 mmol/l). The resultant regression was evaluated by the Clarke error grid analysis and all data points fell within the clinically acceptable regions (region A: 93%, region B: 7%). The discriminant calibration model using SVMs also provided a good result for classification (accuracy rate 84% in the best case).
Kiss, I.; Cioată, V. G.; Alexa, V.; Raţiu, S. A.
2017-05-01
The braking system is one of the most important and complex subsystems of railway vehicles, especially when it comes for safety. Therefore, installing efficient safe brakes on the modern railway vehicles is essential. Nowadays is devoted attention to solving problems connected with using high performance brake materials and its impact on thermal and mechanical loading of railway wheels. The main factor that influences the selection of a friction material for railway applications is the performance criterion, due to the interaction between the brake block and the wheel produce complex thermos-mechanical phenomena. In this work, the investigated subjects are the cast-iron brake shoes, which are still widely used on freight wagons. Therefore, the cast-iron brake shoes - with lamellar graphite and with a high content of phosphorus (0.8-1.1%) - need a special investigation. In order to establish the optimal condition for the cast-iron brake shoes we proposed a mathematical modelling study by using the statistical analysis and multiple regression equations. Multivariate research is important in areas of cast-iron brake shoes manufacturing, because many variables interact with each other simultaneously. Multivariate visualization comes to the fore when researchers have difficulties in comprehending many dimensions at one time. Technological data (hardness and chemical composition) obtained from cast-iron brake shoes were used for this purpose. In order to settle the multiple correlation between the hardness of the cast-iron brake shoes, and the chemical compositions elements several model of regression equation types has been proposed. Because a three-dimensional surface with variables on three axes is a common way to illustrate multivariate data, in which the maximum and minimum values are easily highlighted, we plotted graphical representation of the regression equations in order to explain interaction of the variables and locate the optimal level of each variable for
Rebechi, S R; Vélez, M A; Vaira, S; Perotti, M C
2016-02-01
The aims of the present study were to test the accuracy of the fatty acid ratios established by the Argentinean Legislation to detect adulterations of milk fat with animal fats and to propose a regression model suitable to evaluate these adulterations. For this purpose, 70 milk fat, 10 tallow and 7 lard fat samples were collected and analyzed by gas chromatography. Data was utilized to simulate arithmetically adulterated milk fat samples at 0%, 2%, 5%, 10% and 15%, for both animal fats. The fatty acids ratios failed to distinguish adulterated milk fats containing less than 15% of tallow or lard. For each adulterant, Multiple Linear Regression (MLR) was applied, and a model was chosen and validated. For that, calibration and validation matrices were constructed employing genuine and adulterated milk fat samples. The models were able to detect adulterations of milk fat at levels greater than 10% for tallow and 5% for lard. Copyright © 2015 Elsevier Ltd. All rights reserved.
Directory of Open Access Journals (Sweden)
Fontez B.
2014-04-01
Full Text Available Back-calculation allows to increase available data on fish growth. The accuracy of back-calculation models is of paramount importance for growth analysis. Frequentist and Bayesian hierarchical approaches were used for regression between fish body size and scale size for the rare fish species Zingel asper. The Bayesian approach permits more reliable estimation of back-calculated size, taking into account biological information and cohort variability. This method greatly improves estimation of back-calculated length when sampling is uneven and/or small.
Smith, Kelly M.; Gay, Robert S.; Stachowiak, Susan J.
2013-01-01
In late 2014, NASA will fly the Orion capsule on a Delta IV-Heavy rocket for the Exploration Flight Test-1 (EFT-1) mission. For EFT-1, the Orion capsule will be flying with a new GPS receiver and new navigation software. Given the experimental nature of the flight, the flight software must be robust to the loss of GPS measurements. Once the high-speed entry is complete, the drogue parachutes must be deployed within the proper conditions to stabilize the vehicle prior to deploying the main parachutes. When GPS is available in nominal operations, the vehicle will deploy the drogue parachutes based on an altitude trigger. However, when GPS is unavailable, the navigated altitude errors become excessively large, driving the need for a backup barometric altimeter to improve altitude knowledge. In order to increase overall robustness, the vehicle also has an alternate method of triggering the parachute deployment sequence based on planet-relative velocity if both the GPS and the barometric altimeter fail. However, this backup trigger results in large altitude errors relative to the targeted altitude. Motivated by this challenge, this paper demonstrates how logistic regression may be employed to semi-automatically generate robust triggers based on statistical analysis. Logistic regression is used as a ground processor pre-flight to develop a statistical classifier. The classifier would then be implemented in flight software and executed in real-time. This technique offers improved performance even in the face of highly inaccurate measurements. Although the logistic regression-based trigger approach will not be implemented within EFT-1 flight software, the methodology can be carried forward for future missions and vehicles.
Smith, Kelly; Gay, Robert; Stachowiak, Susan
2013-01-01
In late 2014, NASA will fly the Orion capsule on a Delta IV-Heavy rocket for the Exploration Flight Test-1 (EFT-1) mission. For EFT-1, the Orion capsule will be flying with a new GPS receiver and new navigation software. Given the experimental nature of the flight, the flight software must be robust to the loss of GPS measurements. Once the high-speed entry is complete, the drogue parachutes must be deployed within the proper conditions to stabilize the vehicle prior to deploying the main parachutes. When GPS is available in nominal operations, the vehicle will deploy the drogue parachutes based on an altitude trigger. However, when GPS is unavailable, the navigated altitude errors become excessively large, driving the need for a backup barometric altimeter to improve altitude knowledge. In order to increase overall robustness, the vehicle also has an alternate method of triggering the parachute deployment sequence based on planet-relative velocity if both the GPS and the barometric altimeter fail. However, this backup trigger results in large altitude errors relative to the targeted altitude. Motivated by this challenge, this paper demonstrates how logistic regression may be employed to semi-automatically generate robust triggers based on statistical analysis. Logistic regression is used as a ground processor pre-flight to develop a statistical classifier. The classifier would then be implemented in flight software and executed in real-time. This technique offers improved performance even in the face of highly inaccurate measurements. Although the logistic regression-based trigger approach will not be implemented within EFT-1 flight software, the methodology can be carried forward for future missions and vehicles
Formisano, Elia; De Martino, Federico; Valente, Giancarlo
2008-09-01
Machine learning and pattern recognition techniques are being increasingly employed in functional magnetic resonance imaging (fMRI) data analysis. By taking into account the full spatial pattern of brain activity measured simultaneously at many locations, these methods allow detecting subtle, non-strictly localized effects that may remain invisible to the conventional analysis with univariate statistical methods. In typical fMRI applications, pattern recognition algorithms "learn" a functional relationship between brain response patterns and a perceptual, cognitive or behavioral state of a subject expressed in terms of a label, which may assume discrete (classification) or continuous (regression) values. This learned functional relationship is then used to predict the unseen labels from a new data set ("brain reading"). In this article, we describe the mathematical foundations of machine learning applications in fMRI. We focus on two methods, support vector machines and relevance vector machines, which are respectively suited for the classification and regression of fMRI patterns. Furthermore, by means of several examples and applications, we illustrate and discuss the methodological challenges of using machine learning algorithms in the context of fMRI data analysis.
Kovačević, Strahinja Z; Podunavac Kuzmanović, Sanja O; Jevrić, Lidija R
2013-01-01
In the present study, principal component analysis (PCA) followed by principal component regression (PCR) and partial least squares (PLS) method was applied in order to identify the most important in silico molecular descriptors and quantify their influence on antifungal activity (expressed as minimal inhibitory concentration) of selected benzoxazole and oxazolo[4,5-b]pyridine derivatives against Candida albicans. PLS regression showed the best statistical performance, according to the lowest value of the standard error (root mean square errors of calibration of 0.02526 and cross-validation of 0.04533), while PCR model was characterized by root mean square errors of calibration of 0.03176 and cross-validation of 0.05661. The most important descriptors in both PLS and PCR model are solubility in water, expressed as AClogS and ABlogS, and lipophilicity, expressed as XlogP2 and ABlogP. Very good predictive ability of the established models, confirmed by corresponding statistical parameters, allows us to estimate antifungal activity of structurally similar compounds.
Garrido, M; Larrechi, M S; Rius, F X
2004-12-01
The present study investigates the relationship between the changes in complex viscosity and near-infrared spectra. Principal component regression analysis is applied to a near-infrared data set obtained from the in situ monitoring of the curing of diglycidyl ether of bisphenol A with the diamine 4,4'-diaminodiphenylmethane. The values of complex viscosity obtained by dynamic mechanical analysis during the cure process were used as a reference. The near-infrared spectra recorded throughout the reaction, unlike the univariate data analysis at some wavelengths of the spectra, contain a sufficient amount of information to estimate the complex viscosity. The relationship found was high and the results demonstrate the quality of the fitted model. Also, a simple user-friendly procedure for applying the model, focused on the user, is shown.
Directory of Open Access Journals (Sweden)
Regis Wendpouire Oubida
2015-03-01
Full Text Available Local adaptation to climate in temperate forest trees involves the integration of multiple physiological, morphological, and phenological traits. Latitudinal clines are frequently observed for these traits, but environmental constraints also track longitude and altitude. We combined extensive phenotyping of 12 candidate adaptive traits, multivariate regression trees, quantitative genetics, and a genome-wide panel of SNP markers to better understand the interplay among geography, climate, and adaptation to abiotic factors in Populus trichocarpa. Heritabilities were low to moderate (0.13 to 0.32 and population differentiation for many traits exceeded the 99th percentile of the genome-wide distribution of FST, suggesting local adaptation. When climate variables were taken as predictors and the 12 traits as response variables in a multivariate regression tree analysis, evapotranspiration (Eref explained the most variation, with subsequent splits related to mean temperature of the warmest month, frost-free period (FFP, and mean annual precipitation (MAP. These grouping matched relatively well the splits using geographic variables as predictors: the northernmost groups (short FFP and low Eref had the lowest growth, and lowest cold injury index; the southern British Columbia group (low Eref and intermediate temperatures had average growth and cold injury index; the group from the coast of California and Oregon (high Eref and FFP had the highest growth performance and the highest cold injury index; and the southernmost, high-altitude group (with high Eref and low FFP performed poorly, had high cold injury index, and lower water use efficiency. Taken together, these results suggest variation in both temperature and water availability across the range shape multivariate adaptive traits in poplar.
Directory of Open Access Journals (Sweden)
Gardênia Abbad
2002-01-01
Full Text Available Este artigo discute algumas aplicações das técnicas de análise de regressão múltipla stepwise e hierárquica, as quais são muito utilizadas em pesquisas da área de Psicologia Organizacional. São discutidas algumas estratégias de identificação e de solução de problemas relativos à ocorrência de erros do Tipo I e II e aos fenômenos de supressão, complementaridade e redundância nas equações de regressão múltipla. São apresentados alguns exemplos de pesquisas nas quais esses padrões de associação entre variáveis estiveram presentes e descritas as estratégias utilizadas pelos pesquisadores para interpretá-los. São discutidas as aplicações dessas análises no estudo de interação entre variáveis e na realização de testes para avaliação da linearidade do relacionamento entre variáveis. Finalmente, são apresentadas sugestões para lidar com as limitações das análises de regressão múltipla (stepwise e hierárquica.This article discusses applications of stepwise and hierarchical multiple regression analyses to research in organizational psychology. Strategies for identifying type I and II errors, and solutions to potential problems that may arise from such errors are proposed. In addition, phenomena such as suppression, complementarity, and redundancy are reviewed. The article presents examples of research where these phenomena occurred, and the manner in which they were explained by researchers. Some applications of multiple regression analyses to studies involving between-variable interactions are presented, along with tests used to analyze the presence of linearity among variables. Finally, some suggestions are provided for dealing with limitations implicit in multiple regression analyses (stepwise and hierarchical.
Hordge, LaQuana N; McDaniel, Kiara L; Jones, Derick D; Fakayode, Sayo O
2016-05-15
The endocrine disruption property of estrogens necessitates the immediate need for effective monitoring and development of analytical protocols for their analyses in biological and human specimens. This study explores the first combined utility of a steady-state fluorescence spectroscopy and multivariate partial-least-square (PLS) regression analysis for the simultaneous determination of two estrogens (17α-ethinylestradiol (EE) and norgestimate (NOR)) concentrations in bovine serum albumin (BSA) and human serum albumin (HSA) samples. The influence of EE and NOR concentrations and temperature on the emission spectra of EE-HSA EE-BSA, NOR-HSA, and NOR-BSA complexes was also investigated. The binding of EE with HSA and BSA resulted in increase in emission characteristics of HSA and BSA and a significant blue spectra shift. In contrast, the interaction of NOR with HSA and BSA quenched the emission characteristics of HSA and BSA. The observed emission spectral shifts preclude the effective use of traditional univariate regression analysis of fluorescent data for the determination of EE and NOR concentrations in HSA and BSA samples. Multivariate partial-least-squares (PLS) regression analysis was utilized to correlate the changes in emission spectra with EE and NOR concentrations in HSA and BSA samples. The figures-of-merit of the developed PLS regression models were excellent, with limits of detection as low as 1.6×10(-8) M for EE and 2.4×10(-7) M for NOR and good linearity (R(2)>0.994985). The PLS models correctly predicted EE and NOR concentrations in independent validation HSA and BSA samples with a root-mean-square-percent-relative-error (RMS%RE) of less than 6.0% at physiological condition. On the contrary, the use of univariate regression resulted in poor predictions of EE and NOR in HSA and BSA samples, with RMS%RE larger than 40% at physiological conditions. High accuracy, low sensitivity, simplicity, low-cost with no prior analyte extraction or separation
Rounaghi, Mohammad Mahdi; Abbaszadeh, Mohammad Reza; Arashi, Mohammad
2015-11-01
One of the most important topics of interest to investors is stock price changes. Investors whose goals are long term are sensitive to stock price and its changes and react to them. In this regard, we used multivariate adaptive regression splines (MARS) model and semi-parametric splines technique for predicting stock price in this study. The MARS model as a nonparametric method is an adaptive method for regression and it fits for problems with high dimensions and several variables. semi-parametric splines technique was used in this study. Smoothing splines is a nonparametric regression method. In this study, we used 40 variables (30 accounting variables and 10 economic variables) for predicting stock price using the MARS model and using semi-parametric splines technique. After investigating the models, we select 4 accounting variables (book value per share, predicted earnings per share, P/E ratio and risk) as influencing variables on predicting stock price using the MARS model. After fitting the semi-parametric splines technique, only 4 accounting variables (dividends, net EPS, EPS Forecast and P/E Ratio) were selected as variables effective in forecasting stock prices.
Directory of Open Access Journals (Sweden)
Paulino José García Nieto
2016-05-01
Full Text Available Remaining useful life (RUL estimation is considered as one of the most central points in the prognostics and health management (PHM. The present paper describes a nonlinear hybrid ABC–MARS-based model for the prediction of the remaining useful life of aircraft engines. Indeed, it is well-known that an accurate RUL estimation allows failure prevention in a more controllable way so that the effective maintenance can be carried out in appropriate time to correct impending faults. The proposed hybrid model combines multivariate adaptive regression splines (MARS, which have been successfully adopted for regression problems, with the artificial bee colony (ABC technique. This optimization technique involves parameter setting in the MARS training procedure, which significantly influences the regression accuracy. However, its use in reliability applications has not yet been widely explored. Bearing this in mind, remaining useful life values have been predicted here by using the hybrid ABC–MARS-based model from the remaining measured parameters (input variables for aircraft engines with success. A correlation coefficient equal to 0.92 was obtained when this hybrid ABC–MARS-based model was applied to experimental data. The agreement of this model with experimental data confirmed its good performance. The main advantage of this predictive model is that it does not require information about the previous operation states of the aircraft engine.
Gallidabino, M; Romolo, F S; Weyermann, C
2017-03-01
Estimating the time since discharge of spent cartridges can be a valuable tool in the forensic investigation of firearm-related crimes. To reach this aim, it was previously proposed that the decrease of volatile organic compounds released during discharge is monitored over time using non-destructive headspace extraction techniques. While promising results were obtained for large-calibre cartridges (e.g., shotgun shells), handgun calibres yielded unsatisfying results. In addition to the natural complexity of the specimen itself, these can also be attributed to some selective choices in the methods development. Thus, the present series of papers aimed to systematically evaluate the potential of headspace analysis to estimate the time since discharge of cartridges through the use of more comprehensive analytical and interpretative techniques. Following the comprehensive optimisation and validation of an exhaustive headspace sorptive extraction (HSSE) method in the first part of this work, the present paper addresses the application of chemometric tools in order to systematically evaluate the potential of applying headspace analysis to estimate the time since discharge of 9mm Geco cartridges. Several multivariate regression and pre-treatment methods were tested and compared to univariate models based on non-linear regression. Random forests (RF) and partial least squares (PLS) proceeded by pairwise log-ratios normalisation (PLR) showed the best results, and allowed to estimate time since discharge up to 48h of ageing and to differentiate recently fired from older cartridges (e.g., less than 5h compared to more than 1-2 days). The proposed multivariate approaches showed significant improvement compared to univariate models. The effects of storage conditions were also tested and results demonstrated that temperature, humidity and cartridge position should be taken into account when estimating the time since discharge.
Directory of Open Access Journals (Sweden)
Tsair-Fwu Lee
Full Text Available PURPOSE: The aim of this study was to develop a multivariate logistic regression model with least absolute shrinkage and selection operator (LASSO to make valid predictions about the incidence of moderate-to-severe patient-rated xerostomia among head and neck cancer (HNC patients treated with IMRT. METHODS AND MATERIALS: Quality of life questionnaire datasets from 206 patients with HNC were analyzed. The European Organization for Research and Treatment of Cancer QLQ-H&N35 and QLQ-C30 questionnaires were used as the endpoint evaluation. The primary endpoint (grade 3(+ xerostomia was defined as moderate-to-severe xerostomia at 3 (XER3m and 12 months (XER12m after the completion of IMRT. Normal tissue complication probability (NTCP models were developed. The optimal and suboptimal numbers of prognostic factors for a multivariate logistic regression model were determined using the LASSO with bootstrapping technique. Statistical analysis was performed using the scaled Brier score, Nagelkerke R(2, chi-squared test, Omnibus, Hosmer-Lemeshow test, and the AUC. RESULTS: Eight prognostic factors were selected by LASSO for the 3-month time point: Dmean-c, Dmean-i, age, financial status, T stage, AJCC stage, smoking, and education. Nine prognostic factors were selected for the 12-month time point: Dmean-i, education, Dmean-c, smoking, T stage, baseline xerostomia, alcohol abuse, family history, and node classification. In the selection of the suboptimal number of prognostic factors by LASSO, three suboptimal prognostic factors were fine-tuned by Hosmer-Lemeshow test and AUC, i.e., Dmean-c, Dmean-i, and age for the 3-month time point. Five suboptimal prognostic factors were also selected for the 12-month time point, i.e., Dmean-i, education, Dmean-c, smoking, and T stage. The overall performance for both time points of the NTCP model in terms of scaled Brier score, Omnibus, and Nagelkerke R(2 was satisfactory and corresponded well with the expected values
Shi, Wenhao; Zhang, Silin; Zhao, Wanqiu; Xia, Xue; Wang, Min; Wang, Hui; Bai, Haiyan; Shi, Juanzi
2013-07-01
What factors does multivariate logistic regression show to be significantly associated with the likelihood of clinical pregnancy in vitrified-warmed embryo transfer (VET) cycles? Assisted hatching (AH) and if the reason to freeze embryos was to avoid the risk of ovarian hyperstimulation syndrome (OHSS) were significantly positively associated with a greater likelihood of clinical pregnancy. Single factor analysis has shown AH, number of embryos transferred and the reason of freezing for OHSS to be positively and damaged blastomere to be negatively significantly associated with the chance of clinical pregnancy after VET. It remains unclear what factors would be significant after multivariate analysis. The study was a retrospective analysis of 2313 VET cycles from 1481 patients performed between January 2008 and April 2012. A multivariate logistic regression analysis was performed to identify the factors to affect clinical pregnancy outcome of VET. There were 22 candidate variables selected based on clinical experiences and the literature. With the thresholds of α entry = α removal= 0.05 for both variable entry and variable removal, eight variables were chosen to contribute the multivariable model by the bootstrap stepwise variable selection algorithm (n = 1000). Eight variables were age at controlled ovarian hyperstimulation (COH), reason for freezing, AH, endometrial thickness, damaged blastomere, number of embryos transferred, number of good-quality embryos, and blood presence on transfer catheter. A descriptive comparison of the relative importance was accomplished by the proportion of explained variation (PEV). Among the reasons for freezing, the OHSS group showed a higher OR than the surplus embryo group when compared with other reasons for VET groups (OHSS versus Other, OR: 2.145; CI: 1.4-3.286; Surplus embryos versus Other, OR: 1.152; CI: 0.761-1.743) and high PEV (marginal 2.77%, P = 0.2911; partial 1.68%; CI of area under receptor operator characteristic
Energy Technology Data Exchange (ETDEWEB)
Lewin, M.D.; Sarasua, S.; Jones, P.A. (Agency for Toxic Substances and Disease Registry, Atlanta, GA (United States). Div. of Health Studies)
1999-07-01
For the purpose of examining the association between blood lead levels and household-specific soil lead levels, the authors used a multivariate linear regression model to find a slope factor relating soil lead levels to blood lead levels. They used previously collected data from the Agency for Toxic Substances and Disease Registry's (ATSDR's) multisite lead and cadmium study. The data included in the blood lead measurements of 1,015 children aged 6--71 months, and corresponding household-specific environmental samples. The environmental samples included lead in soil, house dust, interior paint, and tap water. After adjusting for income, education or the parents, presence of a smoker in the household, sex, and dust lead, and using a double log transformation, they found a slope factor of 0.1388 with a 95% confidence interval of 0.09--0.19 for the dose-response relationship between the natural log of the soil lead level and the natural log of the blood lead level. The predicted blood lead level corresponding to a soil lead level of 500 mg/kg was 5.99 [micro]g/kg with a 95% prediction interval of 2.08--17.29. Predicted values and their corresponding prediction intervals varied by covariate level. The model shows that increased soil lead level is associated with elevated blood leads in children, but that predictions based on this regression model are subject to high levels of uncertainty and variability.
Sridharan, S.; Sandhya, M.
2016-09-01
Long-term variabilities and tendencies in the tropical (30°N-30°S)monthly averaged zonal mean water vapor mixing ratio (WVMR) and temperature in the upper troposphere and lower stratosphere (UTLS), obtained from the Microwave Limb Sounder (MLS) instrument onboard Earth Observing System (EOS) satellite for the period October 2004-September 2015, are studied using multivariate regression analysis. It is found that the WVMR shows a decreasing trend of 0.02-0.1 ppmv/year in WVMR below 100 hPa while the trend is positive (0.02-0.035 ppmv/year) above 100 hPa. There is no significant trend at 121 hPa. The WVMR response to solar cycle (SC) is negative below 21 hPa. However, the magnitude decreases with height from 0.13 ppmv/100 sfu(solar flux unit) at 178 hPa to 0.07 ppmv/100sfuat 26 hPa. The response of WVMR to multivariate El Niño index (MEI), which is a proxy for El Niño Southern Oscillation (ENSO), is positive at and below 100 hPa and negative above 100 hPa. It is negative at 56-46 hPa with maximum value of 0.1 ppmv/MEI at 56 hPa. Large positive (negative) quasi-biennial oscillation (QBO) in WVMR at 56-68 hPa reconstructed from the regression analysis coincide with eastward (westward) to westward (eastward) transition of QBO winds at that level. The trend in zonal mean tropical temperature is negative above 56 hPa with magnitude increasing with height. The maximum negative trend of 0.05 K/year is observed at 21-17 hPa and the trend insignificant around tropopause. The response of temperature to SC is negative in the UTLS region and to ENSO is positive below 100 hPa and mostly negative above 100 hPa. The negative response of WVMR to MEI in the stratosphere is suggested to be due to the extended cold trap of tropopause temperature during El Niño years that might have controlled the water vapor entry into the stratosphere. The WVMR response to residual vertical velocity at 70 hPa is positive in the stratosphere, whereas the temperature response is positive in the
Directory of Open Access Journals (Sweden)
Tayseer Elamin Mohamed Elfaki
2016-05-01
Full Text Available In the Sudan, Schistosoma mansoni infections are a major cause of morbidity in school-aged children and infection rates are associated with available clean water sources. During infection, immune responses pass through a Th1 followed by Th2 and Treg phases and patterns can relate to different stages of infection or immunity.This retrospective study evaluated immunoepidemiological aspects in 234 individuals (range 4-85 years old from Kassala and Khartoum states in 2011. Systemic immune profiles (cytokines and immunoglobulins and epidemiological parameters were surveyed in n = 110 persons presenting patent S. mansoni infections (egg+, n = 63 individuals positive for S. mansoni via PCR in sera but egg negative (SmPCR+ and n = 61 people who were infection-free (Sm uninf. Immunoepidemiological findings were further investigated using two binary multivariable regression analysis.Nearly all egg+ individuals had no access to latrines and over 90% obtained water via the canal stemming from the Atbara River. With regards to age, infection and an egg+ status was linked to young and adolescent groups. In terms of immunology, S. mansoni infection per se was strongly associated with increased SEA-specific IgG4 but not IgE levels. IL-6, IL-13 and IL-10 were significantly elevated in patently-infected individuals and positively correlated with egg load. In contrast, IL-2 and IL-1β were significantly lower in SmPCR+ individuals when compared to Sm uninf and egg+ groups which was further confirmed during multivariate regression analysis.Schistosomiasis remains an important public health problem in the Sudan with a high number of patent individuals. In addition, SmPCR diagnostics revealed another cohort of infected individuals with a unique immunological profile and provides an avenue for future studies on non-patent infection states. Future studies should investigate the downstream signalling pathways/mechanisms of IL-2 and IL-1β as potential diagnostic markers
Lewin, M D; Sarasua, S; Jones, P A
1999-07-01
For the purpose of examining the association between blood lead levels and household-specific soil lead levels, we used a multivariate linear regression model to find a slope factor relating soil lead levels to blood lead levels. We used previously collected data from the Agency for Toxic Substances and Disease Registry's (ATSDR's) multisite lead and cadmium study. The data included the blood lead measurements (0.5 to 40.2 microg/dL) of 1015 children aged 6-71 months, and corresponding household-specific environmental samples. The environmental samples included lead in soil (18.1-9980 mg/kg), house dust (5.2-71,000 mg/kg), interior paint (0-16.5 mg/cm2), and tap water (0.3-103 microg/L). After adjusting for income, education of the parents, presence of a smoker in the household, sex, and dust lead, and using a double log transformation, we found a slope factor of 0.1388 with a 95% confidence interval of 0.09-0.19 for the dose-response relationship between the natural log of the soil lead level and the natural log of the blood lead level. The predicted blood lead level corresponding to a soil lead level of 500 mg/kg was 5.99 microg/kg with a 95% prediction interval of 2. 08-17.29. Predicted values and their corresponding prediction intervals varied by covariate level. The model shows that increased soil lead level is associated with elevated blood leads in children, but that predictions based on this regression model are subject to high levels of uncertainty and variability.
Directory of Open Access Journals (Sweden)
Goyal Neeraj
2010-01-01
Full Text Available To compare the accuracy of artificial neural network (ANN analysis and multi-variate regression analysis (MVRA for renal stone fragmentation by extracorporeal shock wave lithotripsy (ESWL. A total of 276 patients with renal calculus were treated by ESWL during December 2001 to December 2006. Of them, the data of 196 patients were used for training the ANN. The predictability of trained ANN was tested on 80 subsequent patients. The input data include age of patient, stone size, stone burden, number of sittings and urinary pH. The output values (predicted values were number of shocks and shock power. Of these 80 patients, the input was analyzed and output was also calculated by MVRA. The output values (predicted values from both the methods were compared and the results were drawn. The predicted and observed values of shock power and number of shocks were compared using 1:1 slope line. The results were calculated as coefficient of correlation (COC (r2 . For prediction of power, the MVRA COC was 0.0195 and ANN COC was 0.8343. For prediction of number of shocks, the MVRA COC was 0.5726 and ANN COC was 0.9329. In conclusion, ANN gives better COC than MVRA, hence could be a better tool to analyze the optimum renal stone fragmentation by ESWL.
Villanueva, Lidón; Montoya-Castilla, Inmaculada; Prado-Gascó, Vicente
2017-07-01
The purpose of this study is to analyze the combined effects of trait emotional intelligence (EI) and feelings on healthy adolescents' stress. Identifying the extent to which adolescent stress varies with trait emotional differences and the feelings of adolescents is of considerable interest in the development of intervention programs for fostering youth well-being. To attain this goal, self-reported questionnaires (perceived stress, trait EI, and positive/negative feelings) and biological measures of stress (hair cortisol concentrations, HCC) were collected from 170 adolescents (12-14 years old). Two different methodologies were conducted, which included hierarchical regression models and a fuzzy-set qualitative comparative analysis (fsQCA). The results support trait EI as a protective factor against stress in healthy adolescents and suggest that feelings reinforce this relation. However, the debate continues regarding the possibility of optimal levels of trait EI for effective and adaptive emotional management, particularly in the emotional attention and clarity dimensions and for female adolescents.
Callén, M S; López, J M; Mastral, A M
2010-08-15
The estimation of benzo(a)pyrene (BaP) concentrations in ambient air is very important from an environmental point of view especially with the introduction of the Directive 2004/107/EC and due to the carcinogenic character of this pollutant. A sampling campaign of particulate matter less or equal than 10 microns (PM10) carried out during 2008-2009 in four locations of Spain was collected to determine experimentally BaP concentrations by gas chromatography mass-spectrometry mass-spectrometry (GC-MS-MS). Multivariate linear regression models (MLRM) were used to predict BaP air concentrations in two sampling places, taking PM10 and meteorological variables as possible predictors. The model obtained with data from two sampling sites (all sites model) (R(2)=0.817, PRESS/SSY=0.183) included the significant variables like PM10, temperature, solar radiation and wind speed and was internally and externally validated. The first validation was performed by cross validation and the last one by BaP concentrations from previous campaigns carried out in Zaragoza from 2001-2004. The proposed model constitutes a first approximation to estimate BaP concentrations in urban atmospheres with very good internal prediction (Q(CV)(2)=0.813, PRESS/SSY=0.187) and with the maximal external prediction for the 2001-2002 campaign (Q(ext)(2)=0.679 and PRESS/SSY=0.321) versus the 2001-2004 campaign (Q(ext)(2)=0.551, PRESS/SSY=0.449).
Institute of Scientific and Technical Information of China (English)
RAO Calyampudi R; WU YueHua
2009-01-01
In this paper, the constrained M-estimation of the regression coefficients and scatter parameters in a general multivariate linear regression model is considered. Since the constrained Mestimation is not easy to compute, an up-dating recursion procedure is proposed to simplify the computation of the estimators when a new observation is obtained. We show that, under mild conditions,the recursion estimates are strongly consistent. In addition, the asymptotic normality of the recursive constrained M-estimators of regression coefficients is established. A Monte Carlo simulation study of the recursion estimates is also provided. Besides, robustness and asymptotic behavior of constrained M-estimators are briefly discussed.
Balshi, M. S.; McGuire, A.D.; Duffy, P.; Flannigan, M.; Walsh, J.; Melillo, J.
2009-01-01
Fire is a common disturbance in the North American boreal forest that influences ecosystem structure and function. The temporal and spatial dynamics of fire are likely to be altered as climate continues to change. In this study, we ask the question: how will area burned in boreal North America by wildfire respond to future changes in climate? To evaluate this question, we developed temporally and spatially explicit relationships between air temperature and fuel moisture codes derived from the Canadian Fire Weather Index System to estimate annual area burned at 2.5?? (latitude ?? longitude) resolution using a Multivariate Adaptive Regression Spline (MARS) approach across Alaska and Canada. Burned area was substantially more predictable in the western portion of boreal North America than in eastern Canada. Burned area was also not very predictable in areas of substantial topographic relief and in areas along the transition between boreal forest and tundra. At the scale of Alaska and western Canada, the empirical fire models explain on the order of 82% of the variation in annual area burned for the period 1960-2002. July temperature was the most frequently occurring predictor across all models, but the fuel moisture codes for the months June through August (as a group) entered the models as the most important predictors of annual area burned. To predict changes in the temporal and spatial dynamics of fire under future climate, the empirical fire models used output from the Canadian Climate Center CGCM2 global climate model to predict annual area burned through the year 2100 across Alaska and western Canada. Relative to 1991-2000, the results suggest that average area burned per decade will double by 2041-2050 and will increase on the order of 3.5-5.5 times by the last decade of the 21st century. To improve the ability to better predict wildfire across Alaska and Canada, future research should focus on incorporating additional effects of long-term and successional
Directory of Open Access Journals (Sweden)
Mario M. Bracco
2006-08-01
Full Text Available OBJETIVO: Identificar fatores biológicos e sociodemográficos atribuíveis à inatividade física em crianças de escolas públicas. MÉTODOS: Foram estudadas, através de questionário auto-relatado pelos pais, 2.519 crianças (49,3% meninas, de 7 a 10 anos (média = 7,6±0,9 anos, de oito escolas públicas da cidade de São Paulo. Aplicamos a análise de correspondência múltipla para identificar grupos de respostas relacionadas com padrões de atividade e inatividade física e a geração de uma escala ótima. A análise de agrupamento identificou os grupos de crianças ativas e inativas. A análise de curva ROC (receiver operator characteristic, para o estudo das propriedades diagnósticas de uma escala simplificada de inatividade física derivada da escala ótima, mostrou o ponto de corte = 3 como o de melhor sensibilidade e especificidade, sendo utilizado como a variável de resposta no modelo de regressão. Um modelo hierárquico multivariado foi construído, assumindo variáveis categóricas como distais e proximais, adotando-se p OBJECTIVE: To identify biological and sociodemographic factors associated with physical inactivity in public school children. METHODS: Parents of 2,519 children (49.3% of whom were girls, aged 7 to 10 years (mean = 7.6±0.9 years, from eight public schools in São Paulo, Brazil, completed a self-administered questionnaire. We used multiple correspondence analysis to identify groups of responses related to levels of physical activity and inactivity and to obtain an optimal scale. The cluster analysis identified groups of active and inactive children. The analysis of the receiver operator characteristic (ROC curve, for the study of diagnostic properties of a simplified scale for physical inactivity derived from the optimal scale, revealed that a cutoff point of 3 had the best sensitivity and specificity, being therefore used as outcome variable in the regression model. A multivariate hierarchical model was
Nakamura, Ryota; Suhrcke, Marc; Jebb, Susan A; Pechey, Rachel; Almiron-Roig, Eva; Marteau, Theresa M
2015-04-01
There is a growing concern, but limited evidence, that price promotions contribute to a poor diet and the social patterning of diet-related disease. We examined the following questions: 1) Are less-healthy foods more likely to be promoted than healthier foods? 2) Are consumers more responsive to promotions on less-healthy products? 3) Are there socioeconomic differences in food purchases in response to price promotions? With the use of hierarchical regression, we analyzed data on purchases of 11,323 products within 135 food and beverage categories from 26,986 households in Great Britain during 2010. Major supermarkets operated the same price promotions in all branches. The number of stores that offered price promotions on each product for each week was used to measure the frequency of price promotions. We assessed the healthiness of each product by using a nutrient profiling (NP) model. A total of 6788 products (60%) were in healthier categories and 4535 products (40%) were in less-healthy categories. There was no significant gap in the frequency of promotion by the healthiness of products neither within nor between categories. However, after we controlled for the reference price, price discount rate, and brand-specific effects, the sales uplift arising from price promotions was larger in less-healthy than in healthier categories; a 1-SD point increase in the category mean NP score, implying the category becomes less healthy, was associated with an additional 7.7-percentage point increase in sales (from 27.3% to 35.0%; P sales uplift from promotions was larger for higher-socioeconomic status (SES) groups than for lower ones (34.6% for the high-SES group, 28.1% for the middle-SES group, and 23.1% for the low-SES group). Finally, there was no significant SES gap in the absolute volume of purchases of less-healthy foods made on promotion. Attempts to limit promotions on less-healthy foods could improve the population diet but would be unlikely to reduce health
Simons, Monique; de Vet, Emely; Chinapaw, Mai Jm; de Boer, Michiel; Seidell, Jacob C; Brug, Johannes
2014-04-04
Playing video games contributes substantially to sedentary behavior in youth. A new generation of video games-active games-seems to be a promising alternative to sedentary games to promote physical activity and reduce sedentary behavior. At this time, little is known about correlates of active and non-active gaming among adolescents. The objective of this study was to examine potential personal, social, and game-related correlates of both active and non-active gaming in adolescents. A survey assessing game behavior and potential personal, social, and game-related correlates was conducted among adolescents (12-16 years, N=353) recruited via schools. Multivariable, multilevel logistic regression analyses, adjusted for demographics (age, sex and educational level of adolescents), were conducted to examine personal, social, and game-related correlates of active gaming ≥1 hour per week (h/wk) and non-active gaming >7 h/wk. Active gaming ≥1 h/wk was significantly associated with a more positive attitude toward active gaming (OR 5.3, CI 2.4-11.8; Pgames (OR 0.30, CI 0.1-0.6; P=.002), a higher score on habit strength regarding gaming (OR 1.9, CI 1.2-3.2; P=.008) and having brothers/sisters (OR 6.7, CI 2.6-17.1; Pgaming and a little bit lower score on game engagement (OR 0.95, CI 0.91-0.997; P=.04). Non-active gaming >7 h/wk was significantly associated with a more positive attitude toward non-active gaming (OR 2.6, CI 1.1-6.3; P=.035), a stronger habit regarding gaming (OR 3.0, CI 1.7-5.3; Pgaming (OR 3.3, CI 1.46-7.53; P=.004), and a more positive image of a non-active gamer (OR 2, CI 1.07-3.75; P=.03). Various factors were significantly associated with active gaming ≥1 h/wk and non-active gaming >7 h/wk. Active gaming is most strongly (negatively) associated with attitude with respect to non-active games, followed by observed active game behavior of brothers and sisters and attitude with respect to active gaming (positive associations). On the other hand, non
Choi, Kilchan; Seltzer, Michael
2010-01-01
In studies of change in education and numerous other fields, interest often centers on how differences in the status of individuals at the start of a period of substantive interest relate to differences in subsequent change. In this article, the authors present a fully Bayesian approach to estimating three-level Hierarchical Models in which latent…
分层递阶多模型自适应解耦控制器%Multivariable Adaptive Decoupling Controller Using Hierarchical Multiple Models
Institute of Scientific and Technical Information of China (English)
王昕; 李少远; 岳恒
2005-01-01
To solve the problem such as too many models, long computing time and so on, a hierarchical multiple models direct adaptive decoupling controller is designed. It consists of multiple levels. In the upper level, the best model is chosen according to the switching index. Then multiple fixed models are constructed on line to cover the region which the above chosen fixed model lies in.In the last level, one free-running and one re-initialized adaptive model are added to guarantee the stability and improve the transient response. By selection of the weighting polynomial matrix, it not only eliminates the steady output error and places the poles of the closed loop system arbitrarily, but also decouples the system dynamically. At last, for this multiple models switching system, global convergence is obtained under common assumptions. Compared with the conventional multiple models adaptive controller, it reduces the number of the fixed models greatly. If the same number of the fixed models is used, the system transient response and decoupling result are improved. The simulation example illustrates the power of the derived controller.
DEFF Research Database (Denmark)
Henneberg, Morten; Jørgensen, Bent; Eriksen, René Lynge
2016-01-01
In this paper, we present an oil condition and wear debris evaluation method for ship thruster gears using T2 statistics to form control charts from a multi-sensor platform. The proposed method takes into account the different ambient conditions by multiple linear regression on the mean value...... as substitution from the normal empirical mean value. This regression approach accounts for the bias imposed on the empirical mean value due to different geographical and seasonal differences on the multi-sensor inputs. Data from a gearbox are used to evaluate the length of the run-in period in order to ensure...... only quasi-stationary data are included in phase I of the T2 statistics. Data from two thruster gears onboard two different ships are presented and analyzed, and the selection of the phase I data size is discussed. A graphic overview for quick localization of T2 signaling is also demonstrated using...
Olive, David J
2017-01-01
This text covers both multiple linear regression and some experimental design models. The text uses the response plot to visualize the model and to detect outliers, does not assume that the error distribution has a known parametric distribution, develops prediction intervals that work when the error distribution is unknown, suggests bootstrap hypothesis tests that may be useful for inference after variable selection, and develops prediction regions and large sample theory for the multivariate linear regression model that has m response variables. A relationship between multivariate prediction regions and confidence regions provides a simple way to bootstrap confidence regions. These confidence regions often provide a practical method for testing hypotheses. There is also a chapter on generalized linear models and generalized additive models. There are many R functions to produce response and residual plots, to simulate prediction intervals and hypothesis tests, to detect outliers, and to choose response trans...
Directory of Open Access Journals (Sweden)
Marcelo Antonio Morgano
2008-03-01
Full Text Available A espectroscopia na região do infravermelho próximo (NIR foi usada para determinar o teor de umidade em amostras de café cru. Foram construídos modelos de regressão usando o método dos mínimos quadrados parciais (PLS com diferentes pré-tratamentos de dados e 157 espectros NIR coletados de amostras de café usando um acessório de reflectância difusa, na região entre 4500 e 10000 cm-1. Os espectros originais passaram por diferentes transformações e pré-tratamentos matemáticos, como a transformação Kubelka-Munk; a correção multiplicativa de sinal (MSC; o alisamento com SPLINE e a média móvel, e os dados foram escalados pela variância. O modelo de regressão permitiu determinar o teor de umidade nas amostras de café cru com erro quadrático médio de calibração (SEC de 0,569 g.100 g -1; erro quadrático médio de validação de 0,298 g.100 g -1; coeficiente de correlação (r 0,712 e 0,818 para calibração e validação, respectivamente; e erro relativo médio de 4,1% para amostras de validação.Near infra-red reflectance (NIR spectroscopy was used to measure the moisture content in raw coffee. Different models using partial least squares (PLS with data pre-processing were used. Regression models were built with 157 spectra of the samples of raw coffee collected using a near infrared spectrometer with an accessory of diffuse reflectance, between 4500 and 10000 cm-1. The original NIR spectra went through different transformations and mathematical pre treatments, such as the Kubelka-Munk transformation; multiplicative signal correction (MSC; spline smoothing and movable average, and the data were scaled by variance. The regression model permitted the determination of the moisture content of the raw coffee samples with a standard error of calibration (SEC = 0.569 g.100 g -1; standard error of validation = 0.298 g.100 g -1; correlation coefficient (r 0.712 and 0.818 for calibration and validation, respectively, and average
Greene, LaVana; Elzey, Brianda; Franklin, Mariah; Fakayode, Sayo O.
2017-03-01
The negative health impact of polycyclic aromatic hydrocarbons (PAHs) and differences in pharmacological activity of enantiomers of chiral molecules in humans highlights the need for analysis of PAHs and their chiral analogue molecules in humans. Herein, the first use of cyclodextrin guest-host inclusion complexation, fluorescence spectrophotometry, and chemometric approach to PAH (anthracene) and chiral-PAH analogue derivatives (1-(9-anthryl)-2,2,2-triflouroethanol (TFE)) analyses are reported. The binding constants (Kb), stoichiometry (n), and thermodynamic properties (Gibbs free energy (ΔG), enthalpy (ΔH), and entropy (ΔS)) of anthracene and enantiomers of TFE-methyl-β-cyclodextrin (Me-β-CD) guest-host complexes were also determined. Chemometric partial-least-square (PLS) regression analysis of emission spectra data of Me-β-CD-guest-host inclusion complexes was used for the determination of anthracene and TFE enantiomer concentrations in Me-β-CD-guest-host inclusion complex samples. The values of calculated Kb and negative ΔG suggest the thermodynamic favorability of anthracene-Me-β-CD and enantiomeric of TFE-Me-β-CD inclusion complexation reactions. However, anthracene-Me-β-CD and enantiomer TFE-Me-β-CD inclusion complexations showed notable differences in the binding affinity behaviors and thermodynamic properties. The PLS regression analysis resulted in square-correlation-coefficients of 0.997530 or better and a low LOD of 3.81 × 10- 7 M for anthracene and 3.48 × 10- 8 M for TFE enantiomers at physiological conditions. Most importantly, PLS regression accurately determined the anthracene and TFE enantiomer concentrations with an average low error of 2.31% for anthracene, 4.44% for R-TFE and 3.60% for S-TFE. The results of the study are highly significant because of its high sensitivity and accuracy for analysis of PAH and chiral PAH analogue derivatives without the need of an expensive chiral column, enantiomeric resolution, or use of a
Institute of Scientific and Technical Information of China (English)
张希翔; 李陶深
2012-01-01
Regression analysis is often used for filling and predicting incomplete data, whereas it has some flaws when constructing regression equation, the independent variable form is fixed and single. In order to solve the problem, the paper proposed an improved multivariate regression analytical method based on heuristic constructed variable. Firstly, the existing variables' optimized combination forms were found by means of greedy algorithm, then the new constructed variable for multivariate regression analysis was chosen to get a better goodness of fit. Results of calculating and estimating incomplete data of wheat stalks' mechanical strength prove thai the proposed method is feasible and effective, and it can get a better goodness of fit when predicting incomplete data.%传统的多元回归分析方法可以对缺失数据进行预测填补,但它在构造回归方程时存在自变量形式较为固定、单一等不足.为此,提出一种基于启发式构元的多元回归分析方法,通过贪婪算法找出现有变量的优化组合形式,选取若干新构变量进行回归分析,从而得到更好的拟合优度.通过对案例中小麦茎秆机械强度缺失数据信息进行仿真计算和评估,证实了方法的有效性.算例结果表明该方法运用在缺失数据预测中拥有较好的精准性.
DEFF Research Database (Denmark)
Johansen, Søren
2008-01-01
The reduced rank regression model is a multivariate regression model with a coefficient matrix with reduced rank. The reduced rank regression algorithm is an estimation procedure, which estimates the reduced rank regression model. It is related to canonical correlations and involves calculating e...
Masuda, Takanori; Nakaura, Takeshi; Funama, Yoshinori; Higaki, Toru; Kiguchi, Masao; Imada, Naoyuki; Sato, Tomoyasu; Awai, Kazuo
We evaluated the effect of the age, sex, total body weight (TBW), height (HT) and cardiac output (CO) of patients on aortic and hepatic contrast enhancement during hepatic-arterial phase (HAP) and portal venous phase (PVP) computed tomography (CT) scanning. This prospective study received institutional review board approval; prior informed consent to participate was obtained from all 168 patients. All were examined using our routine protocol; the contrast material was 600 mg/kg iodine. Cardiac output was measured with a portable electrical velocimeter within 5 minutes of starting the CT scan. We calculated contrast enhancement (per gram of iodine: [INCREMENT]HU/gI) of the abdominal aorta during the HAP and of the liver parenchyma during the PVP. We performed univariate and multivariate linear regression analysis between all patient characteristics and the [INCREMENT]HU/gI of aortic- and liver parenchymal enhancement. Univariate linear regression analysis demonstrated statistically significant correlations between the [INCREMENT]HU/gI and the age, sex, TBW, HT, and CO (all P linear regression analysis showed that only the TBW and CO were of independent predictive value (P linear regression analysis only the TBW and CO were significantly correlated with aortic and liver parenchymal enhancement; the age, sex, and HT were not. The CO was the only independent factor affecting aortic and liver parenchymal enhancement at hepatic CT when the protocol was adjusted for the TBW.
Song, Hae-Ryoung; Lawson, Andrew; D'Agostino, Ralph B; Liese, Angela D
2011-03-01
Sparse count data violate assumptions of traditional Poisson models due to the excessive amount of zeros, and modeling sparse data becomes challenging. However, since aggregation to reduce sparseness may result in biased estimates of risk, solutions need to be found at the level of disaggregated data. We investigated different statistical approaches within a Bayesian hierarchical framework for modeling sparse data without aggregation of data. We compared our proposed models with the traditional Poisson model and the zero-inflated model based on simulated data. We applied statistical models to type 1 and type 2 diabetes in youth 10-19 years known as rare diseases, and compared models using the inference results and various model diagnostic tools. We showed that one of the models we proposed, a sparse Poisson convolution model, performed better than other models in the simulation and application based on the deviance information criterion (DIC) and the mean squared prediction error.
Institute of Scientific and Technical Information of China (English)
陈璐璐
2016-01-01
首先建立股票价格的多元线性回归方程，使用EVIEWS软件计算回归系数，对回归系数进行经济意义的检验和统计检验；然后利用计量经济学课程内容检验回归方程是否存在多重共线性、异方差性、自相关性等情况；接着对模型进行改进，得到的回归方程可决系数较大，并且满足多元线性回归方程的古典假定；最后将改进后的模型应用于目标预测日的开盘价预测，预测误差在可以接受的范围之内。%we set up the stock price of multivariate linear regression equation firstly,using EVIEWS software calculating regression coefficient and economic significance of regression coefficients of inspection and statistics; Then by using the regression equation of econometrics course content test,the presence of multicollinearity,heteroscedasticity,since the correlation,and so on and so forth; improve the model with learned theory,the regression equation of determination coefficient is larger,and multiple linear regression equation of the classical assumptions; Finally the improved model was applied to target forecast day opening price forecasting,prediction error within the acceptable range.
Hegazy, Maha A.; Lotfy, Hayam M.; Rezk, Mamdouh R.; Omran, Yasmin Rostom
2015-04-01
Smart and novel spectrophotometric and chemometric methods have been developed and validated for the simultaneous determination of a binary mixture of chloramphenicol (CPL) and dexamethasone sodium phosphate (DSP) in presence of interfering substances without prior separation. The first method depends upon derivative subtraction coupled with constant multiplication. The second one is ratio difference method at optimum wavelengths which were selected after applying derivative transformation method via multiplying by a decoding spectrum in order to cancel the contribution of non labeled interfering substances. The third method relies on partial least squares with regression model updating. They are so simple that they do not require any preliminary separation steps. Accuracy, precision and linearity ranges of these methods were determined. Moreover, specificity was assessed by analyzing synthetic mixtures of both drugs. The proposed methods were successfully applied for analysis of both drugs in their pharmaceutical formulation. The obtained results have been statistically compared to that of an official spectrophotometric method to give a conclusion that there is no significant difference between the proposed methods and the official ones with respect to accuracy and precision.
Multivariate analysis with LISREL
Jöreskog, Karl G; Y Wallentin, Fan
2016-01-01
This book traces the theory and methodology of multivariate statistical analysis and shows how it can be conducted in practice using the LISREL computer program. It presents not only the typical uses of LISREL, such as confirmatory factor analysis and structural equation models, but also several other multivariate analysis topics, including regression (univariate, multivariate, censored, logistic, and probit), generalized linear models, multilevel analysis, and principal component analysis. It provides numerous examples from several disciplines and discusses and interprets the results, illustrated with sections of output from the LISREL program, in the context of the example. The book is intended for masters and PhD students and researchers in the social, behavioral, economic and many other sciences who require a basic understanding of multivariate statistical theory and methods for their analysis of multivariate data. It can also be used as a textbook on various topics of multivariate statistical analysis.
Durmaz, Murat; Karslioglu, Mahmut Onur
2015-04-01
There are various global and regional methods that have been proposed for the modeling of ionospheric vertical total electron content (VTEC). Global distribution of VTEC is usually modeled by spherical harmonic expansions, while tensor products of compactly supported univariate B-splines can be used for regional modeling. In these empirical parametric models, the coefficients of the basis functions as well as differential code biases (DCBs) of satellites and receivers can be treated as unknown parameters which can be estimated from geometry-free linear combinations of global positioning system observables. In this work we propose a new semi-parametric multivariate adaptive regression B-splines (SP-BMARS) method for the regional modeling of VTEC together with satellite and receiver DCBs, where the parametric part of the model is related to the DCBs as fixed parameters and the non-parametric part adaptively models the spatio-temporal distribution of VTEC. The latter is based on multivariate adaptive regression B-splines which is a non-parametric modeling technique making use of compactly supported B-spline basis functions that are generated from the observations automatically. This algorithm takes advantage of an adaptive scale-by-scale model building strategy that searches for best-fitting B-splines to the data at each scale. The VTEC maps generated from the proposed method are compared numerically and visually with the global ionosphere maps (GIMs) which are provided by the Center for Orbit Determination in Europe (CODE). The VTEC values from SP-BMARS and CODE GIMs are also compared with VTEC values obtained through calibration using local ionospheric model. The estimated satellite and receiver DCBs from the SP-BMARS model are compared with the CODE distributed DCBs. The results show that the SP-BMARS algorithm can be used to estimate satellite and receiver DCBs while adaptively and flexibly modeling the daily regional VTEC.
Hsu, C.; Cifelli, R.; Zamora, R. J.; Schneider, T.
2014-12-01
The PRISM monthly climatology has been widely used by various agencies for diverse purposes. In the River Forecast Centers (RFCs), the PRISM monthly climatology is used to support tasks such as QPE, or quality control of point precipitation observation, and fine tune QPFs. Validation studies by forecasters and researchers have shown that interpolation involving PRISM climatology can effectually reduce the estimation bias for the locations where moderate or little orographic phenomena occur. However, many studies have pointed out limitations in PRISM monthly climatology. These limitations are especially apparent in storm events with fast-moving wet air masses or with storm tracks that are different from climatology. In order to upgrade PRISM climatology so it possesses the capability to characterize the climatology of storm events, it is critical to integrate large-scale atmospheric conditions with the original PRISM predictor variables and to simulate them at a temporal resolution higher than monthly. To this end, a simple, flexible, and powerful framework for precipitation estimation modeling that can be applied to very large data sets is thus developed. In this project, a decision tree based estimation structure was developed to perform the aforementioned variable integration work. Three Atmospheric River events (ARs) were selected to explore the hierarchical relationships among these variables and how these relationships shape the event-based precipitation distribution pattern across California. Several atmospheric variables, including vertically Integrated Vapor Transport (IVT), temperature, zonal wind (u), meridional wind (v), and omega (ω), were added to enhance the sophistication of the tree-based structure in estimating precipitation. To develop a direction-based climatology, the directions the ARs moving over the Pacific Ocean were also calculated and parameterized within the tree estimation structure. The results show that the involvement of the
Hegazy, Maha A.; Lotfy, Hayam M.; Mowaka, Shereen; Mohamed, Ekram Hany
2016-07-01
Wavelets have been adapted for a vast number of signal-processing applications due to the amount of information that can be extracted from a signal. In this work, a comparative study on the efficiency of continuous wavelet transform (CWT) as a signal processing tool in univariate regression and a pre-processing tool in multivariate analysis using partial least square (CWT-PLS) was conducted. These were applied to complex spectral signals of ternary and quaternary mixtures. CWT-PLS method succeeded in the simultaneous determination of a quaternary mixture of drotaverine (DRO), caffeine (CAF), paracetamol (PAR) and p-aminophenol (PAP, the major impurity of paracetamol). While, the univariate CWT failed to simultaneously determine the quaternary mixture components and was able to determine only PAR and PAP, the ternary mixtures of DRO, CAF, and PAR and CAF, PAR, and PAP. During the calculations of CWT, different wavelet families were tested. The univariate CWT method was validated according to the ICH guidelines. While for the development of the CWT-PLS model a calibration set was prepared by means of an orthogonal experimental design and their absorption spectra were recorded and processed by CWT. The CWT-PLS model was constructed by regression between the wavelet coefficients and concentration matrices and validation was performed by both cross validation and external validation sets. Both methods were successfully applied for determination of the studied drugs in pharmaceutical formulations.
Hegazy, Maha A; Lotfy, Hayam M; Mowaka, Shereen; Mohamed, Ekram Hany
2016-07-05
Wavelets have been adapted for a vast number of signal-processing applications due to the amount of information that can be extracted from a signal. In this work, a comparative study on the efficiency of continuous wavelet transform (CWT) as a signal processing tool in univariate regression and a pre-processing tool in multivariate analysis using partial least square (CWT-PLS) was conducted. These were applied to complex spectral signals of ternary and quaternary mixtures. CWT-PLS method succeeded in the simultaneous determination of a quaternary mixture of drotaverine (DRO), caffeine (CAF), paracetamol (PAR) and p-aminophenol (PAP, the major impurity of paracetamol). While, the univariate CWT failed to simultaneously determine the quaternary mixture components and was able to determine only PAR and PAP, the ternary mixtures of DRO, CAF, and PAR and CAF, PAR, and PAP. During the calculations of CWT, different wavelet families were tested. The univariate CWT method was validated according to the ICH guidelines. While for the development of the CWT-PLS model a calibration set was prepared by means of an orthogonal experimental design and their absorption spectra were recorded and processed by CWT. The CWT-PLS model was constructed by regression between the wavelet coefficients and concentration matrices and validation was performed by both cross validation and external validation sets. Both methods were successfully applied for determination of the studied drugs in pharmaceutical formulations.
Nick, Todd G; Campbell, Kathleen M
2007-01-01
The Medical Subject Headings (MeSH) thesaurus used by the National Library of Medicine defines logistic regression models as "statistical models which describe the relationship between a qualitative dependent variable (that is, one which can take only certain discrete values, such as the presence or absence of a disease) and an independent variable." Logistic regression models are used to study effects of predictor variables on categorical outcomes and normally the outcome is binary, such as presence or absence of disease (e.g., non-Hodgkin's lymphoma), in which case the model is called a binary logistic model. When there are multiple predictors (e.g., risk factors and treatments) the model is referred to as a multiple or multivariable logistic regression model and is one of the most frequently used statistical model in medical journals. In this chapter, we examine both simple and multiple binary logistic regression models and present related issues, including interaction, categorical predictor variables, continuous predictor variables, and goodness of fit.
Juliano da Silva, Carlos; Pasquini, Celio
2015-01-21
Conventional reflectance spectroscopy (NIRS) and hyperspectral imaging (HI) in the near-infrared region (1000-2500 nm) are evaluated and compared, using, as the case study, the determination of relevant properties related to the quality of natural rubber. Mooney viscosity (MV) and plasticity indices (PI) (PI0 - original plasticity, PI30 - plasticity after accelerated aging, and PRI - the plasticity retention index after accelerated aging) of rubber were determined using multivariate regression models. Two hundred and eighty six samples of rubber were measured using conventional and hyperspectral near-infrared imaging reflectance instruments in the range of 1000-2500 nm. The sample set was split into regression (n = 191) and external validation (n = 95) sub-sets. Three instruments were employed for data acquisition: a line scanning hyperspectral camera and two conventional FT-NIR spectrometers. Sample heterogeneity was evaluated using hyperspectral images obtained with a resolution of 150 × 150 μm and principal component analysis. The probed sample area (5 cm(2); 24,000 pixels) to achieve representativeness was found to be equivalent to the average of 6 spectra for a 1 cm diameter probing circular window of one FT-NIR instrument. The other spectrophotometer can probe the whole sample in only one measurement. The results show that the rubber properties can be determined with very similar accuracy and precision by Partial Least Square (PLS) regression models regardless of whether HI-NIR or conventional FT-NIR produce the spectral datasets. The best Root Mean Square Errors of Prediction (RMSEPs) of external validation for MV, PI0, PI30, and PRI were 4.3, 1.8, 3.4, and 5.3%, respectively. Though the quantitative results provided by the three instruments can be considered equivalent, the hyperspectral imaging instrument presents a number of advantages, being about 6 times faster than conventional bulk spectrometers, producing robust spectral data by ensuring sample
Sullivan, Paul
2017-01-01
Objectives Previous studies found that hospital and specialty have limited influence on patient experience scores, and patient level factors are more important. This could be due to heterogeneity of experience delivery across subunits within organisations. We aimed to determine whether organisation level factors have greater impact if scores for the same subspecialty microsystem are analysed in each hospital. Setting Acute medical admission units in all NHS Acute Trusts in England. Participants We analysed patient experience data from the English Adult Inpatient Survey which is administered to 850 patients annually in each acute NHS Trusts in England. We selected all 8753 patients who returned the survey and who were emergency medical admissions and stayed in their admission unit for 1–2 nights, so as to isolate the experience delivered during the acute admission process. Primary and secondary outcome measures We used multilevel logistic regression to determine the apportioned influence of host organisation and of organisation level factors (size and teaching status), and patient level factors (demographics, presence of long-term conditions and disabilities). We selected ‘being treated with respect and dignity’ and ‘pain control’ as primary outcome parameters. Other Picker Domain question scores were analysed as secondary parameters. Results The proportion of overall variance attributable at organisational level was small; 0.5% (NS) for respect and dignity, 0.4% (NS) for pain control. Long-standing conditions and consequent disabilities were associated with low scores. Other item scores also showed that most influence was from patient level factors. Conclusions When a single microsystem, the acute medical admission process, is isolated, variance in experience scores is mainly explainable by patient level factors with limited organisational level influence. This has implications for the use of generic patient experience surveys for comparison between
Institute of Scientific and Technical Information of China (English)
朱天琦; 蒋宗滨; 赵劲民; 彭宇; 曾金; 李锋; 蒋奕红
2011-01-01
Objective To investigate the factors related with the development of phantom limb pain (PLP) in amputated patients. Methods 226 cases of patient received amputation surgery have been surveyed using the method of telephone follow-up according to a designing questionnaire. The factors that may influence the development of PLP have been analyzed with univariately and multi-variately regression analysis. Results The factors which included method of anesthesia, pain before the amputation, postoperative analgesia, complication and stump pain were related with the development of PLP by univarite analysis (P 0.05). The factors which included pain before the amputation (OR = 2.60), stump pain (OR = 3.70), general anesthesia (OR = 2.94) and postoperative analgesia (OR = 0.44) were independent factors in multi-variately analysis (P 0.05),多因素分析进一步证实影响幻肢痛发生的独立因素分别为术前疼痛(OR=2.60),残肢痛(OR=3.70),全身麻醉(OR=2.94)和术后镇痛(OR=0.44)(P<0.05).结论:术前疼痛、残肢痛、全身麻醉是幻肢痛发生的相关危险因素,术后镇痛是幻肢痛发生的保护因素,对于预防幻肢痛的发生具有重要的作用.
A Matlab program for stepwise regression
Directory of Open Access Journals (Sweden)
Yanhong Qi
2016-03-01
Full Text Available The stepwise linear regression is a multi-variable regression for identifying statistically significant variables in the linear regression equation. In present study, we presented the Matlab program of stepwise regression.
Prediction of road accidents: A Bayesian hierarchical approach
DEFF Research Database (Denmark)
Deublein, Markus; Schubert, Matthias; Adey, Bryan T.;
2013-01-01
In this paper a novel methodology for the prediction of the occurrence of road accidents is presented. The methodology utilizes a combination of three statistical methods: (1) gamma-updating of the occurrence rates of injury accidents and injured road users, (2) hierarchical multivariate Poisson......-lognormal regression analysis taking into account correlations amongst multiple dependent model response variables and effects of discrete accident count data e.g. over-dispersion, and (3) Bayesian inference algorithms, which are applied by means of data mining techniques supported by Bayesian Probabilistic Networks...... in order to represent non-linearity between risk indicating and model response variables, as well as different types of uncertainties which might be present in the development of the specific models.Prior Bayesian Probabilistic Networks are first established by means of multivariate regression analysis...
Institute of Scientific and Technical Information of China (English)
梁友嘉; 徐中民
2011-01-01
结合GIS和统计学方法，利用21个站点气象数据和DEM（基于5个因子：高程、坡度、坡向、经度和纬度）在黑河干流山区构建一种多元非线性回归模型，用以模拟研究区降水量空间分布，并分析了全年、湿季和干季3种情景及3种不同空间分辨率数据相互耦合下的雨量变化。结果表明，该模型可解释研究区74．5％的年降水空间变异，对湿季降水量解释效果要好于全年和干季两种情景；100m分辨率下的3种降水模型效果均为最好；降水量空间分布不均匀，100m分辨率下，由西北部不足200mm增加至东南部700mm左右，降水量分界线呈西北-东南走向；500m分辨率的降水量分界呈带状，有一定程度上移；1000m分辨率的降水量分布误差大。本研究采用的建模方法有较强移植性，可在其他山区开发类似模型，利用其模拟结果进行更深入的研究，今后在建模中加入空间化的风速变量有可能进一步提高模型精度。%Based on precipitation data collecting at 21 stations from 1971 to 2000 and five topographic factors （altitude, slope, aspect, longitude and latitude） acquiring from three different resolution digital elevation model （DEM）, the multivariate regression analysis, combined with GIS, was used to develop a precipitation prediction model for the Heihe river basin. The results of this study showed that the multivariate regression model explained 74.5 % of the spatial variability of precipitation over the whole year, and this model had better explanation precipitation for wet season （May-September） than the whole year and dry season. Precipitation during dry season was difficult to simulate owing to little rainfall and a different synoptic system. The 100 m resolution model in the three periods were better than other resolution model to explain the precipitation because the spatial distribution of precipitation was uneven. The 100 m
Institute of Scientific and Technical Information of China (English)
刘飞
2014-01-01
识别微生物相互作用关系对理解微生物社团的结构和功能非常重要，一般的推断微生物相互作用关系的计算方法都是基于微生物个体相似性来提出的。比起来自多个不同社区的相互作用网络，一个复杂社区的时间动态性可以揭示更为复杂的相互作用关系。尽管已经提出了很多相似性方法来分析时间序列数据，但是没有有效的多元统计方法来推断和评估作用关系的统计显著性。在本文中，我们提出从人类肠道微生物的时间序列数据来推断出微生物动态相互作用，我们使用多元统计方法--矢量自回归(MVAR)模型，并应用它对重复抗生素扰动的人类肠道微生物时间序列数据集进行网络预测。所涉及的微生物相互作用提供了一个微生物社团的动态观点，这可能是对相似或相关方法的一种新型补充。%There is an increasing interest in identifying the microbial interactions that are important to under-stand the structure and the function of microbial community. Computational inference methods of microbial relation-ships are currently based on the similarity among microbial individuals. The dynamics of a single complex communi-ty over time can reveal complex interacting patterns than collecting samples from multiple distinct communities. Al-though similarity-based method has been proposed for analyzing time series data, there are no efficient multivariate statistical methods to infer and access the statistical significance of the estimated associations. In this paper, we pro-vide the first attempt to infer dynamic microbial interactions from the time series of human gut microbiomes. We use a multivariate statistical method-Vector Auto-regression (MVAR) model and apply the method on time series datas-et of human gut microbiomes with repeated antibiotic perturbations. The referred microbial interactions provide a dy-namical view of a microbial community which could
Directory of Open Access Journals (Sweden)
Jorcen Simon de Souza
2006-09-01
resolution of the 4 cm-1 and 32 scans. For multivariate regression 14 samples had been used for calibration set and 6 samples had been used for validation set. For the PLS modeling, the spectral information (3715-3088 cm-1 and 1634-1191 cm-1 after multiplicative scatter correction (MSC and autoscaling preprocess, had been processed in Pirouette® 2.7. The best models with better coefficient regression (R², these were improved using root mean square error of validation (RMSEV. The new technique propose using DRIFTS/PLS shows an excellent choices for the quality control of the productive process of factories and drugstores that produce or handle these materials in large scale, presenting low time of analysis, no sample destruction and no wastes production.
Institute of Scientific and Technical Information of China (English)
任颖; 马磊; 刘述
2014-01-01
目的：对新生儿呼吸窘迫综合征( RDS)进行流行病学调查，分析其高危因素。方法：2011年2月-2013年1月间调查10104个新生儿，包括监护病房收治的122例RDS患儿。应用Logistic回归分析计算优势比( OR)及其95％置信区间。结果：胎龄<32周，新生儿性别，低出生体重，产妇年龄，选择性剖宫产是RDS的高危因素；而产妇妊娠期疾病不能作为RDS独立的危险因素。结论：新生儿出现RDS及预后受多种因素影响。应防止新生儿出现RDS的各种高危因素，采取措施有效降低新生儿RDS的发生率，并改善患儿预后。%Objective:To investigate the risk factors of respiratory distress syndrome ( RDS) in neonates via an epidemiologic survey. Methods:A total of 10104 neonates, including 122 neonates with RDS in the intensive care unit, were surveyed between February 2011 and January 2013. Multivariate regression analyses, including the odds ratios and 95% confidence intervals, of the risk factors of RDS were performed. Results:The gestational age of 32 weeks or less, gender, low birth weight, maternal age, caesarean section, but not the maternal diseases during pregnancy, were risk factors of RDS in neonates. Conclusion:The incidence and prognosis of RDS could be influenced by multiple factors. This warrants effective measures to reduce the incidence of neonatal RDS and improve the prognosis.
Institute of Scientific and Technical Information of China (English)
张庆丰; 张浩; 陆彪
2016-01-01
以钢渣作为研究对象，采用水玻璃、氢氧化钠与氢氧化钙三元复合活化剂，制备碱钢渣胶凝材料。基于均匀设计和多元非线性回归法研究了各因素对碱钢渣胶凝材料力学性能的影响。结果表明，各因素对性能影响的主次顺序为：3 d时钢渣用量＞氢氧化钠用量＞水玻璃用量＞氢氧化钙用量，7 d时钢渣用量＞水玻璃用量＞氢氧化钠用量＞氢氧化钙用量，28 d时钢渣用量＞水玻璃用量＞氢氧化钙用量＞氢氧化钠用量；28 d碱钢渣胶凝材料的优化制备方案为：钢渣用量为225 g，水玻璃用量为22.5 g，氢氧化钠用量为9.0 g，氢氧化钙用量为13.2 g；优化制备模型选择正确，其相对误差仅为2.19%。%Alkaline steel slag cement materials were prepared with steel slag as the research object, sodium silicate, sodium hydroxide and calcium hydroxide as the ternary compound activator. The effect of every factor on mechanical property of alkaline steel slag cement materials was studied by orthogonal design and multivariate nonlinear regression. The results show that primary and secondary sequence of factors is steel slag dosage>sodium hydroxide dosage>sodium silicate dosage>calcium hydroxide dosage in 3 d, steel slag dosage>sodium silicate dosage>sodium hydroxide dosage>calcium hydroxide dosage in 7 d, steel slag dosage>sodium silicate dosage>calcium hydroxide dosage>sodium hydroxide dosage in 28 d. The optimization program of alkaline steel slag cement materials in 28 d is steel slag dosage 225 g, sodium silicate dosage 22.5 g, sodium hydroxide dosage 9.0 g and calcium hydroxide dosage 13.2 g. Optimized preparation model is correct, its relative error is only 2.19%.
Conceptual hierarchical modeling to describe wetland plant community organization
Little, A.M.; Guntenspergen, G.R.; Allen, T.F.H.
2010-01-01
Using multivariate analysis, we created a hierarchical modeling process that describes how differently-scaled environmental factors interact to affect wetland-scale plant community organization in a system of small, isolated wetlands on Mount Desert Island, Maine. We followed the procedure: 1) delineate wetland groups using cluster analysis, 2) identify differently scaled environmental gradients using non-metric multidimensional scaling, 3) order gradient hierarchical levels according to spatiotem-poral scale of fluctuation, and 4) assemble hierarchical model using group relationships with ordination axes and post-hoc tests of environmental differences. Using this process, we determined 1) large wetland size and poor surface water chemistry led to the development of shrub fen wetland vegetation, 2) Sphagnum and water chemistry differences affected fen vs. marsh / sedge meadows status within small wetlands, and 3) small-scale hydrologic differences explained transitions between forested vs. non-forested and marsh vs. sedge meadow vegetation. This hierarchical modeling process can help explain how upper level contextual processes constrain biotic community response to lower-level environmental changes. It creates models with more nuanced spatiotemporal complexity than classification and regression tree procedures. Using this process, wetland scientists will be able to generate more generalizable theories of plant community organization, and useful management models. ?? Society of Wetland Scientists 2009.
Nakamura, Ryota; Suhrcke, Marc; Jebb, Susan A; Pechey, Rachel; Almiron-Roig, Eva; Marteau, Theresa M
2015-01-01
Background: There is a growing concern, but limited evidence, that price promotions contribute to a poor diet and the social patterning of diet-related disease. Objective: We examined the following questions: 1) Are less-healthy foods more likely to be promoted than healthier foods? 2) Are consumers more responsive to promotions on less-healthy products? 3) Are there socioeconomic differences in food purchases in response to price promotions? Design: With the use of hierarchical regression, we analyzed data on purchases of 11,323 products within 135 food and beverage categories from 26,986 households in Great Britain during 2010. Major supermarkets operated the same price promotions in all branches. The number of stores that offered price promotions on each product for each week was used to measure the frequency of price promotions. We assessed the healthiness of each product by using a nutrient profiling (NP) model. Results: A total of 6788 products (60%) were in healthier categories and 4535 products (40%) were in less-healthy categories. There was no significant gap in the frequency of promotion by the healthiness of products neither within nor between categories. However, after we controlled for the reference price, price discount rate, and brand-specific effects, the sales uplift arising from price promotions was larger in less-healthy than in healthier categories; a 1-SD point increase in the category mean NP score, implying the category becomes less healthy, was associated with an additional 7.7–percentage point increase in sales (from 27.3% to 35.0%; P sales uplift from promotions was larger for higher–socioeconomic status (SES) groups than for lower ones (34.6% for the high-SES group, 28.1% for the middle-SES group, and 23.1% for the low-SES group). Finally, there was no significant SES gap in the absolute volume of purchases of less-healthy foods made on promotion. Conclusion: Attempts to limit promotions on less-healthy foods could improve the
Gebreamlak, Bisratemariam; Dadi, Abel Fekadu; Atnafu, Azeb
2017-01-01
Background Iron deficiency during pregnancy is a risk factor for anemia, preterm delivery, and low birth weight. Iron/Folic Acid supplementation with optimal adherence can effectively prevent anemia in pregnancy. However, studies that address this area of adherence are very limited. Therefore, the current study was conducted to assess the adherence and to identify factors associated with a number of Iron/Folic Acid uptake during pregnancy time among mothers attending antenatal and postnatal care follow up in Akaki kality sub city. Methods Institutional based cross-sectional study was conducted on a sample of 557 pregnant women attending antenatal and postnatal care service. Systematic random sampling was used to select study subjects. The mothers were interviewed and the collected data was cleaned and entered into Epi Info 3.5.1 and analyzed by R version 3.2.0. Hierarchical Negative Binomial Poisson Regression Model was fitted to identify the factors associated with a number of Iron/Folic Acid uptake. Adjusted Incidence rate ratio (IRR) with 95% confidence interval (CI) was computed to assess the strength and significance of the association. Result More than 90% of the mothers were supplemented with at least one Iron/Folic Acid supplement from pill per week during their pregnancy time. Sixty percent of the mothers adhered (took four or more tablets per week) (95%CI, 56%—64.1%). Higher IRR of Iron/Folic Acid supplementation was observed among women: who received health education; which were privately employed; who achieved secondary education; and who believed that Iron/Folic Acid supplements increase blood, whereas mothers who reported a side effect, who were from families with relatively better monthly income, and who took the supplement when sick were more likely to adhere. Conclusion Adherence to Iron/Folic Acid supplement during their pregnancy time among mothers attending antenatal and postnatal care was found to be high. Activities that would address the
Li, Xin; Yu, Jiaguo; Jaroniec, Mietek
2016-05-01
As a green and sustainable technology, semiconductor-based heterogeneous photocatalysis has received much attention in the last few decades because it has potential to solve both energy and environmental problems. To achieve efficient photocatalysts, various hierarchical semiconductors have been designed and fabricated at the micro/nanometer scale in recent years. This review presents a critical appraisal of fabrication methods, growth mechanisms and applications of advanced hierarchical photocatalysts. Especially, the different synthesis strategies such as two-step templating, in situ template-sacrificial dissolution, self-templating method, in situ template-free assembly, chemically induced self-transformation and post-synthesis treatment are highlighted. Finally, some important applications including photocatalytic degradation of pollutants, photocatalytic H2 production and photocatalytic CO2 reduction are reviewed. A thorough assessment of the progress made in photocatalysis may open new opportunities in designing highly effective hierarchical photocatalysts for advanced applications ranging from thermal catalysis, separation and purification processes to solar cells.
Directory of Open Access Journals (Sweden)
Marco F. Ferrão
2007-08-01
Full Text Available Least-squares support vector machines (LS-SVM were used as an alternative multivariate calibration method for the simultaneous quantification of some common adulterants found in powdered milk samples, using near-infrared spectroscopy. Excellent models were built using LS-SVM for determining R², RMSECV and RMSEP values. LS-SVMs show superior performance for quantifying starch, whey and sucrose in powdered milk samples in relation to PLSR. This study shows that it is possible to determine precisely the amount of one and two common adulterants simultaneously in powdered milk samples using LS-SVM and NIR spectra.
Common predictor effects for multivariate longitudinal data
Jia, Juan; Weiss, Robert E.
2009-01-01
Multivariate outcomes measured longitudinally over time are common in medicine, public health, psychology and sociology. The typical (saturated) longitudinal multivariate regression model has a separate set of regression coefficients for each outcome. However, multivariate outcomes are often quite similar and many outcomes can be expected to respond similarly to changes in covariate values. Given a set of outcomes likely to share common covariate effects, we propose the Clustered Outcome COmm...
Multivariable modeling and multivariate analysis for the behavioral sciences
Everitt, Brian S
2009-01-01
Multivariable Modeling and Multivariate Analysis for the Behavioral Sciences shows students how to apply statistical methods to behavioral science data in a sensible manner. Assuming some familiarity with introductory statistics, the book analyzes a host of real-world data to provide useful answers to real-life issues.The author begins by exploring the types and design of behavioral studies. He also explains how models are used in the analysis of data. After describing graphical methods, such as scatterplot matrices, the text covers simple linear regression, locally weighted regression, multip
Collaborative Hierarchical Sparse Modeling
Sprechmann, Pablo; Sapiro, Guillermo; Eldar, Yonina C
2010-01-01
Sparse modeling is a powerful framework for data analysis and processing. Traditionally, encoding in this framework is done by solving an l_1-regularized linear regression problem, usually called Lasso. In this work we first combine the sparsity-inducing property of the Lasso model, at the individual feature level, with the block-sparsity property of the group Lasso model, where sparse groups of features are jointly encoded, obtaining a sparsity pattern hierarchically structured. This results in the hierarchical Lasso, which shows important practical modeling advantages. We then extend this approach to the collaborative case, where a set of simultaneously coded signals share the same sparsity pattern at the higher (group) level but not necessarily at the lower one. Signals then share the same active groups, or classes, but not necessarily the same active set. This is very well suited for applications such as source separation. An efficient optimization procedure, which guarantees convergence to the global opt...
Institute of Scientific and Technical Information of China (English)
汤英汉
2015-01-01
By analyzing the features and status quo of China’s internet insurance development, this paper found that the main reason causing the weak growth in the insurance industry is the conflict between people’s increasing needs for insurance and the relatively backward insurance management approaches. Internet insurance is a supplement to traditional insurance to a certain degree. By using the hierarchical regression method, this paper analyzes the insurance premium and its relative data from 2003 to 2013. The result shows that the driving factors of the internet insurance are: tax, population, internet, etc. The study also indicates that internet insurance is not a replacement or a threat to the traditional insurance business, but a new form of it instead. Internet insurance can satisfy people’s various needs for insurance. Finally, the author proposes that internet insurance, as a new insurance business, its development facilitates changes in the thoughts and ideas of the insurance industry as a whole. Internet technology has pushed it forward, especially, in such areas as insurance channels, product and service innovations. Therefore, internet insurance also injects fresh blood to China’s insurance industry.%通过分析我国互联网保险的特点和发展现状，发现快速变化的市场环境引致的社会日益增长的保险需求同相对落后的保险经营管理方式之间的矛盾日益突出，造成当前保险业增长乏力。互联网保险的出现弥补了传统保险的不足，成为保险业新的增长动力。本文运用分层回归分析方法，对我国2003-2013年网销保费及相关数据进行研究，验证了我国互联网保险驱动因素主要取决于税收、人口、互联网等方面，保险业自身因素对互联网保险影响不显著。研究发现，互联网保险的发展不是对传统保险的替代和竞争，而是保险新需求的发现，互联网保险满足多层次的保险需求。提出互联
multivariate approach to the study of aquatic species diversity of ...
African Journals Online (AJOL)
User
2016-12-02
Dec 2, 2016 ... Generalized Linear Model further revealed the pattern in ... Hierarchical framework at multiple spatial levels can ... Limitations of most multivariate applications included the absence of ... HANNA Instruction Manual. Five days ...
Hao, Lingxin
2007-01-01
Quantile Regression, the first book of Hao and Naiman's two-book series, establishes the seldom recognized link between inequality studies and quantile regression models. Though separate methodological literature exists for each subject, the authors seek to explore the natural connections between this increasingly sought-after tool and research topics in the social sciences. Quantile regression as a method does not rely on assumptions as restrictive as those for the classical linear regression; though more traditional models such as least squares linear regression are more widely utilized, Hao
Multivariate Analysis and Prediction of Dioxin-Furan ...
Peer Review Draft of Regional Methods Initiative Final Report Dioxins, which are bioaccumulative and environmentally persistent, pose an ongoing risk to human and ecosystem health. Fish constitute a significant source of dioxin exposure for humans and fish-eating wildlife. Current dioxin analytical methods are costly, time-consuming, and produce hazardous by-products. A Danish team developed a novel, multivariate statistical methodology based on the covariance of dioxin-furan congener Toxic Equivalences (TEQs) and fatty acid methyl esters (FAMEs) and applied it to North Atlantic Ocean fishmeal samples. The goal of the current study was to attempt to extend this Danish methodology to 77 whole and composite fish samples from three trophic groups: predator (whole largemouth bass), benthic (whole flathead and channel catfish) and forage fish (composite bluegill, pumpkinseed and green sunfish) from two dioxin contaminated rivers (Pocatalico R. and Kanawha R.) in West Virginia, USA. Multivariate statistical analyses, including, Principal Components Analysis (PCA), Hierarchical Clustering, and Partial Least Squares Regression (PLS), were used to assess the relationship between the FAMEs and TEQs in these dioxin contaminated freshwater fish from the Kanawha and Pocatalico Rivers. These three multivariate statistical methods all confirm that the pattern of Fatty Acid Methyl Esters (FAMEs) in these freshwater fish covaries with and is predictive of the WHO TE
Exploratory multivariate analysis by example using R
Husson, Francois; Pages, Jerome
2010-01-01
Full of real-world case studies and practical advice, Exploratory Multivariate Analysis by Example Using R focuses on four fundamental methods of multivariate exploratory data analysis that are most suitable for applications. It covers principal component analysis (PCA) when variables are quantitative, correspondence analysis (CA) and multiple correspondence analysis (MCA) when variables are categorical, and hierarchical cluster analysis.The authors take a geometric point of view that provides a unified vision for exploring multivariate data tables. Within this framework, they present the prin
Energy Technology Data Exchange (ETDEWEB)
Lima, Reginaldo Agapito de [Centro Universitario de Itajuba, MG (Brazil)], email: reginaldo_agapito@yahoo.com.br; Ribeiro Junior, Leopoldo Uberto [Voltalia Energia do Brasil, Sao Paulo, SP (Brazil)], email: leopoldo_junior@yahoo.com.br
2010-07-01
For implantation of a SHP, the barrage is the main structure where its sizing represents from 30% - 50% of general cost of civil works. Considering this it is very important to have a fast, didactic and accurate tool for elaborating a budget, also allowing a quantitative analysis of inherent cost for civil building of barrages concrete made for small hydropower plants. In face of this, the multi changing regression tool is very important as it allows a fast and correct establishing of preliminary costs, even approximate, for estimates of barrages in concrete cost, enabling to ease the budget, guiding feasibility decisions for selecting or neglecting new alternatives of fall. (author)
Kahane, Leo H
2007-01-01
Using a friendly, nontechnical approach, the Second Edition of Regression Basics introduces readers to the fundamentals of regression. Accessible to anyone with an introductory statistics background, this book builds from a simple two-variable model to a model of greater complexity. Author Leo H. Kahane weaves four engaging examples throughout the text to illustrate not only the techniques of regression but also how this empirical tool can be applied in creative ways to consider a broad array of topics. New to the Second Edition Offers greater coverage of simple panel-data estimation:
Multivariate respiratory motion prediction
Dürichen, R.; Wissel, T.; Ernst, F.; Schlaefer, A.; Schweikard, A.
2014-10-01
In extracranial robotic radiotherapy, tumour motion is compensated by tracking external and internal surrogates. To compensate system specific time delays, time series prediction of the external optical surrogates is used. We investigate whether the prediction accuracy can be increased by expanding the current clinical setup by an accelerometer, a strain belt and a flow sensor. Four previously published prediction algorithms are adapted to multivariate inputs—normalized least mean squares (nLMS), wavelet-based least mean squares (wLMS), support vector regression (SVR) and relevance vector machines (RVM)—and evaluated for three different prediction horizons. The measurement involves 18 subjects and consists of two phases, focusing on long term trends (M1) and breathing artefacts (M2). To select the most relevant and least redundant sensors, a sequential forward selection (SFS) method is proposed. Using a multivariate setting, the results show that the clinically used nLMS algorithm is susceptible to large outliers. In the case of irregular breathing (M2), the mean root mean square error (RMSE) of a univariate nLMS algorithm is 0.66 mm and can be decreased to 0.46 mm by a multivariate RVM model (best algorithm on average). To investigate the full potential of this approach, the optimal sensor combination was also estimated on the complete test set. The results indicate that a further decrease in RMSE is possible for RVM (to 0.42 mm). This motivates further research about sensor selection methods. Besides the optical surrogates, the sensors most frequently selected by the algorithms are the accelerometer and the strain belt. These sensors could be easily integrated in the current clinical setup and would allow a more precise motion compensation.
Semiparametric regression during 2003–2007
Ruppert, David
2009-01-01
Semiparametric regression is a fusion between parametric regression and nonparametric regression that integrates low-rank penalized splines, mixed model and hierarchical Bayesian methodology – thus allowing more streamlined handling of longitudinal and spatial correlation. We review progress in the field over the five-year period between 2003 and 2007. We find semiparametric regression to be a vibrant field with substantial involvement and activity, continual enhancement and widespread application.
[Understanding logistic regression].
El Sanharawi, M; Naudet, F
2013-10-01
Logistic regression is one of the most common multivariate analysis models utilized in epidemiology. It allows the measurement of the association between the occurrence of an event (qualitative dependent variable) and factors susceptible to influence it (explicative variables). The choice of explicative variables that should be included in the logistic regression model is based on prior knowledge of the disease physiopathology and the statistical association between the variable and the event, as measured by the odds ratio. The main steps for the procedure, the conditions of application, and the essential tools for its interpretation are discussed concisely. We also discuss the importance of the choice of variables that must be included and retained in the regression model in order to avoid the omission of important confounding factors. Finally, by way of illustration, we provide an example from the literature, which should help the reader test his or her knowledge.
Al-Khatib, Issam A; Abu Fkhidah, Ismail; Khatib, Jumana I; Kontogianni, Stamatia
2016-03-01
Forecasting of hospital solid waste generation is a critical challenge for future planning. The composition and generation rate of hospital solid waste in hospital units was the field where the proposed methodology of the present article was applied in order to validate the results and secure the outcomes of the management plan in national hospitals. A set of three multiple-variable regression models has been derived for estimating the daily total hospital waste, general hospital waste, and total hazardous waste as a function of number of inpatients, number of total patients, and number of beds. The application of several key indicators and validation procedures indicates the high significance and reliability of the developed models in predicting the hospital solid waste of any hospital. Methodology data were drawn from existent scientific literature. Also, useful raw data were retrieved from international organisations and the investigated hospitals' personnel. The primal generation outcomes are compared with other local hospitals and also with hospitals from other countries. The main outcome, which is the developed model results, are presented and analysed thoroughly. The goal is this model to act as leverage in the discussions among governmental authorities on the implementation of a national plan for safe hospital waste management in Palestine.
Applied multivariate statistics with R
Zelterman, Daniel
2015-01-01
This book brings the power of multivariate statistics to graduate-level practitioners, making these analytical methods accessible without lengthy mathematical derivations. Using the open source, shareware program R, Professor Zelterman demonstrates the process and outcomes for a wide array of multivariate statistical applications. Chapters cover graphical displays, linear algebra, univariate, bivariate and multivariate normal distributions, factor methods, linear regression, discrimination and classification, clustering, time series models, and additional methods. Zelterman uses practical examples from diverse disciplines to welcome readers from a variety of academic specialties. Those with backgrounds in statistics will learn new methods while they review more familiar topics. Chapters include exercises, real data sets, and R implementations. The data are interesting, real-world topics, particularly from health and biology-related contexts. As an example of the approach, the text examines a sample from the B...
Matson, Johnny L.; Kozlowski, Alison M.
2010-01-01
Autistic regression is one of the many mysteries in the developmental course of autism and pervasive developmental disorders not otherwise specified (PDD-NOS). Various definitions of this phenomenon have been used, further clouding the study of the topic. Despite this problem, some efforts at establishing prevalence have been made. The purpose of…
Institute of Scientific and Technical Information of China (English)
陈伟; 黄燕林
2011-01-01
Objective To study the risk factors of anxiety and depression in peritoneal dialysis patients and to provide evidence of psychological intervention on patients for clinical nurses.Methods 169 patients with peritoneal dialysis were surveyed with SelfRating Anxiety Scale (SAS) and Self-Rating Depression Scale (SDS).The data were analyzed by multi-factor Logistic regression analysis.Results Mean score of anxiety was (41.24±9.11) and depression (48.71±12.06).The incidences of anxiety and depression were 17.8％ and 52.6％ respectively.Independent factors for anxiety were working status, age, dry skin, skin itching, mid upper arm circumference.Independent factors for depression were education background, medical expenses, working status, appetite, grip strength, calf circumference, edema and skin itching.Conclusion Many factors contributed to anxiety and depression of peritoneal dialysis patients.Medical staff should pay more attention to the psychological status of peritoneal dialysis patients who with different conditions during the implementation of psychological intervention.%目的 探讨腹膜透析患者焦虑和抑郁状况及其危险因素,为临床护士对患者心理干预提供依据.方法 选取169例腹膜透析患者,应用Zung's 的焦虑自评量表、抑郁自评量表评估患者的焦虑和抑郁症状,并对影响因素进行单因素及多因素Logistic 回归分析.结果 患者焦虑得分为(41.24±9.11)分,抑郁得分为(48.71±12.06)分.焦虑发生率为17.8%,抑郁发生率为52.6%.焦虑发生的独立危险因素为工作状况、皮肤干燥、皮肤瘙痒、上臂中点围、年龄.抑郁发生的独立危险因素为文化程度、医疗费用、工作状况、食欲、握力、小腿围、有无浮肿、皮肤瘙痒.结论腹膜透析患者存在焦虑抑郁情绪,焦虑和抑郁的发生与多种因素有关.医务人员应重视腹膜透析患者的心理状况,针对患者不同情况实施心理干预.
Institute of Scientific and Technical Information of China (English)
丁锋; 汪菲菲; 汪学海
2014-01-01
For multivariate pseudo-linear regressive moving average systems,a multivariate extended stochastic gra-dient(ESG) algorithm is discussed.In order to reduce the computational cost of the identification algorithm,we de-compose a multivariate system into several subsystems,and derive a partially coupled(subsystem) ESG algorithm and a partially coupled( subsystem) multi-innovation ESG algorithm according to the coupling identification concept and the multi-innovation identification theory. Furthermore, we extend these methods to multivariate pseudo-linear autoregressive moving average systems and present a partially coupled( subsystem) generalized extended stochastic gradient ( GESG ) algorithm and a partially coupled ( subsystem ) multi-innovation GESG algorithm. The computational efficiencies of the multivariate ESG algorithm,the partially coupled ESG algorithm and the partially coupled multi-innovation ESG algorithm are analyzed.%针对多元伪线性滑动平均系统，讨论了多元增广随机梯度算法，为减小算法的计算量，将系统分解为一些子系统，给出了子系统增广随机梯度算法，利用耦合辨识概念和多新息辨识理论，推导了部分耦合（子系统）增广随机梯度算法、部分耦合（子系统）多新息增广随机梯度算法。进一步将提出的方法推广到多元伪线性自回归滑动平均系统，给出了部分耦合（子系统）广义增广随机梯度算法、部分耦合（子系统）多新息广义增广随机梯度算法。文中分析了多元增广随机梯度算法、部分耦合增广随机梯度算法、部分耦合多新息增广随机梯度算法的计算量。
Institute of Scientific and Technical Information of China (English)
向宗灿
2015-01-01
本文针对以X公司预算编制存在的问题，提出了解决方案。依据X公司近五年的财务数据，通过建立多元一次回归模型，选择最相关的研发费用因素来准确的预测出X公司2014年度研发费用总额。%Aiming at the problems in the budgeting of X company, this article puts forward the solutions. According to the financial data of X company in the nearly five years, a multivariate linear regression model is established to select the most relevant R&D factors to accurately predict the annual R&D total costs of X company in 2014.
Evaluation of multivariate surveillance
Frisén,Marianne; Andersson, Eva; Schiöler, Linus
2009-01-01
Multivariate surveillance is of interest in many areas such as industrial production, bioterrorism detection, spatial surveillance, and financial transaction strategies. Some of the suggested approaches to multivariate surveillance have been multivariate counterparts to the univariate Shewhart, EWMA, and CUSUM methods. Our emphasis is on the special challenges of evaluating multivariate surveillance methods. Some new measures are suggested and the properties of several measures are demonstrat...
DEFF Research Database (Denmark)
Thomadsen, Tommy
2005-01-01
of different types of hierarchical networks. This is supplemented by a review of ring network design problems and a presentation of a model allowing for modeling most hierarchical networks. We use methods based on linear programming to design the hierarchical networks. Thus, a brief introduction to the various....... The thesis investigates models for hierarchical network design and methods used to design such networks. In addition, ring network design is considered, since ring networks commonly appear in the design of hierarchical networks. The thesis introduces hierarchical networks, including a classification scheme...... linear programming based methods is included. The thesis is thus suitable as a foundation for study of design of hierarchical networks. The major contribution of the thesis consists of seven papers which are included in the appendix. The papers address hierarchical network design and/or ring network...
Preference learning with evolutionary Multivariate Adaptive Regression Spline model
DEFF Research Database (Denmark)
Abou-Zleikha, Mohamed; Shaker, Noor; Christensen, Mads Græsbøll
2015-01-01
for human decision making. Learning models from pairwise preference data is however an NP-hard problem. Therefore, constructing models that can effectively learn such data is a challenging task. Models are usually constructed with accuracy being the most important factor. Another vitally important aspect...... that is usually given less attention is expressiveness, i.e. how easy it is to explain the relationship between the model input and output. Most machine learning techniques are focused either on performance or on expressiveness. This paper employ MARS models which have the advantage of being a powerful method...
Hierarchical Multiagent Reinforcement Learning
2004-01-25
In this paper, we investigate the use of hierarchical reinforcement learning (HRL) to speed up the acquisition of cooperative multiagent tasks. We...introduce a hierarchical multiagent reinforcement learning (RL) framework and propose a hierarchical multiagent RL algorithm called Cooperative HRL. In
Should metacognition be measured by logistic regression?
Rausch, Manuel; Zehetleitner, Michael
2017-03-01
Are logistic regression slopes suitable to quantify metacognitive sensitivity, i.e. the efficiency with which subjective reports differentiate between correct and incorrect task responses? We analytically show that logistic regression slopes are independent from rating criteria in one specific model of metacognition, which assumes (i) that rating decisions are based on sensory evidence generated independently of the sensory evidence used for primary task responses and (ii) that the distributions of evidence are logistic. Given a hierarchical model of metacognition, logistic regression slopes depend on rating criteria. According to all considered models, regression slopes depend on the primary task criterion. A reanalysis of previous data revealed that massive numbers of trials are required to distinguish between hierarchical and independent models with tolerable accuracy. It is argued that researchers who wish to use logistic regression as measure of metacognitive sensitivity need to control the primary task criterion and rating criteria. Copyright © 2017 Elsevier Inc. All rights reserved.
DEFF Research Database (Denmark)
Thomadsen, Tommy
2005-01-01
Communication networks are immensely important today, since both companies and individuals use numerous services that rely on them. This thesis considers the design of hierarchical (communication) networks. Hierarchical networks consist of layers of networks and are well-suited for coping...... the clusters. The design of hierarchical networks involves clustering of nodes, hub selection, and network design, i.e. selection of links and routing of ows. Hierarchical networks have been in use for decades, but integrated design of these networks has only been considered for very special types of networks....... The thesis investigates models for hierarchical network design and methods used to design such networks. In addition, ring network design is considered, since ring networks commonly appear in the design of hierarchical networks. The thesis introduces hierarchical networks, including a classification scheme...
Analysis of multivariate social science data
Bartholomew, David J; Galbraith, Jane; Moustaki, Irini
2008-01-01
Drawing on the authors' varied experiences working and teaching in the field, Analysis of Multivariate Social Science Data, Second Editionenables a basic understanding of how to use key multivariate methods in the social sciences. With updates in every chapter, this edition expands its topics to include regression analysis, confirmatory factor analysis, structural equation models, and multilevel models. After emphasizing the summarization of data in the first several chapters, the authors focus on regression analysis. This chapter provides a link between the two halves of the book, signal
Directory of Open Access Journals (Sweden)
Susana de Paula Risso
2011-06-01
retrieved data obtained from the Brazilian Birth and Death Certificates of neonates born to mothers living in São José dos Campos, Brazil, from 2003 up to 2004. Variables associated to neonatal death were analyzed by multivariate analysis using the Cox model. Independent variables were: maternal age, maternal educational level, number of previous stillbirths, number of children alive in the family, single or multiple pregnancy, gestation length, type of delivery, sex, birth weight, 1st and 5th minute Apgar scores. Significance was set at p<0.05 RESULTS: There were 131 deaths up to the 28th day after birth during the study period. Results were expressed in relative risk (RR and 95% confidence intervals (CI. Gestational age <37 weeks (RR 6.92; 95%CI 3.64-13.17, 5th minute Apgar score <7 (RR 3.14; 95%CI 1.95-5.04, 1st minute Apgar score <7 (RR 3.48; CI 2.17-5.60 and low birth weight (RR 4.49; 95%CI 3.36-8.53 were associated with neonatal death in the final model. CONCLUSIONS: Variables associated with neonatal death in São José dos Campos, Brazil, are related to quality of health care during prenatal and perinatal periods.
Switching Between Multivariable Controllers
DEFF Research Database (Denmark)
Niemann, H.; Stoustrup, Jakob; Abrahamsen, R.B.
2004-01-01
A concept for implementation of multivariable controllers is presented in this paper. The concept is based on the Youla-Jabr-Bongiorno-Kucera (YJBK) parameterization of all stabilizing controllers. By using this architecture for implementation of multivariable controllers, it is shown how...
DEFF Research Database (Denmark)
Silvennoinen, Annastiina; Teräsvirta, Timo
This article contains a review of multivariate GARCH models. Most common GARCH models are presented and their properties considered. This also includes nonparametric and semiparametric models. Existing specification and misspecification tests are discussed. Finally, there is an empirical example...... in which several multivariate GARCH models are fitted to the same data set and the results compared....
Multivariate irregular sampling theorem
Institute of Scientific and Technical Information of China (English)
无
2009-01-01
In this paper,we prove a Marcinkiewicz-Zygmund type inequality for multivariate entire functions of exponential type with non-equidistant spaced sampling points. And from this result,we establish a multivariate irregular Whittaker-Kotelnikov-Shannon type sampling theorem.
DEFF Research Database (Denmark)
Silvennoinen, Annastiina; Teräsvirta, Timo
This article contains a review of multivariate GARCH models. Most common GARCH models are presented and their properties considered. This also includes nonparametric and semiparametric models. Existing specification and misspecification tests are discussed. Finally, there is an empirical example...... in which several multivariate GARCH models are fitted to the same data set and the results compared....
Multivariate irregular sampling theorem
Institute of Scientific and Technical Information of China (English)
CHEN GuangGui; FANG GenSun
2009-01-01
In this paper, we prove a Marcinkiewicz-Zygmund type inequality for multivariate entire functions of exponential type with non-equidistant spaced sampling points. And from this result, we establish a multivariate irregular Whittaker-Kotelnikov-Shannon type sampling theorem.
Switching Between Multivariable Controllers
DEFF Research Database (Denmark)
Niemann, Hans Henrik; Stoustrup, Jakob; Abrahamsen, Rune
2004-01-01
it is possible to smoothly switch between multivariable controllers with guaranteed closed-loop stability. This includes also the case where one or more controllers are unstable. The concept for smooth online changes of multivariable controllers based on the YJBK architecture can also handle the start up...
Competing Risks Quantile Regression at Work
DEFF Research Database (Denmark)
Dlugosz, Stephan; Lo, Simon M. S.; Wilke, Ralf
2017-01-01
Despite its emergence as a frequently used method for the empirical analysis of multivariate data, quantile regression is yet to become a mainstream tool for the analysis of duration data. We present a pioneering empirical study on the grounds of a competing risks quantile regression model. We use...
Prediction of road accidents: A Bayesian hierarchical approach.
Deublein, Markus; Schubert, Matthias; Adey, Bryan T; Köhler, Jochen; Faber, Michael H
2013-03-01
In this paper a novel methodology for the prediction of the occurrence of road accidents is presented. The methodology utilizes a combination of three statistical methods: (1) gamma-updating of the occurrence rates of injury accidents and injured road users, (2) hierarchical multivariate Poisson-lognormal regression analysis taking into account correlations amongst multiple dependent model response variables and effects of discrete accident count data e.g. over-dispersion, and (3) Bayesian inference algorithms, which are applied by means of data mining techniques supported by Bayesian Probabilistic Networks in order to represent non-linearity between risk indicating and model response variables, as well as different types of uncertainties which might be present in the development of the specific models. Prior Bayesian Probabilistic Networks are first established by means of multivariate regression analysis of the observed frequencies of the model response variables, e.g. the occurrence of an accident, and observed values of the risk indicating variables, e.g. degree of road curvature. Subsequently, parameter learning is done using updating algorithms, to determine the posterior predictive probability distributions of the model response variables, conditional on the values of the risk indicating variables. The methodology is illustrated through a case study using data of the Austrian rural motorway network. In the case study, on randomly selected road segments the methodology is used to produce a model to predict the expected number of accidents in which an injury has occurred and the expected number of light, severe and fatally injured road users. Additionally, the methodology is used for geo-referenced identification of road sections with increased occurrence probabilities of injury accident events on a road link between two Austrian cities. It is shown that the proposed methodology can be used to develop models to estimate the occurrence of road accidents for any
Multivariate data analysis of 2 DE data
DEFF Research Database (Denmark)
Wulff, Tune; Jokumsen, Alfred; Jessen, Flemming
achieved by 2-DE. Protein spots, which individually or in combination with other spots varied according to hypoxia were found by multivariate data analysis (partial least squares regression) on group scaled data (normalised spot volumes) followed by selection of significant spots by jack-knifing. Tandem...
Multivariate data analysis of 2 DE data
DEFF Research Database (Denmark)
Wulff, Tune; Jokumsen, Alfred; Jessen, Flemming
achieved by 2-DE. Protein spots, which individually or in combination with other spots varied according to hypoxia were found by multivariate data analysis (partial least squares regression) on group scaled data (normalised spot volumes) followed by selection of significant spots by jack-knifing. Tandem...
A MULTIVARIATE ANALYSIS OF CROATIAN COUNTIES ENTREPRENEURSHIP
Directory of Open Access Journals (Sweden)
Elza Jurun
2012-12-01
Full Text Available In the focus of this paper is a multivariate analysis of Croatian Counties entrepreneurship. Complete data base available by official statistic institutions at national and regional level is used. Modern econometric methodology starting from a comparative analysis via multiple regression to multivariate cluster analysis is carried out as well as the analysis of successful or inefficacious entrepreneurship measured by indicators of efficiency, profitability and productivity. Time horizons of the comparative analysis are in 2004 and 2010. Accelerators of socio-economic development - number of entrepreneur investors, investment in fixed assets and current assets ratio in multiple regression model are analytically filtered between twenty-six independent variables as variables of the dominant influence on GDP per capita in 2010 as dependent variable. Results of multivariate cluster analysis of twentyone Croatian Counties are interpreted also in the sense of three Croatian NUTS 2 regions according to European nomenclature of regional territorial division of Croatia.
Varying-coefficient functional linear regression
Wu, Yichao; Müller, Hans-Georg; 10.3150/09-BEJ231
2011-01-01
Functional linear regression analysis aims to model regression relations which include a functional predictor. The analog of the regression parameter vector or matrix in conventional multivariate or multiple-response linear regression models is a regression parameter function in one or two arguments. If, in addition, one has scalar predictors, as is often the case in applications to longitudinal studies, the question arises how to incorporate these into a functional regression model. We study a varying-coefficient approach where the scalar covariates are modeled as additional arguments of the regression parameter function. This extension of the functional linear regression model is analogous to the extension of conventional linear regression models to varying-coefficient models and shares its advantages, such as increased flexibility; however, the details of this extension are more challenging in the functional case. Our methodology combines smoothing methods with regularization by truncation at a finite numb...
Methods of Multivariate Analysis
Rencher, Alvin C
2012-01-01
Praise for the Second Edition "This book is a systematic, well-written, well-organized text on multivariate analysis packed with intuition and insight . . . There is much practical wisdom in this book that is hard to find elsewhere."-IIE Transactions Filled with new and timely content, Methods of Multivariate Analysis, Third Edition provides examples and exercises based on more than sixty real data sets from a wide variety of scientific fields. It takes a "methods" approach to the subject, placing an emphasis on how students and practitioners can employ multivariate analysis in real-life sit
High-dimensional regression with unknown variance
Giraud, Christophe; Verzelen, Nicolas
2011-01-01
We review recent results for high-dimensional sparse linear regression in the practical case of unknown variance. Different sparsity settings are covered, including coordinate-sparsity, group-sparsity and variation-sparsity. The emphasize is put on non-asymptotic analyses and feasible procedures. In addition, a small numerical study compares the practical performance of three schemes for tuning the Lasso esti- mator and some references are collected for some more general models, including multivariate regression and nonparametric regression.
Multivariate Time Series Search
National Aeronautics and Space Administration — Multivariate Time-Series (MTS) are ubiquitous, and are generated in areas as disparate as sensor recordings in aerospace systems, music and video streams, medical...
Institute of Scientific and Technical Information of China (English)
赵晔
2015-01-01
目的：探讨影响狼疮性肾炎（ LN）患者并发股骨头坏死危险因素。方法将2009年8月至2013年8月收治的50例LN患者并发股骨头坏死患者作为观察组，另外选择同期收治的50例LN未并发股骨头坏死患者作为对照组。分别采用Pearson单因素与多元Logistic回归分析的方法，对影响LN患者并发股骨头坏死的危险因素进行分析。结果经Pearson单因素分析，两组患者在口腔溃疡、雷诺现象、血管炎、纤维蛋白原（ Fib）升高、总胆固醇（ TC）、三酰甘油（TG）水平方面的差异均具有统计学意义（ P <0.05~0.01）；经多元Logistic回归分析，口腔溃疡、雷诺现象、血管炎、Fib升高、TC及TG水平为影响LN并发股骨头坏死的危险因素。结论影响LN患者并发股骨头坏死的危险因素包括口腔溃疡、雷诺氏现象、血管炎、Fib升高、TC及TG水平，在实际治疗过程中应注意针对这些因素采取措施。%Objective To study the effect of lupus nephritis( LN)patients complicated with femoral head necrosis factor risk. Methods Between 2009 August to 2013 August in our hospital,50 cases of LN patients complicated with avascular necrosis of the femoral head of patients were included as the observation group,in addition to selecting 50 cases of avascular necrosis of femoral head without LN patients as control group. Using univariate Pearson and multivariate Logistic regression analysis to analyze the influence on the risk of LN patients with femoral head necrosis. Results The single factor analysis by Pearson,two groups of patients with oral ulcer,Raynaud′s phenomenon,vasculitis,elevated Fib,TC and TG level in terms of the differences were statistically significant( P <0. 05 ~0. 01). By multivariate Logistic regression analysis,oral ulcer, Raynaud′s phenomenon,vasculitis,increased Fib,TC and the level of TG( P <0. 05)are the risk factors for the effect of LN complicated with femoral head
Classifying hospitals as mortality outliers: logistic versus hierarchical logistic models.
Alexandrescu, Roxana; Bottle, Alex; Jarman, Brian; Aylin, Paul
2014-05-01
The use of hierarchical logistic regression for provider profiling has been recommended due to the clustering of patients within hospitals, but has some associated difficulties. We assess changes in hospital outlier status based on standard logistic versus hierarchical logistic modelling of mortality. The study population consisted of all patients admitted to acute, non-specialist hospitals in England between 2007 and 2011 with a primary diagnosis of acute myocardial infarction, acute cerebrovascular disease or fracture of neck of femur or a primary procedure of coronary artery bypass graft or repair of abdominal aortic aneurysm. We compared standardised mortality ratios (SMRs) from non-hierarchical models with SMRs from hierarchical models, without and with shrinkage estimates of the predicted probabilities (Model 1 and Model 2). The SMRs from standard logistic and hierarchical models were highly statistically significantly correlated (r > 0.91, p = 0.01). More outliers were recorded in the standard logistic regression than hierarchical modelling only when using shrinkage estimates (Model 2): 21 hospitals (out of a cumulative number of 565 pairs of hospitals under study) changed from a low outlier and 8 hospitals changed from a high outlier based on the logistic regression to a not-an-outlier based on shrinkage estimates. Both standard logistic and hierarchical modelling have identified nearly the same hospitals as mortality outliers. The choice of methodological approach should, however, also consider whether the modelling aim is judgment or improvement, as shrinkage may be more appropriate for the former than the latter.
Directory of Open Access Journals (Sweden)
X. Chen
2013-09-01
Full Text Available A Hierarchal Bayesian model for forecasting regional summer rainfall and streamflow season-ahead using exogenous climate variables for East Central China is presented. The model provides estimates of the posterior forecasted probability distribution for 12 rainfall and 2 streamflow stations considering parameter uncertainty, and cross-site correlation. The model has a multilevel structure with regression coefficients modeled from a common multivariate normal distribution results in partial-pooling of information across multiple stations and better representation of parameter and posterior distribution uncertainty. Covariance structure of the residuals across stations is explicitly modeled. Model performance is tested under leave-10-out cross-validation. Frequentist and Bayesian performance metrics used include Receiver Operating Characteristic, Reduction of Error, Coefficient of Efficiency, Rank Probability Skill Scores, and coverage by posterior credible intervals. The ability of the model to reliably forecast regional summer rainfall and streamflow season-ahead offers potential for developing adaptive water risk management strategies.
Multivariate Birkhoff interpolation
Lorentz, Rudolph A
1992-01-01
The subject of this book is Lagrange, Hermite and Birkhoff (lacunary Hermite) interpolation by multivariate algebraic polynomials. It unifies and extends a new algorithmic approach to this subject which was introduced and developed by G.G. Lorentz and the author. One particularly interesting feature of this algorithmic approach is that it obviates the necessity of finding a formula for the Vandermonde determinant of a multivariate interpolation in order to determine its regularity (which formulas are practically unknown anyways) by determining the regularity through simple geometric manipulations in the Euclidean space. Although interpolation is a classical problem, it is surprising how little is known about its basic properties in the multivariate case. The book therefore starts by exploring its fundamental properties and its limitations. The main part of the book is devoted to a complete and detailed elaboration of the new technique. A chapter with an extensive selection of finite elements follows as well a...
Applied multivariate statistical analysis
Härdle, Wolfgang Karl
2015-01-01
Focusing on high-dimensional applications, this 4th edition presents the tools and concepts used in multivariate data analysis in a style that is also accessible for non-mathematicians and practitioners. It surveys the basic principles and emphasizes both exploratory and inferential statistics; a new chapter on Variable Selection (Lasso, SCAD and Elastic Net) has also been added. All chapters include practical exercises that highlight applications in different multivariate data analysis fields: in quantitative financial studies, where the joint dynamics of assets are observed; in medicine, where recorded observations of subjects in different locations form the basis for reliable diagnoses and medication; and in quantitative marketing, where consumers’ preferences are collected in order to construct models of consumer behavior. All of these examples involve high to ultra-high dimensions and represent a number of major fields in big data analysis. The fourth edition of this book on Applied Multivariate ...
Causal diagrams and multivariate analysis II: precision work.
Jupiter, Daniel C
2014-01-01
In this Investigators' Corner, I continue my discussion of when and why we researchers should include variables in multivariate regression. My examination focuses on studies comparing treatment groups and situations for which we can either exclude variables from multivariate analyses or include them for reasons of precision. Copyright © 2014 American College of Foot and Ankle Surgeons. Published by Elsevier Inc. All rights reserved.
Pedrini, D. T.; Pedrini, Bonnie C.
Regression, another mechanism studied by Sigmund Freud, has had much research, e.g., hypnotic regression, frustration regression, schizophrenic regression, and infra-human-animal regression (often directly related to fixation). Many investigators worked with hypnotic age regression, which has a long history, going back to Russian reflexologists.…
Pedrini, D. T.; Pedrini, Bonnie C.
Regression, another mechanism studied by Sigmund Freud, has had much research, e.g., hypnotic regression, frustration regression, schizophrenic regression, and infra-human-animal regression (often directly related to fixation). Many investigators worked with hypnotic age regression, which has a long history, going back to Russian reflexologists.…
Micromechanics of hierarchical materials
DEFF Research Database (Denmark)
Mishnaevsky, Leon, Jr.
2012-01-01
A short overview of micromechanical models of hierarchical materials (hybrid composites, biomaterials, fractal materials, etc.) is given. Several examples of the modeling of strength and damage in hierarchical materials are summarized, among them, 3D FE model of hybrid composites...... with nanoengineered matrix, fiber bundle model of UD composites with hierarchically clustered fibers and 3D multilevel model of wood considered as a gradient, cellular material with layered composite cell walls. The main areas of research in micromechanics of hierarchical materials are identified, among them......, the investigations of the effects of load redistribution between reinforcing elements at different scale levels, of the possibilities to control different material properties and to ensure synergy of strengthening effects at different scale levels and using the nanoreinforcement effects. The main future directions...
Hierarchical auxetic mechanical metamaterials.
Gatt, Ruben; Mizzi, Luke; Azzopardi, Joseph I; Azzopardi, Keith M; Attard, Daphne; Casha, Aaron; Briffa, Joseph; Grima, Joseph N
2015-02-11
Auxetic mechanical metamaterials are engineered systems that exhibit the unusual macroscopic property of a negative Poisson's ratio due to sub-unit structure rather than chemical composition. Although their unique behaviour makes them superior to conventional materials in many practical applications, they are limited in availability. Here, we propose a new class of hierarchical auxetics based on the rotating rigid units mechanism. These systems retain the enhanced properties from having a negative Poisson's ratio with the added benefits of being a hierarchical system. Using simulations on typical hierarchical multi-level rotating squares, we show that, through design, one can control the extent of auxeticity, degree of aperture and size of the different pores in the system. This makes the system more versatile than similar non-hierarchical ones, making them promising candidates for industrial and biomedical applications, such as stents and skin grafts.
Introduction into Hierarchical Matrices
Litvinenko, Alexander
2013-12-05
Hierarchical matrices allow us to reduce computational storage and cost from cubic to almost linear. This technique can be applied for solving PDEs, integral equations, matrix equations and approximation of large covariance and precision matrices.
Hierarchical Auxetic Mechanical Metamaterials
Gatt, Ruben; Mizzi, Luke; Azzopardi, Joseph I.; Azzopardi, Keith M.; Attard, Daphne; Casha, Aaron; Briffa, Joseph; Grima, Joseph N.
2015-02-01
Auxetic mechanical metamaterials are engineered systems that exhibit the unusual macroscopic property of a negative Poisson's ratio due to sub-unit structure rather than chemical composition. Although their unique behaviour makes them superior to conventional materials in many practical applications, they are limited in availability. Here, we propose a new class of hierarchical auxetics based on the rotating rigid units mechanism. These systems retain the enhanced properties from having a negative Poisson's ratio with the added benefits of being a hierarchical system. Using simulations on typical hierarchical multi-level rotating squares, we show that, through design, one can control the extent of auxeticity, degree of aperture and size of the different pores in the system. This makes the system more versatile than similar non-hierarchical ones, making them promising candidates for industrial and biomedical applications, such as stents and skin grafts.
Applied Bayesian Hierarchical Methods
Congdon, Peter D
2010-01-01
Bayesian methods facilitate the analysis of complex models and data structures. Emphasizing data applications, alternative modeling specifications, and computer implementation, this book provides a practical overview of methods for Bayesian analysis of hierarchical models.
Programming with Hierarchical Maps
DEFF Research Database (Denmark)
Ørbæk, Peter
This report desribes the hierarchical maps used as a central data structure in the Corundum framework. We describe its most prominent features, ague for its usefulness and briefly describe some of the software prototypes implemented using the technology....
Catalysis with hierarchical zeolites
DEFF Research Database (Denmark)
Holm, Martin Spangsberg; Taarning, Esben; Egeblad, Kresten
2011-01-01
Hierarchical (or mesoporous) zeolites have attracted significant attention during the first decade of the 21st century, and so far this interest continues to increase. There have already been several reviews giving detailed accounts of the developments emphasizing different aspects of this research...... topic. Until now, the main reason for developing hierarchical zeolites has been to achieve heterogeneous catalysts with improved performance but this particular facet has not yet been reviewed in detail. Thus, the present paper summaries and categorizes the catalytic studies utilizing hierarchical...... zeolites that have been reported hitherto. Prototypical examples from some of the different categories of catalytic reactions that have been studied using hierarchical zeolite catalysts are highlighted. This clearly illustrates the different ways that improved performance can be achieved with this family...
Semiparametric Quantile Modelling of Hierarchical Data
Institute of Scientific and Technical Information of China (English)
Mao Zai TIAN; Man Lai TANG; Ping Shing CHAN
2009-01-01
The classic hierarchical linear model formulation provides a considerable flexibility for modelling the random effects structure and a powerful tool for analyzing nested data that arise in various areas such as biology, economics and education. However, it assumes the within-group errors to be independently and identically distributed (i.i.d.) and models at all levels to be linear. Most importantly, traditional hierarchical models (just like other ordinary mean regression methods) cannot characterize the entire conditional distribution of a dependent variable given a set of covariates and fail to yield robust estimators. In this article, we relax the aforementioned and normality assumptions, and develop a so-called Hierarchical Semiparametric Quantile Regression Models in which the within-group errors could be heteroscedastic and models at some levels are allowed to be nonparametric. We present the ideas with a 2-level model. The level-l model is specified as a nonparametric model whereas level-2 model is set as a parametric model. Under the proposed semiparametric setting the vector of partial derivatives of the nonparametric function in level-1 becomes the response variable vector in level 2. The proposed method allows us to model the fixed effects in the innermost level (i.e., level 2) as a function of the covariates instead of a constant effect. We outline some mild regularity conditions required for convergence and asymptotic normality for our estimators. We illustrate our methodology with a real hierarchical data set from a laboratory study and some simulation studies.
Multivariate bubbles and antibubbles
Fry, John
2014-08-01
In this paper we develop models for multivariate financial bubbles and antibubbles based on statistical physics. In particular, we extend a rich set of univariate models to higher dimensions. Changes in market regime can be explicitly shown to represent a phase transition from random to deterministic behaviour in prices. Moreover, our multivariate models are able to capture some of the contagious effects that occur during such episodes. We are able to show that declining lending quality helped fuel a bubble in the US stock market prior to 2008. Further, our approach offers interesting insights into the spatial development of UK house prices.
DEFF Research Database (Denmark)
Hansen, Michael Adsetts Edberg
Interest in statistical methodology is increasing so rapidly in the astronomical community that accessible introductory material in this area is long overdue. This book fills the gap by providing a presentation of the most useful techniques in multivariate statistics. A wide-ranging annotated set...
DEFF Research Database (Denmark)
Barndorff-Nielsen, Ole Eiler; Hansen, Peter Reinhard; Lunde, Asger
2011-01-01
We propose a multivariate realised kernel to estimate the ex-post covariation of log-prices. We show this new consistent estimator is guaranteed to be positive semi-definite and is robust to measurement error of certain types and can also handle non-synchronous trading. It is the first estimator...
DEFF Research Database (Denmark)
Hansen, Michael Adsetts Edberg
Interest in statistical methodology is increasing so rapidly in the astronomical community that accessible introductory material in this area is long overdue. This book fills the gap by providing a presentation of the most useful techniques in multivariate statistics. A wide-ranging annotated set...
A MULTIVARIATE WEIBULL DISTRIBUTION
Directory of Open Access Journals (Sweden)
Cheng Lee
2010-07-01
Full Text Available A multivariate survival function of Weibull Distribution is developed by expanding the theorem by Lu and Bhattacharyya. From the survival function, the probability density function, the cumulative probability function, the determinant of the Jacobian Matrix, and the general moment are derived.
On Bayesian shared component disease mapping and ecological regression with errors in covariates.
MacNab, Ying C
2010-05-20
Recent literature on Bayesian disease mapping presents shared component models (SCMs) for joint spatial modeling of two or more diseases with common risk factors. In this study, Bayesian hierarchical formulations of shared component disease mapping and ecological models are explored and developed in the context of ecological regression, taking into consideration errors in covariates. A review of multivariate disease mapping models (MultiVMs) such as the multivariate conditional autoregressive models that are also part of the more recent Bayesian disease mapping literature is presented. Some insights into the connections and distinctions between the SCM and MultiVM procedures are communicated. Important issues surrounding (appropriate) formulation of shared- and disease-specific components, consideration/choice of spatial or non-spatial random effects priors, and identification of model parameters in SCMs are explored and discussed in the context of spatial and ecological analysis of small area multivariate disease or health outcome rates and associated ecological risk factors. The methods are illustrated through an in-depth analysis of four-variate road traffic accident injury (RTAI) data: gender-specific fatal and non-fatal RTAI rates in 84 local health areas in British Columbia (Canada). Fully Bayesian inference via Markov chain Monte Carlo simulations is presented.
What are hierarchical models and how do we analyze them?
Royle, Andy
2016-01-01
In this chapter we provide a basic definition of hierarchical models and introduce the two canonical hierarchical models in this book: site occupancy and N-mixture models. The former is a hierarchical extension of logistic regression and the latter is a hierarchical extension of Poisson regression. We introduce basic concepts of probability modeling and statistical inference including likelihood and Bayesian perspectives. We go through the mechanics of maximizing the likelihood and characterizing the posterior distribution by Markov chain Monte Carlo (MCMC) methods. We give a general perspective on topics such as model selection and assessment of model fit, although we demonstrate these topics in practice in later chapters (especially Chapters 5, 6, 7, and 10 Chapter 5 Chapter 6 Chapter 7 Chapter 10)
Parallel hierarchical radiosity rendering
Energy Technology Data Exchange (ETDEWEB)
Carter, M.
1993-07-01
In this dissertation, the step-by-step development of a scalable parallel hierarchical radiosity renderer is documented. First, a new look is taken at the traditional radiosity equation, and a new form is presented in which the matrix of linear system coefficients is transformed into a symmetric matrix, thereby simplifying the problem and enabling a new solution technique to be applied. Next, the state-of-the-art hierarchical radiosity methods are examined for their suitability to parallel implementation, and scalability. Significant enhancements are also discovered which both improve their theoretical foundations and improve the images they generate. The resultant hierarchical radiosity algorithm is then examined for sources of parallelism, and for an architectural mapping. Several architectural mappings are discussed. A few key algorithmic changes are suggested during the process of making the algorithm parallel. Next, the performance, efficiency, and scalability of the algorithm are analyzed. The dissertation closes with a discussion of several ideas which have the potential to further enhance the hierarchical radiosity method, or provide an entirely new forum for the application of hierarchical methods.
Skopina, Maria; Protasov, Vladimir
2016-01-01
This book presents a systematic study of multivariate wavelet frames with matrix dilation, in particular, orthogonal and bi-orthogonal bases, which are a special case of frames. Further, it provides algorithmic methods for the construction of dual and tight wavelet frames with a desirable approximation order, namely compactly supported wavelet frames, which are commonly required by engineers. It particularly focuses on methods of constructing them. Wavelet bases and frames are actively used in numerous applications such as audio and graphic signal processing, compression and transmission of information. They are especially useful in image recovery from incomplete observed data due to the redundancy of frame systems. The construction of multivariate wavelet frames, especially bases, with desirable properties remains a challenging problem as although a general scheme of construction is well known, its practical implementation in the multidimensional setting is difficult. Another important feature of wavelet is ...
Multivariate calculus and geometry
Dineen, Seán
2014-01-01
Multivariate calculus can be understood best by combining geometric insight, intuitive arguments, detailed explanations and mathematical reasoning. This textbook has successfully followed this programme. It additionally provides a solid description of the basic concepts, via familiar examples, which are then tested in technically demanding situations. In this new edition the introductory chapter and two of the chapters on the geometry of surfaces have been revised. Some exercises have been replaced and others provided with expanded solutions. Familiarity with partial derivatives and a course in linear algebra are essential prerequisites for readers of this book. Multivariate Calculus and Geometry is aimed primarily at higher level undergraduates in the mathematical sciences. The inclusion of many practical examples involving problems of several variables will appeal to mathematics, science and engineering students.
Multivariate Quantitative Chemical Analysis
Kinchen, David G.; Capezza, Mary
1995-01-01
Technique of multivariate quantitative chemical analysis devised for use in determining relative proportions of two components mixed and sprayed together onto object to form thermally insulating foam. Potentially adaptable to other materials, especially in process-monitoring applications in which necessary to know and control critical properties of products via quantitative chemical analyses of products. In addition to chemical composition, also used to determine such physical properties as densities and strengths.
Multivariate $\\alpha$-molecules
Flinth, Axel; Schäfer, Martin
2015-01-01
The suboptimal performance of wavelets with regard to the approximation of multivariate data gave rise to new representation systems, specifically designed for data with anisotropic features. Some prominent examples of these are given by ridgelets, curvelets, and shearlets, to name a few. The great variety of such so-called directional systems motivated the search for a common framework, which unites many under one roof and enables a simultaneous analysis, for example with respect to approxim...
Transient multivariable sensor evaluation
Energy Technology Data Exchange (ETDEWEB)
Vilim, Richard B.; Heifetz, Alexander
2017-02-21
A method and system for performing transient multivariable sensor evaluation. The method and system includes a computer system for identifying a model form, providing training measurement data, generating a basis vector, monitoring system data from sensor, loading the system data in a non-transient memory, performing an estimation to provide desired data and comparing the system data to the desired data and outputting an alarm for a defective sensor.
Transient multivariable sensor evaluation
Vilim, Richard B.; Heifetz, Alexander
2017-02-21
A method and system for performing transient multivariable sensor evaluation. The method and system includes a computer system for identifying a model form, providing training measurement data, generating a basis vector, monitoring system data from sensor, loading the system data in a non-transient memory, performing an estimation to provide desired data and comparing the system data to the desired data and outputting an alarm for a defective sensor.
Multivariate Quantitative Chemical Analysis
Kinchen, David G.; Capezza, Mary
1995-01-01
Technique of multivariate quantitative chemical analysis devised for use in determining relative proportions of two components mixed and sprayed together onto object to form thermally insulating foam. Potentially adaptable to other materials, especially in process-monitoring applications in which necessary to know and control critical properties of products via quantitative chemical analyses of products. In addition to chemical composition, also used to determine such physical properties as densities and strengths.
Regression analysis by example
National Research Council Canada - National Science Library
Chatterjee, Samprit; Hadi, Ali S
2012-01-01
.... The emphasis continues to be on exploratory data analysis rather than statistical theory. The coverage offers in-depth treatment of regression diagnostics, transformation, multicollinearity, logistic regression, and robust regression...
Hierarchical spatial point process analysis for a plant community with high biodiversity
DEFF Research Database (Denmark)
Illian, Janine B.; Møller, Jesper; Waagepetersen, Rasmus
2009-01-01
A complex multivariate spatial point pattern of a plant community with high biodiversity is modelled using a hierarchical multivariate point process model. In the model, interactions between plants with different post-fire regeneration strategies are of key interest. We consider initially a maximum...
Neutrosophic Hierarchical Clustering Algoritms
Directory of Open Access Journals (Sweden)
Rıdvan Şahin
2014-03-01
Full Text Available Interval neutrosophic set (INS is a generalization of interval valued intuitionistic fuzzy set (IVIFS, whose the membership and non-membership values of elements consist of fuzzy range, while single valued neutrosophic set (SVNS is regarded as extension of intuitionistic fuzzy set (IFS. In this paper, we extend the hierarchical clustering techniques proposed for IFSs and IVIFSs to SVNSs and INSs respectively. Based on the traditional hierarchical clustering procedure, the single valued neutrosophic aggregation operator, and the basic distance measures between SVNSs, we define a single valued neutrosophic hierarchical clustering algorithm for clustering SVNSs. Then we extend the algorithm to classify an interval neutrosophic data. Finally, we present some numerical examples in order to show the effectiveness and availability of the developed clustering algorithms.
Directory of Open Access Journals (Sweden)
Joana Vitte
2017-08-01
Full Text Available Molecular-based allergy diagnosis yields multiple biomarker datasets. The classical diagnostic score for allergic bronchopulmonary aspergillosis (ABPA, a severe disease usually occurring in asthmatic patients and people with cystic fibrosis, comprises succinct immunological criteria formulated in 1977: total IgE, anti-Aspergillus fumigatus (Af IgE, anti-Af “precipitins,” and anti-Af IgG. Progress achieved over the last four decades led to multiple IgE and IgG(4 Af biomarkers available with quantitative, standardized, molecular-level reports. These newly available biomarkers have not been included in the current diagnostic criteria, either individually or in algorithms, despite persistent underdiagnosis of ABPA. Large numbers of individual biomarkers may hinder their use in clinical practice. Conversely, multivariate analysis using new tools may bring about a better chance of less diagnostic mistakes. We report here a proof-of-concept work consisting of a three-step multivariate analysis of Af IgE, IgG, and IgG4 biomarkers through a combination of principal component analysis, hierarchical ascendant classification, and classification and regression tree multivariate analysis. The resulting diagnostic algorithms might show the way for novel criteria and improved diagnostic efficiency in Af-sensitized patients at risk for ABPA.
Vitte, Joana; Ranque, Stéphane; Carsin, Ania; Gomez, Carine; Romain, Thomas; Cassagne, Carole; Gouitaa, Marion; Baravalle-Einaudi, Mélisande; Bel, Nathalie Stremler-Le; Reynaud-Gaubert, Martine; Dubus, Jean-Christophe; Mège, Jean-Louis; Gaudart, Jean
2017-01-01
Molecular-based allergy diagnosis yields multiple biomarker datasets. The classical diagnostic score for allergic bronchopulmonary aspergillosis (ABPA), a severe disease usually occurring in asthmatic patients and people with cystic fibrosis, comprises succinct immunological criteria formulated in 1977: total IgE, anti-Aspergillus fumigatus (Af) IgE, anti-Af "precipitins," and anti-Af IgG. Progress achieved over the last four decades led to multiple IgE and IgG(4) Af biomarkers available with quantitative, standardized, molecular-level reports. These newly available biomarkers have not been included in the current diagnostic criteria, either individually or in algorithms, despite persistent underdiagnosis of ABPA. Large numbers of individual biomarkers may hinder their use in clinical practice. Conversely, multivariate analysis using new tools may bring about a better chance of less diagnostic mistakes. We report here a proof-of-concept work consisting of a three-step multivariate analysis of Af IgE, IgG, and IgG4 biomarkers through a combination of principal component analysis, hierarchical ascendant classification, and classification and regression tree multivariate analysis. The resulting diagnostic algorithms might show the way for novel criteria and improved diagnostic efficiency in Af-sensitized patients at risk for ABPA.
Multivariate methods and forecasting with IBM SPSS statistics
Aljandali, Abdulkader
2017-01-01
This is the second of a two-part guide to quantitative analysis using the IBM SPSS Statistics software package; this volume focuses on multivariate statistical methods and advanced forecasting techniques. More often than not, regression models involve more than one independent variable. For example, forecasting methods are commonly applied to aggregates such as inflation rates, unemployment, exchange rates, etc., that have complex relationships with determining variables. This book introduces multivariate regression models and provides examples to help understand theory underpinning the model. The book presents the fundamentals of multivariate regression and then moves on to examine several related techniques that have application in business-orientated fields such as logistic and multinomial regression. Forecasting tools such as the Box-Jenkins approach to time series modeling are introduced, as well as exponential smoothing and naïve techniques. This part also covers hot topics such as Factor Analysis, Dis...
Multivariate Statistical Process Control
DEFF Research Database (Denmark)
Kulahci, Murat
2013-01-01
As sensor and computer technology continues to improve, it becomes a normal occurrence that we confront with high dimensional data sets. As in many areas of industrial statistics, this brings forth various challenges in statistical process control (SPC) and monitoring for which the aim...... is to identify “out-of-control” state of a process using control charts in order to reduce the excessive variation caused by so-called assignable causes. In practice, the most common method of monitoring multivariate data is through a statistic akin to the Hotelling’s T2. For high dimensional data with excessive...
Hierarchical Porous Structures
Energy Technology Data Exchange (ETDEWEB)
Grote, Christopher John [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)
2016-06-07
Materials Design is often at the forefront of technological innovation. While there has always been a push to generate increasingly low density materials, such as aero or hydrogels, more recently the idea of bicontinuous structures has gone more into play. This review will cover some of the methods and applications for generating both porous, and hierarchically porous structures.
Institute of Scientific and Technical Information of China (English)
王莉; 尹春燕; 肖延风; 薛晚利
2013-01-01
目的：研究单纯性肥胖患儿BM I相关危险因素，为预防肥胖相关疾病在儿童中的流行提供一定的指导。方法：对68例单纯性肥胖儿童测量身高、体重，留空腹血并检测血脂、肝功、血糖、胰岛素、C反应蛋白、单核细胞趋化因子、内脂素。结果：线性相关分析显示，BM I与C反应蛋白、单核细胞趋化因子、谷草转氨酶、甘油三酯、总胆固醇、高密度脂蛋白、低密度脂蛋白、胰岛素等具有简单线性相关；多元线性回归显示，甘油三酯、胰岛素、C反应蛋白、低密度脂蛋白与BM I密切相关。结论：甘油三酯、胰岛素、C反应蛋白、低密度脂蛋白与肥胖相关疾病密切相关，肥胖儿童通过降低其BM I水平，对于预防血脂紊乱、炎症性疾病及2型糖尿病的发生有重要意义。%Objective:To study the relation between BMI and risk factors of obesity-related disease and provide some guidance for preventing of obesity-related disease prevalence in children .Method:We recruited 68 obese children and the subjects’ body weight ,height were measured according to standardized techniques .Also lip-ids ,liver function ,blood glucose ,insulin ,C-reactive protein ,monocyte chemotactic factor ,visfatin were tested . Results :The linear correlation analysis showed that BMI had a simple linear correlation with C-reactive protein , monocyte chemotactic factor ,aspartate aminotransferase ,triglycerides ,total cholesterol ,high density lipoprotein , LDL and insulin ,we analysis with multivariate linear regression and it showed that triglycerides ,insulin ,C-reactive protein ,low-density lipoprotein were closely related with BMI .Conclusion :Triglycerides ,insulin ,C-reactive pro-tein and low-density lipoprotein are closely related to obesity-related diseases .It is important for obese children re-duce the level of BMI to prevent lipid disorders ,inflammatory diseases and type 2 diabetes .
Urban water quality evaluation using multivariate analysis
Directory of Open Access Journals (Sweden)
Petr Praus
2007-06-01
Full Text Available A data set, obtained for the sake of drinking water quality monitoring, was analysed by multivariate methods. Principal component analysis (PCA reduced the data dimensionality from 18 original physico-chemical and microbiological parameters determined in drinking water samples to 6 principal components explaining about 83 % of the data variability. These 6 components represented inorganic salts, nitrate/pH, iron, chlorine, nitrite/ammonium traces, and heterotrophic bacteria. Using the PCA scatter plot and the Ward's clustering of the samples characterized by the first and second principal components, three clusters were revealed. These clusters sorted drinking water samples according to their origin - ground and surface water. The PCA results were confirmed by the factor analysis and hierarchical clustering of the original data.
Logistic regression: a brief primer.
Stoltzfus, Jill C
2011-10-01
Regression techniques are versatile in their application to medical research because they can measure associations, predict outcomes, and control for confounding variable effects. As one such technique, logistic regression is an efficient and powerful way to analyze the effect of a group of independent variables on a binary outcome by quantifying each independent variable's unique contribution. Using components of linear regression reflected in the logit scale, logistic regression iteratively identifies the strongest linear combination of variables with the greatest probability of detecting the observed outcome. Important considerations when conducting logistic regression include selecting independent variables, ensuring that relevant assumptions are met, and choosing an appropriate model building strategy. For independent variable selection, one should be guided by such factors as accepted theory, previous empirical investigations, clinical considerations, and univariate statistical analyses, with acknowledgement of potential confounding variables that should be accounted for. Basic assumptions that must be met for logistic regression include independence of errors, linearity in the logit for continuous variables, absence of multicollinearity, and lack of strongly influential outliers. Additionally, there should be an adequate number of events per independent variable to avoid an overfit model, with commonly recommended minimum "rules of thumb" ranging from 10 to 20 events per covariate. Regarding model building strategies, the three general types are direct/standard, sequential/hierarchical, and stepwise/statistical, with each having a different emphasis and purpose. Before reaching definitive conclusions from the results of any of these methods, one should formally quantify the model's internal validity (i.e., replicability within the same data set) and external validity (i.e., generalizability beyond the current sample). The resulting logistic regression model
Approximation by Multivariate Singular Integrals
Anastassiou, George A
2011-01-01
Approximation by Multivariate Singular Integrals is the first monograph to illustrate the approximation of multivariate singular integrals to the identity-unit operator. The basic approximation properties of the general multivariate singular integral operators is presented quantitatively, particularly special cases such as the multivariate Picard, Gauss-Weierstrass, Poisson-Cauchy and trigonometric singular integral operators are examined thoroughly. This book studies the rate of convergence of these operators to the unit operator as well as the related simultaneous approximation. The last cha
Distributed Monitoring of the R2 Statistic for Linear Regression
National Aeronautics and Space Administration — The problem of monitoring a multivariate linear regression model is relevant in studying the evolving relationship between a set of input variables (features) and...
MODELING SNAKE MICROHABITAT FROM RADIOTELEMETRY STUDIES USING POLYTOMOUS LOGISTIC REGRESSION
Multivariate analysis of snake microhabitat has historically used techniques that were derived under assumptions of normality and common covariance structure (e.g., discriminant function analysis, MANOVA). In this study, polytomous logistic regression (PLR which does not require ...
Analog multivariate counting analyzers
Nikitin, A V; Armstrong, T P
2003-01-01
Characterizing rates of occurrence of various features of a signal is of great importance in numerous types of physical measurements. Such signal features can be defined as certain discrete coincidence events, e.g. crossings of a signal with a given threshold, or occurrence of extrema of a certain amplitude. We describe measuring rates of such events by means of analog multivariate counting analyzers. Given a continuous scalar or multicomponent (vector) input signal, an analog counting analyzer outputs a continuous signal with the instantaneous magnitude equal to the rate of occurrence of certain coincidence events. The analog nature of the proposed analyzers allows us to reformulate many problems of the traditional counting measurements, and cast them in a form which is readily addressed by methods of differential calculus rather than by algebraic or logical means of digital signal processing. Analog counting analyzers can be easily implemented in discrete or integrated electronic circuits, do not suffer fro...
Noor Rashidah Rashid
2012-01-01
Cluster Analysis is a multivariate method in statistics. Agglomerative Hierarchical Cluster Analysis is one of approaches in Cluster Analysis. There are two linkage methods in Agglomerative Hierarchical Cluster Analysis which are Single Linkage and Complete Linkage. The purpose of this study is to compare between Single Linkage and Complete Linkage in Agglomerative Hierarchical Cluster Analysis. The comparison of performances between these linkage methods was shown by using Kruskal-Wallis tes...
Multivariate statistical methods a primer
Manly, Bryan FJ
2004-01-01
THE MATERIAL OF MULTIVARIATE ANALYSISExamples of Multivariate DataPreview of Multivariate MethodsThe Multivariate Normal DistributionComputer ProgramsGraphical MethodsChapter SummaryReferencesMATRIX ALGEBRAThe Need for Matrix AlgebraMatrices and VectorsOperations on MatricesMatrix InversionQuadratic FormsEigenvalues and EigenvectorsVectors of Means and Covariance MatricesFurther Reading Chapter SummaryReferencesDISPLAYING MULTIVARIATE DATAThe Problem of Displaying Many Variables in Two DimensionsPlotting index VariablesThe Draftsman's PlotThe Representation of Individual Data P:ointsProfiles o
Fast, Linear Time Hierarchical Clustering using the Baire Metric
Contreras, Pedro
2011-01-01
The Baire metric induces an ultrametric on a dataset and is of linear computational complexity, contrasted with the standard quadratic time agglomerative hierarchical clustering algorithm. In this work we evaluate empirically this new approach to hierarchical clustering. We compare hierarchical clustering based on the Baire metric with (i) agglomerative hierarchical clustering, in terms of algorithm properties; (ii) generalized ultrametrics, in terms of definition; and (iii) fast clustering through k-means partititioning, in terms of quality of results. For the latter, we carry out an in depth astronomical study. We apply the Baire distance to spectrometric and photometric redshifts from the Sloan Digital Sky Survey using, in this work, about half a million astronomical objects. We want to know how well the (more costly to determine) spectrometric redshifts can predict the (more easily obtained) photometric redshifts, i.e. we seek to regress the spectrometric on the photometric redshifts, and we use clusterwi...
Hierarchical manifold learning.
Bhatia, Kanwal K; Rao, Anil; Price, Anthony N; Wolz, Robin; Hajnal, Jo; Rueckert, Daniel
2012-01-01
We present a novel method of hierarchical manifold learning which aims to automatically discover regional variations within images. This involves constructing manifolds in a hierarchy of image patches of increasing granularity, while ensuring consistency between hierarchy levels. We demonstrate its utility in two very different settings: (1) to learn the regional correlations in motion within a sequence of time-resolved images of the thoracic cavity; (2) to find discriminative regions of 3D brain images in the classification of neurodegenerative disease,
Hierarchically Structured Electrospun Fibers
Directory of Open Access Journals (Sweden)
Nicole E. Zander
2013-01-01
Full Text Available Traditional electrospun nanofibers have a myriad of applications ranging from scaffolds for tissue engineering to components of biosensors and energy harvesting devices. The generally smooth one-dimensional structure of the fibers has stood as a limitation to several interesting novel applications. Control of fiber diameter, porosity and collector geometry will be briefly discussed, as will more traditional methods for controlling fiber morphology and fiber mat architecture. The remainder of the review will focus on new techniques to prepare hierarchically structured fibers. Fibers with hierarchical primary structures—including helical, buckled, and beads-on-a-string fibers, as well as fibers with secondary structures, such as nanopores, nanopillars, nanorods, and internally structured fibers and their applications—will be discussed. These new materials with helical/buckled morphology are expected to possess unique optical and mechanical properties with possible applications for negative refractive index materials, highly stretchable/high-tensile-strength materials, and components in microelectromechanical devices. Core-shell type fibers enable a much wider variety of materials to be electrospun and are expected to be widely applied in the sensing, drug delivery/controlled release fields, and in the encapsulation of live cells for biological applications. Materials with a hierarchical secondary structure are expected to provide new superhydrophobic and self-cleaning materials.
Pearce, Dave; Walter, Anton; Lupton, W. F.; Warren-Smith, Rodney F.; Lawden, Mike; McIlwrath, Brian; Peden, J. C. M.; Jenness, Tim; Draper, Peter W.
2015-02-01
The Hierarchical Data System (HDS) is a file-based hierarchical data system designed for the storage of a wide variety of information. It is particularly suited to the storage of large multi-dimensional arrays (with their ancillary data) where efficient access is needed. It is a key component of the Starlink software collection (ascl:1110.012) and is used by the Starlink N-Dimensional Data Format (NDF) library (ascl:1411.023). HDS organizes data into hierarchies, broadly similar to the directory structure of a hierarchical filing system, but contained within a single HDS container file. The structures stored in these files are self-describing and flexible; HDS supports modification and extension of structures previously created, as well as functions such as deletion, copying, and renaming. All information stored in HDS files is portable between the machines on which HDS is implemented. Thus, there are no format conversion problems when moving between machines. HDS can write files in a private binary format (version 4), or be layered on top of HDF5 (version 5).
Hierarchical video summarization
Ratakonda, Krishna; Sezan, M. Ibrahim; Crinon, Regis J.
1998-12-01
We address the problem of key-frame summarization of vide in the absence of any a priori information about its content. This is a common problem that is encountered in home videos. We propose a hierarchical key-frame summarization algorithm where a coarse-to-fine key-frame summary is generated. A hierarchical key-frame summary facilitates multi-level browsing where the user can quickly discover the content of the video by accessing its coarsest but most compact summary and then view a desired segment of the video with increasingly more detail. At the finest level, the summary is generated on the basis of color features of video frames, using an extension of a recently proposed key-frame extraction algorithm. The finest level key-frames are recursively clustered using a novel pairwise K-means clustering approach with temporal consecutiveness constraint. We also address summarization of MPEG-2 compressed video without fully decoding the bitstream. We also propose efficient mechanisms that facilitate decoding the video when the hierarchical summary is utilized in browsing and playback of video segments starting at selected key-frames.
Astronomical Methods for Nonparametric Regression
Steinhardt, Charles L.; Jermyn, Adam
2017-01-01
I will discuss commonly used techniques for nonparametric regression in astronomy. We find that several of them, particularly running averages and running medians, are generically biased, asymmetric between dependent and independent variables, and perform poorly in recovering the underlying function, even when errors are present only in one variable. We then examine less-commonly used techniques such as Multivariate Adaptive Regressive Splines and Boosted Trees and find them superior in bias, asymmetry, and variance both theoretically and in practice under a wide range of numerical benchmarks. In this context the chief advantage of the common techniques is runtime, which even for large datasets is now measured in microseconds compared with milliseconds for the more statistically robust techniques. This points to a tradeoff between bias, variance, and computational resources which in recent years has shifted heavily in favor of the more advanced methods, primarily driven by Moore's Law. Along these lines, we also propose a new algorithm which has better overall statistical properties than all techniques examined thus far, at the cost of significantly worse runtime, in addition to providing guidance on choosing the nonparametric regression technique most suitable to any specific problem. We then examine the more general problem of errors in both variables and provide a new algorithm which performs well in most cases and lacks the clear asymmetry of existing non-parametric methods, which fail to account for errors in both variables.
Regression analysis by example
Chatterjee, Samprit
2012-01-01
Praise for the Fourth Edition: ""This book is . . . an excellent source of examples for regression analysis. It has been and still is readily readable and understandable."" -Journal of the American Statistical Association Regression analysis is a conceptually simple method for investigating relationships among variables. Carrying out a successful application of regression analysis, however, requires a balance of theoretical results, empirical rules, and subjective judgment. Regression Analysis by Example, Fifth Edition has been expanded
Scale of association: hierarchical linear models and the measurement of ecological systems
Sean M. McMahon; Jeffrey M. Diez
2007-01-01
A fundamental challenge to understanding patterns in ecological systems lies in employing methods that can analyse, test and draw inference from measured associations between variables across scales. Hierarchical linear models (HLM) use advanced estimation algorithms to measure regression relationships and variance-covariance parameters in hierarchically structured...
Unitary Response Regression Models
Lipovetsky, S.
2007-01-01
The dependent variable in a regular linear regression is a numerical variable, and in a logistic regression it is a binary or categorical variable. In these models the dependent variable has varying values. However, there are problems yielding an identity output of a constant value which can also be modelled in a linear or logistic regression with…
Flexible survival regression modelling
DEFF Research Database (Denmark)
Cortese, Giuliana; Scheike, Thomas H; Martinussen, Torben
2009-01-01
Regression analysis of survival data, and more generally event history data, is typically based on Cox's regression model. We here review some recent methodology, focusing on the limitations of Cox's regression model. The key limitation is that the model is not well suited to represent time-varyi...
DEFF Research Database (Denmark)
Fitzenberger, Bernd; Wilke, Ralf Andreas
2015-01-01
Quantile regression is emerging as a popular statistical approach, which complements the estimation of conditional mean models. While the latter only focuses on one aspect of the conditional distribution of the dependent variable, the mean, quantile regression provides more detailed insights by m...... treatment of the topic is based on the perspective of applied researchers using quantile regression in their empirical work....
Practical multivariate analysis
Afifi, Abdelmonem; Clark, Virginia A
2011-01-01
""First of all, it is very easy to read. … The authors manage to introduce and (at least partially) explain even quite complex concepts, e.g. eigenvalues, in an easy and pedagogical way that I suppose is attractive to readers without deeper statistical knowledge. The text is also sprinkled with references for those who want to probe deeper into a certain topic. Secondly, I personally find the book's emphasis on practical data handling very appealing. … Thirdly, the book gives very nice coverage of regression analysis. … this is a nicely written book that gives a good overview of a large number
Naghshpour, Shahdad
2012-01-01
Regression analysis is the most commonly used statistical method in the world. Although few would characterize this technique as simple, regression is in fact both simple and elegant. The complexity that many attribute to regression analysis is often a reflection of their lack of familiarity with the language of mathematics. But regression analysis can be understood even without a mastery of sophisticated mathematical concepts. This book provides the foundation and will help demystify regression analysis using examples from economics and with real data to show the applications of the method. T
A hierarchical linear model for tree height prediction.
Vicente J. Monleon
2003-01-01
Measuring tree height is a time-consuming process. Often, tree diameter is measured and height is estimated from a published regression model. Trees used to develop these models are clustered into stands, but this structure is ignored and independence is assumed. In this study, hierarchical linear models that account explicitly for the clustered structure of the data...
The toolkit for multivariate data analysis TMVA 4
Speckmayer, P; Stelzer, J; Voss, H
2010-01-01
The toolkit for multivariate analysis, TMVA, provides a large set of advanced multivariate analysis techniques for signal/background classification. In addition, TMVA now also contains regression analysis, all embedded in a framework capable of handling the preprocessing of the data and the evaluation of the output, thus allowing a simple and convenient use of multivariate techniques. The analysis techniques implemented in TMVA can be invoked easily and the direct comparison of their performance allows the user to choose the most appropriate for a particular data analysis. This article gives an overview of the TMVA package and presents recently developed features.
A Hierarchical Framework for Facial Age Estimation
Directory of Open Access Journals (Sweden)
Yuyu Liang
2014-01-01
Full Text Available Age estimation is a complex issue of multiclassification or regression. To address the problems of uneven distribution of age database and ignorance of ordinal information, this paper shows a hierarchic age estimation system, comprising age group and specific age estimation. In our system, two novel classifiers, sequence k-nearest neighbor (SKNN and ranking-KNN, are introduced to predict age group and value, respectively. Notably, ranking-KNN utilizes the ordinal information between samples in estimation process rather than regards samples as separate individuals. Tested on FG-NET database, our system achieves 4.97 evaluated by MAE (mean absolute error for age estimation.
Multivariate statistics exercises and solutions
Härdle, Wolfgang Karl
2015-01-01
The authors present tools and concepts of multivariate data analysis by means of exercises and their solutions. The first part is devoted to graphical techniques. The second part deals with multivariate random variables and presents the derivation of estimators and tests for various practical situations. The last part introduces a wide variety of exercises in applied multivariate data analysis. The book demonstrates the application of simple calculus and basic multivariate methods in real life situations. It contains altogether more than 250 solved exercises which can assist a university teacher in setting up a modern multivariate analysis course. All computer-based exercises are available in the R language. All R codes and data sets may be downloaded via the quantlet download center www.quantlet.org or via the Springer webpage. For interactive display of low-dimensional projections of a multivariate data set, we recommend GGobi.
Detecting Hierarchical Structure in Networks
DEFF Research Database (Denmark)
Herlau, Tue; Mørup, Morten; Schmidt, Mikkel Nørgaard;
2012-01-01
a generative Bayesian model that is able to infer whether hierarchies are present or not from a hypothesis space encompassing all types of hierarchical tree structures. For efficient inference we propose a collapsed Gibbs sampling procedure that jointly infers a partition and its hierarchical structure......Many real-world networks exhibit hierarchical organization. Previous models of hierarchies within relational data has focused on binary trees; however, for many networks it is unknown whether there is hierarchical structure, and if there is, a binary tree might not account well for it. We propose....... On synthetic and real data we demonstrate that our model can detect hierarchical structure leading to better link-prediction than competing models. Our model can be used to detect if a network exhibits hierarchical structure, thereby leading to a better comprehension and statistical account the network....
Context updates are hierarchical
Directory of Open Access Journals (Sweden)
Anton Karl Ingason
2016-10-01
Full Text Available This squib studies the order in which elements are added to the shared context of interlocutors in a conversation. It focuses on context updates within one hierarchical structure and argues that structurally higher elements are entered into the context before lower elements, even if the structurally higher elements are pronounced after the lower elements. The crucial data are drawn from a comparison of relative clauses in two head-initial languages, English and Icelandic, and two head-final languages, Korean and Japanese. The findings have consequences for any theory of a dynamic semantics.
Autistic epileptiform regression.
Canitano, Roberto; Zappella, Michele
2006-01-01
Autistic regression is a well known condition that occurs in one third of children with pervasive developmental disorders, who, after normal development in the first year of life, undergo a global regression during the second year that encompasses language, social skills and play. In a portion of these subjects, epileptiform abnormalities are present with or without seizures, resembling, in some respects, other epileptiform regressions of language and behaviour such as Landau-Kleffner syndrome. In these cases, for a more accurate definition of the clinical entity, the term autistic epileptifom regression has been suggested. As in other epileptic syndromes with regression, the relationships between EEG abnormalities, language and behaviour, in autism, are still unclear. We describe two cases of autistic epileptiform regression selected from a larger group of children with autistic spectrum disorders, with the aim of discussing the clinical features of the condition, the therapeutic approach and the outcome.
Scaled Sparse Linear Regression
Sun, Tingni
2011-01-01
Scaled sparse linear regression jointly estimates the regression coefficients and noise level in a linear model. It chooses an equilibrium with a sparse regression method by iteratively estimating the noise level via the mean residual squares and scaling the penalty in proportion to the estimated noise level. The iterative algorithm costs nearly nothing beyond the computation of a path of the sparse regression estimator for penalty levels above a threshold. For the scaled Lasso, the algorithm is a gradient descent in a convex minimization of a penalized joint loss function for the regression coefficients and noise level. Under mild regularity conditions, we prove that the method yields simultaneously an estimator for the noise level and an estimated coefficient vector in the Lasso path satisfying certain oracle inequalities for the estimation of the noise level, prediction, and the estimation of regression coefficients. These oracle inequalities provide sufficient conditions for the consistency and asymptotic...
Rolling Regressions with Stata
Kit Baum
2004-01-01
This talk will describe some work underway to add a "rolling regression" capability to Stata's suite of time series features. Although commands such as "statsby" permit analysis of non-overlapping subsamples in the time domain, they are not suited to the analysis of overlapping (e.g. "moving window") samples. Both moving-window and widening-window techniques are often used to judge the stability of time series regression relationships. We will present an implementation of a rolling regression...
Institute of Scientific and Technical Information of China (English)
Guijun YANG; Lu LIN; Runchu ZHANG
2007-01-01
Quasi-regression, motivated by the problems arising in the computer experiments, focuses mainly on speeding up evaluation. However, its theoretical properties are unexplored systemically. This paper shows that quasi-regression is unbiased, strong convergent and asymptotic normal for parameter estimations but it is biased for the fitting of curve. Furthermore, a new method called unbiased quasi-regression is proposed. In addition to retaining the above asymptotic behaviors of parameter estimations, unbiased quasi-regression is unbiased for the fitting of curve.
Introduction to regression graphics
Cook, R Dennis
2009-01-01
Covers the use of dynamic and interactive computer graphics in linear regression analysis, focusing on analytical graphics. Features new techniques like plot rotation. The authors have composed their own regression code, using Xlisp-Stat language called R-code, which is a nearly complete system for linear regression analysis and can be utilized as the main computer program in a linear regression course. The accompanying disks, for both Macintosh and Windows computers, contain the R-code and Xlisp-Stat. An Instructor's Manual presenting detailed solutions to all the problems in the book is ava
Weisberg, Sanford
2005-01-01
Master linear regression techniques with a new edition of a classic text Reviews of the Second Edition: ""I found it enjoyable reading and so full of interesting material that even the well-informed reader will probably find something new . . . a necessity for all of those who do linear regression."" -Technometrics, February 1987 ""Overall, I feel that the book is a valuable addition to the now considerable list of texts on applied linear regression. It should be a strong contender as the leading text for a first serious course in regression analysis."" -American Scientist, May-June 1987
Multivariate Bioclimatic Ecosystem Change Approaches
2015-02-06
conclude that an analogous patch did not exist. It must exist somewhere, but some of the other MVA techniques were restricted by the mathematical ...found that the Primarily Analogous Multivariate approach developed during this research clearly distinguished itself from the other five approaches in...Principally Analogous Multivariate (PAM) approach ............................................... 29 4.6.1 Introduction to the PAM approach
Multivariate Modelling via Matrix Subordination
DEFF Research Database (Denmark)
Nicolato, Elisa
stochastic volatility via time-change is quite ineffective when applied to the multivariate setting. In this work we propose a new class of models, which is obtained by conditioning a multivariate Brownian Motion to a so-called matrix subordinator. The obtained model-class encompasses the vast majority...
Multivariate covariance generalized linear models
DEFF Research Database (Denmark)
Bonat, W. H.; Jørgensen, Bent
2016-01-01
We propose a general framework for non-normal multivariate data analysis called multivariate covariance generalized linear models, designed to handle multivariate response variables, along with a wide range of temporal and spatial correlation structures defined in terms of a covariance link...... function combined with a matrix linear predictor involving known matrices. The method is motivated by three data examples that are not easily handled by existing methods. The first example concerns multivariate count data, the second involves response variables of mixed types, combined with repeated...... are fitted by using an efficient Newton scoring algorithm based on quasi-likelihood and Pearson estimating functions, using only second-moment assumptions. This provides a unified approach to a wide variety of types of response variables and covariance structures, including multivariate extensions...
Energy Technology Data Exchange (ETDEWEB)
Gerber, Samuel [Univ. of Utah, Salt Lake City, UT (United States); Rubel, Oliver [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Bremer, Peer -Timo [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Pascucci, Valerio [Univ. of Utah, Salt Lake City, UT (United States); Whitaker, Ross T. [Univ. of Utah, Salt Lake City, UT (United States)
2012-01-19
This paper introduces a novel partition-based regression approach that incorporates topological information. Partition-based regression typically introduces a quality-of-fit-driven decomposition of the domain. The emphasis in this work is on a topologically meaningful segmentation. Thus, the proposed regression approach is based on a segmentation induced by a discrete approximation of the Morse–Smale complex. This yields a segmentation with partitions corresponding to regions of the function with a single minimum and maximum that are often well approximated by a linear model. This approach yields regression models that are amenable to interpretation and have good predictive capacity. Typically, regression estimates are quantified by their geometrical accuracy. For the proposed regression, an important aspect is the quality of the segmentation itself. Thus, this article introduces a new criterion that measures the topological accuracy of the estimate. The topological accuracy provides a complementary measure to the classical geometrical error measures and is very sensitive to overfitting. The Morse–Smale regression is compared to state-of-the-art approaches in terms of geometry and topology and yields comparable or improved fits in many cases. Finally, a detailed study on climate-simulation data demonstrates the application of the Morse–Smale regression. Supplementary Materials are available online and contain an implementation of the proposed approach in the R package msr, an analysis and simulations on the stability of the Morse–Smale complex approximation, and additional tables for the climate-simulation study.
Flow mapping and multivariate visualization of large spatial interaction data.
Guo, Diansheng
2009-01-01
Spatial interactions (or flows), such as population migration and disease spread, naturally form a weighted location-to-location network (graph). Such geographically embedded networks (graphs) are usually very large. For example, the county-to-county migration data in the U.S. has thousands of counties and about a million migration paths. Moreover, many variables are associated with each flow, such as the number of migrants for different age groups, income levels, and occupations. It is a challenging task to visualize such data and discover network structures, multivariate relations, and their geographic patterns simultaneously. This paper addresses these challenges by developing an integrated interactive visualization framework that consists three coupled components: (1) a spatially constrained graph partitioning method that can construct a hierarchy of geographical regions (communities), where there are more flows or connections within regions than across regions; (2) a multivariate clustering and visualization method to detect and present multivariate patterns in the aggregated region-to-region flows; and (3) a highly interactive flow mapping component to map both flow and multivariate patterns in the geographic space, at different hierarchical levels. The proposed approach can process relatively large data sets and effectively discover and visualize major flow structures and multivariate relations at the same time. User interactions are supported to facilitate the understanding of both an overview and detailed patterns.
DEFF Research Database (Denmark)
Bordacconi, Mats Joe; Larsen, Martin Vinæs
2014-01-01
Humans are fundamentally primed for making causal attributions based on correlations. This implies that researchers must be careful to present their results in a manner that inhibits unwarranted causal attribution. In this paper, we present the results of an experiment that suggests regression...... models – one of the primary vehicles for analyzing statistical results in political science – encourage causal interpretation. Specifically, we demonstrate that presenting observational results in a regression model, rather than as a simple comparison of means, makes causal interpretation of the results...... of equivalent results presented as either regression models or as a test of two sample means. Our experiment shows that the subjects who were presented with results as estimates from a regression model were more inclined to interpret these results causally. Our experiment implies that scholars using regression...
Analysis of the real EADGENE data set: Multivariate approaches and post analyis
Sorensen, P.; Bonnet, A.; Buitenhuis, B.; Closset, R.; Dejean, S.; Delmas, C.; Duval, M.; Glass, L.; Hedegaard, J.; Hornshoj, H.; Hulsegge, B.; Jaffrezic, F.; Jensen, K.; Jiang, L.; Koning, de D.J.; Lê Cao, K.A.; Nie, H.; Petzl, W.; Pool, M.H.; Robert-Granie, C.; San Cristobal, M.; Lund, M.S.; Schothorst, van E.M.; Schuberth, H.J.; Seyfert, H.M.; Tosser-klopp, G.; Waddington, D.; Watson, D.; Yang, W.; Zerbe, H.
2007-01-01
The aim of this paper was to describe, and when possible compare, the multivariate methods used by the participants in the EADGENE WP1.4 workshop. The first approach was for class discovery and class prediction using evidence from the data at hand. Several teams used hierarchical clustering (HC) or
Hierarchical partial order ranking.
Carlsen, Lars
2008-09-01
Assessing the potential impact on environmental and human health from the production and use of chemicals or from polluted sites involves a multi-criteria evaluation scheme. A priori several parameters are to address, e.g., production tonnage, specific release scenarios, geographical and site-specific factors in addition to various substance dependent parameters. Further socio-economic factors may be taken into consideration. The number of parameters to be included may well appear to be prohibitive for developing a sensible model. The study introduces hierarchical partial order ranking (HPOR) that remedies this problem. By HPOR the original parameters are initially grouped based on their mutual connection and a set of meta-descriptors is derived representing the ranking corresponding to the single groups of descriptors, respectively. A second partial order ranking is carried out based on the meta-descriptors, the final ranking being disclosed though average ranks. An illustrative example on the prioritization of polluted sites is given.
Trees and Hierarchical Structures
Haeseler, Arndt
1990-01-01
The "raison d'etre" of hierarchical dustering theory stems from one basic phe nomenon: This is the notorious non-transitivity of similarity relations. In spite of the fact that very often two objects may be quite similar to a third without being that similar to each other, one still wants to dassify objects according to their similarity. This should be achieved by grouping them into a hierarchy of non-overlapping dusters such that any two objects in ~ne duster appear to be more related to each other than they are to objects outside this duster. In everyday life, as well as in essentially every field of scientific investigation, there is an urge to reduce complexity by recognizing and establishing reasonable das sification schemes. Unfortunately, this is counterbalanced by the experience of seemingly unavoidable deadlocks caused by the existence of sequences of objects, each comparatively similar to the next, but the last rather different from the first.
Hierarchical Affinity Propagation
Givoni, Inmar; Frey, Brendan J
2012-01-01
Affinity propagation is an exemplar-based clustering algorithm that finds a set of data-points that best exemplify the data, and associates each datapoint with one exemplar. We extend affinity propagation in a principled way to solve the hierarchical clustering problem, which arises in a variety of domains including biology, sensor networks and decision making in operational research. We derive an inference algorithm that operates by propagating information up and down the hierarchy, and is efficient despite the high-order potentials required for the graphical model formulation. We demonstrate that our method outperforms greedy techniques that cluster one layer at a time. We show that on an artificial dataset designed to mimic the HIV-strain mutation dynamics, our method outperforms related methods. For real HIV sequences, where the ground truth is not available, we show our method achieves better results, in terms of the underlying objective function, and show the results correspond meaningfully to geographi...
Optimisation by hierarchical search
Zintchenko, Ilia; Hastings, Matthew; Troyer, Matthias
2015-03-01
Finding optimal values for a set of variables relative to a cost function gives rise to some of the hardest problems in physics, computer science and applied mathematics. Although often very simple in their formulation, these problems have a complex cost function landscape which prevents currently known algorithms from efficiently finding the global optimum. Countless techniques have been proposed to partially circumvent this problem, but an efficient method is yet to be found. We present a heuristic, general purpose approach to potentially improve the performance of conventional algorithms or special purpose hardware devices by optimising groups of variables in a hierarchical way. We apply this approach to problems in combinatorial optimisation, machine learning and other fields.
How hierarchical is language use?
Frank, Stefan L.; Bod, Rens; Christiansen, Morten H.
2012-01-01
It is generally assumed that hierarchical phrase structure plays a central role in human language. However, considerations of simplicity and evolutionary continuity suggest that hierarchical structure should not be invoked too hastily. Indeed, recent neurophysiological, behavioural and computational studies show that sequential sentence structure has considerable explanatory power and that hierarchical processing is often not involved. In this paper, we review evidence from the recent literature supporting the hypothesis that sequential structure may be fundamental to the comprehension, production and acquisition of human language. Moreover, we provide a preliminary sketch outlining a non-hierarchical model of language use and discuss its implications and testable predictions. If linguistic phenomena can be explained by sequential rather than hierarchical structure, this will have considerable impact in a wide range of fields, such as linguistics, ethology, cognitive neuroscience, psychology and computer science. PMID:22977157
How hierarchical is language use?
Frank, Stefan L; Bod, Rens; Christiansen, Morten H
2012-11-22
It is generally assumed that hierarchical phrase structure plays a central role in human language. However, considerations of simplicity and evolutionary continuity suggest that hierarchical structure should not be invoked too hastily. Indeed, recent neurophysiological, behavioural and computational studies show that sequential sentence structure has considerable explanatory power and that hierarchical processing is often not involved. In this paper, we review evidence from the recent literature supporting the hypothesis that sequential structure may be fundamental to the comprehension, production and acquisition of human language. Moreover, we provide a preliminary sketch outlining a non-hierarchical model of language use and discuss its implications and testable predictions. If linguistic phenomena can be explained by sequential rather than hierarchical structure, this will have considerable impact in a wide range of fields, such as linguistics, ethology, cognitive neuroscience, psychology and computer science.
Spatial assessment of air quality patterns in Malaysia using multivariate analysis
Dominick, Doreena; Juahir, Hafizan; Latif, Mohd Talib; Zain, Sharifuddin M.; Aris, Ahmad Zaharin
2012-12-01
This study aims to investigate possible sources of air pollutants and the spatial patterns within the eight selected Malaysian air monitoring stations based on a two-year database (2008-2009). The multivariate analysis was applied on the dataset. It incorporated Hierarchical Agglomerative Cluster Analysis (HACA) to access the spatial patterns, Principal Component Analysis (PCA) to determine the major sources of the air pollution and Multiple Linear Regression (MLR) to assess the percentage contribution of each air pollutant. The HACA results grouped the eight monitoring stations into three different clusters, based on the characteristics of the air pollutants and meteorological parameters. The PCA analysis showed that the major sources of air pollution were emissions from motor vehicles, aircraft, industries and areas of high population density. The MLR analysis demonstrated that the main pollutant contributing to variability in the Air Pollutant Index (API) at all stations was particulate matter with a diameter of less than 10 μm (PM10). Further MLR analysis showed that the main air pollutant influencing the high concentration of PM10 was carbon monoxide (CO). This was due to combustion processes, particularly originating from motor vehicles. Meteorological factors such as ambient temperature, wind speed and humidity were also noted to influence the concentration of PM10.
Set Correlation as a General Multivariate Data-Analytic Method.
Cohen, Jacob
1982-01-01
Set correlation is a multivariate generalization of multiple regression/correlation analysis that features the employment of overall measures of association interpretable as proportions of variance and the use of set-partialled sets of variables. The statistical development of the theory and several examples are presented. (Author/JKS)
Directory of Open Access Journals (Sweden)
Matthias Schmid
Full Text Available Regression analysis with a bounded outcome is a common problem in applied statistics. Typical examples include regression models for percentage outcomes and the analysis of ratings that are measured on a bounded scale. In this paper, we consider beta regression, which is a generalization of logit models to situations where the response is continuous on the interval (0,1. Consequently, beta regression is a convenient tool for analyzing percentage responses. The classical approach to fit a beta regression model is to use maximum likelihood estimation with subsequent AIC-based variable selection. As an alternative to this established - yet unstable - approach, we propose a new estimation technique called boosted beta regression. With boosted beta regression estimation and variable selection can be carried out simultaneously in a highly efficient way. Additionally, both the mean and the variance of a percentage response can be modeled using flexible nonlinear covariate effects. As a consequence, the new method accounts for common problems such as overdispersion and non-binomial variance structures.
Associative Hierarchical Random Fields.
Ladický, L'ubor; Russell, Chris; Kohli, Pushmeet; Torr, Philip H S
2014-06-01
This paper makes two contributions: the first is the proposal of a new model-The associative hierarchical random field (AHRF), and a novel algorithm for its optimization; the second is the application of this model to the problem of semantic segmentation. Most methods for semantic segmentation are formulated as a labeling problem for variables that might correspond to either pixels or segments such as super-pixels. It is well known that the generation of super pixel segmentations is not unique. This has motivated many researchers to use multiple super pixel segmentations for problems such as semantic segmentation or single view reconstruction. These super-pixels have not yet been combined in a principled manner, this is a difficult problem, as they may overlap, or be nested in such a way that the segmentations form a segmentation tree. Our new hierarchical random field model allows information from all of the multiple segmentations to contribute to a global energy. MAP inference in this model can be performed efficiently using powerful graph cut based move making algorithms. Our framework generalizes much of the previous work based on pixels or segments, and the resulting labelings can be viewed both as a detailed segmentation at the pixel level, or at the other extreme, as a segment selector that pieces together a solution like a jigsaw, selecting the best segments from different segmentations as pieces. We evaluate its performance on some of the most challenging data sets for object class segmentation, and show that this ability to perform inference using multiple overlapping segmentations leads to state-of-the-art results.
Hosmer, David W; Sturdivant, Rodney X
2013-01-01
A new edition of the definitive guide to logistic regression modeling for health science and other applications This thoroughly expanded Third Edition provides an easily accessible introduction to the logistic regression (LR) model and highlights the power of this model by examining the relationship between a dichotomous outcome and a set of covariables. Applied Logistic Regression, Third Edition emphasizes applications in the health sciences and handpicks topics that best suit the use of modern statistical software. The book provides readers with state-of-
Weisberg, Sanford
2013-01-01
Praise for the Third Edition ""...this is an excellent book which could easily be used as a course text...""-International Statistical Institute The Fourth Edition of Applied Linear Regression provides a thorough update of the basic theory and methodology of linear regression modeling. Demonstrating the practical applications of linear regression analysis techniques, the Fourth Edition uses interesting, real-world exercises and examples. Stressing central concepts such as model building, understanding parameters, assessing fit and reliability, and drawing conclusions, the new edition illus
A primer of multivariate statistics
Harris, Richard J
2014-01-01
Drawing upon more than 30 years of experience in working with statistics, Dr. Richard J. Harris has updated A Primer of Multivariate Statistics to provide a model of balance between how-to and why. This classic text covers multivariate techniques with a taste of latent variable approaches. Throughout the book there is a focus on the importance of describing and testing one's interpretations of the emergent variables that are produced by multivariate analysis. This edition retains its conversational writing style while focusing on classical techniques. The book gives the reader a feel for why
Modeling hierarchical structures - Hierarchical Linear Modeling using MPlus
Jelonek, M
2006-01-01
The aim of this paper is to present the technique (and its linkage with physics) of overcoming problems connected to modeling social structures, which are typically hierarchical. Hierarchical Linear Models provide a conceptual and statistical mechanism for drawing conclusions regarding the influence of phenomena at different levels of analysis. In the social sciences it is used to analyze many problems such as educational, organizational or market dilemma. This paper introduces the logic of modeling hierarchical linear equations and estimation based on MPlus software. I present my own model to illustrate the impact of different factors on school acceptation level.
Transductive Ordinal Regression
Seah, Chun-Wei; Ong, Yew-Soon
2011-01-01
Ordinal regression is commonly formulated as a multi-class problem with ordinal constraints. The challenge of designing accurate classifiers for ordinal regression generally increases with the number of classes involved, due to the large number of labeled patterns that are needed. The availability of ordinal class labels, however, are often costly to calibrate or difficult to obtain. Unlabeled patterns, on the other hand, often exist in much greater abundance and are freely available. To take benefits from the abundance of unlabeled patterns, we present a novel transductive learning paradigm for ordinal regression in this paper, namely Transductive Ordinal Regression (TOR). The key challenge of the present study lies in the precise estimation of both the ordinal class label of the unlabeled data and the decision functions of the ordinal classes, simultaneously. The core elements of the proposed TOR include an objective function that caters to several commonly used loss functions casted in transductive setting...
Nonparametric Predictive Regression
Ioannis Kasparis; Elena Andreou; Phillips, Peter C.B.
2012-01-01
A unifying framework for inference is developed in predictive regressions where the predictor has unknown integration properties and may be stationary or nonstationary. Two easily implemented nonparametric F-tests are proposed. The test statistics are related to those of Kasparis and Phillips (2012) and are obtained by kernel regression. The limit distribution of these predictive tests holds for a wide range of predictors including stationary as well as non-stationary fractional and near unit...
Model Checking Multivariate State Rewards
DEFF Research Database (Denmark)
Nielsen, Bo Friis; Nielson, Flemming; Nielson, Hanne Riis
2010-01-01
We consider continuous stochastic logics with state rewards that are interpreted over continuous time Markov chains. We show how results from multivariate phase type distributions can be used to obtain higher-order moments for multivariate state rewards (including covariance). We also generalise ...... the treatment of eventuality to unbounded path formulae. For all extensions we show how to obtain closed form definitions that are straightforward to implement and we illustrate our development on a small example.......We consider continuous stochastic logics with state rewards that are interpreted over continuous time Markov chains. We show how results from multivariate phase type distributions can be used to obtain higher-order moments for multivariate state rewards (including covariance). We also generalise...
Multivariate spatially-structured variability of ovine helminth infections
Directory of Open Access Journals (Sweden)
Annibale Biggeri
2007-11-01
Full Text Available A cross-sectional survey was carried out on 2004-2005 in the Campania region, southern Italy, to study the multivariate geographical distribution of four different sheep helminths, i.e. Fasciola hepatica (liver fluke, Calicophoron (Paramphistomum daubneyi (rumen fluke, Dicrocoelium dendriticum (lancet fluke, and the gastrointestinal strongyle Haemonchus contortus. A series of multivariate Bayesian hierarchical models based on square root transformation of faecal egg counts were performed. The results were consistent with theoretical knowledge of the biology and epidemiology of the four studied helminths. In particular, the impact of common intermediate hosts (F. hepatica and C. daubneyi share the same intermediate host species was quantified and evidence of previously unknown ecological components was given. D. dendriticum was correlated to F. hepatica and H. contortus was found not to be spatially associated with the previously mentioned helminths.
Strategies for Industrial Multivariable Control
DEFF Research Database (Denmark)
Hangstrup, M.
Multivariable control strategies well-suited for industrial applications are suggested. The strategies combine the practical advantages of conventional SISO control schemes and -technology with the potential of multivariable controllers. Special emphasis is put on parameter-varying systems whose...... dynamics and gains strongly depend upon one or more physical parameters characterizing the operating point. This class covers many industrial systems such as airplanes, ships, robots and process control systems. Power plant boilers are representatives for process control systems in general. The dynamics...
Competing Risks Quantile Regression at Work
DEFF Research Database (Denmark)
Dlugosz, Stephan; Lo, Simon M. S.; Wilke, Ralf
2017-01-01
Despite its emergence as a frequently used method for the empirical analysis of multivariate data, quantile regression is yet to become a mainstream tool for the analysis of duration data. We present a pioneering empirical study on the grounds of a competing risks quantile regression model. We us...... into the distribution of transitions out of maternity leave. It is found that cumulative incidences implied by the quantile regression model differ from those implied by a proportional hazards model. To foster the use of the model, we make an R-package (cmprskQR) available....... large-scale maternity duration data with multiple competing risks derived from German linked social security records to analyse how public policies are related to the length of economic inactivity of young mothers after giving birth. Our results show that the model delivers detailed insights...
Multivariate analysis in thoracic research.
Mengual-Macenlle, Noemí; Marcos, Pedro J; Golpe, Rafael; González-Rivas, Diego
2015-03-01
Multivariate analysis is based in observation and analysis of more than one statistical outcome variable at a time. In design and analysis, the technique is used to perform trade studies across multiple dimensions while taking into account the effects of all variables on the responses of interest. The development of multivariate methods emerged to analyze large databases and increasingly complex data. Since the best way to represent the knowledge of reality is the modeling, we should use multivariate statistical methods. Multivariate methods are designed to simultaneously analyze data sets, i.e., the analysis of different variables for each person or object studied. Keep in mind at all times that all variables must be treated accurately reflect the reality of the problem addressed. There are different types of multivariate analysis and each one should be employed according to the type of variables to analyze: dependent, interdependence and structural methods. In conclusion, multivariate methods are ideal for the analysis of large data sets and to find the cause and effect relationships between variables; there is a wide range of analysis types that we can use.
Institute of Scientific and Technical Information of China (English)
韦光毅; 柳元; 胡天桥; 陈启玲
2012-01-01
目的 分析寻找砷化氢作业工人的尿中砷含量的主要影响因素,为预防砷化氢中毒提供参考.方法 收集132例砷化氢作业工人的尿砷水平及其相关因素的详细资料,采用多元逐步回归和多元线性回归的方法进行统计分析.结果 获得回归方程lgY=0.551 D1+0.281 D2 +0.665X7+0.059X8+0.005 X3 -2.279,决定系数调整R2=0.885,其中D1,D2代表工种的2个虚拟变量,X7代表砷化氢浓度,X8代表自觉症状,X3代表年龄,标准化偏回归系数分别为0.569,0.268,0.624,0.056,0.054.结论 回归方程表明砷化氢作业工人尿砷的主要影响因素为工种、砷化氢浓度,提示预防砷化氢中毒要做好高危工种工人的防护,控制好作业场所空气中砷化氢的浓度.%[ Objective ] To analyze the major factors affecting arsenic levels in urine of workers exposed to arsine, and provide reference for arsine poisoning prevention. [ Methods] Detailed information on arsenic levels in urine and its related factors of 132 cases of workers exposed to arsine was collected. Stepwise multiple regression and multiple linear regression method were used for statistical analysis. [Results] Regression equation was obtained, lg Y = 0. 551 D1 +0. 281 D2 +0. 665 X, + 0. 059 X2 +0. 005 X3-2. 279, adjusted R2 =0. 885, D, , D2 were the two dummy variables of jobs, X7 was the concentration of arsine, X8 was symptoms, X3 was the age. Their standardized partial regression coefficients were 0. 569, 0. 268, 0. 624, 0. 056 and 0. 054. [ Conclusion] The regression equation suggests the main factors affecting arsenic levels in urine of workers exposed to arsine were the types of work and the concentration of arsine, the protection of high-risk workers and the control of air arsine concentration in workplaces should be strengthened.
Descriptor Learning via Supervised Manifold Regularization for Multioutput Regression.
Zhen, Xiantong; Yu, Mengyang; Islam, Ali; Bhaduri, Mousumi; Chan, Ian; Li, Shuo
2016-06-08
Multioutput regression has recently shown great ability to solve challenging problems in both computer vision and medical image analysis. However, due to the huge image variability and ambiguity, it is fundamentally challenging to handle the highly complex input-target relationship of multioutput regression, especially with indiscriminate high-dimensional representations. In this paper, we propose a novel supervised descriptor learning (SDL) algorithm for multioutput regression, which can establish discriminative and compact feature representations to improve the multivariate estimation performance. The SDL is formulated as generalized low-rank approximations of matrices with a supervised manifold regularization. The SDL is able to simultaneously extract discriminative features closely related to multivariate targets and remove irrelevant and redundant information by transforming raw features into a new low-dimensional space aligned to targets. The achieved discriminative while compact descriptor largely reduces the variability and ambiguity for multioutput regression, which enables more accurate and efficient multivariate estimation. We conduct extensive evaluation of the proposed SDL on both synthetic data and real-world multioutput regression tasks for both computer vision and medical image analysis. Experimental results have shown that the proposed SDL can achieve high multivariate estimation accuracy on all tasks and largely outperforms the algorithms in the state of the arts. Our method establishes a novel SDL framework for multioutput regression, which can be widely used to boost the performance in different applications.
Modeling hierarchical structures - Hierarchical Linear Modeling using MPlus
Jelonek, Magdalena
2006-01-01
The aim of this paper is to present the technique (and its linkage with physics) of overcoming problems connected to modeling social structures, which are typically hierarchical. Hierarchical Linear Models provide a conceptual and statistical mechanism for drawing conclusions regarding the influence of phenomena at different levels of analysis. In the social sciences it is used to analyze many problems such as educational, organizational or market dilemma. This paper introduces the logic of m...
Petrov, Romain G; Boskri, Abdelkarim; Folcher, Jean-Pierre; Lagarde, Stephane; Bresson, Yves; Benkhaldoum, Zouhair; Lazrek, Mohamed; Rakshit, Suvendu
2014-01-01
The limiting magnitude is a key issue for optical interferometry. Pairwise fringe trackers based on the integrated optics concepts used for example in GRAVITY seem limited to about K=10.5 with the 8m Unit Telescopes of the VLTI, and there is a general "common sense" statement that the efficiency of fringe tracking, and hence the sensitivity of optical interferometry, must decrease as the number of apertures increases, at least in the near infrared where we are still limited by detector readout noise. Here we present a Hierarchical Fringe Tracking (HFT) concept with sensitivity at least equal to this of a two apertures fringe trackers. HFT is based of the combination of the apertures in pairs, then in pairs of pairs then in pairs of groups. The key HFT module is a device that behaves like a spatial filter for two telescopes (2TSF) and transmits all or most of the flux of a cophased pair in a single mode beam. We give an example of such an achromatic 2TSF, based on very broadband dispersed fringes analyzed by g...
Tunesi, Luca; Armbruster, Philippe
2004-02-01
The objective of this paper is to demonstrate a suitable hierarchical networking solution to improve capabilities and performances of space systems, with significant recurrent costs saving and more efficient design & manufacturing flows. Classically, a satellite can be split in two functional sub-systems: the platform and the payload complement. The platform is in charge of providing power, attitude & orbit control and up/down-link services, whereas the payload represents the scientific and/or operational instruments/transponders and embodies the objectives of the mission. One major possibility to improve the performance of payloads, by limiting the data return to pertinent information, is to process data on board thanks to a proper implementation of the payload data system. In this way, it is possible to share non-recurring development costs by exploiting a system that can be adopted by the majority of space missions. It is believed that the Modular and Scalable Payload Data System, under development by ESA, provides a suitable solution to fulfil a large range of future mission requirements. The backbone of the system is the standardised high data rate SpaceWire network http://www.ecss.nl/. As complement, a lower speed command and control bus connecting peripherals is required. For instance, at instrument level, there is a need for a "local" low complexity bus, which gives the possibility to command and control sensors and actuators. Moreover, most of the connections at sub-system level are related to discrete signals management or simple telemetry acquisitions, which can easily and efficiently be handled by a local bus. An on-board hierarchical network can therefore be defined by interconnecting high-speed links and local buses. Additionally, it is worth stressing another important aspect of the design process: Agencies and ESA in particular are frequently confronted with a big consortium of geographically spread companies located in different countries, each one
Hierarchical Reverberation Mapping
Brewer, Brendon J
2013-01-01
Reverberation mapping (RM) is an important technique in studies of active galactic nuclei (AGN). The key idea of RM is to measure the time lag $\\tau$ between variations in the continuum emission from the accretion disc and subsequent response of the broad line region (BLR). The measurement of $\\tau$ is typically used to estimate the physical size of the BLR and is combined with other measurements to estimate the black hole mass $M_{\\rm BH}$. A major difficulty with RM campaigns is the large amount of data needed to measure $\\tau$. Recently, Fine et al (2012) introduced a new approach to RM where the BLR light curve is sparsely sampled, but this is counteracted by observing a large sample of AGN, rather than a single system. The results are combined to infer properties of the sample of AGN. In this letter we implement this method using a hierarchical Bayesian model and contrast this with the results from the previous stacked cross-correlation technique. We find that our inferences are more precise and allow fo...
Correlative and multivariate analysis of increased radon concentration in underground laboratory.
Maletić, Dimitrije M; Udovičić, Vladimir I; Banjanac, Radomir M; Joković, Dejan R; Dragić, Aleksandar L; Veselinović, Nikola B; Filipović, Jelena
2014-11-01
The results of analysis using correlative and multivariate methods, as developed for data analysis in high-energy physics and implemented in the Toolkit for Multivariate Analysis software package, of the relations of the variation of increased radon concentration with climate variables in shallow underground laboratory is presented. Multivariate regression analysis identified a number of multivariate methods which can give a good evaluation of increased radon concentrations based on climate variables. The use of the multivariate regression methods will enable the investigation of the relations of specific climate variable with increased radon concentrations by analysis of regression methods resulting in 'mapped' underlying functional behaviour of radon concentrations depending on a wide spectrum of climate variables. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Bayesian nonlinear regression for large small problems
Chakraborty, Sounak
2012-07-01
Statistical modeling and inference problems with sample sizes substantially smaller than the number of available covariates are challenging. This is known as large p small n problem. Furthermore, the problem is more complicated when we have multiple correlated responses. We develop multivariate nonlinear regression models in this setup for accurate prediction. In this paper, we introduce a full Bayesian support vector regression model with Vapnik\\'s ε-insensitive loss function, based on reproducing kernel Hilbert spaces (RKHS) under the multivariate correlated response setup. This provides a full probabilistic description of support vector machine (SVM) rather than an algorithm for fitting purposes. We have also introduced a multivariate version of the relevance vector machine (RVM). Instead of the original treatment of the RVM relying on the use of type II maximum likelihood estimates of the hyper-parameters, we put a prior on the hyper-parameters and use Markov chain Monte Carlo technique for computation. We have also proposed an empirical Bayes method for our RVM and SVM. Our methods are illustrated with a prediction problem in the near-infrared (NIR) spectroscopy. A simulation study is also undertaken to check the prediction accuracy of our models. © 2012 Elsevier Inc.
Multivariate Generalized Multiscale Entropy Analysis
Directory of Open Access Journals (Sweden)
Anne Humeau-Heurtier
2016-11-01
Full Text Available Multiscale entropy (MSE was introduced in the 2000s to quantify systems’ complexity. MSE relies on (i a coarse-graining procedure to derive a set of time series representing the system dynamics on different time scales; (ii the computation of the sample entropy for each coarse-grained time series. A refined composite MSE (rcMSE—based on the same steps as MSE—also exists. Compared to MSE, rcMSE increases the accuracy of entropy estimation and reduces the probability of inducing undefined entropy for short time series. The multivariate versions of MSE (MMSE and rcMSE (MrcMSE have also been introduced. In the coarse-graining step used in MSE, rcMSE, MMSE, and MrcMSE, the mean value is used to derive representations of the original data at different resolutions. A generalization of MSE was recently published, using the computation of different moments in the coarse-graining procedure. However, so far, this generalization only exists for univariate signals. We therefore herein propose an extension of this generalized MSE to multivariate data. The multivariate generalized algorithms of MMSE and MrcMSE presented herein (MGMSE and MGrcMSE, respectively are first analyzed through the processing of synthetic signals. We reveal that MGrcMSE shows better performance than MGMSE for short multivariate data. We then study the performance of MGrcMSE on two sets of short multivariate electroencephalograms (EEG available in the public domain. We report that MGrcMSE may show better performance than MrcMSE in distinguishing different types of multivariate EEG data. MGrcMSE could therefore supplement MMSE or MrcMSE in the processing of multivariate datasets.
Constrained Sparse Galerkin Regression
Loiseau, Jean-Christophe
2016-01-01
In this work, we demonstrate the use of sparse regression techniques from machine learning to identify nonlinear low-order models of a fluid system purely from measurement data. In particular, we extend the sparse identification of nonlinear dynamics (SINDy) algorithm to enforce physical constraints in the regression, leading to energy conservation. The resulting models are closely related to Galerkin projection models, but the present method does not require the use of a full-order or high-fidelity Navier-Stokes solver to project onto basis modes. Instead, the most parsimonious nonlinear model is determined that is consistent with observed measurement data and satisfies necessary constraints. The constrained Galerkin regression algorithm is implemented on the fluid flow past a circular cylinder, demonstrating the ability to accurately construct models from data.
Hierarchical materials: Background and perspectives
DEFF Research Database (Denmark)
2016-01-01
Hierarchical design draws inspiration from analysis of biological materials and has opened new possibilities for enhancing performance and enabling new functionalities and extraordinary properties. With the development of nanotechnology, the necessary technological requirements for the manufactur...
Hierarchical clustering for graph visualization
Clémençon, Stéphan; Rossi, Fabrice; Tran, Viet Chi
2012-01-01
This paper describes a graph visualization methodology based on hierarchical maximal modularity clustering, with interactive and significant coarsening and refining possibilities. An application of this method to HIV epidemic analysis in Cuba is outlined.
Direct hierarchical assembly of nanoparticles
Xu, Ting; Zhao, Yue; Thorkelsson, Kari
2014-07-22
The present invention provides hierarchical assemblies of a block copolymer, a bifunctional linking compound and a nanoparticle. The block copolymers form one micro-domain and the nanoparticles another micro-domain.
Practical Session: Logistic Regression
Clausel, M.; Grégoire, G.
2014-12-01
An exercise is proposed to illustrate the logistic regression. One investigates the different risk factors in the apparition of coronary heart disease. It has been proposed in Chapter 5 of the book of D.G. Kleinbaum and M. Klein, "Logistic Regression", Statistics for Biology and Health, Springer Science Business Media, LLC (2010) and also by D. Chessel and A.B. Dufour in Lyon 1 (see Sect. 6 of http://pbil.univ-lyon1.fr/R/pdf/tdr341.pdf). This example is based on data given in the file evans.txt coming from http://www.sph.emory.edu/dkleinb/logreg3.htm#data.
DEFF Research Database (Denmark)
Bache, Stefan Holst
A new and alternative quantile regression estimator is developed and it is shown that the estimator is root n-consistent and asymptotically normal. The estimator is based on a minimax ‘deviance function’ and has asymptotically equivalent properties to the usual quantile regression estimator. It is......, however, a different and therefore new estimator. It allows for both linear- and nonlinear model specifications. A simple algorithm for computing the estimates is proposed. It seems to work quite well in practice but whether it has theoretical justification is still an open question....
Functional annotation of hierarchical modularity.
Directory of Open Access Journals (Sweden)
Kanchana Padmanabhan
Full Text Available In biological networks of molecular interactions in a cell, network motifs that are biologically relevant are also functionally coherent, or form functional modules. These functionally coherent modules combine in a hierarchical manner into larger, less cohesive subsystems, thus revealing one of the essential design principles of system-level cellular organization and function-hierarchical modularity. Arguably, hierarchical modularity has not been explicitly taken into consideration by most, if not all, functional annotation systems. As a result, the existing methods would often fail to assign a statistically significant functional coherence score to biologically relevant molecular machines. We developed a methodology for hierarchical functional annotation. Given the hierarchical taxonomy of functional concepts (e.g., Gene Ontology and the association of individual genes or proteins with these concepts (e.g., GO terms, our method will assign a Hierarchical Modularity Score (HMS to each node in the hierarchy of functional modules; the HMS score and its p-value measure functional coherence of each module in the hierarchy. While existing methods annotate each module with a set of "enriched" functional terms in a bag of genes, our complementary method provides the hierarchical functional annotation of the modules and their hierarchically organized components. A hierarchical organization of functional modules often comes as a bi-product of cluster analysis of gene expression data or protein interaction data. Otherwise, our method will automatically build such a hierarchy by directly incorporating the functional taxonomy information into the hierarchy search process and by allowing multi-functional genes to be part of more than one component in the hierarchy. In addition, its underlying HMS scoring metric ensures that functional specificity of the terms across different levels of the hierarchical taxonomy is properly treated. We have evaluated our
A note on Optimal weights and variable selections for multivariate survival data
Institute of Scientific and Technical Information of China (English)
Zhao; Sihai; Dave
2009-01-01
Fan et al. are to be congratulated for this important contribution to the analysis of multivariate failure time data. They have provided three regression parameter estimators for multiple covariates in the marginal hazard model. Using the weighted estimating equation approach,
Hierarchical architecture of active knits
Abel, Julianna; Luntz, Jonathan; Brei, Diann
2013-12-01
Nature eloquently utilizes hierarchical structures to form the world around us. Applying the hierarchical architecture paradigm to smart materials can provide a basis for a new genre of actuators which produce complex actuation motions. One promising example of cellular architecture—active knits—provides complex three-dimensional distributed actuation motions with expanded operational performance through a hierarchically organized structure. The hierarchical structure arranges a single fiber of active material, such as shape memory alloys (SMAs), into a cellular network of interlacing adjacent loops according to a knitting grid. This paper defines a four-level hierarchical classification of knit structures: the basic knit loop, knit patterns, grid patterns, and restructured grids. Each level of the hierarchy provides increased architectural complexity, resulting in expanded kinematic actuation motions of active knits. The range of kinematic actuation motions are displayed through experimental examples of different SMA active knits. The results from this paper illustrate and classify the ways in which each level of the hierarchical knit architecture leverages the performance of the base smart material to generate unique actuation motions, providing necessary insight to best exploit this new actuation paradigm.
Multivariate predictors of failed prehospital endotracheal intubation.
Wang, Henry E; Kupas, Douglas F; Paris, Paul M; Bates, Robyn R; Costantino, Joseph P; Yealy, Donald M
2003-07-01
Conventionally trained out-of-hospital rescuers (such as paramedics) often fail to accomplish endotracheal intubation (ETI) in patients requiring invasive airway management. Previous studies have identified univariate variables associated with failed out-of-hospital ETI but have not examined the interaction between the numerous factors impacting ETI success. This study sought to use multivariate logistic regression to identify a set of factors associated with failed adult out-of-hospital ETI. The authors obtained clinical and demographic data from the Prehospital Airway Collaborative Evaluation, a prospective, multicentered observational study involving advanced life support (ALS) emergency medical services (EMS) systems in the Commonwealth of Pennsylvania. Providers used standard forms to report details of attempted ETI, including system and patient demographics, methods used, difficulties encountered, and initial outcomes. The authors excluded data from sedation-facilitated and neuromuscular blockade-assisted intubations. The main outcome measure was ETI failure, defined as failure to successfully place an endotracheal tube on the last out-of-hospital laryngoscopy attempt. Logistic regression was performed to develop a multivariate model identifying factors associated with failed ETI. Data were used from 45 ALS systems on 663 adult ETIs attempted during the period June 1, 2001, to November 30, 2001. There were 89 cases of failed ETI (failure rate 13.4%). Of 61 factors potentially related to ETI failure, multivariate logistic regression revealed the following significant covariates associated with ETI failure (odds ratio; 95% confidence interval; likelihood ratio p-value): presence of clenched jaw/trismus (9.718; 95% CI = 4.594 to 20.558; p endotracheal tube through the vocal cords (7.653; 95% CI = 3.561 to 16.447; p < 0.0001); inability to visualize the vocal cords (7.638; 95% CI = 3.966 to 14.707; p < 0.0001); intact gag reflex (7.060; 95% CI = 3.552 to 14
Advanced hierarchical distance sampling
Royle, Andy
2016-01-01
In this chapter, we cover a number of important extensions of the basic hierarchical distance-sampling (HDS) framework from Chapter 8. First, we discuss the inclusion of “individual covariates,” such as group size, in the HDS model. This is important in many surveys where animals form natural groups that are the primary observation unit, with the size of the group expected to have some influence on detectability. We also discuss HDS integrated with time-removal and double-observer or capture-recapture sampling. These “combined protocols” can be formulated as HDS models with individual covariates, and thus they have a commonality with HDS models involving group structure (group size being just another individual covariate). We cover several varieties of open-population HDS models that accommodate population dynamics. On one end of the spectrum, we cover models that allow replicate distance sampling surveys within a year, which estimate abundance relative to availability and temporary emigration through time. We consider a robust design version of that model. We then consider models with explicit dynamics based on the Dail and Madsen (2011) model and the work of Sollmann et al. (2015). The final major theme of this chapter is relatively newly developed spatial distance sampling models that accommodate explicit models describing the spatial distribution of individuals known as Point Process models. We provide novel formulations of spatial DS and HDS models in this chapter, including implementations of those models in the unmarked package using a hack of the pcount function for N-mixture models.
Ritz, Christian; Parmigiani, Giovanni
2009-01-01
R is a rapidly evolving lingua franca of graphical display and statistical analysis of experiments from the applied sciences. This book provides a coherent treatment of nonlinear regression with R by means of examples from a diversity of applied sciences such as biology, chemistry, engineering, medicine and toxicology.
Multiple linear regression analysis
Edwards, T. R.
1980-01-01
Program rapidly selects best-suited set of coefficients. User supplies only vectors of independent and dependent data and specifies confidence level required. Program uses stepwise statistical procedure for relating minimal set of variables to set of observations; final regression contains only most statistically significant coefficients. Program is written in FORTRAN IV for batch execution and has been implemented on NOVA 1200.
Adaptive metric kernel regression
DEFF Research Database (Denmark)
Goutte, Cyril; Larsen, Jan
2000-01-01
regression by minimising a cross-validation estimate of the generalisation error. This allows to automatically adjust the importance of different dimensions. The improvement in terms of modelling performance is illustrated on a variable selection task where the adaptive metric kernel clearly outperforms...
Software Regression Verification
2013-12-11
of recursive procedures. Acta Informatica , 45(6):403 – 439, 2008. [GS11] Benny Godlin and Ofer Strichman. Regression verifica- tion. Technical Report...functions. Therefore, we need to rede - fine m-term. – Mutual termination. If either function f or function f ′ (or both) is non- deterministic, then their
Seber, George A F
2012-01-01
Concise, mathematically clear, and comprehensive treatment of the subject.* Expanded coverage of diagnostics and methods of model fitting.* Requires no specialized knowledge beyond a good grasp of matrix algebra and some acquaintance with straight-line regression and simple analysis of variance models.* More than 200 problems throughout the book plus outline solutions for the exercises.* This revision has been extensively class-tested.
Multivariate Modelling via Matrix Subordination
DEFF Research Database (Denmark)
Nicolato, Elisa
Extending the vast library of univariate models to price multi-asset derivatives is still a challenge in the field of Quantitative Finance. Within the literature on multivariate modelling, a dichotomy may be noticed. On one hand, the focus has been on the construction of models displaying...... stochastic correlation within the framework of discussion processes (see e.g. Pigorsh and Stelzer (2008), Hubalek and Nicolato (2008) and Zhu (2000)). On the other hand a number of authors have proposed multivariate Levy models, which allow for flexible modelling of returns, but at the expenses of a constant...... correlation structure (see e.g. Leoni and Schoutens (2007) and Leoni and Schoutens (2007) among others). Tractable multivariate models displaying flexible and stochastic correlation structures combined with jumps is proving to be rather problematic. In particular, the classical technique of introducing...
Multivariable q-Racah polynomials
Van Diejen, J F
1996-01-01
The Koornwinder-Macdonald multivariable generalization of the Askey-Wilson polynomials is studied for parameters satisfying a truncation condition such that the orthogonality measure becomes discrete with support on a finite grid. For this parameter regime the polynomials may be seen as a multivariable counterpart of the (one-variable) q-Racah polynomials. We present the discrete orthogonality measure, expressions for the normalization constants converting the polynomials into an orthonormal system (in terms of the normalization constant for the unit polynomial), and we discuss the limit q\\rightarrow 1 leading to multivariable Racah type polynomials. Of special interest is the situation that q lies on the unit circle; in that case it is found that there exists a natural parameter domain for which the discrete orthogonality measure (which is complex in general) becomes real-valued and positive. We investigate the properties of a finite-dimensional discrete integral transform for functions over the grid, whose ...
Sparse reduced-rank regression with covariance estimation
Chen, Lisha
2014-12-08
Improving the predicting performance of the multiple response regression compared with separate linear regressions is a challenging question. On the one hand, it is desirable to seek model parsimony when facing a large number of parameters. On the other hand, for certain applications it is necessary to take into account the general covariance structure for the errors of the regression model. We assume a reduced-rank regression model and work with the likelihood function with general error covariance to achieve both objectives. In addition we propose to select relevant variables for reduced-rank regression by using a sparsity-inducing penalty, and to estimate the error covariance matrix simultaneously by using a similar penalty on the precision matrix. We develop a numerical algorithm to solve the penalized regression problem. In a simulation study and real data analysis, the new method is compared with two recent methods for multivariate regression and exhibits competitive performance in prediction and variable selection.
Multivariate stochastic simulation with subjective multivariate normal distributions
P. J. Ince; J. Buongiorno
1991-01-01
In many applications of Monte Carlo simulation in forestry or forest products, it may be known that some variables are correlated. However, for simplicity, in most simulations it has been assumed that random variables are independently distributed. This report describes an alternative Monte Carlo simulation technique for subjectively assesed multivariate normal...
When to Use Hierarchical Linear Modeling
Directory of Open Access Journals (Sweden)
Veronika Huta
2014-04-01
Full Text Available Previous publications on hierarchical linear modeling (HLM have provided guidance on how to perform the analysis, yet there is relatively little information on two questions that arise even before analysis: Does HLM apply to ones data and research question? And if it does apply, how does one choose between HLM and other methods sometimes used in these circumstances, including multiple regression, repeated-measures or mixed ANOVA, and structural equation modeling or path analysis? The purpose of this tutorial is to briefly introduce HLM and then to review some of the considerations that are helpful in answering these questions, including the nature of the data, the model to be tested, and the information desired on the output. Some examples of how the same analysis could be performed in HLM, repeated-measures or mixed ANOVA, and structural equation modeling or path analysis are also provided. .
Subset selection in regression
Miller, Alan
2002-01-01
Originally published in 1990, the first edition of Subset Selection in Regression filled a significant gap in the literature, and its critical and popular success has continued for more than a decade. Thoroughly revised to reflect progress in theory, methods, and computing power, the second edition promises to continue that tradition. The author has thoroughly updated each chapter, incorporated new material on recent developments, and included more examples and references. New in the Second Edition:A separate chapter on Bayesian methodsComplete revision of the chapter on estimationA major example from the field of near infrared spectroscopyMore emphasis on cross-validationGreater focus on bootstrappingStochastic algorithms for finding good subsets from large numbers of predictors when an exhaustive search is not feasible Software available on the Internet for implementing many of the algorithms presentedMore examplesSubset Selection in Regression, Second Edition remains dedicated to the techniques for fitting...
Building multivariate systems biology models
Kirwan, G.M.; Johansson, E.; Kleemann, R.; Verheij, E.R.; Wheelock, A.M.; Goto, S.; Trygg, J.; Wheelock, C.E.
2012-01-01
Systems biology methods using large-scale "omics" data sets face unique challenges: integrating and analyzing near limitless data space, while recognizing and removing systematic variation or noise. Herein we propose a complementary multivariate analysis workflow to both integrate "omics" data from
DEFF Research Database (Denmark)
Barndorff-Nielsen, Ole Eiler; Stelzer, Robert
2011-01-01
Univariate superpositions of Ornstein–Uhlenbeck-type processes (OU), called supOU processes, provide a class of continuous time processes capable of exhibiting long memory behavior. This paper introduces multivariate supOU processes and gives conditions for their existence and finiteness of momen...
The Multivariate Gaussian Probability Distribution
DEFF Research Database (Denmark)
Ahrendt, Peter
2005-01-01
This technical report intends to gather information about the multivariate gaussian distribution, that was previously not (at least to my knowledge) to be found in one place and written as a reference manual. Additionally, some useful tips and tricks are collected that may be useful in practical...
Classification and regression trees
Breiman, Leo; Olshen, Richard A; Stone, Charles J
1984-01-01
The methodology used to construct tree structured rules is the focus of this monograph. Unlike many other statistical procedures, which moved from pencil and paper to calculators, this text's use of trees was unthinkable before computers. Both the practical and theoretical sides have been developed in the authors' study of tree methods. Classification and Regression Trees reflects these two sides, covering the use of trees as a data analysis method, and in a more mathematical framework, proving some of their fundamental properties.
DEFF Research Database (Denmark)
Hansen, Henrik; Tarp, Finn
2001-01-01
. There are, however, decreasing returns to aid, and the estimated effectiveness of aid is highly sensitive to the choice of estimator and the set of control variables. When investment and human capital are controlled for, no positive effect of aid is found. Yet, aid continues to impact on growth via...... investment. We conclude by stressing the need for more theoretical work before this kind of cross-country regressions are used for policy purposes....
Robust Nonstationary Regression
1993-01-01
This paper provides a robust statistical approach to nonstationary time series regression and inference. Fully modified extensions of traditional robust statistical procedures are developed which allow for endogeneities in the nonstationary regressors and serial dependence in the shocks that drive the regressors and the errors that appear in the equation being estimated. The suggested estimators involve semiparametric corrections to accommodate these possibilities and they belong to the same ...
Hierarchical topic modeling with nested hierarchical Dirichlet process
Institute of Scientific and Technical Information of China (English)
Yi-qun DING; Shan-ping LI; Zhen ZHANG; Bin SHEN
2009-01-01
This paper deals with the statistical modeling of latent topic hierarchies in text corpora. The height of the topic tree is assumed as fixed, while the number of topics on each level as unknown a priori and to be inferred from data. Taking a nonparametric Bayesian approach to this problem, we propose a new probabilistic generative model based on the nested hierarchical Dirichlet process (nHDP) and present a Markov chain Monte Carlo sampling algorithm for the inference of the topic tree structure as welt as the word distribution of each topic and topic distribution of each document. Our theoretical analysis and experiment results show that this model can produce a more compact hierarchical topic structure and captures more free-grained topic relationships compared to the hierarchical latent Dirichlet allocation model.
TWO REGRESSION CREDIBILITY MODELS
Directory of Open Access Journals (Sweden)
Constanţa-Nicoleta BODEA
2010-03-01
Full Text Available In this communication we will discuss two regression credibility models from Non – Life Insurance Mathematics that can be solved by means of matrix theory. In the first regression credibility model, starting from a well-known representation formula of the inverse for a special class of matrices a risk premium will be calculated for a contract with risk parameter θ. In the next regression credibility model, we will obtain a credibility solution in the form of a linear combination of the individual estimate (based on the data of a particular state and the collective estimate (based on aggregate USA data. To illustrate the solution with the properties mentioned above, we shall need the well-known representation theorem for a special class of matrices, the properties of the trace for a square matrix, the scalar product of two vectors, the norm with respect to a positive definite matrix given in advance and the complicated mathematical properties of conditional expectations and of conditional covariances.
REGRESSION ANALYSIS OF PRODUCTIVITY USING MIXED EFFECT MODEL
Directory of Open Access Journals (Sweden)
Siana Halim
2007-01-01
Full Text Available Production plants of a company are located in several areas that spread across Middle and East Java. As the production process employs mostly manpower, we suspected that each location has different characteristics affecting the productivity. Thus, the production data may have a spatial and hierarchical structure. For fitting a linear regression using the ordinary techniques, we are required to make some assumptions about the nature of the residuals i.e. independent, identically and normally distributed. However, these assumptions were rarely fulfilled especially for data that have a spatial and hierarchical structure. We worked out the problem using mixed effect model. This paper discusses the model construction of productivity and several characteristics in the production line by taking location as a random effect. The simple model with high utility that satisfies the necessary regression assumptions was built using a free statistic software R version 2.6.1.
Institute of Scientific and Technical Information of China (English)
张秀敏; 南卓铜; 吴吉春; 杜二计; 王通; 游艳辉
2011-01-01
以探地雷达、电磁测深、钻探等技术方法获得野外数据及数字高程（DEM）遥感数据为基础,通过聚类分析和相关性分析对高程、坡度、坡向等因素对多年冻土分布的影响进行了定量化研究.利用非线性的多元自适应回归样条（MARS）方法建立了基于高程、太阳辐射的多年冻土分布模型,通过自身的交叉验证及对比年平均地温模型和逻辑回归模型的总体分类精度,说明MARS模型具有较好的分类精度.运用MARS模型模拟了整个温泉区域冻土的空间分布特征.结果表明：MARS模型分类精度较高,验证了此模型模拟温泉区域冻土分布的可行性;此模型除了考虑高程对对多年冻土分布的控制作用外,还体现了太阳辐射这一局地综合因素对多年冻土分布的调整作用,较好地模拟了高程相对较低的低山区多年冻土的存在.%In order to understand the distribution patterns of permafrost in the Wenquan area on the Qinghai-Tibet Plateau,the effects of altitude,slope and aspect and other topo-climatic factors on the distribution of permafrost were studied,using the correlation analysis with digital elevation（DEM） data,borehole observations and measures from ground penetrating radar（GPR） and the electromagnetic sounding method.A permafrost distribution model based on the nonlinear multiple adaptive regression splines（MARS） method was developed,taking elevation and direct solar radiation as variables.Five-fold cross validation shows that this model has a good simulation capability in describing the permafrost distribution spatial pattern in the study area.Applying the model to the study area indicates that in the Wenquan area there is 1 881 km2 of permafrost area,accounting for 76% of the total Wenquan area.The MARS model is better than the mean annual ground temperature model and the logistic model,because the MARS model takes into account not only elevation,the predominantly
Deliberate change without hierarchical influence?
DEFF Research Database (Denmark)
Nørskov, Sladjana; Kesting, Peter; Ulhøi, John Parm
2017-01-01
Purpose This paper aims to present that deliberate change is strongly associated with formal structures and top-down influence. Hierarchical configurations have been used to structure processes, overcome resistance and get things done. But is deliberate change also possible without formal...... reveals that deliberate change is indeed achievable in a non-hierarchical collaborative OSS community context. However, it presupposes the presence and active involvement of informal change agents. The paper identifies and specifies four key drivers for change agents’ influence. Originality....../value The findings contribute to organisational analysis by providing a deeper understanding of the importance of leadership in making deliberate change possible in non-hierarchical settings. It points to the importance of “change-by-conviction”, essentially based on voluntary behaviour. This can open the door...
Static Correctness of Hierarchical Procedures
DEFF Research Database (Denmark)
Schwartzbach, Michael Ignatieff
1990-01-01
A system of hierarchical, fully recursive types in a truly imperative language allows program fragments written for small types to be reused for all larger types. To exploit this property to enable type-safe hierarchical procedures, it is necessary to impose a static requirement on procedure calls....... We introduce an example language and prove the existence of a sound requirement which preserves static correctness while allowing hierarchical procedures. This requirement is further shown to be optimal, in the sense that it imposes as few restrictions as possible. This establishes the theoretical...... basis for a general type hierarchy with static type checking, which enables first-order polymorphism combined with multiple inheritance and specialization in a language with assignments. We extend the results to include opaque types. An opaque version of a type is different from the original but has...
Structural integrity of hierarchical composites
Directory of Open Access Journals (Sweden)
Marco Paggi
2012-01-01
Full Text Available Interface mechanical problems are of paramount importance in engineering and materials science. Traditionally, due to the complexity of modelling their mechanical behaviour, interfaces are often treated as defects and their features are not explored. In this study, a different approach is illustrated, where the interfaces play an active role in the design of innovative hierarchical composites and are fundamental for their structural integrity. Numerical examples regarding cutting tools made of hierarchical cellular polycrystalline materials are proposed, showing that tailoring of interface properties at the different scales is the way to achieve superior mechanical responses that cannot be obtained using standard materials
Analyzing the Dynamics of Nonlinear Multivariate Time Series Models
Institute of Scientific and Technical Information of China (English)
DenghuaZhong; ZhengfengZhang; DonghaiLiu; StefanMittnik
2004-01-01
This paper analyzes the dynamics of nonlinear multivariate time series models that is represented by generalized impulse response functions and asymmetric functions. We illustrate the measures of shock persistences and asymmetric effects of shocks derived from the generalized impulse response functions and asymmetric function in bivariate smooth transition regression models. The empirical work investigates a bivariate smooth transition model of US GDP and the unemployment rate.
Analysis of Forest Foliage Using a Multivariate Mixture Model
Hlavka, C. A.; Peterson, David L.; Johnson, L. F.; Ganapol, B.
1997-01-01
Data with wet chemical measurements and near infrared spectra of ground leaf samples were analyzed to test a multivariate regression technique for estimating component spectra which is based on a linear mixture model for absorbance. The resulting unmixed spectra for carbohydrates, lignin, and protein resemble the spectra of extracted plant starches, cellulose, lignin, and protein. The unmixed protein spectrum has prominent absorption spectra at wavelengths which have been associated with nitrogen bonds.
Visualization of Multivariate Metabolomic Data
Institute of Scientific and Technical Information of China (English)
ZHOU Jun; CAO Bei; ZHENG Tian; LIU Lin-sheng; GUO Sheng; DUAN Jin-ao; AA Ji-ye; WANG Guang-ji; ZHANG Feng-yi; GU Rong-rong; WANG Xin-wen; ZHAO Chun-yan; LI Meng-jie; SHI Jian
2011-01-01
Objective Although principal components analysis profiles greatly facilitate the visualization and interpretation of the multivariate data,the quantitative concepts in both scores plot and loading plot are rather obscure.This article introduced three profiles that assisted the better understanding of metabolomic data.Methods The discriminatory profile,heat map,and statistic profile were developed to visualize the multivariate data obtained from high-throughput GC-TOF-MS analysis.Results The discriminatory profile and heat map obviously showed the discriminatory metabolites between the two groups,while the statistic profile showed the potential markers of statistic significance.Conclusion The three types of profiles greatly facilitate our understanding of the metabolomic data and the identification of the potential markers.
Flotation control -- A multivariable stabilizer
Energy Technology Data Exchange (ETDEWEB)
Schubert, J.H.; Henning, R.G.D.; Hulbert, D.G.; Craig, I.K. [Mintek, Randburg (South Africa)
1995-12-31
This paper presents a stabilizing controller for flotation plants which uses a quasi-multivariable technique. The controller monitors all the levels in the plant, and by anticipating interactions between various parts of the plant, is able to stabilize the plant far more successfully than the normal plant control. Once stabilizing control has been achieved, optimization of the process becomes easier and more sustainable. An estimate of the improvement in metallurgical performance is made and a singular value analysis was conducted to verify that the multivariable algorithm will theoretically control better than a collection of individual PID loops. Metallurgical results are presented to show that the improvements are attainable in practice. Control by the Mintek algorithm was alternated with normal plant control, to show that the improvements are statistically significant.
Sparse Linear Identifiable Multivariate Modeling
DEFF Research Database (Denmark)
Henao, Ricardo; Winther, Ole
2011-01-01
In this paper we consider sparse and identifiable linear latent variable (factor) and linear Bayesian network models for parsimonious analysis of multivariate data. We propose a computationally efficient method for joint parameter and model inference, and model comparison. It consists of a fully...... Bayesian hierarchy for sparse models using slab and spike priors (two-component δ-function and continuous mixtures), non-Gaussian latent factors and a stochastic search over the ordering of the variables. The framework, which we call SLIM (Sparse Linear Identifiable Multivariate modeling), is validated...... and bench-marked on artificial and real biological data sets. SLIM is closest in spirit to LiNGAM (Shimizu et al., 2006), but differs substantially in inference, Bayesian network structure learning and model comparison. Experimentally, SLIM performs equally well or better than LiNGAM with comparable...
Multivariate Evolutionary Analyses in Astrophysics
Fraix-Burnet, Didier
2011-01-01
The large amount of data on galaxies, up to higher and higher redshifts, asks for sophisticated statistical approaches to build adequate classifications. Multivariate cluster analyses, that compare objects for their global similarities, are still confidential in astrophysics, probably because their results are somewhat difficult to interpret. We believe that the missing key is the unavoidable characteristics in our Universe: evolution. Our approach, known as Astrocladistics, is based on the evolutionary nature of both galaxies and their properties. It gathers objects according to their "histories" and establishes an evolutionary scenario among groups of objects. In this presentation, I show two recent results on globular clusters and earlytype galaxies to illustrate how the evolutionary concepts of Astrocladistics can also be useful for multivariate analyses such as K-means Cluster Analysis.
Nonnegative Decomposition of Multivariate Information
Williams, Paul L
2010-01-01
Of the various attempts to generalize information theory to multiple variables, the most widely utilized, interaction information, suffers from the problem that it is sometimes negative. Here we reconsider from first principles the general structure of the information that a set of sources provides about a given variable. We begin with a new definition of redundancy as the minimum information that any source provides about each possible outcome of the variable, averaged over all possible outcomes. We then show how this measure of redundancy induces a lattice over sets of sources that clarifies the general structure of multivariate information. Finally, we use this redundancy lattice to propose a definition of partial information atoms that exhaustively decompose the Shannon information in a multivariate system in terms of the redundancy between synergies of subsets of the sources. Unlike interaction information, the atoms of our partial information decomposition are never negative and always support a clear i...
DEFF Research Database (Denmark)
Barndorff-Nielsen, Ole Eiler; Stelzer, Robert
Univariate superpositions of Ornstein-Uhlenbeck (OU) type processes, called supOU processes, provide a class of continuous time processes capable of exhibiting long memory behaviour. This paper introduces multivariate supOU processes and gives conditions for their existence and finiteness...... of OU type processes, which has been suggested in [2] in the univariate case. Finally, as an important special case, we introduce positive semi-definite supOU processes....
Polo Miranda, Carlos
2002-01-01
Este libro ha sido elaborado y editado para los estudios de segundo ciclo de Ingeniería de Organización Industrial, que se imparten en la ETSEIT de la UPC. La estadística multivariable permite el análisis y la interpretación del comportamiento de múltiples variables de interés, asociadas a un mismo individuo, de las que se dispone de un gran número de observaciones.
Assessing risk factors for periodontitis using regression
Lobo Pereira, J. A.; Ferreira, Maria Cristina; Oliveira, Teresa
2013-10-01
Multivariate statistical analysis is indispensable to assess the associations and interactions between different factors and the risk of periodontitis. Among others, regression analysis is a statistical technique widely used in healthcare to investigate and model the relationship between variables. In our work we study the impact of socio-demographic, medical and behavioral factors on periodontal health. Using regression, linear and logistic models, we can assess the relevance, as risk factors for periodontitis disease, of the following independent variables (IVs): Age, Gender, Diabetic Status, Education, Smoking status and Plaque Index. The multiple linear regression analysis model was built to evaluate the influence of IVs on mean Attachment Loss (AL). Thus, the regression coefficients along with respective p-values will be obtained as well as the respective p-values from the significance tests. The classification of a case (individual) adopted in the logistic model was the extent of the destruction of periodontal tissues defined by an Attachment Loss greater than or equal to 4 mm in 25% (AL≥4mm/≥25%) of sites surveyed. The association measures include the Odds Ratios together with the correspondent 95% confidence intervals.
Novel algorithm for constructing support vector machine regression ensemble
Institute of Scientific and Technical Information of China (English)
Li Bo; Li Xinjun; Zhao Zhiyan
2006-01-01
A novel algorithm for constructing support vector machine regression ensemble is proposed. As to regression prediction, support vector machine regression(SVMR) ensemble is proposed by resampling from given training data sets repeatedly and aggregating several independent SVMRs, each of which is trained to use a replicated training set. After training, several independently trained SVMRs need to be aggregated in an appropriate combination manner. Generally, the linear weighting is usually used like expert weighting score in Boosting Regression and it is without optimization capacity. Three combination techniques are proposed, including simple arithmetic mean,linear least square error weighting and nonlinear hierarchical combining that uses another upper-layer SVMR to combine several lower-layer SVMRs. Finally, simulation experiments demonstrate the accuracy and validity of the presented algorithm.
TMVA - Tool-kit for Multivariate Data Analysis in ROOT
Energy Technology Data Exchange (ETDEWEB)
Therhaag, Jan; Von Toerne, Eckhard [Univ. Bonn, Physikalisches Institut, Nussallee 12, 53115 Bonn (Germany); Hoecker, Andreas; Speckmayer, Peter [European Organization for Nuclear Research - CERN, CH-1211 Geneve 23 (Switzerland); Stelzer, Joerg [Deutsches Elektronen-Synchrotron - DESY, Platanenallee 6, D-15738 Zeuthen (Germany); Voss, Helge [Max-Planck-Institut fuer Kernphysik - MPI, Postfach 10 39 80, Saupfercheckweg 1, DE-69117 Heidelberg (Germany)
2010-07-01
Given the ever-increasing complexity of modern HEP data analysis, multivariate analysis techniques have proven an indispensable tool in extracting the most valuable information from the data. TMVA, the Tool-kit for Multivariate Data Analysis, provides a large variety of advanced multivariate analysis techniques for both signal/background classification and regression problems. In TMVA, all methods are embedded in a user-friendly framework capable of handling the pre-processing of the data as well as the evaluation of the results, thus allowing for a simple use of even the most sophisticated multivariate techniques. Convenient assessment and comparison of different analysis techniques enable the user to choose the most efficient approach for any particular data analysis task. TMVA is an integral part of the ROOT data analysis framework and is widely-used in the LHC experiments. In this talk I will review recent developments in TMVA, discuss typical use-cases in HEP and present the performance of our most important multivariate techniques on example data by comparing it to theoretical performance limits. (authors)
Modified Regression Correlation Coefficient for Poisson Regression Model
Kaengthong, Nattacha; Domthong, Uthumporn
2017-09-01
This study gives attention to indicators in predictive power of the Generalized Linear Model (GLM) which are widely used; however, often having some restrictions. We are interested in regression correlation coefficient for a Poisson regression model. This is a measure of predictive power, and defined by the relationship between the dependent variable (Y) and the expected value of the dependent variable given the independent variables [E(Y|X)] for the Poisson regression model. The dependent variable is distributed as Poisson. The purpose of this research was modifying regression correlation coefficient for Poisson regression model. We also compare the proposed modified regression correlation coefficient with the traditional regression correlation coefficient in the case of two or more independent variables, and having multicollinearity in independent variables. The result shows that the proposed regression correlation coefficient is better than the traditional regression correlation coefficient based on Bias and the Root Mean Square Error (RMSE).
Sensory Hierarchical Organization and Reading.
Skapof, Jerome
The purpose of this study was to judge the viability of an operational approach aimed at assessing response styles in reading using the hypothesis of sensory hierarchical organization. A sample of 103 middle-class children from a New York City public school, between the ages of five and seven, took part in a three phase experiment. Phase one…
Memory Stacking in Hierarchical Networks.
Westö, Johan; May, Patrick J C; Tiitinen, Hannu
2016-02-01
Robust representations of sounds with a complex spectrotemporal structure are thought to emerge in hierarchically organized auditory cortex, but the computational advantage of this hierarchy remains unknown. Here, we used computational models to study how such hierarchical structures affect temporal binding in neural networks. We equipped individual units in different types of feedforward networks with local memory mechanisms storing recent inputs and observed how this affected the ability of the networks to process stimuli context dependently. Our findings illustrate that these local memories stack up in hierarchical structures and hence allow network units to exhibit selectivity to spectral sequences longer than the time spans of the local memories. We also illustrate that short-term synaptic plasticity is a potential local memory mechanism within the auditory cortex, and we show that it can bring robustness to context dependence against variation in the temporal rate of stimuli, while introducing nonlinearities to response profiles that are not well captured by standard linear spectrotemporal receptive field models. The results therefore indicate that short-term synaptic plasticity might provide hierarchically structured auditory cortex with computational capabilities important for robust representations of spectrotemporal patterns.
Directory of Open Access Journals (Sweden)
Karim Hardani*
2012-05-01
Full Text Available A 10-month-old baby presented with developmental delay. He had flaccid paralysis on physical examination.An MRI of the spine revealed malformation of the ninth and tenth thoracic vertebral bodies with complete agenesis of the rest of the spine down that level. The thoracic spinal cord ends at the level of the fifth thoracic vertebra with agenesis of the posterior arches of the eighth, ninth and tenth thoracic vertebral bodies. The roots of the cauda equina appear tightened down and backward and ended into a subdermal fibrous fatty tissue at the level of the ninth and tenth thoracic vertebral bodies (closed meningocele. These findings are consistent with caudal regression syndrome.
Analysis of the real EADGENE data set::Multivariate approaches and post analysis
Schuberth Hans-Joachim; van Schothorst Evert M; Lund Mogens; San Cristobal Magali; Robert-Granié Christèle; Pool Marco H; Petzl Wolfram; Nie Haisheng; Cao Kim-Anh; de Koning Dirk-Jan; Jiang Li; Jensen Kirsty; Hulsegge Ina; Jaffrézic Florence; Hornshøj Henrik
2007-01-01
Abstract The aim of this paper was to describe, and when possible compare, the multivariate methods used by the participants in the EADGENE WP1.4 workshop. The first approach was for class discovery and class prediction using evidence from the data at hand. Several teams used hierarchical clustering (HC) or principal component analysis (PCA) to identify groups of differentially expressed genes with a similar expression pattern over time points and infective agent (E. coli or S. aureus). The m...
Upadhyay, Rohit; Mishra, Hari Niwas
2015-08-01
The sunflower oil-oleoresin rosemary (Rosmarinus officinalis L.) blends (SORB) at 9 different concentrations (200 to 2000 mg/kg), sunflower oil-tertiary butyl hydroquinone (SOTBHQ ) at 200 mg/kg and control (without preservatives) (SO control ) were oxidized using Rancimat (temperature: 100 to 130 °C; airflow rate: 20 L/h). The oxidative stability of blends was expressed using induction period (IP), oil stability index and photochemiluminescence assay. The linear regression models were generated by plotting ln IP with temperature to estimate the shelf life at 20 °C (SL20 ; R(2) > 0.90). Principal component analysis (PCA) and hierarchical cluster analysis (HCA) was used to classify the oil blends depending upon the oxidative stability and kinetic parameters. The Arrhenius equation adequately described the temperature-dependent kinetics (R(2) > 0.90, P < 0.05) and kinetic parameters viz. activation energies, activation enthalpies, and entropies were calculated in the range of 92.07 to 100.50 kJ/mol, 88.85 to 97.28 kJ/mol, -33.33 to -1.13 J/mol K, respectively. Using PCA, a satisfactory discrimination was noted among SORB, SOTBHQ , and SOcontrol samples. HCA classified the oil blends into 3 different clusters (I, II, and III) where SORB1200 and SORB1500 were grouped together in close proximity with SOTBHQ indicating the comparable oxidative stability. The SL20 was estimated to be 3790, 6974, and 4179 h for SO control, SOTBHQ, and SORB1500, respectively. The multivariate kinetic approach effectively screened SORB1500 as the best blend conferring the highest oxidative stability to sunflower oil. This approach can be adopted for quick and reliable estimation of the oxidative stability of oil samples.
Simulation of multivariate diffusion bridges
DEFF Research Database (Denmark)
Bladt, Mogens; Finch, Samuel; Sørensen, Michael
We propose simple methods for multivariate diffusion bridge simulation, which plays a fundamental role in simulation-based likelihood and Bayesian inference for stochastic differential equations. By a novel application of classical coupling methods, the new approach generalizes a previously...... proposed simulation method for one-dimensional bridges to the mulit-variate setting. First a method of simulating approzimate, but often very accurate, diffusion bridges is proposed. These approximate bridges are used as proposal for easily implementable MCMC algorithms that produce exact diffusion bridges...
Likelihood estimators for multivariate extremes
Huser, Raphaël
2015-11-17
The main approach to inference for multivariate extremes consists in approximating the joint upper tail of the observations by a parametric family arising in the limit for extreme events. The latter may be expressed in terms of componentwise maxima, high threshold exceedances or point processes, yielding different but related asymptotic characterizations and estimators. The present paper clarifies the connections between the main likelihood estimators, and assesses their practical performance. We investigate their ability to estimate the extremal dependence structure and to predict future extremes, using exact calculations and simulation, in the case of the logistic model.
Aspects of multivariate statistical theory
Muirhead, Robb J
2009-01-01
The Wiley-Interscience Paperback Series consists of selected books that have been made more accessible to consumers in an effort to increase global appeal and general circulation. With these new unabridged softcover volumes, Wiley hopes to extend the lives of these works by making them available to future generations of statisticians, mathematicians, and scientists. "". . . the wealth of material on statistics concerning the multivariate normal distribution is quite exceptional. As such it is a very useful source of information for the general statistician and a must for anyone wanting to pen
Exploratory and multivariate data analysis
Jambu, Michel
1991-01-01
With a useful index of notations at the beginning, this book explains and illustrates the theory and application of data analysis methods from univariate to multidimensional and how to learn and use them efficiently. This book is well illustrated and is a useful and well-documented review of the most important data analysis techniques.Key Features* Describes, in detail, exploratory data analysis techniques from the univariate to the multivariate ones* Features a complete description of correspondence analysis and factor analysis techniques as multidimensional statistical data a
Multivariate residues and maximal unitarity
Søgaard, Mads; Zhang, Yang
2013-12-01
We extend the maximal unitarity method to amplitude contributions whose cuts define multidimensional algebraic varieties. The technique is valid to all orders and is explicitly demonstrated at three loops in gauge theories with any number of fermions and scalars in the adjoint representation. Deca-cuts realized by replacement of real slice integration contours by higher-dimensional tori encircling the global poles are used to factorize the planar triple box onto a product of trees. We apply computational algebraic geometry and multivariate complex analysis to derive unique projectors for all master integral coefficients and obtain compact analytic formulae in terms of tree-level data.
Essentials of multivariate data analysis
Spencer, Neil H
2013-01-01
""… this text provides an overview at an introductory level of several methods in multivariate data analysis. It contains in-depth examples from one data set woven throughout the text, and a free [Excel] Add-In to perform the analyses in Excel, with step-by-step instructions provided for each technique. … could be used as a text (possibly supplemental) for courses in other fields where researchers wish to apply these methods without delving too deeply into the underlying statistics.""-The American Statistician, February 2015
Ultrasonic sensor for predicting sugar concentration using multivariate calibration.
Krause, D; Hussein, W B; Hussein, M A; Becker, T
2014-08-01
This paper presents a multivariate regression method for the prediction of maltose concentration in aqueous solutions. For this purpose, time and frequency domain of ultrasonic signals are analyzed. It is shown, that the prediction of concentration at different temperatures is possible by using several multivariate regression models for individual temperature points. Combining these models by a linear approximation of each coefficient over temperature results in a unified solution, which takes temperature effects into account. The benefit of the proposed method is the low processing time required for analyzing online signals as well as the non-invasive sensor setup which can be used in pipelines. Also the ultrasonic signal sections used in the presented investigation were extracted out of buffer reflections which remain primarily unaffected by bubble and particle interferences. Model calibration was performed in order to investigate the feasibility of online monitoring in fermentation processes. The temperature range investigated was from 10 °C to 21 °C. This range fits to fermentation processes used in the brewing industry. This paper describes the processing of ultrasonic signals for regression, the model evaluation as well as the input variable selection. The statistical approach used for creating the final prediction solution was partial least squares (PLS) regression validated by cross validation. The overall minimum root mean squared error achieved was 0.64 g/100 g.
Forensic discrimination of dyed hair color: II. Multivariate statistical analysis.
Barrett, Julie A; Siegel, Jay A; Goodpaster, John V
2011-01-01
This research is intended to assess the ability of UV-visible microspectrophotometry to successfully discriminate the color of dyed hair. Fifty-five red hair dyes were analyzed and evaluated using multivariate statistical techniques including agglomerative hierarchical clustering (AHC), principal component analysis (PCA), and discriminant analysis (DA). The spectra were grouped into three classes, which were visually consistent with different shades of red. A two-dimensional PCA observations plot was constructed, describing 78.6% of the overall variance. The wavelength regions associated with the absorbance of hair and dye were highly correlated. Principal components were selected to represent 95% of the overall variance for analysis with DA. A classification accuracy of 89% was observed for the comprehensive dye set, while external validation using 20 of the dyes resulted in a prediction accuracy of 75%. Significant color loss from successive washing of hair samples was estimated to occur within 3 weeks of dye application.
Biostatistics Series Module 10: Brief Overview of Multivariate Methods.
Hazra, Avijit; Gogtay, Nithya
2017-01-01
Multivariate analysis refers to statistical techniques that simultaneously look at three or more variables in relation to the subjects under investigation with the aim of identifying or clarifying the relationships between them. These techniques have been broadly classified as dependence techniques, which explore the relationship between one or more dependent variables and their independent predictors, and interdependence techniques, that make no such distinction but treat all variables equally in a search for underlying relationships. Multiple linear regression models a situation where a single numerical dependent variable is to be predicted from multiple numerical independent variables. Logistic regression is used when the outcome variable is dichotomous in nature. The log-linear technique models count type of data and can be used to analyze cross-tabulations where more than two variables are included. Analysis of covariance is an extension of analysis of variance (ANOVA), in which an additional independent variable of interest, the covariate, is brought into the analysis. It tries to examine whether a difference persists after "controlling" for the effect of the covariate that can impact the numerical dependent variable of interest. Multivariate analysis of variance (MANOVA) is a multivariate extension of ANOVA used when multiple numerical dependent variables have to be incorporated in the analysis. Interdependence techniques are more commonly applied to psychometrics, social sciences and market research. Exploratory factor analysis and principal component analysis are related techniques that seek to extract from a larger number of metric variables, a smaller number of composite factors or components, which are linearly related to the original variables. Cluster analysis aims to identify, in a large number of cases, relatively homogeneous groups called clusters, without prior information about the groups. The calculation intensive nature of multivariate analysis
DEFF Research Database (Denmark)
Sørensen, Jens Benn; Badsberg, Jens Henrik; Olsen, Jens
1989-01-01
and degree of differentiation, the new international staging system for lung cancer, and seven laboratory parameters. Staging of the patients included bone marrow examination but were otherwise nonextensive without routine bone, liver, and brain scans. Factors predicting poor survival were low performance...... status, stage IV disease, no prior nonradical resection, liver metastases, high values of white blood cell count, and lactate dehydrogenase, and low values of aspartate aminotransaminase. The nonradical resection may not be a prognostic factor because of the resection itself but may rather serve...... in the former Cox model to be of importance (performance status, stage, surgical resection, WBC, aspartate aminotransaminase, and lactate dehydrogenase). This simplified model appears to be a feasible clinical tool, allowing for prognostic stratification of patients when first the inoperability of the patient...
M. Ahmadlou; M. R. Delavar; Tayyebi, A.; H. Shafizadeh-Moghadam
2015-01-01
Land use change (LUC) models used for modelling urban growth are different in structure and performance. Local models divide the data into separate subsets and fit distinct models on each of the subsets. Non-parametric models are data driven and usually do not have a fixed model structure or model structure is unknown before the modelling process. On the other hand, global models perform modelling using all the available data. In addition, parametric models have a fixed structure before the m...
A Multivariate Regression Approach to Adjust AATSR Sea Surface Temperature to In Situ Measurements
TANDEO, Pierre; Autret, Emmanuelle; Piolle, Jean-francois; Tournadre, Jean; Ailliot, Pierre
2009-01-01
The Advanced Along-Track Scanning Radiometer (AATSR) onboard Envisat is designed to provide very accurate measurements of sea surface temperature (SST). Using colocated in situ drifting buoys, a dynamical matchup database (MDB) is used to assess the AATSR-derived SST products more precisely. SST biases are then computed. Currently, Medspiration AATSR SST biases are discrete values and can introduce artificial discontinuities in AATSR level-2 SST fields. The new AATSR SST biases presented in t...
Liou, Jyun-you; Smith, Elliot H.; Bateman, Lisa M.; McKhann, Guy M., II; Goodman, Robert R.; Greger, Bradley; Davis, Tyler S.; Kellis, Spencer S.; House, Paul A.; Schevon, Catherine A.
2017-08-01
Objective. Epileptiform discharges, an electrophysiological hallmark of seizures, can propagate across cortical tissue in a manner similar to traveling waves. Recent work has focused attention on the origination and propagation patterns of these discharges, yielding important clues to their source location and mechanism of travel. However, systematic studies of methods for measuring propagation are lacking. Approach. We analyzed epileptiform discharges in microelectrode array recordings of human seizures. The array records multiunit activity and local field potentials at 400 micron spatial resolution, from a small cortical site free of obstructions. We evaluated several computationally efficient statistical methods for calculating traveling wave velocity, benchmarking them to analyses of associated neuronal burst firing. Main results. Over 90% of discharges met statistical criteria for propagation across the sampled cortical territory. Detection rate, direction and speed estimates derived from a multiunit estimator were compared to four field potential-based estimators: negative peak, maximum descent, high gamma power, and cross-correlation. Interestingly, the methods that were computationally simplest and most efficient (negative peak and maximal descent) offer non-inferior results in predicting neuronal traveling wave velocities compared to the other two, more complex methods. Moreover, the negative peak and maximal descent methods proved to be more robust against reduced spatial sampling challenges. Using least absolute deviation in place of least squares error minimized the impact of outliers, and reduced the discrepancies between local field potential-based and multiunit estimators. Significance. Our findings suggest that ictal epileptiform discharges typically take the form of exceptionally strong, rapidly traveling waves, with propagation detectable across millimeter distances. The sequential activation of neurons in space can be inferred from clinically-observable EEG data, with a variety of straightforward computation methods available. This opens possibilities for systematic assessments of ictal discharge propagation in clinical and research settings.
Data-driven fuel consumption estimation: A multivariate adaptive regression spline approach
Energy Technology Data Exchange (ETDEWEB)
Chen, Yuche; Zhu, Lei; Gonder, Jeffrey; Young, Stanley; Walkowicz, Kevin
2017-10-01
Providing guidance and information to drivers to help them make fuel-efficient route choices remains an important and effective strategy in the near term to reduce fuel consumption from the transportation sector. One key component in implementing this strategy is a fuel-consumption estimation model. In this paper, we developed a mesoscopic fuel consumption estimation model that can be implemented into an eco-routing system. Our proposed model presents a framework that utilizes large-scale, real-world driving data, clusters road links by free-flow speed and fits one statistical model for each of cluster. This model includes predicting variables that were rarely or never considered before, such as free-flow speed and number of lanes. We applied the model to a real-world driving data set based on a global positioning system travel survey in the Philadelphia-Camden-Trenton metropolitan area. Results from the statistical analyses indicate that the independent variables we chose influence the fuel consumption rates of vehicles. But the magnitude and direction of the influences are dependent on the type of road links, specifically free-flow speeds of links. A statistical diagnostic is conducted to ensure the validity of the models and results. Although the real-world driving data we used to develop statistical relationships are specific to one region, the framework we developed can be easily adjusted and used to explore the fuel consumption relationship in other regions.
Hierarchical Prisoner's Dilemma in Hierarchical Public-Goods Game
Fujimoto, Yuma; Kaneko, Kunihiko
2016-01-01
The dilemma in cooperation is one of the major concerns in game theory. In a public-goods game, each individual pays a cost for cooperation, or to prevent defection, and receives a reward from the collected cost in a group. Thus, defection is beneficial for each individual, while cooperation is beneficial for the group. Now, groups (say, countries) consisting of individual players also play games. To study such a multi-level game, we introduce a hierarchical public-goods (HPG) game in which two groups compete for finite resources by utilizing costs collected from individuals in each group. Analyzing this HPG game, we found a hierarchical prisoner's dilemma, in which groups choose the defection policy (say, armaments) as a Nash strategy to optimize each group's benefit, while cooperation optimizes the total benefit. On the other hand, for each individual within a group, refusing to pay the cost (say, tax) is a Nash strategy, which turns to be a cooperation policy for the group, thus leading to a hierarchical d...
Multivariate image analysis in biomedicine.
Nattkemper, Tim W
2004-10-01
In recent years, multivariate imaging techniques are developed and applied in biomedical research in an increasing degree. In research projects and in clinical studies as well m-dimensional multivariate images (MVI) are recorded and stored to databases for a subsequent analysis. The complexity of the m-dimensional data and the growing number of high throughput applications call for new strategies for the application of image processing and data mining to support the direct interactive analysis by human experts. This article provides an overview of proposed approaches for MVI analysis in biomedicine. After summarizing the biomedical MVI techniques the two level framework for MVI analysis is illustrated. Following this framework, the state-of-the-art solutions from the fields of image processing and data mining are reviewed and discussed. Motivations for MVI data mining in biology and medicine are characterized, followed by an overview of graphical and auditory approaches for interactive data exploration. The paper concludes with summarizing open problems in MVI analysis and remarks upon the future development of biomedical MVI analysis.
Kwon, Yong-Kook; Jie, Eun Yee; Sartie, Alieu; Kim, Dong Jin; Liu, Jang Ryol; Min, Byung Whan; Kim, Suk Weon
2015-01-01
To determine whether or not FT-IR spectroscopy could be used for taxonomic and metabolic discrimination of African yam lines, tuber samples from African and Asian yam species were subjected to FT-IR. Most remarkable spectral differences between African and Asian yams were found in the 1750-1700 cm(-1) region, polysaccharide (1200-900 cm(-1)) and protein/amide I and II (1700-1500 cm(-1)) regions of FT-IR spectra. A hierarchical dendrogram based on partial least square-discriminant analysis (PLS-DA) of FT-IR data from 7 African yam species show phylogenetic relationship. In addition, the content of dioscin, a steroidal saponin found in yam tuber, was predicted using a PLS regression model with regression coefficient R(2)=0.7208 indicated that prediction model had average accuracy. Thus, considering these results we suggest that FT-IR combined with multivariate analysis could be applied as a novel tool for metabolic evaluation and high-throughput screening of African yam lines with higher content of dioscin.
An architecture for implementation of multivariable controllers
DEFF Research Database (Denmark)
Niemann, Hans Henrik; Stoustrup, Jakob
1999-01-01
august 2002 Abstract An architecture for implementation of multivariable controllers is presented in this paper. The architecture is based on the Youla-Jabr-Bongiorno-Kucera parameterization of all stabilizing controllers. By using this architecture for implementation of multivariable controllers...
Unsupervised classification of multivariate geostatistical data: Two algorithms
Romary, Thomas; Ors, Fabien; Rivoirard, Jacques; Deraisme, Jacques
2015-12-01
With the increasing development of remote sensing platforms and the evolution of sampling facilities in mining and oil industry, spatial datasets are becoming increasingly large, inform a growing number of variables and cover wider and wider areas. Therefore, it is often necessary to split the domain of study to account for radically different behaviors of the natural phenomenon over the domain and to simplify the subsequent modeling step. The definition of these areas can be seen as a problem of unsupervised classification, or clustering, where we try to divide the domain into homogeneous domains with respect to the values taken by the variables in hand. The application of classical clustering methods, designed for independent observations, does not ensure the spatial coherence of the resulting classes. Image segmentation methods, based on e.g. Markov random fields, are not adapted to irregularly sampled data. Other existing approaches, based on mixtures of Gaussian random functions estimated via the expectation-maximization algorithm, are limited to reasonable sample sizes and a small number of variables. In this work, we propose two algorithms based on adaptations of classical algorithms to multivariate geostatistical data. Both algorithms are model free and can handle large volumes of multivariate, irregularly spaced data. The first one proceeds by agglomerative hierarchical clustering. The spatial coherence is ensured by a proximity condition imposed for two clusters to merge. This proximity condition relies on a graph organizing the data in the coordinates space. The hierarchical algorithm can then be seen as a graph-partitioning algorithm. Following this interpretation, a spatial version of the spectral clustering algorithm is also proposed. The performances of both algorithms are assessed on toy examples and a mining dataset.
Multivariate analysis of bistable flow; Analisis multivariable de flujo biestable
Energy Technology Data Exchange (ETDEWEB)
Castillo D, R.; Ortiz V, J.; Ruiz E, J.A. [ININ, 52750 La Marquesa, Estado de Mexico (Mexico); Calleros M, G. [CFE, Alto LUcero, Veracruz (Mexico)]. e-mail: rcd@nuclear.inin.mx
2007-07-01
In this work a bistable flow analysis with an autoregressive multivariate analysis is presented. The bistable flow happens in the boiling water nuclear reactors with external recirculation pumps, and it is presented in the bolster of discharge of the recirculation knot toward the central jet pumps. The phenomenon has two flow patterns, one with greater hydraulic lost that the other one. To irregular time intervals, the flow changes pattern in a random way. The program NOISE that it is in development in the ININ was used and that it uses a autoregressive multivariate model to determine the autoregression coefficients that contain the dynamic information of the signals and that later on they are used to obtain the relative contribution of power, which allows to settle down the influence that exists among the different analyzed variables. It was analyzed an event of bistable flow happened in a BWR5 to operation conditions of 80% power and 69% of total flow through the core. The signal flow noise in each one of the 20 jet pumps, of the power of a monitor of power average, of the motive flows of recirculation, of the controllers and of the position of the control valves in the knots, of the signals of the instrumentation of the recirculation pumps (power, current, pressure drop and suction temperature), and of the buses of where they take the feeding voltage the motors of the pumps. Among the main results it was found that the phenomenon of bistable flow affects to the pressure drop in the recirculation pump of the knot in that occur, what affects to the motor flow in the knot by what the opening system of the flow control valve of recirculation of the knot responds. (Author)
A kernel version of multivariate alteration detection
DEFF Research Database (Denmark)
Nielsen, Allan Aasbjerg; Vestergaard, Jacob Schack
2013-01-01
Based on the established methods kernel canonical correlation analysis and multivariate alteration detection we introduce a kernel version of multivariate alteration detection. A case study with SPOT HRV data shows that the kMAD variates focus on extreme change observations.......Based on the established methods kernel canonical correlation analysis and multivariate alteration detection we introduce a kernel version of multivariate alteration detection. A case study with SPOT HRV data shows that the kMAD variates focus on extreme change observations....
Multivariate normal-Laplace distribution and processes
Directory of Open Access Journals (Sweden)
Kanichukattu Korakutty Jose
2014-12-01
Full Text Available The normal-Laplace distribution is considered and its properties are discussed. A multivariate normal-Laplace distribution is introduced and its properties are studied. First order autoregressive processes with these stationary marginal distributions are developed and studied. A generalized multivariate normal-Laplace distribution is introduced. Multivariate geometric normal-Laplace distribution and multivariate geometric generalized normal-Laplace distributions are introduced and their properties are studied. Estimation of parameters and some applications are also discussed.
Recursive Algorithm For Linear Regression
Varanasi, S. V.
1988-01-01
Order of model determined easily. Linear-regression algorithhm includes recursive equations for coefficients of model of increased order. Algorithm eliminates duplicative calculations, facilitates search for minimum order of linear-regression model fitting set of data satisfactory.
Multivariate strategies in functional magnetic resonance imaging
DEFF Research Database (Denmark)
Hansen, Lars Kai
2007-01-01
We discuss aspects of multivariate fMRI modeling, including the statistical evaluation of multivariate models and means for dimensional reduction. In a case study we analyze linear and non-linear dimensional reduction tools in the context of a `mind reading' predictive multivariate fMRI model....
Multivariate extensions of expectiles risk measures
Directory of Open Access Journals (Sweden)
Maume-Deschamps Véronique
2017-01-01
Full Text Available This paper is devoted to the introduction and study of a new family of multivariate elicitable risk measures. We call the obtained vector-valued measures multivariate expectiles. We present the different approaches used to construct our measures. We discuss the coherence properties of these multivariate expectiles. Furthermore, we propose a stochastic approximation tool of these risk measures.
Hierarchical structure of biological systems
Alcocer-Cuarón, Carlos; Rivera, Ana L; Castaño, Victor M
2014-01-01
A general theory of biological systems, based on few fundamental propositions, allows a generalization of both Wierner and Berthalanffy approaches to theoretical biology. Here, a biological system is defined as a set of self-organized, differentiated elements that interact pair-wise through various networks and media, isolated from other sets by boundaries. Their relation to other systems can be described as a closed loop in a steady-state, which leads to a hierarchical structure and functioning of the biological system. Our thermodynamical approach of hierarchical character can be applied to biological systems of varying sizes through some general principles, based on the exchange of energy information and/or mass from and within the systems. PMID:24145961
Automatic Hierarchical Color Image Classification
Directory of Open Access Journals (Sweden)
Jing Huang
2003-02-01
Full Text Available Organizing images into semantic categories can be extremely useful for content-based image retrieval and image annotation. Grouping images into semantic classes is a difficult problem, however. Image classification attempts to solve this hard problem by using low-level image features. In this paper, we propose a method for hierarchical classification of images via supervised learning. This scheme relies on using a good low-level feature and subsequently performing feature-space reconfiguration using singular value decomposition to reduce noise and dimensionality. We use the training data to obtain a hierarchical classification tree that can be used to categorize new images. Our experimental results suggest that this scheme not only performs better than standard nearest-neighbor techniques, but also has both storage and computational advantages.
Intuitionistic fuzzy hierarchical clustering algorithms
Institute of Scientific and Technical Information of China (English)
Xu Zeshui
2009-01-01
Intuitionistic fuzzy set (IFS) is a set of 2-tuple arguments, each of which is characterized by a mem-bership degree and a nonmembership degree. The generalized form of IFS is interval-valued intuitionistic fuzzy set (IVIFS), whose components are intervals rather than exact numbers. IFSs and IVIFSs have been found to be very useful to describe vagueness and uncertainty. However, it seems that little attention has been focused on the clus-tering analysis of IFSs and IVIFSs. An intuitionistic fuzzy hierarchical algorithm is introduced for clustering IFSs, which is based on the traditional hierarchical clustering procedure, the intuitionistic fuzzy aggregation operator, and the basic distance measures between IFSs: the Hamming distance, normalized Hamming, weighted Hamming, the Euclidean distance, the normalized Euclidean distance, and the weighted Euclidean distance. Subsequently, the algorithm is extended for clustering IVIFSs. Finally the algorithm and its extended form are applied to the classifications of building materials and enterprises respectively.
Hierarchical Formation of Galactic Clusters
Elmegreen, B G
2006-01-01
Young stellar groupings and clusters have hierarchical patterns ranging from flocculent spiral arms and star complexes on the largest scale to OB associations, OB subgroups, small loose groups, clusters and cluster subclumps on the smallest scales. There is no obvious transition in morphology at the cluster boundary, suggesting that clusters are only the inner parts of the hierarchy where stars have had enough time to mix. The power-law cluster mass function follows from this hierarchical structure: n(M_cl) M_cl^-b for b~2. This value of b is independently required by the observation that the summed IMFs from many clusters in a galaxy equals approximately the IMF of each cluster.
Hierarchical matrices algorithms and analysis
Hackbusch, Wolfgang
2015-01-01
This self-contained monograph presents matrix algorithms and their analysis. The new technique enables not only the solution of linear systems but also the approximation of matrix functions, e.g., the matrix exponential. Other applications include the solution of matrix equations, e.g., the Lyapunov or Riccati equation. The required mathematical background can be found in the appendix. The numerical treatment of fully populated large-scale matrices is usually rather costly. However, the technique of hierarchical matrices makes it possible to store matrices and to perform matrix operations approximately with almost linear cost and a controllable degree of approximation error. For important classes of matrices, the computational cost increases only logarithmically with the approximation error. The operations provided include the matrix inversion and LU decomposition. Since large-scale linear algebra problems are standard in scientific computing, the subject of hierarchical matrices is of interest to scientists ...
Hierarchical Cont-Bouchaud model
Paluch, Robert; Holyst, Janusz A
2015-01-01
We extend the well-known Cont-Bouchaud model to include a hierarchical topology of agent's interactions. The influence of hierarchy on system dynamics is investigated by two models. The first one is based on a multi-level, nested Erdos-Renyi random graph and individual decisions by agents according to Potts dynamics. This approach does not lead to a broad return distribution outside a parameter regime close to the original Cont-Bouchaud model. In the second model we introduce a limited hierarchical Erdos-Renyi graph, where merging of clusters at a level h+1 involves only clusters that have merged at the previous level h and we use the original Cont-Bouchaud agent dynamics on resulting clusters. The second model leads to a heavy-tail distribution of cluster sizes and relative price changes in a wide range of connection densities, not only close to the percolation threshold.
Multivariate semiparametric spatial methods for imaging data.
Chen, Huaihou; Cao, Guanqun; Cohen, Ronald A
2017-04-01
Univariate semiparametric methods are often used in modeling nonlinear age trajectories for imaging data, which may result in efficiency loss and lower power for identifying important age-related effects that exist in the data. As observed in multiple neuroimaging studies, age trajectories show similar nonlinear patterns for the left and right corresponding regions and for the different parts of a big organ such as the corpus callosum. To incorporate the spatial similarity information without assuming spatial smoothness, we propose a multivariate semiparametric regression model with a spatial similarity penalty, which constrains the variation of the age trajectories among similar regions. The proposed method is applicable to both cross-sectional and longitudinal region-level imaging data. We show the asymptotic rates for the bias and covariance functions of the proposed estimator and its asymptotic normality. Our simulation studies demonstrate that by borrowing information from similar regions, the proposed spatial similarity method improves the efficiency remarkably. We apply the proposed method to two neuroimaging data examples. The results reveal that accounting for the spatial similarity leads to more accurate estimators and better functional clustering results for visualizing brain atrophy pattern.Functional clustering; Longitudinal magnetic resonance imaging (MRI); Penalized B-splines; Region of interest (ROI); Spatial penalty.
A multivariate exploration of basic symptoms.
Rubino, I Alex; Ciani, Nicola
2002-01-01
Little is known about the relationship between the different categories of basic symptoms (BS). Researchers of the Bonn School have accurately described the progression from second-level BS (relatively characteristic BS) to first-rank Schneiderian symptoms. Using a multiple regression model, the present study tried to investigate which kind of dynamic deficiencies (DDs; uncharacteristic first-level BS) mostly lead to each type of second-level BS. A group of 108 patients with a DSM-III-R diagnosis of schizophrenia completed an inventory on BS, with all items in strict accordance with those of the Bonn Scale. Five dependent variables (cognitive thought disorders, cognitive perception disorders, cognitive action disorders, increased impressionability, cenesthesias) and four independent variables (DDs with direct negative symptoms, DDs with indirect negative symptoms, affective DDs, relational DDs) were considered. Among the significant findings, a widespread contribution of DDs with indirect negative symptoms to most of the dependent variables, and the special role of DDs with direct negative symptoms as a predictor of cognitive thought disorders, must be emphasized. Suggestions for further multivariate studies in the field of BS are presented.
Hierarchical Clustering and Active Galaxies
Hatziminaoglou, E; Manrique, A
2000-01-01
The growth of Super Massive Black Holes and the parallel development of activity in galactic nuclei are implemented in an analytic code of hierarchical clustering. The evolution of the luminosity function of quasars and AGN will be computed with special attention paid to the connection between quasars and Seyfert galaxies. One of the major interests of the model is the parallel study of quasar formation and evolution and the History of Star Formation.
Hybrid and hierarchical composite materials
Kim, Chang-Soo; Sano, Tomoko
2015-01-01
This book addresses a broad spectrum of areas in both hybrid materials and hierarchical composites, including recent development of processing technologies, structural designs, modern computer simulation techniques, and the relationships between the processing-structure-property-performance. Each topic is introduced at length with numerous and detailed examples and over 150 illustrations. In addition, the authors present a method of categorizing these materials, so that representative examples of all material classes are discussed.
Treatment Protocols as Hierarchical Structures
Ben-Bassat, Moshe; Carlson, Richard W.; Puri, Vinod K.; Weil, Max Harry
1978-01-01
We view a treatment protocol as a hierarchical structure of therapeutic modules. The lowest level of this structure consists of individual therapeutic actions. Combinations of individual actions define higher level modules, which we call routines. Routines are designed to manage limited clinical problems, such as the routine for fluid loading to correct hypovolemia. Combinations of routines and additional actions, together with comments, questions, or precautions organized in a branching logic, in turn, define the treatment protocol for a given disorder. Adoption of this modular approach may facilitate the formulation of treatment protocols, since the physician is not required to prepare complex flowcharts. This hierarchical approach also allows protocols to be updated and modified in a flexible manner. By use of such a standard format, individual components may be fitted together to create protocols for multiple disorders. The technique is suited for computer implementation. We believe that this hierarchical approach may facilitate standarization of patient care as well as aid in clinical teaching. A protocol for acute pancreatitis is used to illustrate this technique.
Observability of multivariate differential embeddings
Energy Technology Data Exchange (ETDEWEB)
Aguirre, Luis Antonio [Laboratorio de Modelagem, Analise e Controle de Sistemas Nao Lineares, Departamento de Engenharia Eletronica, Universidade Federeal de Minas Gerais, Av. Antonio Carlos 6627, 31270-901 Belo Horizonte, MG (Brazil); Letellier, Christophe [Universite de Rouen-CORIA UMR 6614, Av. de l' Universite, BP 12, F-76801 Saint-Etienne du Rouvray Cedex (France)
2005-07-15
The present paper extends some results recently developed for the analysis of observability in nonlinear dynamical systems. The aim of the paper is to address the problem of embedding an attractor using more than one observable. A multivariate nonlinear observability matrix is proposed which includes the monovariable nonlinear and linear observability matrices as particular cases. Using the developed framework and a number of worked examples, it is shown that the choice of embedding coordinates is critical. Moreover, in some cases, to reconstruct the dynamics using more than one observable could be worse than to reconstruct using a scalar measurement. Finally, using the developed framework it is shown that increasing the embedding dimension, observability problems diminish and can even be eliminated. This seems to be a physically meaningful interpretation of the Takens embedding theorem.
Multivariate Analyis of Swap Bribery
Dorn, Britta
2010-01-01
We consider the computational complexity of a problem modeling bribery in the context of voting systems. In the scenario of Swap Bribery, each voter assigns a certain price for swapping the positions of two consecutive candidates in his preference ranking. The question is whether it is possible, without exceeding a given budget, to bribe the voters in a way that the preferred candidate wins in the election. We initiate a parameterized and multivariate complexity analysis of Swap Bribery, focusing on the case of k-approval. We investigate how different cost functions affect the computational complexity of the problem. We identify a special case of k-approval for which the problem can be solved in polynomial time, whereas we prove NP-hardness for a slightly more general scenario. We obtain fixed-parameter tractability as well as W[1]-hardness results for certain natural parameters.
Acoustic multivariate condition monitoring - AMCM
Energy Technology Data Exchange (ETDEWEB)
Rosenhave, P.E. [Vestfold College, Maritime Dept., Toensberg (Norway)
1997-12-31
In Norway, Vestfold College, Maritime Department presents new opportunities for non-invasive, on- or off-line acoustic monitoring of rotating machinery such as off-shore pumps and diesel engines. New developments within acoustic sensor technology coupled with chemometric data analysis of complex signals now allow condition monitoring of hitherto unavailable flexibility and diagnostic specificity. Chemometrics paired with existing knowledge yields a new and powerful tool for condition monitoring. By the use of multivariate techniques and acoustics it is possible to quantify wear and tear as well as predict the performance of working components in complex machinery. This presentation describes the AMCM method and one result of a feasibility study conducted onboard the LPG/C `Norgas Mariner` owned by Norwegian Gas Carriers as (NGC), Oslo. (orig.) 6 refs.
A Bayesian approach to linear regression in astronomy
Sereno, Mauro
2015-01-01
Linear regression is common in astronomical analyses. I discuss a Bayesian hierarchical modeling of data with heteroscedastic and possibly correlated measurement errors and intrinsic scatter. The method fully accounts for time evolution. The slope, the normalization, and the intrinsic scatter of the relation can evolve with the redshift. The intrinsic distribution of the independent variable is approximated using a mixture of Gaussian distributions whose means and standard deviations depend on time. The method can address scatter in the measured independent variable (a kind of Eddington bias), selection effects in the response variable (Malmquist bias), and departure from linearity in form of a knee. I tested the method with toy models and simulations and quantified the effect of biases and inefficient modeling. The R-package LIRA (LInear Regression in Astronomy) is made available to perform the regression.
Robust Bayesian Regularized Estimation Based on t Regression Model
Directory of Open Access Journals (Sweden)
Zean Li
2015-01-01
Full Text Available The t distribution is a useful extension of the normal distribution, which can be used for statistical modeling of data sets with heavy tails, and provides robust estimation. In this paper, in view of the advantages of Bayesian analysis, we propose a new robust coefficient estimation and variable selection method based on Bayesian adaptive Lasso t regression. A Gibbs sampler is developed based on the Bayesian hierarchical model framework, where we treat the t distribution as a mixture of normal and gamma distributions and put different penalization parameters for different regression coefficients. We also consider the Bayesian t regression with adaptive group Lasso and obtain the Gibbs sampler from the posterior distributions. Both simulation studies and real data example show that our method performs well compared with other existing methods when the error distribution has heavy tails and/or outliers.
Regression in autistic spectrum disorders.
Stefanatos, Gerry A
2008-12-01
A significant proportion of children diagnosed with Autistic Spectrum Disorder experience a developmental regression characterized by a loss of previously-acquired skills. This may involve a loss of speech or social responsitivity, but often entails both. This paper critically reviews the phenomena of regression in autistic spectrum disorders, highlighting the characteristics of regression, age of onset, temporal course, and long-term outcome. Important considerations for diagnosis are discussed and multiple etiological factors currently hypothesized to underlie the phenomenon are reviewed. It is argued that regressive autistic spectrum disorders can be conceptualized on a spectrum with other regressive disorders that may share common pathophysiological features. The implications of this viewpoint are discussed.
Combining Alphas via Bounded Regression
Directory of Open Access Journals (Sweden)
Zura Kakushadze
2015-11-01
Full Text Available We give an explicit algorithm and source code for combining alpha streams via bounded regression. In practical applications, typically, there is insufficient history to compute a sample covariance matrix (SCM for a large number of alphas. To compute alpha allocation weights, one then resorts to (weighted regression over SCM principal components. Regression often produces alpha weights with insufficient diversification and/or skewed distribution against, e.g., turnover. This can be rectified by imposing bounds on alpha weights within the regression procedure. Bounded regression can also be applied to stock and other asset portfolio construction. We discuss illustrative examples.
Voxelwise multivariate analysis of multimodality magnetic resonance imaging.
Naylor, Melissa G; Cardenas, Valerie A; Tosun, Duygu; Schuff, Norbert; Weiner, Michael; Schwartzman, Armin
2014-03-01
Most brain magnetic resonance imaging (MRI) studies concentrate on a single MRI contrast or modality, frequently structural MRI. By performing an integrated analysis of several modalities, such as structural, perfusion-weighted, and diffusion-weighted MRI, new insights may be attained to better understand the underlying processes of brain diseases. We compare two voxelwise approaches: (1) fitting multiple univariate models, one for each outcome and then adjusting for multiple comparisons among the outcomes and (2) fitting a multivariate model. In both cases, adjustment for multiple comparisons is performed over all voxels jointly to account for the search over the brain. The multivariate model is able to account for the multiple comparisons over outcomes without assuming independence because the covariance structure between modalities is estimated. Simulations show that the multivariate approach is more powerful when the outcomes are correlated and, even when the outcomes are independent, the multivariate approach is just as powerful or more powerful when at least two outcomes are dependent on predictors in the model. However, multiple univariate regressions with Bonferroni correction remain a desirable alternative in some circumstances. To illustrate the power of each approach, we analyze a case control study of Alzheimer's disease, in which data from three MRI modalities are available. Copyright © 2013 Wiley Periodicals, Inc.
Linear regression in astronomy. I
Isobe, Takashi; Feigelson, Eric D.; Akritas, Michael G.; Babu, Gutti Jogesh
1990-01-01
Five methods for obtaining linear regression fits to bivariate data with unknown or insignificant measurement errors are discussed: ordinary least-squares (OLS) regression of Y on X, OLS regression of X on Y, the bisector of the two OLS lines, orthogonal regression, and 'reduced major-axis' regression. These methods have been used by various researchers in observational astronomy, most importantly in cosmic distance scale applications. Formulas for calculating the slope and intercept coefficients and their uncertainties are given for all the methods, including a new general form of the OLS variance estimates. The accuracy of the formulas was confirmed using numerical simulations. The applicability of the procedures is discussed with respect to their mathematical properties, the nature of the astronomical data under consideration, and the scientific purpose of the regression. It is found that, for problems needing symmetrical treatment of the variables, the OLS bisector performs significantly better than orthogonal or reduced major-axis regression.
Multivariate Boosting for Integrative Analysis of High-Dimensional Cancer Genomic Data.
Xiong, Lie; Kuan, Pei-Fen; Tian, Jianan; Keles, Sunduz; Wang, Sijian
2015-01-01
In this paper, we propose a novel multivariate component-wise boosting method for fitting multivariate response regression models under the high-dimension, low sample size setting. Our method is motivated by modeling the association among different biological molecules based on multiple types of high-dimensional genomic data. Particularly, we are interested in two applications: studying the influence of DNA copy number alterations on RNA transcript levels and investigating the association between DNA methylation and gene expression. For this purpose, we model the dependence of the RNA expression levels on DNA copy number alterations and the dependence of gene expression on DNA methylation through multivariate regression models and utilize boosting-type method to handle the high dimensionality as well as model the possible nonlinear associations. The performance of the proposed method is demonstrated through simulation studies. Finally, our multivariate boosting method is applied to two breast cancer studies.
U-Scores for Multivariate Data in Sports.
Wittkowski, Knut M; Song, Tingting; Anderson, Kent; Daniels, John E
2008-07-18
In many sport competitions athletes, teams, or countries are evaluated based on several variables. The strong assumptions underlying traditional 'linear weight' scoring systems (that the relative importance, interactions and linearizing transformations of the variables are known) can often not be justified on theoretical grounds, and empirical 'validation' of weights, interactions and transformations, is problematic when a 'gold standard' is lacking. With μ-scores (u-scores for multivariate data) one can integrate information even if the variables have different scales and unknown interactions or if the events counted are not directly comparable, as long as the variables have an 'orientation'. Using baseball as an example, we discuss how measures based on μ-scores can complement the existing measures for 'performance' (which may depend on the situation) by providing the first multivariate measures for 'ability' (which should be independent of the situation). Recently, μ-scores have been extended to situations where count variables are graded by importance or relevance, such as medals in the Olympics (Wittkowski 2003) or Tour-de-France jerseys (Cherchye and Vermeulen 2006, 2007). Here, we present extensions to 'censored' variables (life-time achievements of active athletes), penalties (counting a win more than two ties) and hierarchically structured variables (Nordic, alpine, outdoor, and indoor Olympic events). The methods presented are not restricted to sports. Other applications of the method include medicine (adverse events), finance (risk analysis), social choice theory (voting), and economy (long-term profit).
TMVA(Toolkit for Multivariate Analysis) new architectures design and implementation.
Zapata Mesa, Omar Andres
2016-01-01
Toolkit for Multivariate Analysis(TMVA) is a package in ROOT for machine learning algorithms for classification and regression of the events in the detectors. In TMVA, we are developing new high level algorithms to perform multivariate analysis as cross validation, hyper parameter optimization, variable importance etc... Almost all the algorithms are expensive and designed to process a huge amount of data. It is very important to implement the new technologies on parallel computing to reduce the processing times.
Wong, Vivian C.; Steiner, Peter M.; Cook, Thomas D.
2013-01-01
In a traditional regression-discontinuity design (RDD), units are assigned to treatment on the basis of a cutoff score and a continuous assignment variable. The treatment effect is measured at a single cutoff location along the assignment variable. This article introduces the multivariate regression-discontinuity design (MRDD), where multiple…
Moment-bases estimation of smooth transition regression models with endogenous variables
W.D. Areosa (Waldyr Dutra); M.J. McAleer (Michael); M.C. Medeiros (Marcelo)
2008-01-01
textabstractNonlinear regression models have been widely used in practice for a variety of time series and cross-section datasets. For purposes of analyzing univariate and multivariate time series data, in particular, Smooth Transition Regression (STR) models have been shown to be very useful for re
Time-adaptive quantile regression
DEFF Research Database (Denmark)
Møller, Jan Kloppenborg; Nielsen, Henrik Aalborg; Madsen, Henrik
2008-01-01
An algorithm for time-adaptive quantile regression is presented. The algorithm is based on the simplex algorithm, and the linear optimization formulation of the quantile regression problem is given. The observations have been split to allow a direct use of the simplex algorithm. The simplex method...... and an updating procedure are combined into a new algorithm for time-adaptive quantile regression, which generates new solutions on the basis of the old solution, leading to savings in computation time. The suggested algorithm is tested against a static quantile regression model on a data set with wind power...... production, where the models combine splines and quantile regression. The comparison indicates superior performance for the time-adaptive quantile regression in all the performance parameters considered....
Linear regression in astronomy. II
Feigelson, Eric D.; Babu, Gutti J.
1992-01-01
A wide variety of least-squares linear regression procedures used in observational astronomy, particularly investigations of the cosmic distance scale, are presented and discussed. The classes of linear models considered are (1) unweighted regression lines, with bootstrap and jackknife resampling; (2) regression solutions when measurement error, in one or both variables, dominates the scatter; (3) methods to apply a calibration line to new data; (4) truncated regression models, which apply to flux-limited data sets; and (5) censored regression models, which apply when nondetections are present. For the calibration problem we develop two new procedures: a formula for the intercept offset between two parallel data sets, which propagates slope errors from one regression to the other; and a generalization of the Working-Hotelling confidence bands to nonstandard least-squares lines. They can provide improved error analysis for Faber-Jackson, Tully-Fisher, and similar cosmic distance scale relations.
Hierarchical Control for Smart Grids
DEFF Research Database (Denmark)
Trangbæk, K; Bendtsen, Jan Dimon; Stoustrup, Jakob
2011-01-01
This paper deals with hierarchical model predictive control (MPC) of smart grid systems. The design consists of a high level MPC controller, a second level of so-called aggregators, which reduces the computational and communication-related load on the high-level control, and a lower level...... of autonomous consumers. The control system is tasked with balancing electric power production and consumption within the smart grid, and makes active use of the ﬂexibility of a large number of power producing and/or power consuming units. The objective is to accommodate the load variation on the grid, arising...
Polynomial Regression on Riemannian Manifolds
Hinkle, Jacob; Fletcher, P Thomas; Joshi, Sarang
2012-01-01
In this paper we develop the theory of parametric polynomial regression in Riemannian manifolds and Lie groups. We show application of Riemannian polynomial regression to shape analysis in Kendall shape space. Results are presented, showing the power of polynomial regression on the classic rat skull growth data of Bookstein as well as the analysis of the shape changes associated with aging of the corpus callosum from the OASIS Alzheimer's study.
Evaluating Differential Effects Using Regression Interactions and Regression Mixture Models
Van Horn, M. Lee; Jaki, Thomas; Masyn, Katherine; Howe, George; Feaster, Daniel J.; Lamont, Andrea E.; George, Melissa R. W.; Kim, Minjung
2015-01-01
Research increasingly emphasizes understanding differential effects. This article focuses on understanding regression mixture models, which are relatively new statistical methods for assessing differential effects by comparing results to using an interactive term in linear regression. The research questions which each model answers, their…
Quantile regression theory and applications
Davino, Cristina; Vistocco, Domenico
2013-01-01
A guide to the implementation and interpretation of Quantile Regression models This book explores the theory and numerous applications of quantile regression, offering empirical data analysis as well as the software tools to implement the methods. The main focus of this book is to provide the reader with a comprehensivedescription of the main issues concerning quantile regression; these include basic modeling, geometrical interpretation, estimation and inference for quantile regression, as well as issues on validity of the model, diagnostic tools. Each methodological aspect is explored and
Business applications of multiple regression
Richardson, Ronny
2015-01-01
This second edition of Business Applications of Multiple Regression describes the use of the statistical procedure called multiple regression in business situations, including forecasting and understanding the relationships between variables. The book assumes a basic understanding of statistics but reviews correlation analysis and simple regression to prepare the reader to understand and use multiple regression. The techniques described in the book are illustrated using both Microsoft Excel and a professional statistical program. Along the way, several real-world data sets are analyzed in deta
Multivariate Longitudinal Analysis with Bivariate Correlation Test.
Adjakossa, Eric Houngla; Sadissou, Ibrahim; Hounkonnou, Mahouton Norbert; Nuel, Gregory
2016-01-01
In the context of multivariate multilevel data analysis, this paper focuses on the multivariate linear mixed-effects model, including all the correlations between the random effects when the dimensional residual terms are assumed uncorrelated. Using the EM algorithm, we suggest more general expressions of the model's parameters estimators. These estimators can be used in the framework of the multivariate longitudinal data analysis as well as in the more general context of the analysis of multivariate multilevel data. By using a likelihood ratio test, we test the significance of the correlations between the random effects of two dependent variables of the model, in order to investigate whether or not it is useful to model these dependent variables jointly. Simulation studies are done to assess both the parameter recovery performance of the EM estimators and the power of the test. Using two empirical data sets which are of longitudinal multivariate type and multivariate multilevel type, respectively, the usefulness of the test is illustrated.
Handbook of univariate and multivariate data analysis with IBM SPSS
Ho, Robert
2013-01-01
Using the same accessible, hands-on approach as its best-selling predecessor, the Handbook of Univariate and Multivariate Data Analysis with IBM SPSS, Second Edition explains how to apply statistical tests to experimental findings, identify the assumptions underlying the tests, and interpret the findings. This second edition now covers more topics and has been updated with the SPSS statistical package for Windows.New to the Second EditionThree new chapters on multiple discriminant analysis, logistic regression, and canonical correlationNew section on how to deal with missing dataCoverage of te
Hierarchical Structures in Hypertext Learning Environments
Bezdan, Eniko; Kester, Liesbeth; Kirschner, Paul A.
2011-01-01
Bezdan, E., Kester, L., & Kirschner, P. A. (2011, 9 September). Hierarchical Structures in Hypertext Learning Environments. Presentation for the visit of KU Leuven, Open University, Heerlen, The Netherlands.
Multivariate pluvial flood damage models
Energy Technology Data Exchange (ETDEWEB)
Van Ootegem, Luc [HIVA — University of Louvain (Belgium); SHERPPA — Ghent University (Belgium); Verhofstadt, Elsy [SHERPPA — Ghent University (Belgium); Van Herck, Kristine; Creten, Tom [HIVA — University of Louvain (Belgium)
2015-09-15
Depth–damage-functions, relating the monetary flood damage to the depth of the inundation, are commonly used in the case of fluvial floods (floods caused by a river overflowing). We construct four multivariate damage models for pluvial floods (caused by extreme rainfall) by differentiating on the one hand between ground floor floods and basement floods and on the other hand between damage to residential buildings and damage to housing contents. We do not only take into account the effect of flood-depth on damage, but also incorporate the effects of non-hazard indicators (building characteristics, behavioural indicators and socio-economic variables). By using a Tobit-estimation technique on identified victims of pluvial floods in Flanders (Belgium), we take into account the effect of cases of reported zero damage. Our results show that the flood depth is an important predictor of damage, but with a diverging impact between ground floor floods and basement floods. Also non-hazard indicators are important. For example being aware of the risk just before the water enters the building reduces content damage considerably, underlining the importance of warning systems and policy in this case of pluvial floods. - Highlights: • Prediction of damage of pluvial floods using also non-hazard information • We include ‘no damage cases’ using a Tobit model. • The damage of flood depth is stronger for ground floor than for basement floods. • Non-hazard indicators are especially important for content damage. • Potential gain of policies that increase awareness of flood risks.
Dynamic Organization of Hierarchical Memories.
Kurikawa, Tomoki; Kaneko, Kunihiko
2016-01-01
In the brain, external objects are categorized in a hierarchical way. Although it is widely accepted that objects are represented as static attractors in neural state space, this view does not take account interaction between intrinsic neural dynamics and external input, which is essential to understand how neural system responds to inputs. Indeed, structured spontaneous neural activity without external inputs is known to exist, and its relationship with evoked activities is discussed. Then, how categorical representation is embedded into the spontaneous and evoked activities has to be uncovered. To address this question, we studied bifurcation process with increasing input after hierarchically clustered associative memories are learned. We found a "dynamic categorization"; neural activity without input wanders globally over the state space including all memories. Then with the increase of input strength, diffuse representation of higher category exhibits transitions to focused ones specific to each object. The hierarchy of memories is embedded in the transition probability from one memory to another during the spontaneous dynamics. With increased input strength, neural activity wanders over a narrower state space including a smaller set of memories, showing more specific category or memory corresponding to the applied input. Moreover, such coarse-to-fine transitions are also observed temporally during transient process under constant input, which agrees with experimental findings in the temporal cortex. These results suggest the hierarchy emerging through interaction with an external input underlies hierarchy during transient process, as well as in the spontaneous activity.
On the Security of Multivariate Hash Functions
Institute of Scientific and Technical Information of China (English)
LUO Yi-yuan; LAI Xue-jia
2009-01-01
Multivariate hash functions are a type of hash functions whose compression function is explicitly defined as a sequence of multivariate equations. Billet et al designed the hash function MQ-HASH and Ding et al proposed a similar construction. In this paper, we analyze the security of multivariate hash functions and conclude that low degree multivariate functions such as MQ-HASH are neither pseudo-random nor unpredictable. There may be trivial collisions and fixed point attacks if the parameters of the compression ftmction have been chosen. And they are also not computation-resistance, which makes MAC forgery easily.
Detrended fluctuation analysis of multivariate time series
Xiong, Hui; Shang, P.
2017-01-01
In this work, we generalize the detrended fluctuation analysis (DFA) to the multivariate case, named multivariate DFA (MVDFA). The validity of the proposed MVDFA is illustrated by numerical simulations on synthetic multivariate processes, where the cases that initial data are generated independently from the same system and from different systems as well as the correlated variate from one system are considered. Moreover, the proposed MVDFA works well when applied to the multi-scale analysis of the returns of stock indices in Chinese and US stock markets. Generally, connections between the multivariate system and the individual variate are uncovered, showing the solid performances of MVDFA and the multi-scale MVDFA.
Multivariate meta-analysis: potential and promise.
Jackson, Dan; Riley, Richard; White, Ian R
2011-09-10
The multivariate random effects model is a generalization of the standard univariate model. Multivariate meta-analysis is becoming more commonly used and the techniques and related computer software, although continually under development, are now in place. In order to raise awareness of the multivariate methods, and discuss their advantages and disadvantages, we organized a one day 'Multivariate meta-analysis' event at the Royal Statistical Society. In addition to disseminating the most recent developments, we also received an abundance of comments, concerns, insights, critiques and encouragement. This article provides a balanced account of the day's discourse. By giving others the opportunity to respond to our assessment, we hope to ensure that the various view points and opinions are aired before multivariate meta-analysis simply becomes another widely used de facto method without any proper consideration of it by the medical statistics community. We describe the areas of application that multivariate meta-analysis has found, the methods available, the difficulties typically encountered and the arguments for and against the multivariate methods, using four representative but contrasting examples. We conclude that the multivariate methods can be useful, and in particular can provide estimates with better statistical properties, but also that these benefits come at the price of making more assumptions which do not result in better inference in every case. Although there is evidence that multivariate meta-analysis has considerable potential, it must be even more carefully applied than its univariate counterpart in practice. Copyright © 2011 John Wiley & Sons, Ltd.
Collaborative regression-based anatomical landmark detection
Gao, Yaozong; Shen, Dinggang
2015-12-01
Anatomical landmark detection plays an important role in medical image analysis, e.g. for registration, segmentation and quantitative analysis. Among the various existing methods for landmark detection, regression-based methods have recently attracted much attention due to their robustness and efficiency. In these methods, landmarks are localised through voting from all image voxels, which is completely different from the classification-based methods that use voxel-wise classification to detect landmarks. Despite their robustness, the accuracy of regression-based landmark detection methods is often limited due to (1) the inclusion of uninformative image voxels in the voting procedure, and (2) the lack of effective ways to incorporate inter-landmark spatial dependency into the detection step. In this paper, we propose a collaborative landmark detection framework to address these limitations. The concept of collaboration is reflected in two aspects. (1) Multi-resolution collaboration. A multi-resolution strategy is proposed to hierarchically localise landmarks by gradually excluding uninformative votes from faraway voxels. Moreover, for informative voxels near the landmark, a spherical sampling strategy is also designed at the training stage to improve their prediction accuracy. (2) Inter-landmark collaboration. A confidence-based landmark detection strategy is proposed to improve the detection accuracy of ‘difficult-to-detect’ landmarks by using spatial guidance from ‘easy-to-detect’ landmarks. To evaluate our method, we conducted experiments extensively on three datasets for detecting prostate landmarks and head & neck landmarks in computed tomography images, and also dental landmarks in cone beam computed tomography images. The results show the effectiveness of our collaborative landmark detection framework in improving landmark detection accuracy, compared to other state-of-the-art methods.
Testing discontinuities in nonparametric regression
Dai, Wenlin
2017-01-19
In nonparametric regression, it is often needed to detect whether there are jump discontinuities in the mean function. In this paper, we revisit the difference-based method in [13 H.-G. Müller and U. Stadtmüller, Discontinuous versus smooth regression, Ann. Stat. 27 (1999), pp. 299–337. doi: 10.1214/aos/1018031100
Logistic Regression: Concept and Application
Cokluk, Omay
2010-01-01
The main focus of logistic regression analysis is classification of individuals in different groups. The aim of the present study is to explain basic concepts and processes of binary logistic regression analysis intended to determine the combination of independent variables which best explain the membership in certain groups called dichotomous…
Dynamic system multivariate calibration by system identification methods
Directory of Open Access Journals (Sweden)
Rolf Ergon
1998-04-01
Full Text Available In the first part of the paper, the optimal estimator for normally nonmeasured primary outputs from a linear and time invariant dynamic system is developed. The estimator is based on an underlying Kalman filter, utilizing all available information in known inputs and measured secondary outputs. Assuming sufficient experimental data, the optimal estimator can be identified by specifying an output error model in a standard prediction error identification method. It is further shown that static estimators found by the ordinary least squares method or multivariate calibration by means of principal component regression (PCR or partial least squares regression (PLSR can be seen as special cases of the optimal dynamic estimator. Finally, it is shown that dynamic system PCR and PLSR solutions can be developed as special cases of the general estimator for dynamic systems.
First Look at Photometric Reduction via Mixed-Model Regression (Poster abstract)
Dose, E.
2016-12-01
(Abstract only) Mixed-model regression is proposed as a new approach to photometric reduction, especially for variable-star photometry in several filters. Mixed-model regression adds to normal multivariate regression certain "random effects": categorical-variable terms that model and extract specific systematic errors such as image-to-image zero-point fluctuations (cirrus effect) or even errors in comp-star catalog magnitudes.
Fungible weights in logistic regression.
Jones, Jeff A; Waller, Niels G
2016-06-01
In this article we develop methods for assessing parameter sensitivity in logistic regression models. To set the stage for this work, we first review Waller's (2008) equations for computing fungible weights in linear regression. Next, we describe 2 methods for computing fungible weights in logistic regression. To demonstrate the utility of these methods, we compute fungible logistic regression weights using data from the Centers for Disease Control and Prevention's (2010) Youth Risk Behavior Surveillance Survey, and we illustrate how these alternate weights can be used to evaluate parameter sensitivity. To make our work accessible to the research community, we provide R code (R Core Team, 2015) that will generate both kinds of fungible logistic regression weights. (PsycINFO Database Record
Regression Testing Cost Reduction Suite
Directory of Open Access Journals (Sweden)
Mohamed Alaa El-Din
2014-08-01
Full Text Available The estimated cost of software maintenance exceeds 70 percent of total software costs [1], and large portion of this maintenance expenses is devoted to regression testing. Regression testing is an expensive and frequently executed maintenance activity used to revalidate the modified software. Any reduction in the cost of regression testing would help to reduce the software maintenance cost. Test suites once developed are reused and updated frequently as the software evolves. As a result, some test cases in the test suite may become redundant when the software is modified over time since the requirements covered by them are also covered by other test cases. Due to the resource and time constraints for re-executing large test suites, it is important to develop techniques to minimize available test suites by removing redundant test cases. In general, the test suite minimization problem is NP complete. This paper focuses on proposing an effective approach for reducing the cost of regression testing process. The proposed approach is applied on real-time case study. It was found that the reduction in cost of regression testing for each regression testing cycle is ranging highly improved in the case of programs containing high number of selected statements which in turn maximize the benefits of using it in regression testing of complex software systems. The reduction in the regression test suite size will reduce the effort and time required by the testing teams to execute the regression test suite. Since regression testing is done more frequently in software maintenance phase, the overall software maintenance cost can be reduced considerably by applying the proposed approach.
Analysis of stability of community structure across multiple hierarchical levels
Li, Hui-Jia
2015-01-01
The analysis of stability of community structure is an important problem for scientists from many fields. Here, we propose a new framework to reveal hidden properties of community structure by quantitatively analyzing the dynamics of Potts model. Specifically we model the Potts procedure of community structure detection by a Markov process, which has a clear mathematical explanation. Critical topological information regarding to multivariate spin configuration could also be inferred from the spectral significance of the Markov process. We test our framework on some example networks and find it doesn't have resolute limitation problem at all. Results have shown the model we proposed is able to uncover hierarchical structure in different scales effectively and efficiently.
Multivariate return periods of sea storms for coastal erosion risk assessment
Directory of Open Access Journals (Sweden)
S. Corbella
2012-08-01
Full Text Available The erosion of a beach depends on various storm characteristics. Ideally, the risk associated with a storm would be described by a single multivariate return period that is also representative of the erosion risk, i.e. a 100 yr multivariate storm return period would cause a 100 yr erosion return period. Unfortunately, a specific probability level may be associated with numerous combinations of storm characteristics. These combinations, despite having the same multivariate probability, may cause very different erosion outcomes. This paper explores this ambiguity problem in the context of copula based multivariate return periods and using a case study at Durban on the east coast of South Africa. Simulations were used to correlate multivariate return periods of historical events to return periods of estimated storm induced erosion volumes. In addition, the relationship of the most-likely design event (Salvadori et al., 2011 to coastal erosion was investigated. It was found that the multivariate return periods for wave height and duration had the highest correlation to erosion return periods. The most-likely design event was found to be an inadequate design method in its current form. We explore the inclusion of conditions based on the physical realizability of wave events and the use of multivariate linear regression to relate storm parameters to erosion computed from a process based model. Establishing a link between storm statistics and erosion consequences can resolve the ambiguity between multivariate storm return periods and associated erosion return periods.
Discovering hierarchical structure in normal relational data
DEFF Research Database (Denmark)
Schmidt, Mikkel Nørgaard; Herlau, Tue; Mørup, Morten
2014-01-01
Hierarchical clustering is a widely used tool for structuring and visualizing complex data using similarity. Traditionally, hierarchical clustering is based on local heuristics that do not explicitly provide assessment of the statistical saliency of the extracted hierarchy. We propose a non-param...
Discursive Hierarchical Patterning in Economics Cases
Lung, Jane
2011-01-01
This paper attempts to apply Lung's (2008) model of the discursive hierarchical patterning of cases to a closer and more specific study of Economics cases and proposes a model of the distinct discursive hierarchical patterning of the same. It examines a corpus of 150 Economics cases with a view to uncovering the patterns of discourse construction.…