Ting, Hui-Min; Chang, Liyun; Huang, Yu-Jie; Wu, Jia-Ming; Wang, Hung-Yu; Horng, Mong-Fong; Chang, Chun-Ming; Lan, Jen-Hong; Huang, Ya-Yu; Fang, Fu-Min; Leung, Stephen Wan
2014-01-01
Purpose The aim of this study was to develop a multivariate logistic regression model with least absolute shrinkage and selection operator (LASSO) to make valid predictions about the incidence of moderate-to-severe patient-rated xerostomia among head and neck cancer (HNC) patients treated with IMRT. Methods and Materials Quality of life questionnaire datasets from 206 patients with HNC were analyzed. The European Organization for Research and Treatment of Cancer QLQ-H&N35 and QLQ-C30 questionnaires were used as the endpoint evaluation. The primary endpoint (grade 3+ xerostomia) was defined as moderate-to-severe xerostomia at 3 (XER3m) and 12 months (XER12m) after the completion of IMRT. Normal tissue complication probability (NTCP) models were developed. The optimal and suboptimal numbers of prognostic factors for a multivariate logistic regression model were determined using the LASSO with bootstrapping technique. Statistical analysis was performed using the scaled Brier score, Nagelkerke R2, chi-squared test, Omnibus, Hosmer-Lemeshow test, and the AUC. Results Eight prognostic factors were selected by LASSO for the 3-month time point: Dmean-c, Dmean-i, age, financial status, T stage, AJCC stage, smoking, and education. Nine prognostic factors were selected for the 12-month time point: Dmean-i, education, Dmean-c, smoking, T stage, baseline xerostomia, alcohol abuse, family history, and node classification. In the selection of the suboptimal number of prognostic factors by LASSO, three suboptimal prognostic factors were fine-tuned by Hosmer-Lemeshow test and AUC, i.e., Dmean-c, Dmean-i, and age for the 3-month time point. Five suboptimal prognostic factors were also selected for the 12-month time point, i.e., Dmean-i, education, Dmean-c, smoking, and T stage. The overall performance for both time points of the NTCP model in terms of scaled Brier score, Omnibus, and Nagelkerke R2 was satisfactory and corresponded well with the expected values. Conclusions
Allegrini, Franco; Braga, Jez W B; Moreira, Alessandro C O; Olivieri, Alejandro C
2018-06-29
A new multivariate regression model, named Error Covariance Penalized Regression (ECPR) is presented. Following a penalized regression strategy, the proposed model incorporates information about the measurement error structure of the system, using the error covariance matrix (ECM) as a penalization term. Results are reported from both simulations and experimental data based on replicate mid and near infrared (MIR and NIR) spectral measurements. The results for ECPR are better under non-iid conditions when compared with traditional first-order multivariate methods such as ridge regression (RR), principal component regression (PCR) and partial least-squares regression (PLS). Copyright © 2018 Elsevier B.V. All rights reserved.
Bayesian Inference of a Multivariate Regression Model
Directory of Open Access Journals (Sweden)
Marick S. Sinay
2014-01-01
Full Text Available We explore Bayesian inference of a multivariate linear regression model with use of a flexible prior for the covariance structure. The commonly adopted Bayesian setup involves the conjugate prior, multivariate normal distribution for the regression coefficients and inverse Wishart specification for the covariance matrix. Here we depart from this approach and propose a novel Bayesian estimator for the covariance. A multivariate normal prior for the unique elements of the matrix logarithm of the covariance matrix is considered. Such structure allows for a richer class of prior distributions for the covariance, with respect to strength of beliefs in prior location hyperparameters, as well as the added ability, to model potential correlation amongst the covariance structure. The posterior moments of all relevant parameters of interest are calculated based upon numerical results via a Markov chain Monte Carlo procedure. The Metropolis-Hastings-within-Gibbs algorithm is invoked to account for the construction of a proposal density that closely matches the shape of the target posterior distribution. As an application of the proposed technique, we investigate a multiple regression based upon the 1980 High School and Beyond Survey.
Retro-regression--another important multivariate regression improvement.
Randić, M
2001-01-01
We review the serious problem associated with instabilities of the coefficients of regression equations, referred to as the MRA (multivariate regression analysis) "nightmare of the first kind". This is manifested when in a stepwise regression a descriptor is included or excluded from a regression. The consequence is an unpredictable change of the coefficients of the descriptors that remain in the regression equation. We follow with consideration of an even more serious problem, referred to as the MRA "nightmare of the second kind", arising when optimal descriptors are selected from a large pool of descriptors. This process typically causes at different steps of the stepwise regression a replacement of several previously used descriptors by new ones. We describe a procedure that resolves these difficulties. The approach is illustrated on boiling points of nonanes which are considered (1) by using an ordered connectivity basis; (2) by using an ordering resulting from application of greedy algorithm; and (3) by using an ordering derived from an exhaustive search for optimal descriptors. A novel variant of multiple regression analysis, called retro-regression (RR), is outlined showing how it resolves the ambiguities associated with both "nightmares" of the first and the second kind of MRA.
Bayesian Estimation of Multivariate Latent Regression Models: Gauss versus Laplace
Culpepper, Steven Andrew; Park, Trevor
2017-01-01
A latent multivariate regression model is developed that employs a generalized asymmetric Laplace (GAL) prior distribution for regression coefficients. The model is designed for high-dimensional applications where an approximate sparsity condition is satisfied, such that many regression coefficients are near zero after accounting for all the model…
A Scalable Local Algorithm for Distributed Multivariate Regression
National Aeronautics and Space Administration — This paper offers a local distributed algorithm for multivariate regression in large peer-to-peer environments. The algorithm can be used for distributed...
An Efficient Local Algorithm for Distributed Multivariate Regression
National Aeronautics and Space Administration — This paper offers a local distributed algorithm for multivariate regression in large peer-to-peer environments. The algorithm is designed for distributed...
Variable Selection in Multivariable Regression Using SAS/IML
Directory of Open Access Journals (Sweden)
Ali A. Al-Subaihi
2002-11-01
Full Text Available This paper introduces a SAS/IML program to select among the multivariate model candidates based on a few well-known multivariate model selection criteria. Stepwise regression and all-possible-regression are considered. The program is user friendly and requires the user to paste or read the data at the beginning of the module, include the names of the dependent and independent variables (the y's and the x's, and then run the module. The program produces the multivariate candidate models based on the following criteria: Forward Selection, Forward Stepwise Regression, Backward Elimination, Mean Square Error, Coefficient of Multiple Determination, Adjusted Coefficient of Multiple Determination, Akaike's Information Criterion, the Corrected Form of Akaike's Information Criterion, Hannan and Quinn Information Criterion, the Corrected Form of Hannan and Quinn (HQc Information Criterion, Schwarz's Criterion, and Mallow's PC. The output also constitutes detailed as well as summarized results.
Multivariate Regression of Liver on Intestine of Mice: A ...
African Journals Online (AJOL)
FIRST LADY
Key Words: Shistosomiasis; Multivariate Regression; Likelihood-Ratio. Statistics. Introduction. Schistosomiasis (Bilharziasis) is one of the world's major public health problems for rural and agricultural communities living near slow-moving water in the tropics and subtropics. Schistosomiasis is a disease caused by digenean ...
Asymptotics of Multivariate Regression with Consecutively Added Dependent Varibles
Raats, V.M.; van der Genugten, B.B.; Moors, J.J.A.
2004-01-01
We consider multivariate regression where new dependent variables are consecutively added during the experiment (or in time).So, viewed at the end of the experiment, the number of observations decreases with each added variable. The explanatory variables are observed throughout.In a previous paper
Multivariate Local Polynomial Regression with Application to Shenzhen Component Index
Directory of Open Access Journals (Sweden)
Liyun Su
2011-01-01
Full Text Available This study attempts to characterize and predict stock index series in Shenzhen stock market using the concepts of multivariate local polynomial regression. Based on nonlinearity and chaos of the stock index time series, multivariate local polynomial prediction methods and univariate local polynomial prediction method, all of which use the concept of phase space reconstruction according to Takens' Theorem, are considered. To fit the stock index series, the single series changes into bivariate series. To evaluate the results, the multivariate predictor for bivariate time series based on multivariate local polynomial model is compared with univariate predictor with the same Shenzhen stock index data. The numerical results obtained by Shenzhen component index show that the prediction mean squared error of the multivariate predictor is much smaller than the univariate one and is much better than the existed three methods. Even if the last half of the training data are used in the multivariate predictor, the prediction mean squared error is smaller than the univariate predictor. Multivariate local polynomial prediction model for nonsingle time series is a useful tool for stock market price prediction.
Keithley, Richard B; Heien, Michael L; Wightman, R Mark
2009-10-01
Data analysis is an essential tenet of analytical chemistry, extending the possible information obtained from the measurement of chemical phenomena. Chemometric methods have grown considerably in recent years, but their wide use is hindered because some still consider them too complicated. The purpose of this review is to describe a multivariate chemometric method, principal component regression, in a simple manner from the point of view of an analytical chemist, to demonstrate the need for proper quality-control (QC) measures in multivariate analysis and to advocate the use of residuals as a proper QC method.
Regularized multivariate regression models with skew-t error distributions
Chen, Lianfu
2014-06-01
We consider regularization of the parameters in multivariate linear regression models with the errors having a multivariate skew-t distribution. An iterative penalized likelihood procedure is proposed for constructing sparse estimators of both the regression coefficient and inverse scale matrices simultaneously. The sparsity is introduced through penalizing the negative log-likelihood by adding L1-penalties on the entries of the two matrices. Taking advantage of the hierarchical representation of skew-t distributions, and using the expectation conditional maximization (ECM) algorithm, we reduce the problem to penalized normal likelihood and develop a procedure to minimize the ensuing objective function. Using a simulation study the performance of the method is assessed, and the methodology is illustrated using a real data set with a 24-dimensional response vector. © 2014 Elsevier B.V.
Preference learning with evolutionary Multivariate Adaptive Regression Spline model
DEFF Research Database (Denmark)
Abou-Zleikha, Mohamed; Shaker, Noor; Christensen, Mads Græsbøll
2015-01-01
This paper introduces a novel approach for pairwise preference learning through combining an evolutionary method with Multivariate Adaptive Regression Spline (MARS). Collecting users' feedback through pairwise preferences is recommended over other ranking approaches as this method is more appealing...... for function approximation as well as being relatively easy to interpret. MARS models are evolved based on their efficiency in learning pairwise data. The method is tested on two datasets that collectively provide pairwise preference data of five cognitive states expressed by users. The method is analysed...
REGSTEP - stepwise multivariate polynomial regression with singular extensions
International Nuclear Information System (INIS)
Davierwalla, D.M.
1977-09-01
The program REGSTEP determines a polynomial approximation, in the least squares sense, to tabulated data. The polynomial may be univariate or multivariate. The computational method is that of stepwise regression. A variable is inserted into the regression basis if it is significant with respect to an appropriate F-test at a preselected risk level. In addition, should a variable already in the basis, become nonsignificant (again with respect to an appropriate F-test) after the entry of a new variable, it is expelled from the model. Thus only significant variables are retained in the model. Although written expressly to be incorporated into CORCOD, a code for predicting nuclear cross sections for given values of power, temperature, void fractions, Boron content etc. there is nothing to limit the use of REGSTEP to nuclear applications, as the examples demonstrate. A separate version has been incorporated into RSYST for the general user. (Auth.)
Li, Yanming; Nan, Bin; Zhu, Ji
2015-06-01
We propose a multivariate sparse group lasso variable selection and estimation method for data with high-dimensional predictors as well as high-dimensional response variables. The method is carried out through a penalized multivariate multiple linear regression model with an arbitrary group structure for the regression coefficient matrix. It suits many biology studies well in detecting associations between multiple traits and multiple predictors, with each trait and each predictor embedded in some biological functional groups such as genes, pathways or brain regions. The method is able to effectively remove unimportant groups as well as unimportant individual coefficients within important groups, particularly for large p small n problems, and is flexible in handling various complex group structures such as overlapping or nested or multilevel hierarchical structures. The method is evaluated through extensive simulations with comparisons to the conventional lasso and group lasso methods, and is applied to an eQTL association study. © 2015, The International Biometric Society.
Collision prediction models using multivariate Poisson-lognormal regression.
El-Basyouny, Karim; Sayed, Tarek
2009-07-01
This paper advocates the use of multivariate Poisson-lognormal (MVPLN) regression to develop models for collision count data. The MVPLN approach presents an opportunity to incorporate the correlations across collision severity levels and their influence on safety analyses. The paper introduces a new multivariate hazardous location identification technique, which generalizes the univariate posterior probability of excess that has been commonly proposed and applied in the literature. In addition, the paper presents an alternative approach for quantifying the effect of the multivariate structure on the precision of expected collision frequency. The MVPLN approach is compared with the independent (separate) univariate Poisson-lognormal (PLN) models with respect to model inference, goodness-of-fit, identification of hot spots and precision of expected collision frequency. The MVPLN is modeled using the WinBUGS platform which facilitates computation of posterior distributions as well as providing a goodness-of-fit measure for model comparisons. The results indicate that the estimates of the extra Poisson variation parameters were considerably smaller under MVPLN leading to higher precision. The improvement in precision is due mainly to the fact that MVPLN accounts for the correlation between the latent variables representing property damage only (PDO) and injuries plus fatalities (I+F). This correlation was estimated at 0.758, which is highly significant, suggesting that higher PDO rates are associated with higher I+F rates, as the collision likelihood for both types is likely to rise due to similar deficiencies in roadway design and/or other unobserved factors. In terms of goodness-of-fit, the MVPLN model provided a superior fit than the independent univariate models. The multivariate hazardous location identification results demonstrated that some hazardous locations could be overlooked if the analysis was restricted to the univariate models.
Multivariate Frequency-Severity Regression Models in Insurance
Directory of Open Access Journals (Sweden)
Edward W. Frees
2016-02-01
Full Text Available In insurance and related industries including healthcare, it is common to have several outcome measures that the analyst wishes to understand using explanatory variables. For example, in automobile insurance, an accident may result in payments for damage to one’s own vehicle, damage to another party’s vehicle, or personal injury. It is also common to be interested in the frequency of accidents in addition to the severity of the claim amounts. This paper synthesizes and extends the literature on multivariate frequency-severity regression modeling with a focus on insurance industry applications. Regression models for understanding the distribution of each outcome continue to be developed yet there now exists a solid body of literature for the marginal outcomes. This paper contributes to this body of literature by focusing on the use of a copula for modeling the dependence among these outcomes; a major advantage of this tool is that it preserves the body of work established for marginal models. We illustrate this approach using data from the Wisconsin Local Government Property Insurance Fund. This fund offers insurance protection for (i property; (ii motor vehicle; and (iii contractors’ equipment claims. In addition to several claim types and frequency-severity components, outcomes can be further categorized by time and space, requiring complex dependency modeling. We find significant dependencies for these data; specifically, we find that dependencies among lines are stronger than the dependencies between the frequency and average severity within each line.
Multivariate study and regression analysis of gluten-free granola
Directory of Open Access Journals (Sweden)
Lilian Maria Pagamunici
2014-03-01
Full Text Available This study developed a gluten-free granola and evaluated it during storage with the application of multivariate and regression analysis of the sensory and instrumental parameters. The physicochemical, sensory, and nutritional characteristics of a product containing quinoa, amaranth and linseed were evaluated. The crude protein and lipid contents ranged from 97.49 and 122.72 g kg-1 of food, respectively. The polyunsaturated/saturated, and n-6:n-3 fatty acid ratios ranged from 2.82 and 2.59:1, respectively. Granola had the best alpha-linolenic acid content, nutritional indices in the lipid fraction, and mineral content. There were good hygienic and sanitary conditions during storage; probably due to the low water activity of the formulation, which contributed to inhibit microbial growth. The sensory attributes ranged from 'like very much' to 'like slightly', and the regression models were highly fitted and correlated during the storage period. A reduction in the sensory attribute levels and in the product physical stabilisation was verified by principal component analysis. The use of the affective test acceptance and instrumental analysis combined with statistical methods allowed us to obtain promising results about the characteristics of gluten-free granola.
Real estate value prediction using multivariate regression models
Manjula, R.; Jain, Shubham; Srivastava, Sharad; Rajiv Kher, Pranav
2017-11-01
The real estate market is one of the most competitive in terms of pricing and the same tends to vary significantly based on a lot of factors, hence it becomes one of the prime fields to apply the concepts of machine learning to optimize and predict the prices with high accuracy. Therefore in this paper, we present various important features to use while predicting housing prices with good accuracy. We have described regression models, using various features to have lower Residual Sum of Squares error. While using features in a regression model some feature engineering is required for better prediction. Often a set of features (multiple regressions) or polynomial regression (applying a various set of powers in the features) is used for making better model fit. For these models are expected to be susceptible towards over fitting ridge regression is used to reduce it. This paper thus directs to the best application of regression models in addition to other techniques to optimize the result.
Optimization of ridge parameters in multivariate generalized ridge regression by plug-in methods
Nagai, Isamu; Yanagihara, Hirokazu; Satoh, Kenichi
2012-01-01
Generalized ridge (GR) regression for an univariate linear model was proposed simultaneously with ridge regression by Hoerl and Kennard (1970). In this paper, we deal with a GR regression for a multivariate linear model, referred to as a multivariate GR (MGR) regression. From the viewpoint of reducing the mean squared error (MSE) of a predicted value, many authors have proposed several GR estimators consisting of ridge parameters optimized by non-iterative methods. By expanding...
Nonparametric Regression Estimation for Multivariate Null Recurrent Processes
Directory of Open Access Journals (Sweden)
Biqing Cai
2015-04-01
Full Text Available This paper discusses nonparametric kernel regression with the regressor being a \\(d\\-dimensional \\(\\beta\\-null recurrent process in presence of conditional heteroscedasticity. We show that the mean function estimator is consistent with convergence rate \\(\\sqrt{n(Th^{d}}\\, where \\(n(T\\ is the number of regenerations for a \\(\\beta\\-null recurrent process and the limiting distribution (with proper normalization is normal. Furthermore, we show that the two-step estimator for the volatility function is consistent. The finite sample performance of the estimate is quite reasonable when the leave-one-out cross validation method is used for bandwidth selection. We apply the proposed method to study the relationship of Federal funds rate with 3-month and 5-year T-bill rates and discover the existence of nonlinearity of the relationship. Furthermore, the in-sample and out-of-sample performance of the nonparametric model is far better than the linear model.
Ultracentrifuge separative power modeling with multivariate regression using covariance matrix
International Nuclear Information System (INIS)
Migliavacca, Elder
2004-01-01
In this work, the least-squares methodology with covariance matrix is applied to determine a data curve fitting to obtain a performance function for the separative power δU of a ultracentrifuge as a function of variables that are experimentally controlled. The experimental data refer to 460 experiments on the ultracentrifugation process for uranium isotope separation. The experimental uncertainties related with these independent variables are considered in the calculation of the experimental separative power values, determining an experimental data input covariance matrix. The process variables, which significantly influence the δU values are chosen in order to give information on the ultracentrifuge behaviour when submitted to several levels of feed flow rate F, cut θ and product line pressure P p . After the model goodness-of-fit validation, a residual analysis is carried out to verify the assumed basis concerning its randomness and independence and mainly the existence of residual heteroscedasticity with any explained regression model variable. The surface curves are made relating the separative power with the control variables F, θ and P p to compare the fitted model with the experimental data and finally to calculate their optimized values. (author)
Khoshravesh, Mojtaba; Sefidkouhi, Mohammad Ali Gholami; Valipour, Mohammad
2017-07-01
The proper evaluation of evapotranspiration is essential in food security investigation, farm management, pollution detection, irrigation scheduling, nutrient flows, carbon balance as well as hydrologic modeling, especially in arid environments. To achieve sustainable development and to ensure water supply, especially in arid environments, irrigation experts need tools to estimate reference evapotranspiration on a large scale. In this study, the monthly reference evapotranspiration was estimated by three different regression models including the multivariate fractional polynomial (MFP), robust regression, and Bayesian regression in Ardestan, Esfahan, and Kashan. The results were compared with Food and Agriculture Organization (FAO)-Penman-Monteith (FAO-PM) to select the best model. The results show that at a monthly scale, all models provided a closer agreement with the calculated values for FAO-PM ( R 2 > 0.95 and RMSE < 12.07 mm month-1). However, the MFP model gives better estimates than the other two models for estimating reference evapotranspiration at all stations.
Directory of Open Access Journals (Sweden)
Lassi Rieppo
Full Text Available Fourier Transform Infrared (FT-IR spectroscopic imaging has been earlier applied for the spatial estimation of the collagen and the proteoglycan (PG contents of articular cartilage (AC. However, earlier studies have been limited to the use of univariate analysis techniques. Current analysis methods lack the needed specificity for collagen and PGs. The aim of the present study was to evaluate the suitability of partial least squares regression (PLSR and principal component regression (PCR methods for the analysis of the PG content of AC. Multivariate regression models were compared with earlier used univariate methods and tested with a sample material consisting of healthy and enzymatically degraded steer AC. Chondroitinase ABC enzyme was used to increase the variation in PG content levels as compared to intact AC. Digital densitometric measurements of Safranin O-stained sections provided the reference for PG content. The results showed that multivariate regression models predict PG content of AC significantly better than earlier used absorbance spectrum (i.e. the area of carbohydrate region with or without amide I normalization or second derivative spectrum univariate parameters. Increased molecular specificity favours the use of multivariate regression models, but they require more knowledge of chemometric analysis and extended laboratory resources for gathering reference data for establishing the models. When true molecular specificity is required, the multivariate models should be used.
Depth-weighted robust multivariate regression with application to sparse data
Dutta, Subhajit
2017-04-05
A robust method for multivariate regression is developed based on robust estimators of the joint location and scatter matrix of the explanatory and response variables using the notion of data depth. The multivariate regression estimator possesses desirable affine equivariance properties, achieves the best breakdown point of any affine equivariant estimator, and has an influence function which is bounded in both the response as well as the predictor variable. To increase the efficiency of this estimator, a re-weighted estimator based on robust Mahalanobis distances of the residual vectors is proposed. In practice, the method is more stable than existing methods that are constructed using subsamples of the data. The resulting multivariate regression technique is computationally feasible, and turns out to perform better than several popular robust multivariate regression methods when applied to various simulated data as well as a real benchmark data set. When the data dimension is quite high compared to the sample size it is still possible to use meaningful notions of data depth along with the corresponding depth values to construct a robust estimator in a sparse setting.
Henrard, S; Speybroeck, N; Hermans, C
2015-11-01
Haemophilia is a rare genetic haemorrhagic disease characterized by partial or complete deficiency of coagulation factor VIII, for haemophilia A, or IX, for haemophilia B. As in any other medical research domain, the field of haemophilia research is increasingly concerned with finding factors associated with binary or continuous outcomes through multivariable models. Traditional models include multiple logistic regressions, for binary outcomes, and multiple linear regressions for continuous outcomes. Yet these regression models are at times difficult to implement, especially for non-statisticians, and can be difficult to interpret. The present paper sought to didactically explain how, why, and when to use classification and regression tree (CART) analysis for haemophilia research. The CART method is non-parametric and non-linear, based on the repeated partitioning of a sample into subgroups based on a certain criterion. Breiman developed this method in 1984. Classification trees (CTs) are used to analyse categorical outcomes and regression trees (RTs) to analyse continuous ones. The CART methodology has become increasingly popular in the medical field, yet only a few examples of studies using this methodology specifically in haemophilia have to date been published. Two examples using CART analysis and previously published in this field are didactically explained in details. There is increasing interest in using CART analysis in the health domain, primarily due to its ease of implementation, use, and interpretation, thus facilitating medical decision-making. This method should be promoted for analysing continuous or categorical outcomes in haemophilia, when applicable. © 2015 John Wiley & Sons Ltd.
Jakubowski, J.; Stypulkowski, J. B.; Bernardeau, F. G.
2017-12-01
The first phase of the Abu Hamour drainage and storm tunnel was completed in early 2017. The 9.5 km long, 3.7 m diameter tunnel was excavated with two Earth Pressure Balance (EPB) Tunnel Boring Machines from Herrenknecht. TBM operation processes were monitored and recorded by Data Acquisition and Evaluation System. The authors coupled collected TBM drive data with available information on rock mass properties, cleansed, completed with secondary variables and aggregated by weeks and shifts. Correlations and descriptive statistics charts were examined. Multivariate Linear Regression and CART regression tree models linking TBM penetration rate (PR), penetration per revolution (PPR) and field penetration index (FPI) with TBM operational and geotechnical characteristics were performed for the conditions of the weak/soft rock of Doha. Both regression methods are interpretable and the data were screened with different computational approaches allowing enriched insight. The primary goal of the analysis was to investigate empirical relations between multiple explanatory and responding variables, to search for best subsets of explanatory variables and to evaluate the strength of linear and non-linear relations. For each of the penetration indices, a predictive model coupling both regression methods was built and validated. The resultant models appeared to be stronger than constituent ones and indicated an opportunity for more accurate and robust TBM performance predictions.
Laurens, L M L; Wolfrum, E J
2013-12-18
One of the challenges associated with microalgal biomass characterization and the comparison of microalgal strains and conversion processes is the rapid determination of the composition of algae. We have developed and applied a high-throughput screening technology based on near-infrared (NIR) spectroscopy for the rapid and accurate determination of algal biomass composition. We show that NIR spectroscopy can accurately predict the full composition using multivariate linear regression analysis of varying lipid, protein, and carbohydrate content of algal biomass samples from three strains. We also demonstrate a high quality of predictions of an independent validation set. A high-throughput 96-well configuration for spectroscopy gives equally good prediction relative to a ring-cup configuration, and thus, spectra can be obtained from as little as 10-20 mg of material. We found that lipids exhibit a dominant, distinct, and unique fingerprint in the NIR spectrum that allows for the use of single and multiple linear regression of respective wavelengths for the prediction of the biomass lipid content. This is not the case for carbohydrate and protein content, and thus, the use of multivariate statistical modeling approaches remains necessary.
MCKissick, Burnell T. (Technical Monitor); Plassman, Gerald E.; Mall, Gerald H.; Quagliano, John R.
2005-01-01
Linear multivariable regression models for predicting day and night Eddy Dissipation Rate (EDR) from available meteorological data sources are defined and validated. Model definition is based on a combination of 1997-2000 Dallas/Fort Worth (DFW) data sources, EDR from Aircraft Vortex Spacing System (AVOSS) deployment data, and regression variables primarily from corresponding Automated Surface Observation System (ASOS) data. Model validation is accomplished through EDR predictions on a similar combination of 1994-1995 Memphis (MEM) AVOSS and ASOS data. Model forms include an intercept plus a single term of fixed optimal power for each of these regression variables; 30-minute forward averaged mean and variance of near-surface wind speed and temperature, variance of wind direction, and a discrete cloud cover metric. Distinct day and night models, regressing on EDR and the natural log of EDR respectively, yield best performance and avoid model discontinuity over day/night data boundaries.
Multivariate nonparametric regression and visualization with R and applications to finance
Klemelä, Jussi
2014-01-01
A modern approach to statistical learning and its applications through visualization methods With a unique and innovative presentation, Multivariate Nonparametric Regression and Visualization provides readers with the core statistical concepts to obtain complete and accurate predictions when given a set of data. Focusing on nonparametric methods to adapt to the multiple types of data generatingmechanisms, the book begins with an overview of classification and regression. The book then introduces and examines various tested and proven visualization techniques for learning samples and functio
Dynamic prediction of cumulative incidence functions by direct binomial regression.
Grand, Mia K; de Witte, Theo J M; Putter, Hein
2018-03-25
In recent years there have been a series of advances in the field of dynamic prediction. Among those is the development of methods for dynamic prediction of the cumulative incidence function in a competing risk setting. These models enable the predictions to be updated as time progresses and more information becomes available, for example when a patient comes back for a follow-up visit after completing a year of treatment, the risk of death, and adverse events may have changed since treatment initiation. One approach to model the cumulative incidence function in competing risks is by direct binomial regression, where right censoring of the event times is handled by inverse probability of censoring weights. We extend the approach by combining it with landmarking to enable dynamic prediction of the cumulative incidence function. The proposed models are very flexible, as they allow the covariates to have complex time-varying effects, and we illustrate how to investigate possible time-varying structures using Wald tests. The models are fitted using generalized estimating equations. The method is applied to bone marrow transplant data and the performance is investigated in a simulation study. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Higher-order Multivariable Polynomial Regression to Estimate Human Affective States
Wei, Jie; Chen, Tong; Liu, Guangyuan; Yang, Jiemin
2016-03-01
From direct observations, facial, vocal, gestural, physiological, and central nervous signals, estimating human affective states through computational models such as multivariate linear-regression analysis, support vector regression, and artificial neural network, have been proposed in the past decade. In these models, linear models are generally lack of precision because of ignoring intrinsic nonlinearities of complex psychophysiological processes; and nonlinear models commonly adopt complicated algorithms. To improve accuracy and simplify model, we introduce a new computational modeling method named as higher-order multivariable polynomial regression to estimate human affective states. The study employs standardized pictures in the International Affective Picture System to induce thirty subjects’ affective states, and obtains pure affective patterns of skin conductance as input variables to the higher-order multivariable polynomial model for predicting affective valence and arousal. Experimental results show that our method is able to obtain efficient correlation coefficients of 0.98 and 0.96 for estimation of affective valence and arousal, respectively. Moreover, the method may provide certain indirect evidences that valence and arousal have their brain’s motivational circuit origins. Thus, the proposed method can serve as a novel one for efficiently estimating human affective states.
Directory of Open Access Journals (Sweden)
Soyoung Park
2017-07-01
Full Text Available This study mapped and analyzed groundwater potential using two different models, logistic regression (LR and multivariate adaptive regression splines (MARS, and compared the results. A spatial database was constructed for groundwater well data and groundwater influence factors. Groundwater well data with a high potential yield of ≥70 m3/d were extracted, and 859 locations (70% were used for model training, whereas the other 365 locations (30% were used for model validation. We analyzed 16 groundwater influence factors including altitude, slope degree, slope aspect, plan curvature, profile curvature, topographic wetness index, stream power index, sediment transport index, distance from drainage, drainage density, lithology, distance from fault, fault density, distance from lineament, lineament density, and land cover. Groundwater potential maps (GPMs were constructed using LR and MARS models and tested using a receiver operating characteristics curve. Based on this analysis, the area under the curve (AUC for the success rate curve of GPMs created using the MARS and LR models was 0.867 and 0.838, and the AUC for the prediction rate curve was 0.836 and 0.801, respectively. This implies that the MARS model is useful and effective for groundwater potential analysis in the study area.
DEFF Research Database (Denmark)
Tybjærg-Hansen, Anne
2009-01-01
Within-person variability in measured values of multiple risk factors can bias their associations with disease. The multivariate regression calibration (RC) approach can correct for such measurement error and has been applied to studies in which true values or independent repeat measurements...... of the risk factors are observed on a subsample. We extend the multivariate RC techniques to a meta-analysis framework where multiple studies provide independent repeat measurements and information on disease outcome. We consider the cases where some or all studies have repeat measurements, and compare study......-specific, averaged and empirical Bayes estimates of RC parameters. Additionally, we allow for binary covariates (e.g. smoking status) and for uncertainty and time trends in the measurement error corrections. Our methods are illustrated using a subset of individual participant data from prospective long-term studies...
Nieto, Paulino José García; Antón, Juan Carlos Álvarez; Vilán, José Antonio Vilán; García-Gonzalo, Esperanza
2014-10-01
The aim of this research work is to build a regression model of the particulate matter up to 10 micrometers in size (PM10) by using the multivariate adaptive regression splines (MARS) technique in the Oviedo urban area (Northern Spain) at local scale. This research work explores the use of a nonparametric regression algorithm known as multivariate adaptive regression splines (MARS) which has the ability to approximate the relationship between the inputs and outputs, and express the relationship mathematically. In this sense, hazardous air pollutants or toxic air contaminants refer to any substance that may cause or contribute to an increase in mortality or serious illness, or that may pose a present or potential hazard to human health. To accomplish the objective of this study, the experimental dataset of nitrogen oxides (NOx), carbon monoxide (CO), sulfur dioxide (SO2), ozone (O3) and dust (PM10) were collected over 3 years (2006-2008) and they are used to create a highly nonlinear model of the PM10 in the Oviedo urban nucleus (Northern Spain) based on the MARS technique. One main objective of this model is to obtain a preliminary estimate of the dependence between PM10 pollutant in the Oviedo urban area at local scale. A second aim is to determine the factors with the greatest bearing on air quality with a view to proposing health and lifestyle improvements. The United States National Ambient Air Quality Standards (NAAQS) establishes the limit values of the main pollutants in the atmosphere in order to ensure the health of healthy people. Firstly, this MARS regression model captures the main perception of statistical learning theory in order to obtain a good prediction of the dependence among the main pollutants in the Oviedo urban area. Secondly, the main advantages of MARS are its capacity to produce simple, easy-to-interpret models, its ability to estimate the contributions of the input variables, and its computational efficiency. Finally, on the basis of
Predicting Cumulative Incidence Probability by Direct Binomial Regression
DEFF Research Database (Denmark)
Scheike, Thomas H.; Zhang, Mei-Jie
Binomial modelling; cumulative incidence probability; cause-specific hazards; subdistribution hazard......Binomial modelling; cumulative incidence probability; cause-specific hazards; subdistribution hazard...
A refined method for multivariate meta-analysis and meta-regression
Jackson, Daniel; Riley, Richard D
2014-01-01
Making inferences about the average treatment effect using the random effects model for meta-analysis is problematic in the common situation where there is a small number of studies. This is because estimates of the between-study variance are not precise enough to accurately apply the conventional methods for testing and deriving a confidence interval for the average effect. We have found that a refined method for univariate meta-analysis, which applies a scaling factor to the estimated effects’ standard error, provides more accurate inference. We explain how to extend this method to the multivariate scenario and show that our proposal for refined multivariate meta-analysis and meta-regression can provide more accurate inferences than the more conventional approach. We explain how our proposed approach can be implemented using standard output from multivariate meta-analysis software packages and apply our methodology to two real examples. © 2013 The Authors. Statistics in Medicine published by John Wiley & Sons, Ltd. PMID:23996351
Multivariate linear regression of high-dimensional fMRI data with multiple target variables.
Valente, Giancarlo; Castellanos, Agustin Lage; Vanacore, Gianluca; Formisano, Elia
2014-05-01
Multivariate regression is increasingly used to study the relation between fMRI spatial activation patterns and experimental stimuli or behavioral ratings. With linear models, informative brain locations are identified by mapping the model coefficients. This is a central aspect in neuroimaging, as it provides the sought-after link between the activity of neuronal populations and subject's perception, cognition or behavior. Here, we show that mapping of informative brain locations using multivariate linear regression (MLR) may lead to incorrect conclusions and interpretations. MLR algorithms for high dimensional data are designed to deal with targets (stimuli or behavioral ratings, in fMRI) separately, and the predictive map of a model integrates information deriving from both neural activity patterns and experimental design. Not accounting explicitly for the presence of other targets whose associated activity spatially overlaps with the one of interest may lead to predictive maps of troublesome interpretation. We propose a new model that can correctly identify the spatial patterns associated with a target while achieving good generalization. For each target, the training is based on an augmented dataset, which includes all remaining targets. The estimation on such datasets produces both maps and interaction coefficients, which are then used to generalize. The proposed formulation is independent of the regression algorithm employed. We validate this model on simulated fMRI data and on a publicly available dataset. Results indicate that our method achieves high spatial sensitivity and good generalization and that it helps disentangle specific neural effects from interaction with predictive maps associated with other targets. Copyright © 2013 Wiley Periodicals, Inc.
Van der Elst, Wim; Molenberghs, Geert; van Tetering, Marleen; Jolles, Jelle
Multi-trial memory tests are widely used in research and clinical practice because they allow for assessing different aspects of memory and learning in a single comprehensive test procedure. However, the use of multi-trial memory tests also raises some key data analysis issues. Indeed, the different trial scores are typically all correlated, and this correlation has to be properly accounted for in the statistical analyses. In the present paper, the focus is on the setting where normative data have to be established for multi-trial memory tests. At present, normative data for such tests are typically based on a series of univariate analyses, i.e. a statistical model is fitted for each of the test scores separately. This approach is suboptimal because (1) the correlated nature of the data is not accounted for, (2) multiple testing issues may arise, and (3) the analysis is not parsimonious. Here, a normative approach that is not hampered by these issues is proposed (the so-called multivariate regression-based approach). The methodology is exemplified in a sample of N = 221 Dutch-speaking children (aged between 5.82 and 15.49 years) who were administered Rey's Auditory Verbal Learning Test. An online Appendix that details how the analyses can be conducted in practice (using the R software) is also provided. The multivariate normative regression-based approach has some substantial methodological advantages over univariate regression-based methods. In addition, the method allows for testing substantive hypotheses that cannot be addressed in a univariate framework (e.g. trial by covariate interactions can be modeled).
Nieto, P J García; Antón, J C Álvarez; Vilán, J A Vilán; García-Gonzalo, E
2015-05-01
The aim of this research work is to build a regression model of air quality by using the multivariate adaptive regression splines (MARS) technique in the Oviedo urban area (northern Spain) at a local scale. To accomplish the objective of this study, the experimental data set made up of nitrogen oxides (NO x ), carbon monoxide (CO), sulfur dioxide (SO2), ozone (O3), and dust (PM10) was collected over 3 years (2006-2008). The US National Ambient Air Quality Standards (NAAQS) establishes the limit values of the main pollutants in the atmosphere in order to ensure the health of healthy people. Firstly, this MARS regression model captures the main perception of statistical learning theory in order to obtain a good prediction of the dependence among the main pollutants in the Oviedo urban area. Secondly, the main advantages of MARS are its capacity to produce simple, easy-to-interpret models, its ability to estimate the contributions of the input variables, and its computational efficiency. Finally, on the basis of these numerical calculations, using the MARS technique, conclusions of this research work are exposed.
Directory of Open Access Journals (Sweden)
Yoonsu Shin
2016-01-01
Full Text Available In the 5G era, the operational cost of mobile wireless networks will significantly increase. Further, massive network capacity and zero latency will be needed because everything will be connected to mobile networks. Thus, self-organizing networks (SON are needed, which expedite automatic operation of mobile wireless networks, but have challenges to satisfy the 5G requirements. Therefore, researchers have proposed a framework to empower SON using big data. The recent framework of a big data-empowered SON analyzes the relationship between key performance indicators (KPIs and related network parameters (NPs using machine-learning tools, and it develops regression models using a Gaussian process with those parameters. The problem, however, is that the methods of finding the NPs related to the KPIs differ individually. Moreover, the Gaussian process regression model cannot determine the relationship between a KPI and its various related NPs. In this paper, to solve these problems, we proposed multivariate multiple regression models to determine the relationship between various KPIs and NPs. If we assume one KPI and multiple NPs as one set, the proposed models help us process multiple sets at one time. Also, we can find out whether some KPIs are conflicting or not. We implement the proposed models using MapReduce.
Hardy, Krista L; Davis, Kathryn E; Constantine, Ryan S; Chen, Mo; Hein, Rachel; Jewell, James L; Dirisala, Karunakar; Lysikowski, Jerzy; Reed, Gary; Kenkel, Jeffrey M
2014-05-01
Little evidence within plastic surgery literature supports the precept that longer operative times lead to greater morbidity. The authors investigate surgery duration as a determinant of morbidity, with the goal of defining a clinically relevant time for increased risk. A retrospective chart review was conducted of patients who underwent a broad range of complex plastic surgical procedures (n = 1801 procedures) at UT Southwestern Medical Center in Dallas, Texas, from January 1, 2008 to January 31, 2012. Adjusting for possible confounders, multivariate logistic regression assessed surgery duration as an independent predictor of morbidity. To define a cutoff for increased risk, incidence of complications was compared among quintiles of surgery duration. Stratification by type of surgery controlled for procedural complexity. A total of 1753 cases were included in multivariate analyses with an overall complication rate of 27.8%. Most operations were combined (75.8%), averaging 4.9 concurrent procedures. Each hour increase in surgery duration was associated with a 21% rise in odds of morbidity (P surgery (odds ratio, 1.6; P = .017), with progressively greater odds increases of 3.1 times after 4.5 hours (P surgery, longer operations continued to be associated with greater morbidity. Surgery duration is an independent predictor of complications, with a significantly increased risk above 3 hours. Although procedural complexity undoubtedly affects morbidity, operative time should factor into surgical decision making.
DEFF Research Database (Denmark)
Tybjærg-Hansen, Anne
2009-01-01
Within-person variability in measured values of multiple risk factors can bias their associations with disease. The multivariate regression calibration (RC) approach can correct for such measurement error and has been applied to studies in which true values or independent repeat measurements......-specific, averaged and empirical Bayes estimates of RC parameters. Additionally, we allow for binary covariates (e.g. smoking status) and for uncertainty and time trends in the measurement error corrections. Our methods are illustrated using a subset of individual participant data from prospective long-term studies...... in the Fibrinogen Studies Collaboration to assess the relationship between usual levels of plasma fibrinogen and the risk of coronary heart disease, allowing for measurement error in plasma fibrinogen and several confounders Udgivelsesdato: 2009/3/30...
On the degrees of freedom of reduced-rank estimators in multivariate regression.
Mukherjee, A; Chen, K; Wang, N; Zhu, J
We study the effective degrees of freedom of a general class of reduced-rank estimators for multivariate regression in the framework of Stein's unbiased risk estimation. A finite-sample exact unbiased estimator is derived that admits a closed-form expression in terms of the thresholded singular values of the least-squares solution and hence is readily computable. The results continue to hold in the high-dimensional setting where both the predictor and the response dimensions may be larger than the sample size. The derived analytical form facilitates the investigation of theoretical properties and provides new insights into the empirical behaviour of the degrees of freedom. In particular, we examine the differences and connections between the proposed estimator and a commonly-used naive estimator. The use of the proposed estimator leads to efficient and accurate prediction risk estimation and model selection, as demonstrated by simulation studies and a data example.
Giacomo, Della Riccia; Stefania, Del Zotto
2013-12-15
Fumonisins are mycotoxins produced by Fusarium species that commonly live in maize. Whereas fungi damage plants, fumonisins cause disease both to cattle breedings and human beings. Law limits set fumonisins tolerable daily intake with respect to several maize based feed and food. Chemical techniques assure the most reliable and accurate measurements, but they are expensive and time consuming. A method based on Near Infrared spectroscopy and multivariate statistical regression is described as a simpler, cheaper and faster alternative. We apply Partial Least Squares with full cross validation. Two models are described, having high correlation of calibration (0.995, 0.998) and of validation (0.908, 0.909), respectively. Description of observed phenomenon is accurate and overfitting is avoided. Screening of contaminated maize with respect to European legal limit of 4 mg kg(-1) should be assured. Copyright © 2013 Elsevier Ltd. All rights reserved.
DEFF Research Database (Denmark)
Sørensen, Jens Benn; Badsberg, Jens Henrik; Olsen, Jens
1989-01-01
as an indicator for patients having minimal disease spread. Liver metastases were of limited clinical value as a prognostic factor because they were detected in only seven cases in this patient population. A new Cox analysis ignoring the influence of this variable revealed no other variables than those occurring...... status, stage IV disease, no prior nonradical resection, liver metastases, high values of white blood cell count, and lactate dehydrogenase, and low values of aspartate aminotransaminase. The nonradical resection may not be a prognostic factor because of the resection itself but may rather serve......The prognostic factors for survival in advanced adenocarcinoma of the lung were investigated in a consecutive series of 259 patients treated with chemotherapy. Twenty-eight pretreatment variables were investigated by use of Cox's multivariate regression model, including histological subtypes...
Wilms, M.; Werner, R.; Ehrhardt, J.; Schmidt-Richberg, A.; Schlemmer, H.-P.; Handels, H.
2014-03-01
Breathing-induced location uncertainties of internal structures are still a relevant issue in the radiation therapy of thoracic and abdominal tumours. Motion compensation approaches like gating or tumour tracking are usually driven by low-dimensional breathing signals, which are acquired in real-time during the treatment. These signals are only surrogates of the internal motion of target structures and organs at risk, and, consequently, appropriate models are needed to establish correspondence between the acquired signals and the sought internal motion patterns. In this work, we present a diffeomorphic framework for correspondence modelling based on the Log-Euclidean framework and multivariate regression. Within the framework, we systematically compare standard and subspace regression approaches (principal component regression, partial least squares, canonical correlation analysis) for different types of common breathing signals (1D: spirometry, abdominal belt, diaphragm tracking; multi-dimensional: skin surface tracking). Experiments are based on 4D CT and 4D MRI data sets and cover intra- and inter-cycle as well as intra- and inter-session motion variations. Only small differences in internal motion estimation accuracy are observed between the 1D surrogates. Increasing the surrogate dimensionality, however, improved the accuracy significantly; this is shown for both 2D signals, which consist of a common 1D signal and its time derivative, and high-dimensional signals containing the motion of many skin surface points. Eventually, comparing the standard and subspace regression variants when applied to the high-dimensional breathing signals, only small differences in terms of motion estimation accuracy are found.
Gong, Xu; Cui, Jianli; Jiang, Ziping; Lu, Laijin; Li, Xiucun
2018-03-01
Few clinical retrospective studies have reported the risk factors of pedicled flap necrosis in hand soft tissue reconstruction. The aim of this study was to identify non-technical risk factors associated with pedicled flap perioperative necrosis in hand soft tissue reconstruction via a multivariate logistic regression analysis. For patients with hand soft tissue reconstruction, we carefully reviewed hospital records and identified 163 patients who met the inclusion criteria. The characteristics of these patients, flap transfer procedures and postoperative complications were recorded. Eleven predictors were identified. The correlations between pedicled flap necrosis and risk factors were analysed using a logistic regression model. Of 163 skin flaps, 125 flaps survived completely without any complications. The pedicled flap necrosis rate in hands was 11.04%, which included partial flap necrosis (7.36%) and total flap necrosis (3.68%). Soft tissue defects in fingers were noted in 68.10% of all cases. The logistic regression analysis indicated that the soft tissue defect site (P = 0.046, odds ratio (OR) = 0.079, confidence interval (CI) (0.006, 0.959)), flap size (P = 0.020, OR = 1.024, CI (1.004, 1.045)) and postoperative wound infection (P < 0.001, OR = 17.407, CI (3.821, 79.303)) were statistically significant risk factors for pedicled flap necrosis of the hand. Soft tissue defect site, flap size and postoperative wound infection were risk factors associated with pedicled flap necrosis in hand soft tissue defect reconstruction. © 2017 Royal Australasian College of Surgeons.
Cannon, Alex
2017-04-01
univariate technique, and cannot incorporate information from additional covariates, for example ENSO state or physiographic controls on extreme rainfall within a region. Here, the univariate MQR model is extended to allow the use of multiple covariates. Multivariate monotone quantile regression (MMQR) is based on a single hidden-layer feedforward network with the quantile regression error function and partial monotonicity constraints. The MMQR model is demonstrated via Monte Carlo simulations and the estimation and visualization of regional trends in moderate rainfall extremes based on homogenized sub-daily precipitation data at stations in Canada.
Lebl, Darren R; Bono, Christopher M; Velmahos, George; Metkar, Umesh; Nguyen, Joseph; Harris, Mitchel B
2013-07-15
Retrospective analysis of prospective registry data. To determine the patient characteristics, risk factors, and fracture patterns associated with vertebral artery injury (VAI) in patients with blunt cervical spine injury. VAI associated with cervical spine trauma has the potential for catastrophical clinical sequelae. The patterns of cervical spine injury and patient characteristics associated with VAI remain to be determined. A retrospective review of prospectively collected data from the American College of Surgeons trauma registries at 3 level-1 trauma centers identified all patients with a cervical spine injury on multidetector computed tomographic scan during a 3-year period (January 1, 2007, to January 1, 2010). Fracture pattern and patient characteristics were recorded. Logistic multivariate regression analysis of independent predictors for VAI and subgroup analysis of neurological events related to VAI was performed. Twenty-one percent of 1204 patients with cervical injuries (n = 253) underwent screening for VAI by multidetector computed tomography angiogram. VAI was diagnosed in 17% (42 of 253), unilateral in 15% (38 of 253), and bilateral in 1.6% (4 of 253) and was associated with a lower Glasgow coma scale (P < 0.001), a higher injury severity score (P < 0.01), and a higher mortality (P < 0.001). VAI was associated with ankylosing spondylitis/diffuse idiopathic skeletal hyperosteosis (crude odds ratio [OR] = 8.04; 95% confidence interval [CI], 1.30-49.68; P = 0.034), and occipitocervical dissociation (P < 0.001) by univariate analysis and fracture displacement into the transverse foramen 1 mm or more (adjusted OR = 3.29; 95% CI, 1.15-9.41; P = 0.026), and basilar skull fracture (adjusted OR = 4.25; 95% CI, 1.25-14.47; P= 0.021), by multivariate regression model. Subgroup analyses of neurological events secondary to VAI occurred in 14% (6 of 42) and the stroke-related mortality rate was 4.8% (2 of 42). Neurological events were associated with male sex (P
Forghani, Ali; Peralta, Richard C.
2017-10-01
The study presents a procedure using solute transport and statistical models to evaluate the performance of aquifer storage and recovery (ASR) systems designed to earn additional water rights in freshwater aquifers. The recovery effectiveness (REN) index quantifies the performance of these ASR systems. REN is the proportion of the injected water that the same ASR well can recapture during subsequent extraction periods. To estimate REN for individual ASR wells, the presented procedure uses finely discretized groundwater flow and contaminant transport modeling. Then, the procedure uses multivariate adaptive regression splines (MARS) analysis to identify the significant variables affecting REN, and to identify the most recovery-effective wells. Achieving REN values close to 100% is the desire of the studied 14-well ASR system operator. This recovery is feasible for most of the ASR wells by extracting three times the injectate volume during the same year as injection. Most of the wells would achieve RENs below 75% if extracting merely the same volume as they injected. In other words, recovering almost all the same water molecules that are injected requires having a pre-existing water right to extract groundwater annually. MARS shows that REN most significantly correlates with groundwater flow velocity, or hydraulic conductivity and hydraulic gradient. MARS results also demonstrate that maximizing REN requires utilizing the wells located in areas with background Darcian groundwater velocities less than 0.03 m/d. The study also highlights the superiority of MARS over regular multiple linear regressions to identify the wells that can provide the maximum REN. This is the first reported application of MARS for evaluating performance of an ASR system in fresh water aquifers.
Directory of Open Access Journals (Sweden)
Wengang Zhang
2016-01-01
Full Text Available Piles are long, slender structural elements used to transfer the loads from the superstructure through weak strata onto stiffer soils or rocks. For driven piles, the impact of the piling hammer induces compression and tension stresses in the piles. Hence, an important design consideration is to check that the strength of the pile is sufficient to resist the stresses caused by the impact of the pile hammer. Due to its complexity, pile drivability lacks a precise analytical solution with regard to the phenomena involved. In situations where measured data or numerical hypothetical results are available, neural networks stand out in mapping the nonlinear interactions and relationships between the system's predictors and dependent responses. In addition, unlike most computational tools, no mathematical relationship assumption between the dependent and independent variables has to be made. Nevertheless, neural networks have been criticized for their long trial-and-error training process since the optimal configuration is not known a priori. This paper investigates the use of a fairly simple nonparametric regression algorithm known as multivariate adaptive regression splines (MARS, as an alternative to neural networks, to approximate the relationship between the inputs and dependent response, and to mathematically interpret the relationship between the various parameters. In this paper, the Back propagation neural network (BPNN and MARS models are developed for assessing pile drivability in relation to the prediction of the Maximum compressive stresses (MCS, Maximum tensile stresses (MTS, and Blow per foot (BPF. A database of more than four thousand piles is utilized for model development and comparative performance between BPNN and MARS predictions.
Directory of Open Access Journals (Sweden)
Mario Menéndez Álvarez
2017-06-01
Full Text Available Modeling of a cylindrical heavy media separator has been conducted in order to predict its optimum operating parameters. As far as it is known by the authors, this is the first application in the literature. The aim of the present research is to predict the separation efficiency based on the adjustment of the device’s dimensions and media flow rates. A variety of heavy media separators exist that are extensively used to separate particles by density. There is a growing importance in their application in the recycling sector. The cylindrical variety is reported to be the most suited for processing a large range of particle sizes, but optimizing its operating parameters remains to be documented. The multivariate adaptive regression splines methodology has been applied in order to predict the separation efficiencies using, as inputs, the device dimension and media flow rate variables. The results obtained show that it is possible to predict the device separation efficiency according to laboratory experiments performed and, therefore, forecast results obtainable with different operating conditions.
Selecting minimum dataset soil variables using PLSR as a regressive multivariate method
Stellacci, Anna Maria; Armenise, Elena; Castellini, Mirko; Rossi, Roberta; Vitti, Carolina; Leogrande, Rita; De Benedetto, Daniela; Ferrara, Rossana M.; Vivaldi, Gaetano A.
2017-04-01
Long-term field experiments and science-based tools that characterize soil status (namely the soil quality indices, SQIs) assume a strategic role in assessing the effect of agronomic techniques and thus in improving soil management especially in marginal environments. Selecting key soil variables able to best represent soil status is a critical step for the calculation of SQIs. Current studies show the effectiveness of statistical methods for variable selection to extract relevant information deriving from multivariate datasets. Principal component analysis (PCA) has been mainly used, however supervised multivariate methods and regressive techniques are progressively being evaluated (Armenise et al., 2013; de Paul Obade et al., 2016; Pulido Moncada et al., 2014). The present study explores the effectiveness of partial least square regression (PLSR) in selecting critical soil variables, using a dataset comparing conventional tillage and sod-seeding on durum wheat. The results were compared to those obtained using PCA and stepwise discriminant analysis (SDA). The soil data derived from a long-term field experiment in Southern Italy. On samples collected in April 2015, the following set of variables was quantified: (i) chemical: total organic carbon and nitrogen (TOC and TN), alkali-extractable C (TEC and humic substances - HA-FA), water extractable N and organic C (WEN and WEOC), Olsen extractable P, exchangeable cations, pH and EC; (ii) physical: texture, dry bulk density (BD), macroporosity (Pmac), air capacity (AC), and relative field capacity (RFC); (iii) biological: carbon of the microbial biomass quantified with the fumigation-extraction method. PCA and SDA were previously applied to the multivariate dataset (Stellacci et al., 2016). PLSR was carried out on mean centered and variance scaled data of predictors (soil variables) and response (wheat yield) variables using the PLS procedure of SAS/STAT. In addition, variable importance for projection (VIP
Directory of Open Access Journals (Sweden)
Hairong Huang
Full Text Available This study identified potential general influencing factors for a mathematical prediction of implant stability quotient (ISQ values in clinical practice.We collected the ISQ values of 557 implants from 2 different brands (SICace and Osstem placed by 2 surgeons in 336 patients. Surgeon 1 placed 329 SICace implants, and surgeon 2 placed 113 SICace implants and 115 Osstem implants. ISQ measurements were taken at T1 (immediately after implant placement and T2 (before dental restoration. A multivariate linear regression model was used to analyze the influence of the following 11 candidate factors for stability prediction: sex, age, maxillary/mandibular location, bone type, immediate/delayed implantation, bone grafting, insertion torque, I-stage or II-stage healing pattern, implant diameter, implant length and T1-T2 time interval.The need for bone grafting as a predictor significantly influenced ISQ values in all three groups at T1 (weight coefficients ranging from -4 to -5. In contrast, implant diameter consistently influenced the ISQ values in all three groups at T2 (weight coefficients ranging from 3.4 to 4.2. Other factors, such as sex, age, I/II-stage implantation and bone type, did not significantly influence ISQ values at T2, and implant length did not significantly influence ISQ values at T1 or T2.These findings provide a rational basis for mathematical models to quantitatively predict the ISQ values of implants in clinical practice.
Brandstätter, Christian; Laner, David; Prantl, Roman; Fellner, Johann
2014-12-01
Municipal solid waste landfills pose a threat on environment and human health, especially old landfills which lack facilities for collection and treatment of landfill gas and leachate. Consequently, missing information about emission flows prevent site-specific environmental risk assessments. To overcome this gap, the combination of waste sampling and analysis with statistical modeling is one option for estimating present and future emission potentials. Optimizing the tradeoff between investigation costs and reliable results requires knowledge about both: the number of samples to be taken and variables to be analyzed. This article aims to identify the optimized number of waste samples and variables in order to predict a larger set of variables. Therefore, we introduce a multivariate linear regression model and tested the applicability by usage of two case studies. Landfill A was used to set up and calibrate the model based on 50 waste samples and twelve variables. The calibrated model was applied to Landfill B including 36 waste samples and twelve variables with four predictor variables. The case study results are twofold: first, the reliable and accurate prediction of the twelve variables can be achieved with the knowledge of four predictor variables (Loi, EC, pH and Cl). For the second Landfill B, only ten full measurements would be needed for a reliable prediction of most response variables. The four predictor variables would exhibit comparably low analytical costs in comparison to the full set of measurements. This cost reduction could be used to increase the number of samples yielding an improved understanding of the spatial waste heterogeneity in landfills. Concluding, the future application of the developed model potentially improves the reliability of predicted emission potentials. The model could become a standard screening tool for old landfills if its applicability and reliability would be tested in additional case studies. Copyright © 2014 Elsevier Ltd
Predictors of anemia after bariatric surgery using multivariate adaptive regression splines.
Lee, Yi-Chih; Lee, Tian-Shyug; Lee, Wei-Jei; Lin, Yang-Chu; Lee, Chia-Ko; Liew, Phui-Ly
2012-01-01
Anemia is the most common nutritional deficiency after bariatric surgery. The predictors of anemia have not been clearly identified. This issue is useful for selecting an appropriate surgery procedure for morbid obesity. From December 2000 to October 2007, a retrospective study of 442 obese patients after bariatric surgery with two years' follow-up data was conducted. Anemia was defined by hemoglobin (Hb) under 13mg/dL in male and 11.5mg/dL in female. We analyzed the clinical information and laboratory data during the initial evaluation of patients referred to bariatric surgery for predictors of anemia development after surgery. All data were analyzed by using multivariate adaptive regression splines (MARS) method. Of the patients, the mean age was 30.8±8.6 years; mean BMI was 40.7±7.8kg/m2 and preoperative mean hemoglobin (Hb) was 13.7±1.5g/ dL. The prevalence of anemia increased from preoperatively 5.4% to 38.0% two years after surgery. Mean Hb was significantly lower in patients receiving gastric bypass than in restrictive type surgery (11.9mg/dL vs. 13.1mg/dL, p=0.040) two years after surgery. Besides, the preoperative optimal value of hemoglobin to predict future anemia in MARS model is 15.6mg/dL. The prevalence of anemia increased to 38.0% two years after bariatric surgery. We obtained an optimal preoperative value of hemoglobin 15.6mg/dL to predict postoperative anemia, which was important in preoperative assessment for bariatric surgery. Patients undergone gastric bypass surgery developed more severe anemia than gastric banding or sleeve gastrectomy.
Directory of Open Access Journals (Sweden)
M. Ahmadlou
2015-12-01
Full Text Available Land use change (LUC models used for modelling urban growth are different in structure and performance. Local models divide the data into separate subsets and fit distinct models on each of the subsets. Non-parametric models are data driven and usually do not have a fixed model structure or model structure is unknown before the modelling process. On the other hand, global models perform modelling using all the available data. In addition, parametric models have a fixed structure before the modelling process and they are model driven. Since few studies have compared local non-parametric models with global parametric models, this study compares a local non-parametric model called multivariate adaptive regression spline (MARS, and a global parametric model called artificial neural network (ANN to simulate urbanization in Mumbai, India. Both models determine the relationship between a dependent variable and multiple independent variables. We used receiver operating characteristic (ROC to compare the power of the both models for simulating urbanization. Landsat images of 1991 (TM and 2010 (ETM+ were used for modelling the urbanization process. The drivers considered for urbanization in this area were distance to urban areas, urban density, distance to roads, distance to water, distance to forest, distance to railway, distance to central business district, number of agricultural cells in a 7 by 7 neighbourhoods, and slope in 1991. The results showed that the area under the ROC curve for MARS and ANN was 94.77% and 95.36%, respectively. Thus, ANN performed slightly better than MARS to simulate urban areas in Mumbai, India.
Mehdizadeh, Saeid; Behmanesh, Javad; Khalili, Keivan
2017-07-01
Soil temperature (T s) and its thermal regime are the most important factors in plant growth, biological activities, and water movement in soil. Due to scarcity of the T s data, estimation of soil temperature is an important issue in different fields of sciences. The main objective of the present study is to investigate the accuracy of multivariate adaptive regression splines (MARS) and support vector machine (SVM) methods for estimating the T s. For this aim, the monthly mean data of the T s (at depths of 5, 10, 50, and 100 cm) and meteorological parameters of 30 synoptic stations in Iran were utilized. To develop the MARS and SVM models, various combinations of minimum, maximum, and mean air temperatures (T min, T max, T); actual and maximum possible sunshine duration; sunshine duration ratio (n, N, n/N); actual, net, and extraterrestrial solar radiation data (R s, R n, R a); precipitation (P); relative humidity (RH); wind speed at 2 m height (u 2); and water vapor pressure (Vp) were used as input variables. Three error statistics including root-mean-square-error (RMSE), mean absolute error (MAE), and determination coefficient (R 2) were used to check the performance of MARS and SVM models. The results indicated that the MARS was superior to the SVM at different depths. In the test and validation phases, the most accurate estimations for the MARS were obtained at the depth of 10 cm for T max, T min, T inputs (RMSE = 0.71 °C, MAE = 0.54 °C, and R 2 = 0.995) and for RH, V p, P, and u 2 inputs (RMSE = 0.80 °C, MAE = 0.61 °C, and R 2 = 0.996), respectively.
Ahmadlou, M.; Delavar, M. R.; Tayyebi, A.; Shafizadeh-Moghadam, H.
2015-12-01
Land use change (LUC) models used for modelling urban growth are different in structure and performance. Local models divide the data into separate subsets and fit distinct models on each of the subsets. Non-parametric models are data driven and usually do not have a fixed model structure or model structure is unknown before the modelling process. On the other hand, global models perform modelling using all the available data. In addition, parametric models have a fixed structure before the modelling process and they are model driven. Since few studies have compared local non-parametric models with global parametric models, this study compares a local non-parametric model called multivariate adaptive regression spline (MARS), and a global parametric model called artificial neural network (ANN) to simulate urbanization in Mumbai, India. Both models determine the relationship between a dependent variable and multiple independent variables. We used receiver operating characteristic (ROC) to compare the power of the both models for simulating urbanization. Landsat images of 1991 (TM) and 2010 (ETM+) were used for modelling the urbanization process. The drivers considered for urbanization in this area were distance to urban areas, urban density, distance to roads, distance to water, distance to forest, distance to railway, distance to central business district, number of agricultural cells in a 7 by 7 neighbourhoods, and slope in 1991. The results showed that the area under the ROC curve for MARS and ANN was 94.77% and 95.36%, respectively. Thus, ANN performed slightly better than MARS to simulate urban areas in Mumbai, India.
Bayesian classification and regression trees for predicting incidence of cryptosporidiosis.
Directory of Open Access Journals (Sweden)
Wenbiao Hu
Full Text Available BACKGROUND: Classification and regression tree (CART models are tree-based exploratory data analysis methods which have been shown to be very useful in identifying and estimating complex hierarchical relationships in ecological and medical contexts. In this paper, a Bayesian CART model is described and applied to the problem of modelling the cryptosporidiosis infection in Queensland, Australia. METHODOLOGY/PRINCIPAL FINDINGS: We compared the results of a Bayesian CART model with those obtained using a Bayesian spatial conditional autoregressive (CAR model. Overall, the analyses indicated that the nature and magnitude of the effect estimates were similar for the two methods in this study, but the CART model more easily accommodated higher order interaction effects. CONCLUSIONS/SIGNIFICANCE: A Bayesian CART model for identification and estimation of the spatial distribution of disease risk is useful in monitoring and assessment of infectious diseases prevention and control.
Significant drivers of the virtual water trade evaluated with a multivariate regression analysis
Tamea, Stefania; Laio, Francesco; Ridolfi, Luca
2014-05-01
International trade of food is vital for the food security of many countries, which rely on trade to compensate for an agricultural production insufficient to feed the population. At the same time, food trade has implications on the distribution and use of water resources, because through the international trade of food commodities, countries virtually displace the water used for food production, known as "virtual water". Trade thus implies a network of virtual water fluxes from exporting to importing countries, which has been estimated to displace more than 2 billions of m3 of water per year, or about the 2% of the annual global precipitation above land. It is thus important to adequately identify the dynamics and the controlling factors of the virtual water trade in that it supports and enables the world food security. Using the FAOSTAT database of international trade and the virtual water content available from the Water Footprint Network, we reconstructed 25 years (1986-2010) of virtual water fluxes. We then analyzed the dependence of exchanged fluxes on a set of major relevant factors, that includes: population, gross domestic product, arable land, virtual water embedded in agricultural production and dietary consumption, and geographical distance between countries. Significant drivers have been identified by means of a multivariate regression analysis, applied separately to the export and import fluxes of each country; temporal trends are outlined and the relative importance of drivers is assessed by a commonality analysis. Results indicate that population, gross domestic product and geographical distance are the major drivers of virtual water fluxes, with a minor (but non-negligible) contribution given by the agricultural production of exporting countries. Such drivers have become relevant for an increasing number of countries throughout the years, with an increasing variance explained by the distance between countries and a decreasing role of the gross
Dinç, Erdal; Ozdemir, Abdil
2005-01-01
Multivariate chromatographic calibration technique was developed for the quantitative analysis of binary mixtures enalapril maleate (EA) and hydrochlorothiazide (HCT) in tablets in the presence of losartan potassium (LST). The mathematical algorithm of multivariate chromatographic calibration technique is based on the use of the linear regression equations constructed using relationship between concentration and peak area at the five-wavelength set. The algorithm of this mathematical calibration model having a simple mathematical content was briefly described. This approach is a powerful mathematical tool for an optimum chromatographic multivariate calibration and elimination of fluctuations coming from instrumental and experimental conditions. This multivariate chromatographic calibration contains reduction of multivariate linear regression functions to univariate data set. The validation of model was carried out by analyzing various synthetic binary mixtures and using the standard addition technique. Developed calibration technique was applied to the analysis of the real pharmaceutical tablets containing EA and HCT. The obtained results were compared with those obtained by classical HPLC method. It was observed that the proposed multivariate chromatographic calibration gives better results than classical HPLC.
Thibaut, Loïc; Wang, Yi Alice
2017-01-01
Bootstrap methods are widely used in statistics, and bootstrapping of residuals can be especially useful in the regression context. However, difficulties are encountered extending residual resampling to regression settings where residuals are not identically distributed (thus not amenable to bootstrapping)—common examples including logistic or Poisson regression and generalizations to handle clustered or multivariate data, such as generalised estimating equations. We propose a bootstrap method based on probability integral transform (PIT-) residuals, which we call the PIT-trap, which assumes data come from some marginal distribution F of known parametric form. This method can be understood as a type of “model-free bootstrap”, adapted to the problem of discrete and highly multivariate data. PIT-residuals have the key property that they are (asymptotically) pivotal. The PIT-trap thus inherits the key property, not afforded by any other residual resampling approach, that the marginal distribution of data can be preserved under PIT-trapping. This in turn enables the derivation of some standard bootstrap properties, including second-order correctness of pivotal PIT-trap test statistics. In multivariate data, bootstrapping rows of PIT-residuals affords the property that it preserves correlation in data without the need for it to be modelled, a key point of difference as compared to a parametric bootstrap. The proposed method is illustrated on an example involving multivariate abundance data in ecology, and demonstrated via simulation to have improved properties as compared to competing resampling methods. PMID:28738071
Warton, David I; Thibaut, Loïc; Wang, Yi Alice
2017-01-01
Bootstrap methods are widely used in statistics, and bootstrapping of residuals can be especially useful in the regression context. However, difficulties are encountered extending residual resampling to regression settings where residuals are not identically distributed (thus not amenable to bootstrapping)-common examples including logistic or Poisson regression and generalizations to handle clustered or multivariate data, such as generalised estimating equations. We propose a bootstrap method based on probability integral transform (PIT-) residuals, which we call the PIT-trap, which assumes data come from some marginal distribution F of known parametric form. This method can be understood as a type of "model-free bootstrap", adapted to the problem of discrete and highly multivariate data. PIT-residuals have the key property that they are (asymptotically) pivotal. The PIT-trap thus inherits the key property, not afforded by any other residual resampling approach, that the marginal distribution of data can be preserved under PIT-trapping. This in turn enables the derivation of some standard bootstrap properties, including second-order correctness of pivotal PIT-trap test statistics. In multivariate data, bootstrapping rows of PIT-residuals affords the property that it preserves correlation in data without the need for it to be modelled, a key point of difference as compared to a parametric bootstrap. The proposed method is illustrated on an example involving multivariate abundance data in ecology, and demonstrated via simulation to have improved properties as compared to competing resampling methods.
Bayes Wavelet Regression Approach to Solve Problems in Multivariable Calibration Modeling
Directory of Open Access Journals (Sweden)
Setiawan Setiawan
2010-05-01
Full Text Available In the multiple regression modeling, a serious problems would arise if the independent variables are correlated among each other (the problem of ill conditioned and the number of observations is much smaller than the number of independent variables (the problem of singularity. Bayes Regression (BR is an approach that can be used to solve the problem of ill conditioned, but computing constraints will be experienced, so pre-processing methods will be necessary in the form of dimensional reduction of independent variables. The results of empirical studies and literature shows that the discrete wavelet transform (WT gives estimation results of regression model which is better than the other preprocessing methods. This experiment will study a combination of BR with WT as pre-processing method to solve the problems ill conditioned and singularities. One application of calibration in the field of chemistry is relationship modeling between the concentration of active substance as measured by High Performance Liquid Chromatography (HPLC with Fourier Transform Infrared (FTIR absorbance spectrum. Spectrum pattern is expected to predict the value of the concentration of active substance. The exploration of Continuum Regression Wavelet Transform (CR-WT, and Partial Least Squares Regression Wavelet Transform (PLS-WT, and Bayes Regression Wavelet Transform (BR-WT shows that the BR-WT has a good performance. BR-WT is superior than PLS-WT method, and relatively is as good as CR-WT method.
Lehermeier, Christina; Schön, Chris-Carolin; de Los Campos, Gustavo
2015-09-01
Plant breeding populations exhibit varying levels of structure and admixture; these features are likely to induce heterogeneity of marker effects across subpopulations. Traditionally, structure has been dealt with as a potential confounder, and various methods exist to "correct" for population stratification. However, these methods induce a mean correction that does not account for heterogeneity of marker effects. The animal breeding literature offers a few recent studies that consider modeling genetic heterogeneity in multibreed data, using multivariate models. However, these methods have received little attention in plant breeding where population structure can have different forms. In this article we address the problem of analyzing data from heterogeneous plant breeding populations, using three approaches: (a) a model that ignores population structure [A-genome-based best linear unbiased prediction (A-GBLUP)], (b) a stratified (i.e., within-group) analysis (W-GBLUP), and (c) a multivariate approach that uses multigroup data and accounts for heterogeneity (MG-GBLUP). The performance of the three models was assessed on three different data sets: a diversity panel of rice (Oryza sativa), a maize (Zea mays L.) half-sib panel, and a wheat (Triticum aestivum L.) data set that originated from plant breeding programs. The estimated genomic correlations between subpopulations varied from null to moderate, depending on the genetic distance between subpopulations and traits. Our assessment of prediction accuracy features cases where ignoring population structure leads to a parsimonious more powerful model as well as others where the multivariate and stratified approaches have higher predictive power. In general, the multivariate approach appeared slightly more robust than either the A- or the W-GBLUP. Copyright © 2015 by the Genetics Society of America.
Stagewise pseudo-value regression for time-varying effects on the cumulative incidence
DEFF Research Database (Denmark)
Zöller, Daniela; Schmidtmann, Irene; Weinmann, Arndt
2016-01-01
using a pseudo-value approach. For a grid of time points, the possibly unobserved binary event status is replaced by a jackknife pseudo-value based on the Aalen-Johansen method. We combine a stagewise regression technique with the pseudo-value approach to provide variable selection while allowing......In a competing risks setting, the cumulative incidence of an event of interest describes the absolute risk for this event as a function of time. For regression analysis, one can either choose to model all competing events by separate cause-specific hazard models or directly model the association...... between covariates and the cumulative incidence of one of the events. With a suitable link function, direct regression models allow for a straightforward interpretation of covariate effects on the cumulative incidence. In practice, where data can be right-censored, these regression models are implemented...
Jiménez-Huete, Adolfo; Riva, Elena; Toledano, Rafael; Campo, Pablo; Esteban, Jesús; Barrio, Antonio Del; Franch, Oriol
2014-12-01
The validity of neuropsychological tests for the differential diagnosis of degenerative dementias may depend on the clinical context. We constructed a series of logistic models taking into account this factor. We retrospectively analyzed the demographic and neuropsychological data of 301 patients with probable Alzheimer's disease (AD), frontotemporal degeneration (FTLD), or dementia with Lewy bodies (DLB). Nine models were constructed taking into account the diagnostic question (eg, AD vs DLB) and subpopulation (incident vs prevalent). The AD versus DLB model for all patients, including memory recovery and phonological fluency, was highly accurate (area under the curve = 0.919, sensitivity = 90%, and specificity = 80%). The results were comparable in incident and prevalent cases. The FTLD versus AD and DLB versus FTLD models were both inaccurate. The models constructed from basic neuropsychological variables allowed an accurate differential diagnosis of AD versus DLB but not of FTLD versus AD or DLB. © The Author(s) 2014.
Shi, Jinfei; Zhu, Songqing; Chen, Ruwen
2017-12-01
An order selection method based on multiple stepwise regressions is proposed for General Expression of Nonlinear Autoregressive model which converts the model order problem into the variable selection of multiple linear regression equation. The partial autocorrelation function is adopted to define the linear term in GNAR model. The result is set as the initial model, and then the nonlinear terms are introduced gradually. Statistics are chosen to study the improvements of both the new introduced and originally existed variables for the model characteristics, which are adopted to determine the model variables to retain or eliminate. So the optimal model is obtained through data fitting effect measurement or significance test. The simulation and classic time-series data experiment results show that the method proposed is simple, reliable and can be applied to practical engineering.
Dental age assessment of young Iranian adults using third molars: A multivariate regression study.
Bagherpour, Ali; Anbiaee, Najmeh; Partovi, Parnia; Golestani, Shayan; Afzalinasab, Shakiba
2012-10-01
In recent years, a noticeable increase in forensic age estimations of living individuals has been observed. Radiologic assessment of the mineralisation stage of third molars is of particular importance, with regard to the relevant age group. To attain a referral database and regression equations for dental age estimation of unaccompanied minors in an Iranian population was the goal of this study. Moreover, determination was made concerning the probability of an individual being over the age of 18 in case of full third molar(s) development. Using the scoring system of Gleiser and Hunt, modified by Köhler, an investigation of a cross-sectional sample of 1274 orthopantomograms of 885 females and 389 males aged between 15 and 22 years was carried out. Using kappa statistics, intra-observer reliability was tested. With Spearman correlation coefficient, correlation between the scores of all four wisdom teeth, was evaluated. We also carried out the Wilcoxon signed-rank test on asymmetry and calculated the regression formulae. A strong intra-observer agreement was displayed by the kappa value. No significant difference (p-value for upper and lower jaws were 0.07 and 0.59, respectively) was discovered by Wilcoxon signed-rank test for left and right asymmetry. The developmental stage of upper right and upper left third molars yielded the greatest correlation coefficient. The probability of an individual being over the age of 18 is 95.6% for males and 100.0% for females in case four fully developed third molars are present. Taking into consideration gender, location and number of wisdom teeth, regression formulae were arrived at. Use of population-specific standards is recommended as a means of improving the accuracy of forensic age estimates based on third molars mineralisation. To obtain more exact regression formulae, wider age range studies are recommended. Copyright © 2012 Elsevier Ltd and Faculty of Forensic and Legal Medicine. All rights reserved.
International Nuclear Information System (INIS)
Lima, Reginaldo Agapito de; Ribeiro Junior, Leopoldo Uberto
2010-01-01
For implantation of a SHP, the barrage is the main structure where its sizing represents from 30% - 50% of general cost of civil works. Considering this it is very important to have a fast, didactic and accurate tool for elaborating a budget, also allowing a quantitative analysis of inherent cost for civil building of barrages concrete made for small hydropower plants. In face of this, the multi changing regression tool is very important as it allows a fast and correct establishing of preliminary costs, even approximate, for estimates of barrages in concrete cost, enabling to ease the budget, guiding feasibility decisions for selecting or neglecting new alternatives of fall. (author)
Prediction of grindability with multivariable regression and neural network in Chinese coal
Energy Technology Data Exchange (ETDEWEB)
Li Peisheng; Xiong Youhui; Yu Dunxi; Sun Xuexin [Huazhong University of Science and Technology, Wuhan (China). State Key Laboratory of Coal Combustion
2005-12-01
Grindability index of coal is usually determined by Hardgrove Grindability Index (HGI). The correlation between the proximate analysis of Chinese coal and HGI was studied. It was found from statistical analysis that, the higher the moisture and the volatile matter content in coal, the less the HGI will be. On the contrary, the higher the ash and the fixed carbon content in coal, the higher the HGI will be. But the correlation between proximate analysis and HGI in coals is nonlinear. The prediction equation of HGI reported in literature, which is based on proximate analysis of coal and linear regression method, is not correct for coals in China. In this paper, the generalized regression neural network (GRNN) method was used to predict the HGI. A higher precision in the prediction result was obtained through such new method. By this method, the HGI can be estimated indirectly from the proximate analysis of coal when the HGI measurement equipment is not available. 12 refs., 2 figs., 1 tab.
The analysis of internet addiction scale using multivariate adaptive regression splines.
Kayri, M
2010-01-01
Determining real effects on internet dependency is too crucial with unbiased and robust statistical method. MARS is a new non-parametric method in use in the literature for parameter estimations of cause and effect based research. MARS can both obtain legible model curves and make unbiased parametric predictions. In order to examine the performance of MARS, MARS findings will be compared to Classification and Regression Tree (C&RT) findings, which are considered in the literature to be efficient in revealing correlations between variables. The data set for the study is taken from "The Internet Addiction Scale" (IAS), which attempts to reveal addiction levels of individuals. The population of the study consists of 754 secondary school students (301 female, 443 male students with 10 missing data). MARS 2.0 trial version is used for analysis by MARS method and C&RT analysis was done by SPSS. MARS obtained six base functions of the model. As a common result of these six functions, regression equation of the model was found. Over the predicted variable, MARS showed that the predictors of daily Internet-use time on average, the purpose of Internet-use, grade of students and occupations of mothers had a significant effect (Pdependency level prediction. The fact that MARS revealed extent to which the variable, which was considered significant, changes the character of the model was observed in this study.
DEFF Research Database (Denmark)
Cheng, Yongcun; Andersen, Ole Baltazar; Knudsen, Per
2010-01-01
of GMES marine core service. One such added value will be a multivariate regression model of sea level variability of multisatellite and in-situ tide gauge observations with the aim at improved future high spatial and temporal sea level prediction for i.e., human safety. Tide gauges and satellite...... altimetry data from the last seventeen years have been compared for an area around UK and temporal correlation coefficients between them were calculated. The results are extremely encouraging, as we have shown that the detided signal from response method correlates to more than 90% for nearly all tide gauge...
Deconinck, E; Zhang, M H; Petitet, F; Dubus, E; Ijjaali, I; Coomans, D; Vander Heyden, Y
2008-02-18
The use of some unconventional non-linear modeling techniques, i.e. classification and regression trees and multivariate adaptive regression splines-based methods, was explored to model the blood-brain barrier (BBB) passage of drugs and drug-like molecules. The data set contains BBB passage values for 299 structural and pharmacological diverse drugs, originating from a structured knowledge-based database. Models were built using boosted regression trees (BRT) and multivariate adaptive regression splines (MARS), as well as their respective combinations with stepwise multiple linear regression (MLR) and partial least squares (PLS) regression in two-step approaches. The best models were obtained using combinations of MARS with either stepwise MLR or PLS. It could be concluded that the use of combinations of a linear with a non-linear modeling technique results in some improved properties compared to the individual linear and non-linear models and that, when the use of such a combination is appropriate, combinations using MARS as non-linear technique should be preferred over those with BRT, due to some serious drawbacks of the BRT approaches.
Prediction of diffuse solar irradiance using machine learning and multivariable regression
International Nuclear Information System (INIS)
Lou, Siwei; Li, Danny H.W.; Lam, Joseph C.; Chan, Wilco W.H.
2016-01-01
Highlights: • 54.9% of the annual global irradiance is composed by its diffuse part in HK. • Hourly diffuse irradiance was predicted by accessible variables. • The importance of variable in prediction was assessed by machine learning. • Simple prediction equations were developed with the knowledge of variable importance. - Abstract: The paper studies the horizontal global, direct-beam and sky-diffuse solar irradiance data measured in Hong Kong from 2008 to 2013. A machine learning algorithm was employed to predict the horizontal sky-diffuse irradiance and conduct sensitivity analysis for the meteorological variables. Apart from the clearness index (horizontal global/extra atmospheric solar irradiance), we found that predictors including solar altitude, air temperature, cloud cover and visibility are also important in predicting the diffuse component. The mean absolute error (MAE) of the logistic regression using the aforementioned predictors was less than 21.5 W/m 2 and 30 W/m 2 for Hong Kong and Denver, USA, respectively. With the systematic recording of the five variables for more than 35 years, the proposed model would be appropriate to estimate of long-term diffuse solar radiation, study climate change and develope typical meteorological year in Hong Kong and places with similar climates.
Directory of Open Access Journals (Sweden)
Tao Gao
2014-01-01
Full Text Available Extreme precipitation is likely to be one of the most severe meteorological disasters in China; however, studies on the physical factors affecting precipitation extremes and corresponding prediction models are not accurately available. From a new point of view, the sensible heat flux (SHF and latent heat flux (LHF, which have significant impacts on summer extreme rainfall in Yangtze River basin (YRB, have been quantified and then selections of the impact factors are conducted. Firstly, a regional extreme precipitation index was applied to determine Regions of Significant Correlation (RSC by analyzing spatial distribution of correlation coefficients between this index and SHF, LHF, and sea surface temperature (SST on global ocean scale; then the time series of SHF, LHF, and SST in RSCs during 1967–2010 were selected. Furthermore, other factors that significantly affect variations in precipitation extremes over YRB were also selected. The methods of multiple stepwise regression and leave-one-out cross-validation (LOOCV were utilized to analyze and test influencing factors and statistical prediction model. The correlation coefficient between observed regional extreme index and model simulation result is 0.85, with significant level at 99%. This suggested that the forecast skill was acceptable although many aspects of the prediction model should be improved.
Lees, Mackenzie C.; Merani, Shaheed; Tauh, Keerit; Khadaroo, Rachel G.
2015-01-01
Background Older adults (≥ 65 yr) are the fastest growing population and are presenting in increasing numbers for acute surgical care. Emergency surgery is frequently life threatening for older patients. Our objective was to identify predictors of mortality and poor outcome among elderly patients undergoing emergency general surgery. Methods We conducted a retrospective cohort study of patients aged 65–80 years undergoing emergency general surgery between 2009 and 2010 at a tertiary care centre. Demographics, comorbidities, in-hospital complications, mortality and disposition characteristics of patients were collected. Logistic regression analysis was used to identify covariate-adjusted predictors of in-hospital mortality and discharge of patients home. Results Our analysis included 257 patients with a mean age of 72 years; 52% were men. In-hospital mortality was 12%. Mortality was associated with patients who had higher American Society of Anesthesiologists (ASA) class (odds ratio [OR] 3.85, 95% confidence interval [CI] 1.43–10.33, p = 0.008) and in-hospital complications (OR 1.93, 95% CI 1.32–2.83, p = 0.001). Nearly two-thirds of patients discharged home were younger (OR 0.92, 95% CI 0.85–0.99, p = 0.036), had lower ASA class (OR 0.45, 95% CI 0.27–0.74, p = 0.002) and fewer in-hospital complications (OR 0.69, 95% CI 0.53–0.90, p = 0.007). Conclusion American Society of Anesthesiologists class and in-hospital complications are perioperative predictors of mortality and disposition in the older surgical population. Understanding the predictors of poor outcome and the importance of preventing in-hospital complications in older patients will have important clinical utility in terms of preoperative counselling, improving health care and discharging patients home. PMID:26204143
Monakhova, Yulia B; Diehl, Bernd W K
2016-03-22
In recent years the number of spectroscopic studies utilizing multivariate techniques and involving different laboratories has been dramatically increased. In this paper the protocol for calibration transfer of partial least square regression model between high-resolution nuclear magnetic resonance (NMR) spectrometers of different frequencies and equipped with different probes was established. As the test system previously published quantitative model to predict the concentration of blended soy species in sunflower lecithin was used. For multivariate modelling piecewise direct standardization (PDS), direct standardization, and hybrid calibration were employed. PDS showed the best performance for estimating lecithin falsification regarding its vegetable origin resulting in a significant decrease in root mean square error of prediction from 5.0 to 7.3% without standardization to 2.9-3.2% for PDS. Acceptable calibration transfer model was obtained by direct standardization, but this standardization approach introduces unfavourable noise to the spectral data. Hybrid calibration is least recommended for high-resolution NMR data. The sensitivity of instrument transfer methods with respect to the type of spectrometer, the number of samples and the subset selection was also discussed. The study showed the necessity of applying a proper standardization procedure in cases when multivariate model has to be applied to the spectra recorded on a secondary NMR spectrometer even with the same magnetic field strength. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
International Nuclear Information System (INIS)
Li, Yanting; He, Yong; Su, Yan; Shu, Lianjie
2016-01-01
Highlights: • Suggests a nonparametric model based on MARS for output power prediction. • Compare the MARS model with a wide variety of prediction models. • Show that the MARS model is able to provide an overall good performance in both the training and testing stages. - Abstract: Both linear and nonlinear models have been proposed for forecasting the power output of photovoltaic systems. Linear models are simple to implement but less flexible. Due to the stochastic nature of the power output of PV systems, nonlinear models tend to provide better forecast than linear models. Motivated by this, this paper suggests a fairly simple nonlinear regression model known as multivariate adaptive regression splines (MARS), as an alternative to forecasting of solar power output. The MARS model is a data-driven modeling approach without any assumption about the relationship between the power output and predictors. It maintains simplicity of the classical multiple linear regression (MLR) model while possessing the capability of handling nonlinearity. It is simpler in format than other nonlinear models such as ANN, k-nearest neighbors (KNN), classification and regression tree (CART), and support vector machine (SVM). The MARS model was applied on the daily output of a grid-connected 2.1 kW PV system to provide the 1-day-ahead mean daily forecast of the power output. The comparisons with a wide variety of forecast models show that the MARS model is able to provide reliable forecast performance.
Wu, W.; Chen, G. Y.; Kang, R.; Xia, J. C.; Huang, Y. P.; Chen, K. J.
2017-07-01
During slaughtering and further processing, chicken carcasses are inevitably contaminated by microbial pathogen contaminants. Due to food safety concerns, many countries implement a zero-tolerance policy that forbids the placement of visibly contaminated carcasses in ice-water chiller tanks during processing. Manual detection of contaminants is labor consuming and imprecise. Here, a successive projections algorithm (SPA)-multivariable linear regression (MLR) classifier based on an optimal performance threshold was developed for automatic detection of contaminants on chicken carcasses. Hyperspectral images were obtained using a hyperspectral imaging system. A regression model of the classifier was established by MLR based on twelve characteristic wavelengths (505, 537, 561, 562, 564, 575, 604, 627, 656, 665, 670, and 689 nm) selected by SPA , and the optimal threshold T = 1 was obtained from the receiver operating characteristic (ROC) analysis. The SPA-MLR classifier provided the best detection results when compared with the SPA-partial least squares (PLS) regression classifier and the SPA-least squares supported vector machine (LS-SVM) classifier. The true positive rate (TPR) of 100% and the false positive rate (FPR) of 0.392% indicate that the SPA-MLR classifier can utilize spatial and spectral information to effectively detect contaminants on chicken carcasses.
Directory of Open Access Journals (Sweden)
Peng Nai
2016-03-01
Full Text Available A great number of immigration populations resident permanently in Yunnan Border Area of China. To some extent, these people belong to refugees or immigrants in accordance with International Rules, which significantly features the social diversity of this area. However, this kind of social diversity always impairs the social order. Therefore, there will be a positive influence to the local society governance by a research on local immigration integration. This essay hereby attempts to acquire the data of the living situation of these border area immigration and refugees. The analysis of the social integration of refugees and immigration in Yunnan border area in China will be deployed through the modeling of multivariable linear regression based on these data in order to propose some more achievable resolutions.
Energy Technology Data Exchange (ETDEWEB)
Dey, Prasenjit; Dad, Ajoy K. [Mechanical Engineering Department, National Institute of Technology, Agartala (India)
2016-12-15
The present study aims to predict the heat transfer characteristics around a square cylinder with different corner radii using multivariate adaptive regression splines (MARS). Further, the MARS-generated objective function is optimized by particle swarm optimization. The data for the prediction are taken from the recently published article by the present authors [P. Dey, A. Sarkar, A.K. Das, Development of GEP and ANN model to predict the unsteady forced convection over a cylinder, Neural Comput. Appl. (2015). Further, the MARS model is compared with artificial neural network and gene expression programming. It has been found that the MARS model is very efficient in predicting the heat transfer characteristics. It has also been found that MARS is more efficient than artificial neural network and gene expression programming in predicting the forced convection data, and also particle swarm optimization can efficiently optimize the heat transfer rate.
Directory of Open Access Journals (Sweden)
Abdelfattah M. Selim
2018-03-01
Full Text Available Aim: The present cross-sectional study was conducted to determine the seroprevalence and potential risk factors associated with Bovine viral diarrhea virus (BVDV disease in cattle and buffaloes in Egypt, to model the potential risk factors associated with the disease using logistic regression (LR models, and to fit the best predictive model for the current data. Materials and Methods: A total of 740 blood samples were collected within November 2012-March 2013 from animals aged between 6 months and 3 years. The potential risk factors studied were species, age, sex, and herd location. All serum samples were examined with indirect ELIZA test for antibody detection. Data were analyzed with different statistical approaches such as Chi-square test, odds ratios (OR, univariable, and multivariable LR models. Results: Results revealed a non-significant association between being seropositive with BVDV and all risk factors, except for species of animal. Seroprevalence percentages were 40% and 23% for cattle and buffaloes, respectively. OR for all categories were close to one with the highest OR for cattle relative to buffaloes, which was 2.237. Likelihood ratio tests showed a significant drop of the -2LL from univariable LR to multivariable LR models. Conclusion: There was an evidence of high seroprevalence of BVDV among cattle as compared with buffaloes with the possibility of infection in different age groups of animals. In addition, multivariable LR model was proved to provide more information for association and prediction purposes relative to univariable LR models and Chi-square tests if we have more than one predictor.
DEFF Research Database (Denmark)
Søraas, Camilla L; Wachtell, Kristian; Okin, Peter M
2010-01-01
Regression of left ventricular (LV) hypertrophy and albuminuria in hypertension has previously been shown to reduce clinical cardiovascular events and death. We aimed to investigate the associations of regression of electrocardiographic (ECG) LV hypertrophy and albuminuria with the incidence...
Directory of Open Access Journals (Sweden)
Maogui Hu
Full Text Available BACKGROUND: Over the past two decades, major epidemics of hand, foot, and mouth disease (HFMD have occurred throughout most of the West-Pacific Region countries, causing thousands of deaths among children. However, few studies have examined potential determinants of the incidence of HFMD. METHODS: Reported HFMD cases from 2912 counties in China were obtained for May 2008. The monthly HFMD cumulative incidence was calculated for children aged 9 years and younger. Child population density (CPD and six climate factors (average-temperature [AT], average-minimum-temperature [AT(min], average-maximum-temperature [AT(max], average-temperature-difference [AT(diff], average-relative-humidity [ARH], and monthly precipitation [MP] were selected as potential explanatory variables for the study. Geographically weighted regression (GWR models were used to explore the associations between the selected factors and HFMD incidence at county level. RESULTS: There were 176,111 HFMD cases reported in the studied counties. The adjusted monthly cumulative incidence by county ranged from 0.26 cases per 100,000 children to 2549.00 per 100,000 children. For local univariate GWR models, the percentage of counties with statistical significance (p<0.05 between HFMD incidence and each of the seven factors were: CPD 84.3%, AT(max 54.9%, AT 57.8%, AT(min 61.2%, ARH 54.4%, MP 50.3%, and AT(diff 51.6%. The R(2 for the seven factors' univariate GWR models are CPD 0.56, AT(max 0.53, AT 0.52, MP 0.51, AT(min 0.52, ARH 0.51, and AT(diff 0.51, respectively. CPD, MP, AT, ARH and AT(diff were further included in the multivariate GWR model, with R(2 0.62, and all counties show statistically significant relationship. CONCLUSION: Child population density and climate factors are potential determinants of the HFMD incidence in most areas in China. The strength and direction of association between these factors and the incidence of HFDM is spatially heterogeneous at the local geographic
Yoon, Richard S; Gage, Mark J; Galos, David K; Donegan, Derek J; Liporace, Frank A
2017-06-01
Intramedullary nailing (IMN) has become the standard of care for the treatment of most femoral shaft fractures. Different IMN options include trochanteric and piriformis entry as well as retrograde nails, which may result in varying degrees of femoral rotation. The objective of this study was to analyze postoperative femoral version between three types of nails and to delineate any significant differences in femoral version (DFV) and revision rates. Over a 10-year period, 417 patients underwent IMN of a diaphyseal femur fracture (AO/OTA 32A-C). Of these patients, 316 met inclusion criteria and obtained postoperative computed tomography (CT) scanograms to calculate femoral version and were thus included in the study. In this study, our main outcome measure was the difference in femoral version (DFV) between the uninjured limb and the injured limb. The effect of the following variables on DFV and revision rates were determined via univariate, multivariate, and ordinal regression analyses: gender, age, BMI, ethnicity, mechanism of injury, operative side, open fracture, and table type/position. Statistical significance was set at ptrochanteric entry nails (n=67). Univariate regression analysis revealed that a lower BMI was significantly associated with a lower DFV (p=0.006). Controlling for possible covariables, multivariate analysis yielded a significantly lower DFV for trochanteric entry nails than piriformis or retrograde nails (7.9±6.10 vs. 9.5±7.4 vs. 9.4±7.8°, ptrochanteric entry nails also had a significantly lower revision rate, even when controlling for all other variables (ptrochanteric nails had a significantly lower DFV and a lower revision rate, even after regression analysis. However, this is not to state that the other nail types exhibited abnormal DFV. Translation to the clinical impact of a few degrees of DFV is also unknown. Future studies to more in-depth study the intricacies of femoral version may lead to improved technology in addition to
Directory of Open Access Journals (Sweden)
Kehinde Anthony Mogaji
2016-07-01
Full Text Available This study developed a GIS-based multivariate regression (MVR yield rate prediction model of groundwater resource sustainability in the hard-rock geology terrain of southwestern Nigeria. This model can economically manage the aquifer yield rate potential predictions that are often overlooked in groundwater resources development. The proposed model relates the borehole yield rate inventory of the area to geoelectrically derived parameters. Three sets of borehole yield rate conditioning geoelectrically derived parameters—aquifer unit resistivity (ρ, aquifer unit thickness (D and coefficient of anisotropy (λ—were determined from the acquired and interpreted geophysical data. The extracted borehole yield rate values and the geoelectrically derived parameter values were regressed to develop the MVR relationship model by applying linear regression and GIS techniques. The sensitivity analysis results of the MVR model evaluated at P ⩽ 0.05 for the predictors ρ, D and λ provided values of 2.68 × 10−05, 2 × 10−02 and 2.09 × 10−06, respectively. The accuracy and predictive power tests conducted on the MVR model using the Theil inequality coefficient measurement approach, coupled with the sensitivity analysis results, confirmed the model yield rate estimation and prediction capability. The MVR borehole yield prediction model estimates were processed in a GIS environment to model an aquifer yield potential prediction map of the area. The information on the prediction map can serve as a scientific basis for predicting aquifer yield potential rates relevant in groundwater resources sustainability management. The developed MVR borehole yield rate prediction mode provides a good alternative to other methods used for this purpose.
Spontaneous regression of retinopathy of prematurity:incidence and predictive factors
Directory of Open Access Journals (Sweden)
Rui-Hong Ju
2013-08-01
Full Text Available AIM:To evaluate the incidence of spontaneous regression of changes in the retina and vitreous in active stage of retinopathy of prematurity(ROP and identify the possible relative factors during the regression.METHODS: This was a retrospective, hospital-based study. The study consisted of 39 premature infants with mild ROP showed spontaneous regression (Group A and 17 with severe ROP who had been treated before naturally involuting (Group B from August 2008 through May 2011. Data on gender, single or multiple pregnancy, gestational age, birth weight, weight gain from birth to the sixth week of life, use of oxygen in mechanical ventilation, total duration of oxygen inhalation, surfactant given or not, need for and times of blood transfusion, 1,5,10-min Apgar score, presence of bacterial or fungal or combined infection, hyaline membrane disease (HMD, patent ductus arteriosus (PDA, duration of stay in the neonatal intensive care unit (NICU and duration of ROP were recorded.RESULTS: The incidence of spontaneous regression of ROP with stage 1 was 86.7%, and with stage 2, stage 3 was 57.1%, 5.9%, respectively. With changes in zone Ⅲ regression was detected 100%, in zoneⅡ 46.2% and in zoneⅠ 0%. The mean duration of ROP in spontaneous regression group was 5.65±3.14 weeks, lower than that of the treated ROP group (7.34±4.33 weeks, but this difference was not statistically significant (P=0.201. GA, 1min Apgar score, 5min Apgar score, duration of NICU stay, postnatal age of initial screening and oxygen therapy longer than 10 days were significant predictive factors for the spontaneous regression of ROP (P＜0.05. Retinal hemorrhage was the only independent predictive factor the spontaneous regression of ROP (OR 0.030, 95%CI 0.001-0.775, P=0.035.CONCLUSION:This study showed most stage 1 and 2 ROP and changes in zone Ⅲ can spontaneously regression in the end. Retinal hemorrhage is weakly inversely associated with the spontaneous regression.
Heddam, Salim; Kisi, Ozgur
2018-04-01
In the present study, three types of artificial intelligence techniques, least square support vector machine (LSSVM), multivariate adaptive regression splines (MARS) and M5 model tree (M5T) are applied for modeling daily dissolved oxygen (DO) concentration using several water quality variables as inputs. The DO concentration and water quality variables data from three stations operated by the United States Geological Survey (USGS) were used for developing the three models. The water quality data selected consisted of daily measured of water temperature (TE, °C), pH (std. unit), specific conductance (SC, μS/cm) and discharge (DI cfs), are used as inputs to the LSSVM, MARS and M5T models. The three models were applied for each station separately and compared to each other. According to the results obtained, it was found that: (i) the DO concentration could be successfully estimated using the three models and (ii) the best model among all others differs from one station to another.
Directory of Open Access Journals (Sweden)
Jairo Vanegas
2017-05-01
Full Text Available Multivariate Adaptative Regression Splines (MARS es un método de modelación no paramétrico que extiende el modelo lineal incorporando no linealidades e interacciones de variables. Es una herramienta flexible que automatiza la construcción de modelos de predicción, seleccionando variables relevantes, transformando las variables predictoras, tratando valores perdidos y previniendo sobreajustes mediante un autotest. También permite predecir tomando en cuenta factores estructurales que pudieran tener influencia sobre la variable respuesta, generando modelos hipotéticos. El resultado final serviría para identificar puntos de corte relevantes en series de datos. En el área de la salud es poco utilizado, por lo que se propone como una herramienta más para la evaluación de indicadores relevantes en salud pública. Para efectos demostrativos se utilizaron series de datos de mortalidad de menores de 5 años de Costa Rica en el periodo 1978-2008.
Smith, R.; Kasprzyk, J. R.; Balaji, R.
2017-12-01
In light of deeply uncertain factors like future climate change and population shifts, responsible resource management will require new types of information and strategies. For water utilities, this entails potential expansion and efficient management of water supply infrastructure systems for changes in overall supply; changes in frequency and severity of climate extremes such as droughts and floods; and variable demands, all while accounting for conflicting long and short term performance objectives. Multiobjective Evolutionary Algorithms (MOEAs) are emerging decision support tools that have been used by researchers and, more recently, water utilities to efficiently generate and evaluate thousands of planning portfolios. The tradeoffs between conflicting objectives are explored in an automated way to produce (often large) suites of portfolios that strike different balances of performance. Once generated, the sets of optimized portfolios are used to support relatively subjective assertions of priorities and human reasoning, leading to adoption of a plan. These large tradeoff sets contain information about complex relationships between decisions and between groups of decisions and performance that, until now, has not been quantitatively described. We present a novel use of Multivariate Regression Trees (MRTs) to analyze tradeoff sets to reveal these relationships and critical decisions. Additionally, when MRTs are applied to tradeoff sets developed for different realizations of an uncertain future, they can identify decisions that are robust across a wide range of conditions and produce fundamental insights about the system being optimized.
Xu, A; Zhang, Y; Ran, T; Liu, H; Lu, S; Xu, J; Xiong, X; Jiang, Y; Lu, T; Chen, Y
2015-01-01
Bruton's tyrosine kinase (BTK) plays a crucial role in B-cell activation and development, and has emerged as a new molecular target for the treatment of autoimmune diseases and B-cell malignancies. In this study, two- and three-dimensional quantitative structure-activity relationship (2D and 3D-QSAR) analyses were performed on a series of pyridine and pyrimidine-based BTK inhibitors by means of genetic algorithm optimized multivariate adaptive regression spline (GA-MARS) and comparative molecular similarity index analysis (CoMSIA) methods. Here, we propose a modified MARS algorithm to develop 2D-QSAR models. The top ranked models showed satisfactory statistical results (2D-QSAR: Q(2) = 0.884, r(2) = 0.929, r(2)pred = 0.878; 3D-QSAR: q(2) = 0.616, r(2) = 0.987, r(2)pred = 0.905). Key descriptors selected by 2D-QSAR were in good agreement with the conclusions of 3D-QSAR, and the 3D-CoMSIA contour maps facilitated interpretation of the structure-activity relationship. A new molecular database was generated by molecular fragment replacement (MFR) and further evaluated with GA-MARS and CoMSIA prediction. Twenty-five pyridine and pyrimidine derivatives as novel potential BTK inhibitors were finally selected for further study. These results also demonstrated that our method can be a very efficient tool for the discovery of novel potent BTK inhibitors.
Grinn-Gofroń, Agnieszka; Strzelczak, Agnieszka
2009-11-01
A study was made of the link between time of day, weather variables and the hourly content of certain fungal spores in the atmosphere of the city of Szczecin, Poland, in 2004-2007. Sampling was carried out with a Lanzoni 7-day-recording spore trap. The spores analysed belonged to the taxa Alternaria and Cladosporium. These spores were selected both for their allergenic capacity and for their high level presence in the atmosphere, particularly during summer. Spearman correlation coefficients between spore concentrations, meteorological parameters and time of day showed different indices depending on the taxon being analysed. Relative humidity (RH), air temperature, air pressure and clouds most strongly and significantly influenced the concentration of Alternaria spores. Cladosporium spores correlated less strongly and significantly than Alternaria. Multivariate regression tree analysis revealed that, at air pressures lower than 1,011 hPa the concentration of Alternaria spores was low. Under higher air pressure spore concentrations were higher, particularly when RH was lower than 36.5%. In the case of Cladosporium, under higher air pressure (>1,008 hPa), the spores analysed were more abundant, particularly after 0330 hours. In artificial neural networks, RH, air pressure and air temperature were the most important variables in the model for Alternaria spore concentration. For Cladosporium, clouds, time of day, air pressure, wind speed and dew point temperature were highly significant factors influencing spore concentration. The maximum abundance of Cladosporium spores in air fell between 1200 and 1700 hours.
Directory of Open Access Journals (Sweden)
Shahab Karimi
2014-01-01
Full Text Available In this study, the effects of ratios of dolomite, base/acid, silica, SiO2/Al2O3, and Fe2O3/CaO, base and acid oxides, and 11 oxides (SiO2, Al2O3, CaO, MgO, MnO, Na2O, K2O, Fe2O3, TiO2, P2O5, and SO3 on ash fusion temperatures for 1040 US coal samples from 12 states were evaluated using regression and adaptive neurofuzzy inference system (ANFIS methods. Different combinations of independent variables were examined to predict ash fusion temperatures in the multivariable procedure. The combination of the “11 oxides + (Base/Acid + Silica ratio” was the best predictor. Correlation coefficients (R2 of 0.891, 0.917, and 0.94 were achieved using nonlinear equations for the prediction of initial deformation temperature (IDT, softening temperature (ST, and fluid temperature (FT, respectively. The mentioned “best predictor” was used as input to the ANFIS system as well, and the correlation coefficients (R2 of the prediction were enhanced to 0.97, 0.98, and 0.99 for IDT, ST, and FT, respectively. The prediction precision that was achieved in this work exceeded that reported in previously published works.
Qi, Danyi; Roe, Brian E
2016-01-01
We estimate models of consumer food waste awareness and attitudes using responses from a national survey of U.S. residents. Our models are interpreted through the lens of several theories that describe how pro-social behaviors relate to awareness, attitudes and opinions. Our analysis of patterns among respondents' food waste attitudes yields a model with three principal components: one that represents perceived practical benefits households may lose if food waste were reduced, one that represents the guilt associated with food waste, and one that represents whether households feel they could be doing more to reduce food waste. We find our respondents express significant agreement that some perceived practical benefits are ascribed to throwing away uneaten food, e.g., nearly 70% of respondents agree that throwing away food after the package date has passed reduces the odds of foodborne illness, while nearly 60% agree that some food waste is necessary to ensure meals taste fresh. We identify that these attitudinal responses significantly load onto a single principal component that may represent a key attitudinal construct useful for policy guidance. Further, multivariate regression analysis reveals a significant positive association between the strength of this component and household income, suggesting that higher income households most strongly agree with statements that link throwing away uneaten food to perceived private benefits.
Directory of Open Access Journals (Sweden)
Danyi Qi
Full Text Available We estimate models of consumer food waste awareness and attitudes using responses from a national survey of U.S. residents. Our models are interpreted through the lens of several theories that describe how pro-social behaviors relate to awareness, attitudes and opinions. Our analysis of patterns among respondents' food waste attitudes yields a model with three principal components: one that represents perceived practical benefits households may lose if food waste were reduced, one that represents the guilt associated with food waste, and one that represents whether households feel they could be doing more to reduce food waste. We find our respondents express significant agreement that some perceived practical benefits are ascribed to throwing away uneaten food, e.g., nearly 70% of respondents agree that throwing away food after the package date has passed reduces the odds of foodborne illness, while nearly 60% agree that some food waste is necessary to ensure meals taste fresh. We identify that these attitudinal responses significantly load onto a single principal component that may represent a key attitudinal construct useful for policy guidance. Further, multivariate regression analysis reveals a significant positive association between the strength of this component and household income, suggesting that higher income households most strongly agree with statements that link throwing away uneaten food to perceived private benefits.
A common error in the ecological regression of cancer incidence on the deprivation index
Directory of Open Access Journals (Sweden)
Gemma Renart
2013-08-01
Full Text Available OBJECTIVE: To determine if introducing age as another explanatory variable in an ecological regression model relating crude rates of cancer incidence and a deprivation index provides better results than the usual practice of using the standard incidence ratio (SIR as the response variable, introducing the non-standardized index, and not including age in the model. METHODS: Relative risks associated with the deprivation index for some locations of cancer in Spain's Girona Health Region were estimated using two different models. Model 1 estimated relative risks with the indirect method, using the SIR as the response variable. Model 2 estimated relative risks using age as an explanatory variable and crude cancer rates as the response variable. Two scenarios and two sub-scenarios were simulated to test the properties of the estimators and the goodness of fit of the two models. RESULTS: The results obtained from Model 2's estimates were slightly better (less biased than those from Model 1. The results of the simulation showed that in all cases (two scenarios and two sub-scenarios Model 2 had a better fit than Model 1. The probability density for the parameter of interest provided evidence that Model 1 leads to biased estimates. CONCLUSIONS: When attempting to explain the relative risk of incidence of cancer using ecological models that control geographic variability, introducing age as another explanatory variable and crude rates as a response variable provides less biased results.
Golkarian, Ali; Naghibi, Seyed Amir; Kalantar, Bahareh; Pradhan, Biswajeet
2018-02-17
Ever increasing demand for water resources for different purposes makes it essential to have better understanding and knowledge about water resources. As known, groundwater resources are one of the main water resources especially in countries with arid climatic condition. Thus, this study seeks to provide groundwater potential maps (GPMs) employing new algorithms. Accordingly, this study aims to validate the performance of C5.0, random forest (RF), and multivariate adaptive regression splines (MARS) algorithms for generating GPMs in the eastern part of Mashhad Plain, Iran. For this purpose, a dataset was produced consisting of spring locations as indicator and groundwater-conditioning factors (GCFs) as input. In this research, 13 GCFs were selected including altitude, slope aspect, slope angle, plan curvature, profile curvature, topographic wetness index (TWI), slope length, distance from rivers and faults, rivers and faults density, land use, and lithology. The mentioned dataset was divided into two classes of training and validation with 70 and 30% of the springs, respectively. Then, C5.0, RF, and MARS algorithms were employed using R statistical software, and the final values were transformed into GPMs. Finally, two evaluation criteria including Kappa and area under receiver operating characteristics curve (AUC-ROC) were calculated. According to the findings of this research, MARS had the best performance with AUC-ROC of 84.2%, followed by RF and C5.0 algorithms with AUC-ROC values of 79.7 and 77.3%, respectively. The results indicated that AUC-ROC values for the employed models are more than 70% which shows their acceptable performance. As a conclusion, the produced methodology could be used in other geographical areas. GPMs could be used by water resource managers and related organizations to accelerate and facilitate water resource exploitation.
Directory of Open Access Journals (Sweden)
Sepedeh Gholizadeh
2016-07-01
Full Text Available Background:Obesity and hypertension are the most important non-communicable diseases thatin many studies, the prevalence and their risk factors have been performedin each geographic region univariately.Study of factors affecting both obesity and hypertension may have an important role which to be adrressed in this study. Materials &Methods:This cross-sectional study was conducted on 1000 men aged 20-70 living in Bushehr province. Blood pressure was measured three times and the average of them was considered as one of the response variables. Hypertension was defined as systolic blood pressure ≥140 (and-or diastolic blood pressure ≥90 and obesity was defined as body mass index ≥25. Data was analyzed by using multilevel, multivariate logistic regression model by MlwiNsoftware. Results:Intra class correlations in cluster level obtained 33% for high blood pressure and 37% for obesity, so two level model was fitted to data. The prevalence of obesity and hypertension obtained 43.6% (0.95%CI; 40.6-46.5, 29.4% (0.95%CI; 26.6-32.1 respectively. Age, gender, smoking, hyperlipidemia, diabetes, fruit and vegetable consumption and physical activity were the factors affecting blood pressure (p≤0.05. Age, gender, hyperlipidemia, diabetes, fruit and vegetable consumption, physical activity and place of residence are effective on obesity (p≤0.05. Conclusion: The multilevel models with considering levels distribution provide more precise estimates. As regards obesity and hypertension are the major risk factors for cardiovascular disease, by knowing the high-risk groups we can d careful planning to prevention of non-communicable diseases and promotion of society health.
Mansouri, Edris; Feizi, Faranak; Jafari Rad, Alireza; Arian, Mehran
2018-03-01
This paper uses multivariate regression to create a mathematical model for iron skarn exploration in the Sarvian area, central Iran, using multivariate regression for mineral prospectivity mapping (MPM). The main target of this paper is to apply multivariate regression analysis (as an MPM method) to map iron outcrops in the northeastern part of the study area in order to discover new iron deposits in other parts of the study area. Two types of multivariate regression models using two linear equations were employed to discover new mineral deposits. This method is one of the reliable methods for processing satellite images. ASTER satellite images (14 bands) were used as unique independent variables (UIVs), and iron outcrops were mapped as dependent variables for MPM. According to the results of the probability value (p value), coefficient of determination value (R2) and adjusted determination coefficient (Radj2), the second regression model (which consistent of multiple UIVs) fitted better than other models. The accuracy of the model was confirmed by iron outcrops map and geological observation. Based on field observation, iron mineralization occurs at the contact of limestone and intrusive rocks (skarn type).
The purpose of this report is to provide a reference manual that could be used by investigators for making informed use of logistic regression using two methods (standard logistic regression and MARS). The details for analyses of relationships between a dependent binary response ...
Ukkonen, Mika; Kivivuori, Antti; Rantanen, Tuomo; Paajanen, Hannu
2015-12-01
This study is intended to ascertain if outcome of acute abdominal surgery among elderly patients with acute abdominal pain have improved. Altogether 456 patients aged >65 years underwent emergency abdominal surgery between the years 2007 and 2009 in our hospital. After excluding emergency reoperations of elective surgery, a total of 430 consecutive patients were included in this retrospective audit. The key factors under analysis in this study were the occurrence of major complications and death from any cause within 30 days after the operation. In addition, we compared our results to our previously published data some 20 years ago. The most common diagnoses were cholecystitis (n = 139, 32.3 %, incidence of 125 per 100,000 elderly persons), incarcerated hernia (n = 60, 13.9 %, 54/100,000), malignancy related (n = 50, 11.6 %, 45/100,000), or acute appendicitis (n = 46, 10.7 %, 41/100,000). The majority of operations (80.7 %) were performed using open technique. Of all 112 laparoscopic procedures, 25.9 % were converted to open surgery. Reoperations were rare and postoperative surgical complications were not associated with statistically significant increase in mortality, even if reoperation was needed. The 30-day mortality and morbidity rates were 14.2 and 31.9 %, respectively. Logistic regression analysis showed that patient's age (p = 0.014), atrial fibrillation (p = 0.017), low body mass index (p = 0.001), open surgery (p = 0.029), ASA grade III or more (p abdominal surgery still have relatively high morbidity and mortality as reported in earlier studies.
Ytsma, Cai R.; Dyar, M. Darby
2018-01-01
Hydrogen (H) is a critical element to measure on the surface of Mars because its presence in mineral structures is indicative of past hydrous conditions. The Curiosity rover uses the laser-induced breakdown spectrometer (LIBS) on the ChemCam instrument to analyze rocks for their H emission signal at 656.6 nm, from which H can be quantified. Previous LIBS calibrations for H used small data sets measured on standards and/or manufactured mixtures of hydrous minerals and rocks and applied univariate regression to spectra normalized in a variety of ways. However, matrix effects common to LIBS make these calibrations of limited usefulness when applied to the broad range of compositions on the Martian surface. In this study, 198 naturally-occurring hydrous geological samples covering a broad range of bulk compositions with directly-measured H content are used to create more robust prediction models for measuring H in LIBS data acquired under Mars conditions. Both univariate and multivariate prediction models, including partial least square (PLS) and the least absolute shrinkage and selection operator (Lasso), are compared using several different methods for normalization of H peak intensities. Data from the ChemLIBS Mars-analog spectrometer at Mount Holyoke College are compared against spectra from the same samples acquired using a ChemCam-like instrument at Los Alamos National Laboratory and the ChemCam instrument on Mars. Results show that all current normalization and data preprocessing variations for quantifying H result in models with statistically indistinguishable prediction errors (accuracies) ca. ± 1.5 weight percent (wt%) H2O, limiting the applications of LIBS in these implementations for geological studies. This error is too large to allow distinctions among the most common hydrous phases (basalts, amphiboles, micas) to be made, though some clays (e.g., chlorites with ≈ 12 wt% H2O, smectites with 15-20 wt% H2O) and hydrated phases (e.g., gypsum with ≈ 20
Deo, Ravinesh C.; Kisi, Ozgur; Singh, Vijay P.
2017-02-01
Drought forecasting using standardized metrics of rainfall is a core task in hydrology and water resources management. Standardized Precipitation Index (SPI) is a rainfall-based metric that caters for different time-scales at which the drought occurs, and due to its standardization, is well-suited for forecasting drought at different periods in climatically diverse regions. This study advances drought modelling using multivariate adaptive regression splines (MARS), least square support vector machine (LSSVM), and M5Tree models by forecasting SPI in eastern Australia. MARS model incorporated rainfall as mandatory predictor with month (periodicity), Southern Oscillation Index, Pacific Decadal Oscillation Index and Indian Ocean Dipole, ENSO Modoki and Nino 3.0, 3.4 and 4.0 data added gradually. The performance was evaluated with root mean square error (RMSE), mean absolute error (MAE), and coefficient of determination (r2). Best MARS model required different input combinations, where rainfall, sea surface temperature and periodicity were used for all stations, but ENSO Modoki and Pacific Decadal Oscillation indices were not required for Bathurst, Collarenebri and Yamba, and the Southern Oscillation Index was not required for Collarenebri. Inclusion of periodicity increased the r2 value by 0.5-8.1% and reduced RMSE by 3.0-178.5%. Comparisons showed that MARS superseded the performance of the other counterparts for three out of five stations with lower MAE by 15.0-73.9% and 7.3-42.2%, respectively. For the other stations, M5Tree was better than MARS/LSSVM with lower MAE by 13.8-13.4% and 25.7-52.2%, respectively, and for Bathurst, LSSVM yielded more accurate result. For droughts identified by SPI ≤ - 0.5, accurate forecasts were attained by MARS/M5Tree for Bathurst, Yamba and Peak Hill, whereas for Collarenebri and Barraba, M5Tree was better than LSSVM/MARS. Seasonal analysis revealed disparate results where MARS/M5Tree was better than LSSVM. The results highlight the
Michael S. Balshi; A. David McGuire; Paul Duffy; Mike Flannigan; John Walsh; Jerry Melillo
2009-01-01
We developed temporally and spatially explicit relationships between air temperature and fuel moisture codes derived from the Canadian Fire Weather Index System to estimate annual area burned at 2.5o (latitude x longitude) resolution using a Multivariate Adaptive Regression Spline (MARS) approach across Alaska and Canada. Burned area was...
Lee, Tsair-Fwu; Chao, Pei-Ju; Chang, Liyun; Ting, Hui-Min; Huang, Yu-Jie
2015-01-01
Symptomatic radiation pneumonitis (SRP), which decreases quality of life (QoL), is the most common pulmonary complication in patients receiving breast irradiation. If it occurs, acute SRP usually develops 4-12 weeks after completion of radiotherapy and presents as a dry cough, dyspnea and low-grade fever. If the incidence of SRP is reduced, not only the QoL but also the compliance of breast cancer patients may be improved. Therefore, we investigated the incidence SRP in breast cancer patients after hybrid intensity modulated radiotherapy (IMRT) to find the risk factors, which may have important effects on the risk of radiation-induced complications. In total, 93 patients with breast cancer were evaluated. The final endpoint for acute SRP was defined as those who had density changes together with symptoms, as measured using computed tomography. The risk factors for a multivariate normal tissue complication probability model of SRP were determined using the least absolute shrinkage and selection operator (LASSO) technique. Five risk factors were selected using LASSO: the percentage of the ipsilateral lung volume that received more than 20-Gy (IV20), energy, age, body mass index (BMI) and T stage. Positive associations were demonstrated among the incidence of SRP, IV20, and patient age. Energy, BMI and T stage showed a negative association with the incidence of SRP. Our analyses indicate that the risk of SPR following hybrid IMRT in elderly or low-BMI breast cancer patients is increased once the percentage of the ipsilateral lung volume receiving more than 20-Gy is controlled below a limitation. We suggest to define a dose-volume percentage constraint of IV20radiation therapy treatment planning to maintain the incidence of SPR below 20%, and pay attention to the sequelae especially in elderly or low-BMI breast cancer patients. (AIV20: the absolute ipsilateral lung volume that received more than 20 Gy (cc).
DEFF Research Database (Denmark)
Henneberg, Morten; Jørgensen, Bent; Eriksen, René Lynge
2016-01-01
In this paper, we present an oil condition and wear debris evaluation method for ship thruster gears using T2 statistics to form control charts from a multi-sensor platform. The proposed method takes into account the different ambient conditions by multiple linear regression on the mean value...... as substitution from the normal empirical mean value. This regression approach accounts for the bias imposed on the empirical mean value due to different geographical and seasonal differences on the multi-sensor inputs. Data from a gearbox are used to evaluate the length of the run-in period in order to ensure...
Zhou, Jinzhe; Zhou, Yanbing; Cao, Shougen; Li, Shikuan; Wang, Hao; Niu, Zhaojian; Chen, Dong; Wang, Dongsheng; Lv, Liang; Zhang, Jian; Li, Yu; Jiao, Xuelong; Tan, Xiaojie; Zhang, Jianli; Wang, Haibo; Zhang, Bingyuan; Lu, Yun; Sun, Zhenqing
2016-01-01
Reporting of surgical complications is common, but few provide information about the severity and estimate risk factors of complications. If have, but lack of specificity. We retrospectively analyzed data on 2795 gastric cancer patients underwent surgical procedure at the Affiliated Hospital of Qingdao University between June 2007 and June 2012, established multivariate logistic regression model to predictive risk factors related to the postoperative complications according to the Clavien-Dindo classification system. Twenty-four out of 86 variables were identified statistically significant in univariate logistic regression analysis, 11 significant variables entered multivariate analysis were employed to produce the risk model. Liver cirrhosis, diabetes mellitus, Child classification, invasion of neighboring organs, combined resection, introperative transfusion, Billroth II anastomosis of reconstruction, malnutrition, surgical volume of surgeons, operating time and age were independent risk factors for postoperative complications after gastrectomy. Based on logistic regression equation, p=Exp∑BiXi / (1+Exp∑BiXi), multivariate logistic regression predictive model that calculated the risk of postoperative morbidity was developed, p = 1/(1 + e((4.810-1.287X1-0.504X2-0.500X3-0.474X4-0.405X5-0.318X6-0.316X7-0.305X8-0.278X9-0.255X10-0.138X11))). The accuracy, sensitivity and specificity of the model to predict the postoperative complications were 86.7%, 76.2% and 88.6%, respectively. This risk model based on Clavien-Dindo grading severity of complications system and logistic regression analysis can predict severe morbidity specific to an individual patient's risk factors, estimate patients' risks and benefits of gastric surgery as an accurate decision-making tool and may serve as a template for the development of risk models for other surgical groups.
Directory of Open Access Journals (Sweden)
Yuan Chuanlai
2014-06-01
Full Text Available This paper takes software method to compensate temperature against the drift problem of piezoresistive waist force sensor. The compensation algorithm is modeled through using the PLS method. And compensation module is designed based on multiple regression method. According to the simulation results, the system designed meet the basic phenomenon for drift correction. Finally, the actual data is used to validate this algorithm.
Kiss, I.; Cioată, V. G.; Alexa, V.; Raţiu, S. A.
2017-05-01
The braking system is one of the most important and complex subsystems of railway vehicles, especially when it comes for safety. Therefore, installing efficient safe brakes on the modern railway vehicles is essential. Nowadays is devoted attention to solving problems connected with using high performance brake materials and its impact on thermal and mechanical loading of railway wheels. The main factor that influences the selection of a friction material for railway applications is the performance criterion, due to the interaction between the brake block and the wheel produce complex thermos-mechanical phenomena. In this work, the investigated subjects are the cast-iron brake shoes, which are still widely used on freight wagons. Therefore, the cast-iron brake shoes - with lamellar graphite and with a high content of phosphorus (0.8-1.1%) - need a special investigation. In order to establish the optimal condition for the cast-iron brake shoes we proposed a mathematical modelling study by using the statistical analysis and multiple regression equations. Multivariate research is important in areas of cast-iron brake shoes manufacturing, because many variables interact with each other simultaneously. Multivariate visualization comes to the fore when researchers have difficulties in comprehending many dimensions at one time. Technological data (hardness and chemical composition) obtained from cast-iron brake shoes were used for this purpose. In order to settle the multiple correlation between the hardness of the cast-iron brake shoes, and the chemical compositions elements several model of regression equation types has been proposed. Because a three-dimensional surface with variables on three axes is a common way to illustrate multivariate data, in which the maximum and minimum values are easily highlighted, we plotted graphical representation of the regression equations in order to explain interaction of the variables and locate the optimal level of each variable for
Smith, Kelly M.; Gay, Robert S.; Stachowiak, Susan J.
2013-01-01
In late 2014, NASA will fly the Orion capsule on a Delta IV-Heavy rocket for the Exploration Flight Test-1 (EFT-1) mission. For EFT-1, the Orion capsule will be flying with a new GPS receiver and new navigation software. Given the experimental nature of the flight, the flight software must be robust to the loss of GPS measurements. Once the high-speed entry is complete, the drogue parachutes must be deployed within the proper conditions to stabilize the vehicle prior to deploying the main parachutes. When GPS is available in nominal operations, the vehicle will deploy the drogue parachutes based on an altitude trigger. However, when GPS is unavailable, the navigated altitude errors become excessively large, driving the need for a backup barometric altimeter to improve altitude knowledge. In order to increase overall robustness, the vehicle also has an alternate method of triggering the parachute deployment sequence based on planet-relative velocity if both the GPS and the barometric altimeter fail. However, this backup trigger results in large altitude errors relative to the targeted altitude. Motivated by this challenge, this paper demonstrates how logistic regression may be employed to semi-automatically generate robust triggers based on statistical analysis. Logistic regression is used as a ground processor pre-flight to develop a statistical classifier. The classifier would then be implemented in flight software and executed in real-time. This technique offers improved performance even in the face of highly inaccurate measurements. Although the logistic regression-based trigger approach will not be implemented within EFT-1 flight software, the methodology can be carried forward for future missions and vehicles.
Smith, Kelly; Gay, Robert; Stachowiak, Susan
2013-01-01
In late 2014, NASA will fly the Orion capsule on a Delta IV-Heavy rocket for the Exploration Flight Test-1 (EFT-1) mission. For EFT-1, the Orion capsule will be flying with a new GPS receiver and new navigation software. Given the experimental nature of the flight, the flight software must be robust to the loss of GPS measurements. Once the high-speed entry is complete, the drogue parachutes must be deployed within the proper conditions to stabilize the vehicle prior to deploying the main parachutes. When GPS is available in nominal operations, the vehicle will deploy the drogue parachutes based on an altitude trigger. However, when GPS is unavailable, the navigated altitude errors become excessively large, driving the need for a backup barometric altimeter to improve altitude knowledge. In order to increase overall robustness, the vehicle also has an alternate method of triggering the parachute deployment sequence based on planet-relative velocity if both the GPS and the barometric altimeter fail. However, this backup trigger results in large altitude errors relative to the targeted altitude. Motivated by this challenge, this paper demonstrates how logistic regression may be employed to semi-automatically generate robust triggers based on statistical analysis. Logistic regression is used as a ground processor pre-flight to develop a statistical classifier. The classifier would then be implemented in flight software and executed in real-time. This technique offers improved performance even in the face of highly inaccurate measurements. Although the logistic regression-based trigger approach will not be implemented within EFT-1 flight software, the methodology can be carried forward for future missions and vehicles
Directory of Open Access Journals (Sweden)
Tsair-Fwu Lee
Full Text Available Symptomatic radiation pneumonitis (SRP, which decreases quality of life (QoL, is the most common pulmonary complication in patients receiving breast irradiation. If it occurs, acute SRP usually develops 4-12 weeks after completion of radiotherapy and presents as a dry cough, dyspnea and low-grade fever. If the incidence of SRP is reduced, not only the QoL but also the compliance of breast cancer patients may be improved. Therefore, we investigated the incidence SRP in breast cancer patients after hybrid intensity modulated radiotherapy (IMRT to find the risk factors, which may have important effects on the risk of radiation-induced complications.In total, 93 patients with breast cancer were evaluated. The final endpoint for acute SRP was defined as those who had density changes together with symptoms, as measured using computed tomography. The risk factors for a multivariate normal tissue complication probability model of SRP were determined using the least absolute shrinkage and selection operator (LASSO technique.Five risk factors were selected using LASSO: the percentage of the ipsilateral lung volume that received more than 20-Gy (IV20, energy, age, body mass index (BMI and T stage. Positive associations were demonstrated among the incidence of SRP, IV20, and patient age. Energy, BMI and T stage showed a negative association with the incidence of SRP. Our analyses indicate that the risk of SPR following hybrid IMRT in elderly or low-BMI breast cancer patients is increased once the percentage of the ipsilateral lung volume receiving more than 20-Gy is controlled below a limitation.We suggest to define a dose-volume percentage constraint of IV20< 37% (or AIV20< 310cc for the irradiated ipsilateral lung in radiation therapy treatment planning to maintain the incidence of SPR below 20%, and pay attention to the sequelae especially in elderly or low-BMI breast cancer patients. (AIV20: the absolute ipsilateral lung volume that received more than
Bjelanovic, Milena; Sørheim, Oddvin; Slinde, Erik; Puolanne, Eero; Isaksson, Tomas; Egelandsdal, Bjørg
2013-11-01
Seventy-two samples of ground beef from M. semimembranosus of two 5 and two 1.5year old animals were prepared. Two types of fat tissues from either beef or pork were added to the ground beef. The samples were prepared to contain predominantly deoxymyoglobin (DMb), oxymyoglobin (OMb) and metmyoglobin (MMb) states on surfaces using selected methods based on chemical treatment (for MMb) and oxygen pressure packaging to induce the two other states. Reflectance spectra were measured on ground beef after three storage times. Partial least regression analysis was used to make calibration models of the desired myoglobin states. Validated models using leave-one-sample out cross validation gave, after correction and normalization, prediction errors of about 5%. Long term storage of ground beef was unsuitable for preparing pure MMb states due to gradual reduction of the pigment to DMb, presumably by bacteria. Copyright © 2013 Elsevier Ltd. All rights reserved.
de Oliveira, Isadora R. N.; Roque, Jussara V.; Maia, Mariza P.; Stringheta, Paulo C.; Teófilo, Reinaldo F.
2018-04-01
A new method was developed to determine the antioxidant properties of red cabbage extract (Brassica oleracea) by mid (MID) and near (NIR) infrared spectroscopies and partial least squares (PLS) regression. A 70% (v/v) ethanolic extract of red cabbage was concentrated to 9° Brix and further diluted (12 to 100%) in water. The dilutions were used as external standards for the building of PLS models. For the first time, this strategy was applied for building multivariate regression models. Reference analyses and spectral data were obtained from diluted extracts. The determinate properties were total and monomeric anthocyanins, total polyphenols and antioxidant capacity by ABTS (2,2-azino-bis(3-ethyl-benzothiazoline-6-sulfonate)) and DPPH (2,2-diphenyl-1-picrylhydrazyl) methods. Ordered predictors selection (OPS) and genetic algorithm (GA) were used for feature selection before PLS regression (PLS-1). In addition, a PLS-2 regression was applied to all properties simultaneously. PLS-1 models provided more predictive models than did PLS-2 regression. PLS-OPS and PLS-GA models presented excellent prediction results with a correlation coefficient higher than 0.98. However, the best models were obtained using PLS and variable selection with the OPS algorithm and the models based on NIR spectra were considered more predictive for all properties. Then, these models provided a simple, rapid and accurate method for determination of red cabbage extract antioxidant properties and its suitability for use in the food industry.
Directory of Open Access Journals (Sweden)
Guo Junqiao
2008-09-01
Full Text Available Abstract Background The effects of climate variations on bacillary dysentery incidence have gained more recent concern. However, the multi-collinearity among meteorological factors affects the accuracy of correlation with bacillary dysentery incidence. Methods As a remedy, a modified method to combine ridge regression and hierarchical cluster analysis was proposed for investigating the effects of climate variations on bacillary dysentery incidence in northeast China. Results All weather indicators, temperatures, precipitation, evaporation and relative humidity have shown positive correlation with the monthly incidence of bacillary dysentery, while air pressure had a negative correlation with the incidence. Ridge regression and hierarchical cluster analysis showed that during 1987–1996, relative humidity, temperatures and air pressure affected the transmission of the bacillary dysentery. During this period, all meteorological factors were divided into three categories. Relative humidity and precipitation belonged to one class, temperature indexes and evaporation belonged to another class, and air pressure was the third class. Conclusion Meteorological factors have affected the transmission of bacillary dysentery in northeast China. Bacillary dysentery prevention and control would benefit from by giving more consideration to local climate variations.
Huang, Desheng; Guan, Peng; Guo, Junqiao; Wang, Ping; Zhou, Baosen
2008-09-25
The effects of climate variations on bacillary dysentery incidence have gained more recent concern. However, the multi-collinearity among meteorological factors affects the accuracy of correlation with bacillary dysentery incidence. As a remedy, a modified method to combine ridge regression and hierarchical cluster analysis was proposed for investigating the effects of climate variations on bacillary dysentery incidence in northeast China. All weather indicators, temperatures, precipitation, evaporation and relative humidity have shown positive correlation with the monthly incidence of bacillary dysentery, while air pressure had a negative correlation with the incidence. Ridge regression and hierarchical cluster analysis showed that during 1987-1996, relative humidity, temperatures and air pressure affected the transmission of the bacillary dysentery. During this period, all meteorological factors were divided into three categories. Relative humidity and precipitation belonged to one class, temperature indexes and evaporation belonged to another class, and air pressure was the third class. Meteorological factors have affected the transmission of bacillary dysentery in northeast China. Bacillary dysentery prevention and control would benefit from by giving more consideration to local climate variations.
Oubida, Regis W; Gantulga, Dashzeveg; Zhang, Man; Zhou, Lecong; Bawa, Rajesh; Holliday, Jason A
2015-01-01
Local adaptation to climate in temperate forest trees involves the integration of multiple physiological, morphological, and phenological traits. Latitudinal clines are frequently observed for these traits, but environmental constraints also track longitude and altitude. We combined extensive phenotyping of 12 candidate adaptive traits, multivariate regression trees, quantitative genetics, and a genome-wide panel of SNP markers to better understand the interplay among geography, climate, and adaptation to abiotic factors in Populus trichocarpa. Heritabilities were low to moderate (0.13-0.32) and population differentiation for many traits exceeded the 99th percentile of the genome-wide distribution of FST, suggesting local adaptation. When climate variables were taken as predictors and the 12 traits as response variables in a multivariate regression tree analysis, evapotranspiration (Eref) explained the most variation, with subsequent splits related to mean temperature of the warmest month, frost-free period (FFP), and mean annual precipitation (MAP). These grouping matched relatively well the splits using geographic variables as predictors: the northernmost groups (short FFP and low Eref) had the lowest growth, and lowest cold injury index; the southern British Columbia group (low Eref and intermediate temperatures) had average growth and cold injury index; the group from the coast of California and Oregon (high Eref and FFP) had the highest growth performance and the highest cold injury index; and the southernmost, high-altitude group (with high Eref and low FFP) performed poorly, had high cold injury index, and lower water use efficiency. Taken together, these results suggest variation in both temperature and water availability across the range shape multivariate adaptive traits in poplar.
Directory of Open Access Journals (Sweden)
Regis Wendpouire Oubida
2015-03-01
Full Text Available Local adaptation to climate in temperate forest trees involves the integration of multiple physiological, morphological, and phenological traits. Latitudinal clines are frequently observed for these traits, but environmental constraints also track longitude and altitude. We combined extensive phenotyping of 12 candidate adaptive traits, multivariate regression trees, quantitative genetics, and a genome-wide panel of SNP markers to better understand the interplay among geography, climate, and adaptation to abiotic factors in Populus trichocarpa. Heritabilities were low to moderate (0.13 to 0.32 and population differentiation for many traits exceeded the 99th percentile of the genome-wide distribution of FST, suggesting local adaptation. When climate variables were taken as predictors and the 12 traits as response variables in a multivariate regression tree analysis, evapotranspiration (Eref explained the most variation, with subsequent splits related to mean temperature of the warmest month, frost-free period (FFP, and mean annual precipitation (MAP. These grouping matched relatively well the splits using geographic variables as predictors: the northernmost groups (short FFP and low Eref had the lowest growth, and lowest cold injury index; the southern British Columbia group (low Eref and intermediate temperatures had average growth and cold injury index; the group from the coast of California and Oregon (high Eref and FFP had the highest growth performance and the highest cold injury index; and the southernmost, high-altitude group (with high Eref and low FFP performed poorly, had high cold injury index, and lower water use efficiency. Taken together, these results suggest variation in both temperature and water availability across the range shape multivariate adaptive traits in poplar.
Hamidi, Omid; Tapak, Leili; Abbasi, Hamed; Maryanaji, Zohreh
2017-10-01
We have conducted a case study to investigate the performance of support vector machine, multivariate adaptive regression splines, and random forest time series methods in snowfall modeling. These models were applied to a data set of monthly snowfall collected during six cold months at Hamadan Airport sample station located in the Zagros Mountain Range in Iran. We considered monthly data of snowfall from 1981 to 2008 during the period from October/November to April/May as the training set and the data from 2009 to 2015 as the testing set. The root mean square errors (RMSE), mean absolute errors (MAE), determination coefficient (R 2), coefficient of efficiency (E%), and intra-class correlation coefficient (ICC) statistics were used as evaluation criteria. Our results indicated that the random forest time series model outperformed the support vector machine and multivariate adaptive regression splines models in predicting monthly snowfall in terms of several criteria. The RMSE, MAE, R 2, E, and ICC for the testing set were 7.84, 5.52, 0.92, 0.89, and 0.93, respectively. The overall results indicated that the random forest time series model could be successfully used to estimate monthly snowfall values. Moreover, the support vector machine model showed substantial performance as well, suggesting it may also be applied to forecast snowfall in this area.
Directory of Open Access Journals (Sweden)
Paulino José García Nieto
2016-05-01
Full Text Available Remaining useful life (RUL estimation is considered as one of the most central points in the prognostics and health management (PHM. The present paper describes a nonlinear hybrid ABC–MARS-based model for the prediction of the remaining useful life of aircraft engines. Indeed, it is well-known that an accurate RUL estimation allows failure prevention in a more controllable way so that the effective maintenance can be carried out in appropriate time to correct impending faults. The proposed hybrid model combines multivariate adaptive regression splines (MARS, which have been successfully adopted for regression problems, with the artificial bee colony (ABC technique. This optimization technique involves parameter setting in the MARS training procedure, which significantly influences the regression accuracy. However, its use in reliability applications has not yet been widely explored. Bearing this in mind, remaining useful life values have been predicted here by using the hybrid ABC–MARS-based model from the remaining measured parameters (input variables for aircraft engines with success. A correlation coefficient equal to 0.92 was obtained when this hybrid ABC–MARS-based model was applied to experimental data. The agreement of this model with experimental data confirmed its good performance. The main advantage of this predictive model is that it does not require information about the previous operation states of the aircraft engine.
Rossi, M.; Apuani, T.; Felletti, F.
2009-04-01
The aim of this paper is to compare the results of two statistical methods for landslide susceptibility analysis: 1) univariate probabilistic method based on landslide susceptibility index, 2) multivariate method (logistic regression). The study area is the Febbraro valley, located in the central Italian Alps, where different types of metamorphic rocks croup out. On the eastern part of the studied basin a quaternary cover represented by colluvial and secondarily, by glacial deposits, is dominant. In this study 110 earth flows, mainly located toward NE portion of the catchment, were analyzed. They involve only the colluvial deposits and their extension mainly ranges from 36 to 3173 m2. Both statistical methods require to establish a spatial database, in which each landslide is described by several parameters that can be assigned using a main scarp central point of landslide. The spatial database is constructed using a Geographical Information System (GIS). Each landslide is described by several parameters corresponding to the value of main scarp central point of the landslide. Based on bibliographic review a total of 15 predisposing factors were utilized. The width of the intervals, in which the maps of the predisposing factors have to be reclassified, has been defined assuming constant intervals to: elevation (100 m), slope (5 °), solar radiation (0.1 MJ/cm2/year), profile curvature (1.2 1/m), tangential curvature (2.2 1/m), drainage density (0.5), lineament density (0.00126). For the other parameters have been used the results of the probability-probability plots analysis and the statistical indexes of landslides site. In particular slope length (0 ÷ 2, 2 ÷ 5, 5 ÷ 10, 10 ÷ 20, 20 ÷ 35, 35 ÷ 260), accumulation flow (0 ÷ 1, 1 ÷ 2, 2 ÷ 5, 5 ÷ 12, 12 ÷ 60, 60 ÷27265), Topographic Wetness Index 0 ÷ 0.74, 0.74 ÷ 1.94, 1.94 ÷ 2.62, 2.62 ÷ 3.48, 3.48 ÷ 6,00, 6.00 ÷ 9.44), Stream Power Index (0 ÷ 0.64, 0.64 ÷ 1.28, 1.28 ÷ 1.81, 1.81 ÷ 4.20, 4.20 ÷ 9
Elfaki, Tayseer Elamin Mohamed; Arndts, Kathrin; Wiszniewsky, Anna; Ritter, Manuel; Goreish, Ibtisam A; Atti El Mekki, Misk El Yemen A; Arriens, Sandra; Pfarr, Kenneth; Fimmers, Rolf; Doenhoff, Mike; Hoerauf, Achim; Layland, Laura E
2016-05-01
In the Sudan, Schistosoma mansoni infections are a major cause of morbidity in school-aged children and infection rates are associated with available clean water sources. During infection, immune responses pass through a Th1 followed by Th2 and Treg phases and patterns can relate to different stages of infection or immunity. This retrospective study evaluated immunoepidemiological aspects in 234 individuals (range 4-85 years old) from Kassala and Khartoum states in 2011. Systemic immune profiles (cytokines and immunoglobulins) and epidemiological parameters were surveyed in n = 110 persons presenting patent S. mansoni infections (egg+), n = 63 individuals positive for S. mansoni via PCR in sera but egg negative (SmPCR+) and n = 61 people who were infection-free (Sm uninf). Immunoepidemiological findings were further investigated using two binary multivariable regression analysis. Nearly all egg+ individuals had no access to latrines and over 90% obtained water via the canal stemming from the Atbara River. With regards to age, infection and an egg+ status was linked to young and adolescent groups. In terms of immunology, S. mansoni infection per se was strongly associated with increased SEA-specific IgG4 but not IgE levels. IL-6, IL-13 and IL-10 were significantly elevated in patently-infected individuals and positively correlated with egg load. In contrast, IL-2 and IL-1β were significantly lower in SmPCR+ individuals when compared to Sm uninf and egg+ groups which was further confirmed during multivariate regression analysis. Schistosomiasis remains an important public health problem in the Sudan with a high number of patent individuals. In addition, SmPCR diagnostics revealed another cohort of infected individuals with a unique immunological profile and provides an avenue for future studies on non-patent infection states. Future studies should investigate the downstream signalling pathways/mechanisms of IL-2 and IL-1β as potential diagnostic markers in order to
Directory of Open Access Journals (Sweden)
Oilson Alberto Gonzatto Junior
2017-06-01
Full Text Available Data with excess zeros are frequently found in practice, and the recommended analysis is to use models that adequately address the counting of zero observations. In this study, the Zero Inflated Beta Regression Model (BeZI was used on experimental data to describe the mean incidence of leaf citrus canker in orange groves under the influence of genotype and rootstocks of origin. Based on the model, it was possible to quantify the odds that a null observation to mean incidence comes from a particular plant according to genotype and rootstock, and estimate its expected value according to this combination. Laranja Caipira rootstock proved to be the most resistant to leaf citrus canker as well as Limão Cravo proved to be the most fragile. The Ipiguá IAC, Arapongas, EEL and Olímpia genotypes have statistically equivalent chances.
Bevan, Melody G; Asrani, Varsha M; Bharmal, Sakina; Wu, Landy M; Windsor, John A; Petrov, Maxim S
2017-06-01
Tolerance of oral food is an important criterion for hospital discharge in patients with acute pancreatitis. Patients who develop oral feeding intolerance have prolonged hospitalisation, use additional healthcare resources, and have impaired quality of life. This study aimed to quantify the incidence of oral feeding intolerance, the effect of confounders, and determine the best predictors of oral feeding intolerance. Clinical studies indexed in three electronic databases (EMBASE, MEDLINE, and the Cochrane Central Register of Controlled Trials) were reviewed. Incidence and predictor data were meta-analysed and possible confounders were investigated by meta-regression analysis. A total of 22 studies with 2024 patients met the inclusion criteria, 17 of which (with 1550 patients) were suitable for meta-analysis. The incidence of oral feeding intolerance was 16.3%, and was not affected by WHO region, age, sex, or aetiology of acute pancreatitis. Nine of the 22 studies investigated a total of 62 different predictors of oral feeding intolerance. Serum lipase level prior to refeeding, pleural effusions, (peri)pancreatic collections, Ranson score, and Balthazar score were found to be statistically significant in meta-analyses. Oral feeding intolerance affects approximately 1 in 6 patients with acute pancreatitis. Serum lipase levels of more than 2.5 times the upper limit of normal prior to refeeding is a potentially useful threshold to identify patients at high risk of developing oral feeding intolerance. Copyright © 2016 Elsevier Ltd and European Society for Clinical Nutrition and Metabolism. All rights reserved.
Marson, Breno Maurício; de Oliveira Vilhena, Raquel; de Souza Madeira, Camilla Regina; Pontes, Flávia Lada Degaut; Piantavini, Mário Sérgio; Pontarolo, Roberto
2016-02-24
Malaria is one of the most lethal and life-threatening infectious diseases in the world, causing more than half a million deaths annually. Treatment with mefloquine and artesunate is currently recommended by the World Health Organization, and was historically the first artemisinin-based combination therapy used clinically for treatment of Plasmodium falciparum. The problem of poor-quality medicines, such as counterfeit and sub-standard anti-malarials, is a worldwide issue; therefore, it is essential to develop rapid, low cost, solvent-free, and reliable methods for routine quality control for these drugs. The aim of this study was to develop and validate a novel multivariate method for direct simultaneous quantification of mefloquine and artesunate in tablets by diffuse reflectance, middle infrared spectroscopy and partial least squares regression (MIR-PLS). Diffuse reflectance infrared Fourier transform spectroscopy (DRIFTS) and partial least squares regression were applied for simultaneous quantification of artesunate and mefloquine in tablets provided by the Brazilian Government. The model was obtained with full spectra (4000-400 cm(-1)) preprocessed by first derivative and Savitzky-Golay smoothing followed by mean centring, and built with three latent variables. The method was validated according to Brazilian and international guidelines through the measuring of figures of merit, such as trueness, precision, linearity, analytical sensitivity, selectivity, bias, and residual prediction deviation. The results were compared to an HPLC-MS/MS method. The MIR-PLS method provided root mean square errors of prediction lower than 2.0 mg per 100 mg of powder for the two analytes, and proved to be valid according to guidelines for analytical methods that use infrared (IR) spectroscopy with multivariate calibration. For the samples obtained from Brazilian healthcare units, the method provided results statistically similar to those obtained by HPLC-MS/MS. MIR-PLS was found
Directory of Open Access Journals (Sweden)
Tayseer Elamin Mohamed Elfaki
2016-05-01
Full Text Available In the Sudan, Schistosoma mansoni infections are a major cause of morbidity in school-aged children and infection rates are associated with available clean water sources. During infection, immune responses pass through a Th1 followed by Th2 and Treg phases and patterns can relate to different stages of infection or immunity.This retrospective study evaluated immunoepidemiological aspects in 234 individuals (range 4-85 years old from Kassala and Khartoum states in 2011. Systemic immune profiles (cytokines and immunoglobulins and epidemiological parameters were surveyed in n = 110 persons presenting patent S. mansoni infections (egg+, n = 63 individuals positive for S. mansoni via PCR in sera but egg negative (SmPCR+ and n = 61 people who were infection-free (Sm uninf. Immunoepidemiological findings were further investigated using two binary multivariable regression analysis.Nearly all egg+ individuals had no access to latrines and over 90% obtained water via the canal stemming from the Atbara River. With regards to age, infection and an egg+ status was linked to young and adolescent groups. In terms of immunology, S. mansoni infection per se was strongly associated with increased SEA-specific IgG4 but not IgE levels. IL-6, IL-13 and IL-10 were significantly elevated in patently-infected individuals and positively correlated with egg load. In contrast, IL-2 and IL-1β were significantly lower in SmPCR+ individuals when compared to Sm uninf and egg+ groups which was further confirmed during multivariate regression analysis.Schistosomiasis remains an important public health problem in the Sudan with a high number of patent individuals. In addition, SmPCR diagnostics revealed another cohort of infected individuals with a unique immunological profile and provides an avenue for future studies on non-patent infection states. Future studies should investigate the downstream signalling pathways/mechanisms of IL-2 and IL-1β as potential diagnostic markers
Energy Technology Data Exchange (ETDEWEB)
Bondt, R.B.J. de; Bakers, F.; Hofman, P.A.M. [Maastricht University Medical Center, Department of Radiology, Maastricht (Netherlands); Nelemans, P.J. [Maastricht University Medical Center, Department of Epidemiology, Maastricht (Netherlands); Casselman, J.W. [AZ St. Jan Hospital, Department of Radiology, Bruges (Belgium); Peutz-Kootstra, C. [Maastricht University Medical Center, Department of Pathology, Maastricht (Netherlands); Kremer, B. [Maastricht University Medical Center, Department of Otolaryngology/ Head and Neck Surgery, Maastricht (Netherlands); Beets-Tan, R.G.H. [Academic Hospital Maastricht, Department of Radiology, Maastricht (Netherlands)
2009-03-15
The aim was to evaluate whether morphological criteria in addition to the size criterion results in better diagnostic performance of MRI for the detection of cervical lymph node metastases in patients with head and neck squamous cell carcinoma (HNSCC). Two radiologists evaluated 44 consecutive patients in which lymph node characteristics were assessed with histopathological correlation as gold standard. Assessed criteria were the short axial diameter and morphological criteria such as border irregularity and homogeneity of signal intensity on T2-weighted and contrast-enhanced T1-weighted images. Multivariate logistic regression analysis was performed: diagnostic odds ratios (DOR) with 95% confidence intervals (95% CI) and areas under the curve (AUCs) of receiver-operating characteristic (ROC) curves were determined. Border irregularity and heterogeneity of signal intensity on T{sub 2}-weighted images showed significantly increased DORs. AUCs increased from 0.67 (95% CI: 0.61-0.73) using size only to 0.81 (95% CI: 0.75-0.87) using all four criteria for observer 1 and from 0.68 (95% CI: 0.62-0.74) to 0.96 (95% CI: 0.94-0.98) for observer 2 (p < 0.001). This study demonstrated that the morphological criteria border irregularity and heterogeneity of signal intensity on T2-weighted images in addition to size significantly improved the detection of cervical lymph nodes metastases. (orig.)
Kjekshus, Lars Erik; Bernstrøm, Vilde Hoff; Dahl, Espen; Lorentzen, Thomas
2014-02-03
Hospitals are merging to become more cost-effective. Mergers are often complex and difficult processes with variable outcomes. The aim of this study was to analyze the effect of mergers on long-term sickness absence among hospital employees. Long-term sickness absence was analyzed among hospital employees (N = 107 209) in 57 hospitals involved in 23 mergers in Norway between 2000 and 2009. Variation in long-term sickness absence was explained through a fixed effects multivariate regression analysis using panel data with years-since-merger as the independent variable. We found a significant but modest effect of mergers on long-term sickness absence in the year of the merger, and in years 2, 3 and 4; analyzed by gender there was a significant effect for women, also for these years, but only in year 4 for men. However, men are less represented among the hospital workforce; this could explain the lack of significance. Mergers has a significant effect on employee health that should be taken into consideration when deciding to merge hospitals. This study illustrates the importance of analyzing the effects of mergers over several years and the need for more detailed analyses of merger processes and of the changes that may occur as a result of such mergers.
Thorsen, Kenneth; Søreide, Jon Arne; Søreide, Kjetil
2014-07-01
Mortality rates in perforated peptic ulcer (PPU) have remained unchanged. The aim of this study was to compare known clinical factors and three scoring systems (American Society of Anesthesiologists (ASA), Boey and peptic ulcer perforation (PULP)) in the ability to predict mortality in PPU. This is a consecutive, observational cohort study of patients surgically treated for perforated peptic ulcer over a decade (January 2001 through December 2010). Primary outcome was 30-day mortality. A total of 172 patients were included, of whom 28 (16 %) died within 30 days. Among the factors associated with mortality, the PULP score had an odds ratio (OR) of 18.6 and the ASA score had an OR of 11.6, both with an area under the curve (AUC) of 0.79. The Boey score had an OR of 5.0 and an AUC of 0.75. Hypoalbuminaemia alone (≤37 g/l) achieved an OR of 8.7 and an AUC of 0.78. In multivariable regression, mortality was best predicted by a combination of increasing age, presence of active cancer and delay from admission to surgery of >24 h, together with hypoalbuminaemia, hyperbilirubinaemia and increased creatinine values, for a model AUC of 0.89. Six clinical factors predicted 30-day mortality better than available risk scores. Hypoalbuminaemia was the strongest single predictor of mortality and may be included for improved risk estimation.
Kisi, Ozgur; Parmar, Kulwinder Singh
2016-03-01
This study investigates the accuracy of least square support vector machine (LSSVM), multivariate adaptive regression splines (MARS) and M5 model tree (M5Tree) in modeling river water pollution. Various combinations of water quality parameters, Free Ammonia (AMM), Total Kjeldahl Nitrogen (TKN), Water Temperature (WT), Total Coliform (TC), Fecal Coliform (FC) and Potential of Hydrogen (pH) monitored at Nizamuddin, Delhi Yamuna River in India were used as inputs to the applied models. Results indicated that the LSSVM and MARS models had almost same accuracy and they performed better than the M5Tree model in modeling monthly chemical oxygen demand (COD). The average root mean square error (RMSE) of the LSSVM and M5Tree models was decreased by 1.47% and 19.1% using MARS model, respectively. Adding TC input to the models did not increase their accuracy in modeling COD while adding FC and pH inputs to the models generally decreased the accuracy. The overall results indicated that the MARS and LSSVM models could be successfully used in estimating monthly river water pollution level by using AMM, TKN and WT parameters as inputs.
Directory of Open Access Journals (Sweden)
Goyal Neeraj
2010-01-01
Full Text Available To compare the accuracy of artificial neural network (ANN analysis and multi-variate regression analysis (MVRA for renal stone fragmentation by extracorporeal shock wave lithotripsy (ESWL. A total of 276 patients with renal calculus were treated by ESWL during December 2001 to December 2006. Of them, the data of 196 patients were used for training the ANN. The predictability of trained ANN was tested on 80 subsequent patients. The input data include age of patient, stone size, stone burden, number of sittings and urinary pH. The output values (predicted values were number of shocks and shock power. Of these 80 patients, the input was analyzed and output was also calculated by MVRA. The output values (predicted values from both the methods were compared and the results were drawn. The predicted and observed values of shock power and number of shocks were compared using 1:1 slope line. The results were calculated as coefficient of correlation (COC (r2 . For prediction of power, the MVRA COC was 0.0195 and ANN COC was 0.8343. For prediction of number of shocks, the MVRA COC was 0.5726 and ANN COC was 0.9329. In conclusion, ANN gives better COC than MVRA, hence could be a better tool to analyze the optimum renal stone fragmentation by ESWL.
Directory of Open Access Journals (Sweden)
Goovaerts Pierre
2011-12-01
Full Text Available Abstract Background Although prostate cancer-related incidence and mortality have declined recently, striking racial/ethnic differences persist in the United States. Visualizing and modelling temporal trends of prostate cancer late-stage incidence, and how they vary according to geographic locations and race, should help explaining such disparities. Joinpoint regression is increasingly used to identify the timing and extent of changes in time series of health outcomes. Yet, most analyses of temporal trends are aspatial and conducted at the national level or for a single cancer registry. Methods Time series (1981-2007 of annual proportions of prostate cancer late-stage cases were analyzed for non-Hispanic Whites and non-Hispanic Blacks in each county of Florida. Noise in the data was first filtered by binomial kriging and results were modelled using joinpoint regression. A similar analysis was also conducted at the state level and for groups of metropolitan and non-metropolitan counties. Significant racial differences were detected using tests of parallelism and coincidence of time trends. A new disparity statistic was introduced to measure spatial and temporal changes in the frequency of racial disparities. Results State-level percentage of late-stage diagnosis decreased 50% since 1981; a decline that accelerated in the 90's when Prostate Specific Antigen (PSA screening was introduced. Analysis at the metropolitan and non-metropolitan levels revealed that the frequency of late-stage diagnosis increased recently in urban areas, and this trend was significant for white males. The annual rate of decrease in late-stage diagnosis and the onset years for significant declines varied greatly among counties and racial groups. Most counties with non-significant average annual percent change (AAPC were located in the Florida Panhandle for white males, whereas they clustered in South-eastern Florida for black males. The new disparity statistic indicated
Goovaerts, Pierre; Xiao, Hong
2011-12-05
Although prostate cancer-related incidence and mortality have declined recently, striking racial/ethnic differences persist in the United States. Visualizing and modelling temporal trends of prostate cancer late-stage incidence, and how they vary according to geographic locations and race, should help explaining such disparities. Joinpoint regression is increasingly used to identify the timing and extent of changes in time series of health outcomes. Yet, most analyses of temporal trends are aspatial and conducted at the national level or for a single cancer registry. Time series (1981-2007) of annual proportions of prostate cancer late-stage cases were analyzed for non-Hispanic Whites and non-Hispanic Blacks in each county of Florida. Noise in the data was first filtered by binomial kriging and results were modelled using joinpoint regression. A similar analysis was also conducted at the state level and for groups of metropolitan and non-metropolitan counties. Significant racial differences were detected using tests of parallelism and coincidence of time trends. A new disparity statistic was introduced to measure spatial and temporal changes in the frequency of racial disparities. State-level percentage of late-stage diagnosis decreased 50% since 1981; a decline that accelerated in the 90's when Prostate Specific Antigen (PSA) screening was introduced. Analysis at the metropolitan and non-metropolitan levels revealed that the frequency of late-stage diagnosis increased recently in urban areas, and this trend was significant for white males. The annual rate of decrease in late-stage diagnosis and the onset years for significant declines varied greatly among counties and racial groups. Most counties with non-significant average annual percent change (AAPC) were located in the Florida Panhandle for white males, whereas they clustered in South-eastern Florida for black males. The new disparity statistic indicated that the spatial extent of racial disparities reached a
Huijbregts, Henricus J T A M; Khan, Riaz J K; Fick, Daniel P; Jarrett, Olivia M; Haebich, Samantha
2016-06-01
Approximately 18% of the patients are dissatisfied with the result of total knee replacement. However, the relation between dissatisfaction and prosthetic alignment has not been investigated before. We retrospectively analysed prospectively gathered data of all patients who had a primary TKR, preoperative and one-year postoperative Oxford Knee Scores (OKS) and postoperative computed tomography (CT). The CT protocol measures hip-knee-ankle (HKA) angle, and coronal, sagittal and axial component alignment. Satisfaction was defined using a five-item Likert scale. We dichotomised dissatisfaction by combining '(very) dissatisfied' and 'neutral/not sure'. Associations with dissatisfaction and change in OKS were calculated using multivariable logistic and linear regression models. 230 TKRs were implanted in 105 men and 106 women. At one year, 12% were (very) dissatisfied and 10% neutral. Coronal alignment of the femoral component was 0.5 degrees more accurate in patients who were satisfied at one year. The other alignment measurements were not different between satisfied and dissatisfied patients. All radiographic measurements had a P-value>0.10 on univariate analyses. At one year, dissatisfaction was associated with the three-months OKS. Change in OKS was associated with three-months OKS, preoperative physical SF-12, preoperative pain and cruciate retaining design. Neither mechanical axis, nor component alignment, is associated with dissatisfaction at one year following TKR. Patients get the best outcome when pain reduction and function improvement are optimal during the first three months and when the indication to embark on surgery is based on physical limitations rather than on a high pain score. 2. Copyright © 2016 Elsevier B.V. All rights reserved.
International Nuclear Information System (INIS)
Callen, M.S.; Lopez, J.M.; Mastral, A.M.
2010-01-01
The estimation of benzo(a)pyrene (BaP) concentrations in ambient air is very important from an environmental point of view especially with the introduction of the Directive 2004/107/EC and due to the carcinogenic character of this pollutant. A sampling campaign of particulate matter less or equal than 10 microns (PM10) carried out during 2008-2009 in four locations of Spain was collected to determine experimentally BaP concentrations by gas chromatography mass-spectrometry mass-spectrometry (GC-MS-MS). Multivariate linear regression models (MLRM) were used to predict BaP air concentrations in two sampling places, taking PM10 and meteorological variables as possible predictors. The model obtained with data from two sampling sites (all sites model) (R 2 = 0.817, PRESS/SSY = 0.183) included the significant variables like PM10, temperature, solar radiation and wind speed and was internally and externally validated. The first validation was performed by cross validation and the last one by BaP concentrations from previous campaigns carried out in Zaragoza from 2001-2004. The proposed model constitutes a first approximation to estimate BaP concentrations in urban atmospheres with very good internal prediction (Q CV 2 =0.813, PRESS/SSY = 0.187) and with the maximal external prediction for the 2001-2002 campaign (Q ext 2 =0.679 and PRESS/SSY = 0.321) versus the 2001-2004 campaign (Q ext 2 =0.551, PRESS/SSY = 0.449).
Callén, M S; López, J M; Mastral, A M
2010-08-15
The estimation of benzo(a)pyrene (BaP) concentrations in ambient air is very important from an environmental point of view especially with the introduction of the Directive 2004/107/EC and due to the carcinogenic character of this pollutant. A sampling campaign of particulate matter less or equal than 10 microns (PM10) carried out during 2008-2009 in four locations of Spain was collected to determine experimentally BaP concentrations by gas chromatography mass-spectrometry mass-spectrometry (GC-MS-MS). Multivariate linear regression models (MLRM) were used to predict BaP air concentrations in two sampling places, taking PM10 and meteorological variables as possible predictors. The model obtained with data from two sampling sites (all sites model) (R(2)=0.817, PRESS/SSY=0.183) included the significant variables like PM10, temperature, solar radiation and wind speed and was internally and externally validated. The first validation was performed by cross validation and the last one by BaP concentrations from previous campaigns carried out in Zaragoza from 2001-2004. The proposed model constitutes a first approximation to estimate BaP concentrations in urban atmospheres with very good internal prediction (Q(CV)(2)=0.813, PRESS/SSY=0.187) and with the maximal external prediction for the 2001-2002 campaign (Q(ext)(2)=0.679 and PRESS/SSY=0.321) versus the 2001-2004 campaign (Q(ext)(2)=0.551, PRESS/SSY=0.449). Copyright 2010 Elsevier B.V. All rights reserved.
Jeandron, Aurélie; Saidi, Jaime Mufitini; Kapama, Alois; Burhole, Manu; Birembano, Freddy; Vandevelde, Thierry; Gasparrini, Antonio; Armstrong, Ben; Cairncross, Sandy; Ensink, Jeroen H. J.
2015-01-01
Background The eastern provinces of the Democratic Republic of the Congo have been identified as endemic areas for cholera transmission, and despite continuous control efforts, they continue to experience regular cholera outbreaks that occasionally spread to the rest of the country. In a region where access to improved water sources is particularly poor, the question of which improvements in water access should be prioritized to address cholera transmission remains unresolved. This study aimed at investigating the temporal association between water supply interruptions and Cholera Treatment Centre (CTC) admissions in a medium-sized town. Methods and Findings Time-series patterns of daily incidence of suspected cholera cases admitted to the Cholera Treatment Centre in Uvira in South Kivu Province between 2009 and 2014 were examined in relation to the daily variations in volume of water supplied by the town water treatment plant. Quasi-poisson regression and distributed lag nonlinear models up to 12 d were used, adjusting for daily precipitation rates, day of the week, and seasonal variations. A total of 5,745 patients over 5 y of age with acute watery diarrhoea symptoms were admitted to the CTC over the study period of 1,946 d. Following a day without tap water supply, the suspected cholera incidence rate increased on average by 155% over the next 12 d, corresponding to a rate ratio of 2.55 (95% CI: 1.54–4.24), compared to the incidence experienced after a day with optimal production (defined as the 95th percentile—4,794 m3). Suspected cholera cases attributable to a suboptimal tap water supply reached 23.2% of total admissions (95% CI 11.4%–33.2%). Although generally reporting less admissions to the CTC, neighbourhoods with a higher consumption of tap water were more affected by water supply interruptions, with a rate ratio of 3.71 (95% CI: 1.91–7.20) and an attributable fraction of cases of 31.4% (95% CI: 17.3%–42.5%). The analysis did not suggest any
Simons, Monique; de Vet, Emely; Chinapaw, Mai Jm; de Boer, Michiel; Seidell, Jacob C; Brug, Johannes
2014-04-04
Playing video games contributes substantially to sedentary behavior in youth. A new generation of video games-active games-seems to be a promising alternative to sedentary games to promote physical activity and reduce sedentary behavior. At this time, little is known about correlates of active and non-active gaming among adolescents. The objective of this study was to examine potential personal, social, and game-related correlates of both active and non-active gaming in adolescents. A survey assessing game behavior and potential personal, social, and game-related correlates was conducted among adolescents (12-16 years, N=353) recruited via schools. Multivariable, multilevel logistic regression analyses, adjusted for demographics (age, sex and educational level of adolescents), were conducted to examine personal, social, and game-related correlates of active gaming ≥1 hour per week (h/wk) and non-active gaming >7 h/wk. Active gaming ≥1 h/wk was significantly associated with a more positive attitude toward active gaming (OR 5.3, CI 2.4-11.8; Pgames (OR 0.30, CI 0.1-0.6; P=.002), a higher score on habit strength regarding gaming (OR 1.9, CI 1.2-3.2; P=.008) and having brothers/sisters (OR 6.7, CI 2.6-17.1; Pgame engagement (OR 0.95, CI 0.91-0.997; P=.04). Non-active gaming >7 h/wk was significantly associated with a more positive attitude toward non-active gaming (OR 2.6, CI 1.1-6.3; P=.035), a stronger habit regarding gaming (OR 3.0, CI 1.7-5.3; P7 h/wk. Active gaming is most strongly (negatively) associated with attitude with respect to non-active games, followed by observed active game behavior of brothers and sisters and attitude with respect to active gaming (positive associations). On the other hand, non-active gaming is most strongly associated with observed non-active game behavior of friends, habit strength regarding gaming and attitude toward non-active gaming (positive associations). Habit strength was a correlate of both active and non-active gaming
Balshi, M. S.; McGuire, A.D.; Duffy, P.; Flannigan, M.; Walsh, J.; Melillo, J.
2009-01-01
Fire is a common disturbance in the North American boreal forest that influences ecosystem structure and function. The temporal and spatial dynamics of fire are likely to be altered as climate continues to change. In this study, we ask the question: how will area burned in boreal North America by wildfire respond to future changes in climate? To evaluate this question, we developed temporally and spatially explicit relationships between air temperature and fuel moisture codes derived from the Canadian Fire Weather Index System to estimate annual area burned at 2.5?? (latitude ?? longitude) resolution using a Multivariate Adaptive Regression Spline (MARS) approach across Alaska and Canada. Burned area was substantially more predictable in the western portion of boreal North America than in eastern Canada. Burned area was also not very predictable in areas of substantial topographic relief and in areas along the transition between boreal forest and tundra. At the scale of Alaska and western Canada, the empirical fire models explain on the order of 82% of the variation in annual area burned for the period 1960-2002. July temperature was the most frequently occurring predictor across all models, but the fuel moisture codes for the months June through August (as a group) entered the models as the most important predictors of annual area burned. To predict changes in the temporal and spatial dynamics of fire under future climate, the empirical fire models used output from the Canadian Climate Center CGCM2 global climate model to predict annual area burned through the year 2100 across Alaska and western Canada. Relative to 1991-2000, the results suggest that average area burned per decade will double by 2041-2050 and will increase on the order of 3.5-5.5 times by the last decade of the 21st century. To improve the ability to better predict wildfire across Alaska and Canada, future research should focus on incorporating additional effects of long-term and successional
de Vet, Emely; Chinapaw, Mai JM; de Boer, Michiel; Seidell, Jacob C; Brug, Johannes
2014-01-01
Background Playing video games contributes substantially to sedentary behavior in youth. A new generation of video games—active games—seems to be a promising alternative to sedentary games to promote physical activity and reduce sedentary behavior. At this time, little is known about correlates of active and non-active gaming among adolescents. Objective The objective of this study was to examine potential personal, social, and game-related correlates of both active and non-active gaming in adolescents. Methods A survey assessing game behavior and potential personal, social, and game-related correlates was conducted among adolescents (12-16 years, N=353) recruited via schools. Multivariable, multilevel logistic regression analyses, adjusted for demographics (age, sex and educational level of adolescents), were conducted to examine personal, social, and game-related correlates of active gaming ≥1 hour per week (h/wk) and non-active gaming >7 h/wk. Results Active gaming ≥1 h/wk was significantly associated with a more positive attitude toward active gaming (OR 5.3, CI 2.4-11.8; Pgames (OR 0.30, CI 0.1-0.6; P=.002), a higher score on habit strength regarding gaming (OR 1.9, CI 1.2-3.2; P=.008) and having brothers/sisters (OR 6.7, CI 2.6-17.1; Pgame engagement (OR 0.95, CI 0.91-0.997; P=.04). Non-active gaming >7 h/wk was significantly associated with a more positive attitude toward non-active gaming (OR 2.6, CI 1.1-6.3; P=.035), a stronger habit regarding gaming (OR 3.0, CI 1.7-5.3; P7 h/wk. Active gaming is most strongly (negatively) associated with attitude with respect to non-active games, followed by observed active game behavior of brothers and sisters and attitude with respect to active gaming (positive associations). On the other hand, non-active gaming is most strongly associated with observed non-active game behavior of friends, habit strength regarding gaming and attitude toward non-active gaming (positive associations). Habit strength was a
International Nuclear Information System (INIS)
Hirotsu, Yuko; Suzuki, Kunihiko; Takano, Kenichi; Kojima, Mitsuhiro
2000-01-01
It is essential for preventing the recurrence of human error incidents to analyze and evaluate them with the emphasis on human factor. Detailed and structured analyses of all incidents at domestic nuclear power plants (NPPs) reported during last 31 years have been conducted based on J-HPES, in which total 193 human error cases are identified. Results obtained by the analyses have been stored into the J-HPES database. In the previous study, by applying multivariate analysis to above case studies, it was suggested that there were several occurrence patterns identified of how errors occur at NPPs. It was also clarified that the causes related to each human error are different depending on age of their occurrence. This paper described the obtained results in respects of periodical transition of human error occurrence patterns. By applying multivariate analysis to the above data, it was suggested there were two types of error occurrence patterns as to each human error type. First type is common occurrence patterns, not depending on the age, and second type is the one influenced by periodical characteristics. (author)
Ferrão, Marco F.; Mello, Cesar; Borin, Alessandra; Maretto, Danilo A.; Poppi, Ronei J.
2007-01-01
Least-squares support vector machines (LS-SVM) were used as an alternative multivariate calibration method for the simultaneous quantification of some common adulterants found in powdered milk samples, using near-infrared spectroscopy. Excellent models were built using LS-SVM for determining R², RMSECV and RMSEP values. LS-SVMs show superior performance for quantifying starch, whey and sucrose in powdered milk samples in relation to PLSR. This study shows that it is possible to determine prec...
Broe, Rebecca; Rasmussen, Malin Lundberg; Frydkjaer-Olsen, Ulrik; Olsen, Birthe Susanne; Mortensen, Henrik Bindesboel; Peto, Tunde; Grauslund, Jakob
2014-01-01
The aim was to investigate the long-term incidence of proliferative diabetic retinopathy (PDR), and progression and regression of diabetic retinopathy (DR) and associated risk factors in young Danish patients with Type 1 diabetes mellitus. In 1987-89, a pediatric cohort involving approximately 75 % of all children with Type 1 diabetes in Denmark retinopathy graded and all relevant diabetic parameters assessed. Of those, 185 (54.6 %) were evaluated again in 2011 for the same clinical parameters. All retinal images were graded using modified early treatment of DR study for 1995 and 2011. In 1995, mean age was 21.0 years and mean diabetes duration 13.5 years. The 16-year incidence of proliferative retinopathy, 2-step progression and 2-step regression of DR was 31.0, 64.4 and 0.0 %, respectively, while the incidence of DR was 95.1 %. In a multivariate logistic regression model, progression to PDR was significantly associated with 1995 HbA1c (OR 2.61 per 1 % increase, 95 % CI 1.85-3.68) and 1995 diastolic blood pressure (OR 1.79 per 10 mmHg increase, 95 % CI 1.04-3.07). Two-step progression of DR was associated with male gender (OR 2.37 vs. female, 95 % CI 1.07-5.27), 1995 HbA1c (OR 3.02 per 1 % increase, 95 % CI 2.04-4.48) and 1995 vibration perception threshold (OR 1.19 per 1 Volt increase, 95 % CI 1.02-1.40). In conclusion, one in three progressed to PDR and two in three had 2-step progression despite young age and increased awareness of the importance of metabolic control. After 30 years duration of diabetes, the presence of DR is almost universal.
Peng, Ying; Li, Su-Ning; Pei, Xuexue; Hao, Kun
2018-03-01
Amultivariate regression statisticstrategy was developed to clarify multi-components content-effect correlation ofpanaxginseng saponins extract and predict the pharmacological effect by components content. In example 1, firstly, we compared pharmacological effects between panax ginseng saponins extract and individual saponin combinations. Secondly, we examined the anti-platelet aggregation effect in seven different saponin combinations of ginsenoside Rb1, Rg1, Rh, Rd, Ra3 and notoginsenoside R1. Finally, the correlation between anti-platelet aggregation and the content of multiple components was analyzed by a partial least squares algorithm. In example 2, firstly, 18 common peaks were identified in ten different batches of panax ginseng saponins extracts from different origins. Then, we investigated the anti-myocardial ischemia reperfusion injury effects of the ten different panax ginseng saponins extracts. Finally, the correlation between the fingerprints and the cardioprotective effects was analyzed by a partial least squares algorithm. Both in example 1 and 2, the relationship between the components content and pharmacological effect was modeled well by the partial least squares regression equations. Importantly, the predicted effect curve was close to the observed data of dot marked on the partial least squares regression model. This study has given evidences that themulti-component content is a promising information for predicting the pharmacological effects of traditional Chinese medicine.
Directory of Open Access Journals (Sweden)
Ying Peng
2018-03-01
Full Text Available Amultivariate regression statisticstrategy was developed to clarify multi-components content-effect correlation ofpanaxginseng saponins extract and predict the pharmacological effect by components content. In example 1, firstly, we compared pharmacological effects between panax ginseng saponins extract and individual saponin combinations. Secondly, we examined the anti-platelet aggregation effect in seven different saponin combinations of ginsenoside Rb1, Rg1, Rh, Rd, Ra3 and notoginsenoside R1. Finally, the correlation between anti-platelet aggregation and the content of multiple components was analyzed by a partial least squares algorithm. In example 2, firstly, 18 common peaks were identified in ten different batches of panax ginseng saponins extracts from different origins. Then, we investigated the anti-myocardial ischemia reperfusion injury effects of the ten different panax ginseng saponins extracts. Finally, the correlation between the fingerprints and the cardioprotective effects was analyzed by a partial least squares algorithm. Both in example 1 and 2, the relationship between the components content and pharmacological effect was modeled well by the partial least squares regression equations. Importantly, the predicted effect curve was close to the observed data of dot marked on the partial least squares regression model. This study has given evidences that themulti-component content is a promising information for predicting the pharmacological effects of traditional Chinese medicine.
DEFF Research Database (Denmark)
Broe, Rebecca; Rasmussen, Malin Lundberg; Frydkjaer-Olsen, Ulrik
2014-01-01
The aim was to investigate the long-term incidence of proliferative diabetic retinopathy (PDR), and progression and regression of diabetic retinopathy (DR) and associated risk factors in young Danish patients with Type 1 diabetes mellitus. In 1987-89, a pediatric cohort involving approximately 75...... % of all children with Type 1 diabetes in Denmark diabetic parameters assessed. Of those, 185 (54.6 %) were evaluated again in 2011 for the same clinical parameters. All retinal images...... were graded using modified early treatment of DR study for 1995 and 2011. In 1995, mean age was 21.0 years and mean diabetes duration 13.5 years. The 16-year incidence of proliferative retinopathy, 2-step progression and 2-step regression of DR was 31.0, 64.4 and 0.0 %, respectively, while...
Greene, LaVana; Elzey, Brianda; Franklin, Mariah; Fakayode, Sayo O.
2017-03-01
The negative health impact of polycyclic aromatic hydrocarbons (PAHs) and differences in pharmacological activity of enantiomers of chiral molecules in humans highlights the need for analysis of PAHs and their chiral analogue molecules in humans. Herein, the first use of cyclodextrin guest-host inclusion complexation, fluorescence spectrophotometry, and chemometric approach to PAH (anthracene) and chiral-PAH analogue derivatives (1-(9-anthryl)-2,2,2-triflouroethanol (TFE)) analyses are reported. The binding constants (Kb), stoichiometry (n), and thermodynamic properties (Gibbs free energy (ΔG), enthalpy (ΔH), and entropy (ΔS)) of anthracene and enantiomers of TFE-methyl-β-cyclodextrin (Me-β-CD) guest-host complexes were also determined. Chemometric partial-least-square (PLS) regression analysis of emission spectra data of Me-β-CD-guest-host inclusion complexes was used for the determination of anthracene and TFE enantiomer concentrations in Me-β-CD-guest-host inclusion complex samples. The values of calculated Kb and negative ΔG suggest the thermodynamic favorability of anthracene-Me-β-CD and enantiomeric of TFE-Me-β-CD inclusion complexation reactions. However, anthracene-Me-β-CD and enantiomer TFE-Me-β-CD inclusion complexations showed notable differences in the binding affinity behaviors and thermodynamic properties. The PLS regression analysis resulted in square-correlation-coefficients of 0.997530 or better and a low LOD of 3.81 × 10- 7 M for anthracene and 3.48 × 10- 8 M for TFE enantiomers at physiological conditions. Most importantly, PLS regression accurately determined the anthracene and TFE enantiomer concentrations with an average low error of 2.31% for anthracene, 4.44% for R-TFE and 3.60% for S-TFE. The results of the study are highly significant because of its high sensitivity and accuracy for analysis of PAH and chiral PAH analogue derivatives without the need of an expensive chiral column, enantiomeric resolution, or use of a
DEFF Research Database (Denmark)
Jørgensen, Lasse Vigel; Huss, Hans Henrik; Dalgaard, Paw
2001-01-01
, 1- penten-3-ol, and 1-propanol. The potency and importance of these compounds was confirmed by gas chromatography- olfactometry. The present study provides valuable information on the bacterial reactions responsible for spoilage off-flavors of cold-smoked salmon, which can be used to develop......Changes were studied in the concentration of 38 volatile compounds during chilled storage at 5 degreesC of six lots of commercially produced vacuum-packed cold-smoked salmon and sterile cold-smoked salmon. The majority of volatile compounds produced during spoilage of cold-smoked salmon were...... alcohols, which were produced by microbial activity. Partial least- squares regression of volatile compounds and sensory results allowed for a multiple compound quality index to be developed. This index was based on volatile bacterial metabolites, 1- propanol and 2-butanone, and 2-furan...
Luna, Aderval S.; Gonzaga, Fabiano B.; da Rocha, Werickson F. C.; Lima, Igor C. A.
2018-01-01
Laser-induced breakdown spectroscopy (LIBS) analysis was carried out on eleven steel samples to quantify the concentrations of chromium, nickel, and manganese. LIBS spectral data were correlated to known concentrations of the samples using different strategies in partial least squares (PLS) regression models. For the PLS analysis, one predictive model was separately generated for each element, while different approaches were used for the selection of variables (VIP: variable importance in projection and iPLS: interval partial least squares) in the PLS model to quantify the contents of the elements. The comparison of the performance of the models showed that there was no significant statistical difference using the Wilcoxon signed rank test. The elliptical joint confidence region (EJCR) did not detect systematic errors in these proposed methodologies for each metal.
Zorrilla-Vaca, Andres; Healy, Ryan; Zorrilla-Vaca, Carolina
2016-10-01
Post-dural puncture headache (PDPH) is a well-known neurological outcome caused by leakage of cerebrospinal fluid during neuraxial anesthesia. Studies aimed at assessing the efficacy of finer gauged spinal needles to reduce the incidence of PDPH have produced conflicting results. We have therefore examined the effect of the gauge of cutting needles and pencil-point needles, separately, on the incidence of PDPH. The PubMed, EMBASE and Google Scholar databases were searched for randomized studies which compared PDPH incidence in a head-to-head analysis of individual needle gauges of similar needle designs (cutting and pencil-point). A meta-regression analysis was performed taking into account various covariates, such as needle gauge and design, mean age of patient population, surgery type, percentage of males and females in study population and year of publication. Of the 22 studies (n = 5631) included in the analysis, 12 (n = 3148) and ten (n = 2483) compared different gauges of cutting needles and pencil-point needles, respectively. After adjusting for covariates, meta-regression analysis was performed for all studies that randomly compared individual needle gauges of similar needle design. Whereas the incidence of PDPH inversely correlated with gauge in cutting needles (β = -1.36 % per gauge, P = 0.037), no relationship was noted in pencil-point needles (β = -0.32 % per gauge, P = 0.114). Female gender was the only covariate that reached a statistically significant correlation with the incidence of PDPH in both models. A significant relationship between needle gauge and subsequent rate of PDPH was noted in cutting needles, but not pencil-point needles.
Directory of Open Access Journals (Sweden)
Patricio Peralta-Zamora
2005-10-01
Full Text Available In this work, a partial least squares regression routine was used to develop a multivariate calibration model to predict the chemical oxygen demand (COD in substrates of environmental relevance (paper effluents and landfill leachates from UV-Vis spectral data. The calibration models permit the fast determination of the COD with typical relative errors lower by 10% with respect to the conventional methodology.
Masuda, Takanori; Nakaura, Takeshi; Funama, Yoshinori; Higaki, Toru; Kiguchi, Masao; Imada, Naoyuki; Sato, Tomoyasu; Awai, Kazuo
We evaluated the effect of the age, sex, total body weight (TBW), height (HT) and cardiac output (CO) of patients on aortic and hepatic contrast enhancement during hepatic-arterial phase (HAP) and portal venous phase (PVP) computed tomography (CT) scanning. This prospective study received institutional review board approval; prior informed consent to participate was obtained from all 168 patients. All were examined using our routine protocol; the contrast material was 600 mg/kg iodine. Cardiac output was measured with a portable electrical velocimeter within 5 minutes of starting the CT scan. We calculated contrast enhancement (per gram of iodine: [INCREMENT]HU/gI) of the abdominal aorta during the HAP and of the liver parenchyma during the PVP. We performed univariate and multivariate linear regression analysis between all patient characteristics and the [INCREMENT]HU/gI of aortic- and liver parenchymal enhancement. Univariate linear regression analysis demonstrated statistically significant correlations between the [INCREMENT]HU/gI and the age, sex, TBW, HT, and CO (all P linear regression analysis showed that only the TBW and CO were of independent predictive value (P linear regression analysis only the TBW and CO were significantly correlated with aortic and liver parenchymal enhancement; the age, sex, and HT were not. The CO was the only independent factor affecting aortic and liver parenchymal enhancement at hepatic CT when the protocol was adjusted for the TBW.
Olive, David J
2017-01-01
This text covers both multiple linear regression and some experimental design models. The text uses the response plot to visualize the model and to detect outliers, does not assume that the error distribution has a known parametric distribution, develops prediction intervals that work when the error distribution is unknown, suggests bootstrap hypothesis tests that may be useful for inference after variable selection, and develops prediction regions and large sample theory for the multivariate linear regression model that has m response variables. A relationship between multivariate prediction regions and confidence regions provides a simple way to bootstrap confidence regions. These confidence regions often provide a practical method for testing hypotheses. There is also a chapter on generalized linear models and generalized additive models. There are many R functions to produce response and residual plots, to simulate prediction intervals and hypothesis tests, to detect outliers, and to choose response trans...
Kitasako, Yuichi; Sasaki, Y; Takagaki, T; Sadr, A; Tagami, J
2017-11-01
The aim of this study was to evaluate factors associated with the incidence of erosive tooth wear (ETW) among adults at different ages in Tokyo using multifactorial logistic regression analysis. The study sample consisted of a total of 1108 subjects aged 15 to 89 years in Tokyo, Japan. Two examiners evaluated ETW in a full-mouth recording. The subjects were asked to complete a self-administered daily diet, habit, and health condition questionnaire. Subjects who had frequent acid consumption or gastric reflux and at least one tooth with initial enamel wear were placed in the ETW-positive group, and the remainder of the subjects was placed in the ETW-negative group. Logistic regression analyses were carried out to identify factors collectively associated with ETW. Logistic regression analysis showed that greater frequencies of carbonated or sports drink consumption were associated with higher incidence of ETW for all age groups except for 70-89 years. Adults in the 30-39-year group who reported suffering from heartburn were about 22.3 times more likely to develop ETW, while 40-49-year adults who had repeated vomiting were about 33.5 times more likely to exhibit ETW compared with those who did not experience vomiting. Age-specific dietary habits were clearly observed among adults at different ages in Tokyo, and there were significant differences in intrinsic and extrinsic factors between ETW-positive and ETW-negative groups for each age group. Both greater frequency of carbonated and sports drink consumption were associated with higher incidence of ETW among adults at different ages in Tokyo.
J Olive, David
2017-01-01
This text presents methods that are robust to the assumption of a multivariate normal distribution or methods that are robust to certain types of outliers. Instead of using exact theory based on the multivariate normal distribution, the simpler and more applicable large sample theory is given. The text develops among the first practical robust regression and robust multivariate location and dispersion estimators backed by theory. The robust techniques are illustrated for methods such as principal component analysis, canonical correlation analysis, and factor analysis. A simple way to bootstrap confidence regions is also provided. Much of the research on robust multivariate analysis in this book is being published for the first time. The text is suitable for a first course in Multivariate Statistical Analysis or a first course in Robust Statistics. This graduate text is also useful for people who are familiar with the traditional multivariate topics, but want to know more about handling data sets with...
Multivariate analysis with LISREL
Jöreskog, Karl G; Y Wallentin, Fan
2016-01-01
This book traces the theory and methodology of multivariate statistical analysis and shows how it can be conducted in practice using the LISREL computer program. It presents not only the typical uses of LISREL, such as confirmatory factor analysis and structural equation models, but also several other multivariate analysis topics, including regression (univariate, multivariate, censored, logistic, and probit), generalized linear models, multilevel analysis, and principal component analysis. It provides numerous examples from several disciplines and discusses and interprets the results, illustrated with sections of output from the LISREL program, in the context of the example. The book is intended for masters and PhD students and researchers in the social, behavioral, economic and many other sciences who require a basic understanding of multivariate statistical theory and methods for their analysis of multivariate data. It can also be used as a textbook on various topics of multivariate statistical analysis.
DEFF Research Database (Denmark)
Johansen, Søren
2008-01-01
The reduced rank regression model is a multivariate regression model with a coefficient matrix with reduced rank. The reduced rank regression algorithm is an estimation procedure, which estimates the reduced rank regression model. It is related to canonical correlations and involves calculating...
Cormanich, Rodrigo A; Goodarzi, Mohammad; Freitas, Matheus P
2009-02-01
Inhibition of tyrosine kinase enzyme WEE1 is an important step for the treatment of cancer. The bioactivities of a series of WEE1 inhibitors have been previously modeled through comparative molecular field analyses (CoMFA and CoMSIA), but a two-dimensional image-based quantitative structure-activity relationship approach has shown to be highly predictive for other compound classes. This method, called multivariate image analysis applied to quantitative structure-activity relationship, was applied here to derive quantitative structure-activity relationship models. Whilst the well-known bilinear and multilinear partial least squares regressions (PLS and N-PLS, respectively) correlated multivariate image analysis descriptors with the corresponding dependent variables only reasonably well, the use of wavelet and principal component ranking as variable selection methods, together with least-squares support vector machine, improved significantly the prediction statistics. These recently implemented mathematical tools, particularly novel in quantitative structure-activity relationship studies, represent an important advance for the development of more predictive quantitative structure-activity relationship models and, consequently, new drugs.
Hegazy, Maha A; Lotfy, Hayam M; Mowaka, Shereen; Mohamed, Ekram Hany
2016-07-05
Wavelets have been adapted for a vast number of signal-processing applications due to the amount of information that can be extracted from a signal. In this work, a comparative study on the efficiency of continuous wavelet transform (CWT) as a signal processing tool in univariate regression and a pre-processing tool in multivariate analysis using partial least square (CWT-PLS) was conducted. These were applied to complex spectral signals of ternary and quaternary mixtures. CWT-PLS method succeeded in the simultaneous determination of a quaternary mixture of drotaverine (DRO), caffeine (CAF), paracetamol (PAR) and p-aminophenol (PAP, the major impurity of paracetamol). While, the univariate CWT failed to simultaneously determine the quaternary mixture components and was able to determine only PAR and PAP, the ternary mixtures of DRO, CAF, and PAR and CAF, PAR, and PAP. During the calculations of CWT, different wavelet families were tested. The univariate CWT method was validated according to the ICH guidelines. While for the development of the CWT-PLS model a calibration set was prepared by means of an orthogonal experimental design and their absorption spectra were recorded and processed by CWT. The CWT-PLS model was constructed by regression between the wavelet coefficients and concentration matrices and validation was performed by both cross validation and external validation sets. Both methods were successfully applied for determination of the studied drugs in pharmaceutical formulations. Copyright © 2016 Elsevier B.V. All rights reserved.
Hegazy, Maha A.; Lotfy, Hayam M.; Mowaka, Shereen; Mohamed, Ekram Hany
2016-07-01
Wavelets have been adapted for a vast number of signal-processing applications due to the amount of information that can be extracted from a signal. In this work, a comparative study on the efficiency of continuous wavelet transform (CWT) as a signal processing tool in univariate regression and a pre-processing tool in multivariate analysis using partial least square (CWT-PLS) was conducted. These were applied to complex spectral signals of ternary and quaternary mixtures. CWT-PLS method succeeded in the simultaneous determination of a quaternary mixture of drotaverine (DRO), caffeine (CAF), paracetamol (PAR) and p-aminophenol (PAP, the major impurity of paracetamol). While, the univariate CWT failed to simultaneously determine the quaternary mixture components and was able to determine only PAR and PAP, the ternary mixtures of DRO, CAF, and PAR and CAF, PAR, and PAP. During the calculations of CWT, different wavelet families were tested. The univariate CWT method was validated according to the ICH guidelines. While for the development of the CWT-PLS model a calibration set was prepared by means of an orthogonal experimental design and their absorption spectra were recorded and processed by CWT. The CWT-PLS model was constructed by regression between the wavelet coefficients and concentration matrices and validation was performed by both cross validation and external validation sets. Both methods were successfully applied for determination of the studied drugs in pharmaceutical formulations.
Shimaponda-Mataa, Nzooma M; Tembo-Mwase, Enala; Gebreslasie, Michael; Achia, Thomas N O; Mukaratirwa, Samson
2017-11-01
Although malaria morbidity and mortality are greatly reduced globally owing to great control efforts, the disease remains the main contributor. In Zambia, all provinces are malaria endemic. However, the transmission intensities vary mainly depending on environmental factors as they interact with the vectors. Generally in Africa, possibly due to the varying perspectives and methods used, there is variation on the relative importance of malaria risk determinants. In Zambia, the role climatic factors play on malaria case rates has not been determined in combination of space and time using robust methods in modelling. This is critical considering the reversal in malaria reduction after the year 2010 and the variation by transmission zones. Using a geoadditive or structured additive semiparametric Poisson regression model, we determined the influence of climatic factors on malaria incidence in four endemic provinces of Zambia. We demonstrate a strong positive association between malaria incidence and precipitation as well as minimum temperature. The risk of malaria was 95% lower in Lusaka (ARR=0.05, 95% CI=0.04-0.06) and 68% lower in the Western Province (ARR=0.31, 95% CI=0.25-0.41) compared to Luapula Province. North-western Province did not vary from Luapula Province. The effects of geographical region are clearly demonstrated by the unique behaviour and effects of minimum and maximum temperatures in the four provinces. Environmental factors such as landscape in urbanised places may also be playing a role. Copyright © 2017 Elsevier B.V. All rights reserved.
Directory of Open Access Journals (Sweden)
Wudi Wei
Full Text Available Hepatitis is a serious public health problem with increasing cases and property damage in Heng County. It is necessary to develop a model to predict the hepatitis epidemic that could be useful for preventing this disease.The autoregressive integrated moving average (ARIMA model and the generalized regression neural network (GRNN model were used to fit the incidence data from the Heng County CDC (Center for Disease Control and Prevention from January 2005 to December 2012. Then, the ARIMA-GRNN hybrid model was developed. The incidence data from January 2013 to December 2013 were used to validate the models. Several parameters, including mean absolute error (MAE, root mean square error (RMSE, mean absolute percentage error (MAPE and mean square error (MSE, were used to compare the performance among the three models.The morbidity of hepatitis from Jan 2005 to Dec 2012 has seasonal variation and slightly rising trend. The ARIMA(0,1,2(1,1,112 model was the most appropriate one with the residual test showing a white noise sequence. The smoothing factor of the basic GRNN model and the combined model was 1.8 and 0.07, respectively. The four parameters of the hybrid model were lower than those of the two single models in the validation. The parameters values of the GRNN model were the lowest in the fitting of the three models.The hybrid ARIMA-GRNN model showed better hepatitis incidence forecasting in Heng County than the single ARIMA model and the basic GRNN model. It is a potential decision-supportive tool for controlling hepatitis in Heng County.
Wei, Wudi; Jiang, Junjun; Liang, Hao; Gao, Lian; Liang, Bingyu; Huang, Jiegang; Zang, Ning; Liao, Yanyan; Yu, Jun; Lai, Jingzhen; Qin, Fengxiang; Su, Jinming; Ye, Li; Chen, Hui
2016-01-01
Hepatitis is a serious public health problem with increasing cases and property damage in Heng County. It is necessary to develop a model to predict the hepatitis epidemic that could be useful for preventing this disease. The autoregressive integrated moving average (ARIMA) model and the generalized regression neural network (GRNN) model were used to fit the incidence data from the Heng County CDC (Center for Disease Control and Prevention) from January 2005 to December 2012. Then, the ARIMA-GRNN hybrid model was developed. The incidence data from January 2013 to December 2013 were used to validate the models. Several parameters, including mean absolute error (MAE), root mean square error (RMSE), mean absolute percentage error (MAPE) and mean square error (MSE), were used to compare the performance among the three models. The morbidity of hepatitis from Jan 2005 to Dec 2012 has seasonal variation and slightly rising trend. The ARIMA(0,1,2)(1,1,1)12 model was the most appropriate one with the residual test showing a white noise sequence. The smoothing factor of the basic GRNN model and the combined model was 1.8 and 0.07, respectively. The four parameters of the hybrid model were lower than those of the two single models in the validation. The parameters values of the GRNN model were the lowest in the fitting of the three models. The hybrid ARIMA-GRNN model showed better hepatitis incidence forecasting in Heng County than the single ARIMA model and the basic GRNN model. It is a potential decision-supportive tool for controlling hepatitis in Heng County.
Directory of Open Access Journals (Sweden)
Ferreira Márcia M. C.
2002-01-01
Full Text Available In this work, the chemometric techniques most frequently used in QSAR (quantitative structure-activity relationships studies are reviewed. They are introduced in chronological order, beginning with Hansch analysis and the exploratory data analysis methods of principal components and hierarchical clustering (PCA and HCA. Principal component regression and partial least squares regression methods (PCR and PLS are discussed, followed by the pattern recognition methods (KNN and SIMCA. Different applications are presented to illustrate these chemometric techniques. The methodology used for regression in 3D-QSAR is presented (unfolding PLS. Finally, the higher order method called Multilinear PLS, already used in analytical chemistry but not yet explored by the QSAR community, is introduced. This method maintains the multiway structure of the data and has several advantages over bilinear PLS including speed in calculation, simplicity and stability, since the number of parameters to be estimated can be greatly reduced.
A Matlab program for stepwise regression
Directory of Open Access Journals (Sweden)
Yanhong Qi
2016-03-01
Full Text Available The stepwise linear regression is a multi-variable regression for identifying statistically significant variables in the linear regression equation. In present study, we presented the Matlab program of stepwise regression.
Ferrao M.F.; Mello C.; Borin A.; Maretto D.A.; Poppi R.J.
2007-01-01
Least-squares support vector machines (LS-SVM) were used as an alternative multivariate calibration method for the simultaneous quantification of some common adulterants found in powdered milk samples, using near-infrared spectroscopy. Excellent models were built using LS-SVM for determining R 2, RMSECV and RMSEP values. LS-SVMs show superior performance for quantifying starch, whey and sucrose in powdered milk samples in relation to PLSR. This study shows that it is possible to determine pre...
Directory of Open Access Journals (Sweden)
Marco F. Ferrão
2007-08-01
Full Text Available Least-squares support vector machines (LS-SVM were used as an alternative multivariate calibration method for the simultaneous quantification of some common adulterants found in powdered milk samples, using near-infrared spectroscopy. Excellent models were built using LS-SVM for determining R², RMSECV and RMSEP values. LS-SVMs show superior performance for quantifying starch, whey and sucrose in powdered milk samples in relation to PLSR. This study shows that it is possible to determine precisely the amount of one and two common adulterants simultaneously in powdered milk samples using LS-SVM and NIR spectra.
Tully, P J; Turnbull, D A; Beltrame, J; Horowitz, J; Cosh, S; Baumeister, H; Wittert, G A
2015-10-01
Substantial healthcare resources are devoted to panic disorder (PD) and coronary heart disease (CHD); however, the association between these conditions remains controversial. Our objective was to conduct a systematic review of studies assessing the association between PD, related syndromes, and incident CHD. Relevant studies were retrieved from Medline, EMBASE, SCOPUS and PsycINFO without restrictions from inception to January 2015 supplemented with hand-searching. We included studies that reported hazard ratios (HR) or sufficient data to calculate the risk ratio and 95% confidence interval (CI) which were pooled using a random-effects model. Studies utilizing self-reported CHD were ineligible. Twelve studies were included comprising 1 131 612 persons and 58 111 incident CHD cases. PD was associated with the primary incident CHD endpoint [adjusted HR (aHR) 1.47, 95% CI 1.24-1.74, p < 0.00001] even after excluding angina (aHR 1.49, 95% CI 1.22-1.81, p < 0.00001). High to moderate quality evidence suggested an association with incident major adverse cardiac events (MACE; aHR 1.40, 95% CI 1.16-1.69, p = 0.0004) and myocardial infarction (aHR 1.36, 95% CI 1.12-1.66, p = 0.002). The risk for CHD was significant after excluding depression (aHR 1.64, 95% CI 1.45-1.85) and after depression adjustment (aHR 1.38, 95% CI 1.03-1.87). Age, sex, length of follow-up, socioeconomic status and diabetes were sources of heterogeneity in the primary endpoint. Meta-analysis showed that PD was independently associated with incident CHD, myocardial infarction and MACE; however, reverse causality cannot be ruled out and there was evidence of heterogeneity.
Multivariable modeling and multivariate analysis for the behavioral sciences
Everitt, Brian S
2009-01-01
Multivariable Modeling and Multivariate Analysis for the Behavioral Sciences shows students how to apply statistical methods to behavioral science data in a sensible manner. Assuming some familiarity with introductory statistics, the book analyzes a host of real-world data to provide useful answers to real-life issues.The author begins by exploring the types and design of behavioral studies. He also explains how models are used in the analysis of data. After describing graphical methods, such as scatterplot matrices, the text covers simple linear regression, locally weighted regression, multip
Directory of Open Access Journals (Sweden)
Atousa Fakherpour
2018-01-01
Full Text Available Background and Aims: Although spinal anaesthesia (SA is nowadays the preferred anaesthesia technique for caesarean section (CS, it is associated with considerable haemodynamic effects, such as maternal hypotension. This study aimed to evaluate a wide range of variables (related to parturient and anaesthesia techniques associated with the incidence of different degrees of SA-induced hypotension during elective CS. Methods: This prospective study was conducted on 511 mother–infant pairs, in which the mother underwent elective CS under SA. The data were collected through preset proforma containing three parts related to the parturient, anaesthetic techniques and a table for recording maternal blood pressure. It was hypothesized that some maternal (such as age and anaesthesia-related risk factors (such as block height were associated with occurance of SA-induced hypotension during elective CS. Results: The incidence of mild, moderate and severe hypotension was 20%, 35% and 40%, respectively. Eventually, ten risk factors were found to be associated with hypotension, including age >35 years, body mass index ≥25 kg/m2, 11–20 kg weight gain, gravidity ≥4, history of hypotension, baseline systolic blood pressure (SBP 100 beats/min in maternal modelling, fluid preloading ≥1000 ml, adding sufentanil to bupivacaine and sensory block height >T4in anaesthesia-related modelling (P < 0.05. Conclusion: Age, body mass index, weight gain, gravidity, history of hypotension, baseline SBP and heart rate, fluid preloading, adding sufentanil to bupivacaine and sensory block hieght were the main risk factors identified in the study for SA-induced hypotension during CS.
Van Damme, Nele; Van den Bussche, Karen; De Meyer, Dorien; Van Hecke, Ann; Verhaeghe, Sofie; Beeckman, Dimitri
2017-10-01
The aim of this study was to identify characteristics independently associated with a higher risk of developing skin damage because of incontinence [incontinence-associated dermatitis (IAD) category 2] in nursing home residents. As part of a larger randomised controlled trial, IAD incidence was monitored for 1 month in a sample of 381 incontinent residents using a validated IAD Severity Categorisation Tool. Data on demographical, physical, functional and psychological characteristics were collected. The overall IAD incidence (category 1-2) was 30·0%, and 6% of the participants developed skin damage (IAD category 2). Residents who developed IAD category 2 were less mobile [odds ratio (OR) 2·72, 95% confidence interval (CI) 1·06-6·94], had more friction and shear issues (OR 2·54; 95% CI 1·02-6·33) and had more erythema due to incontinence (OR 3·02; 95% CI 1·04-8·73) before IAD category 2 occurrence. Care providers should give full attention to risk factors to both detect residents at risk for IAD development and to start prevention in time. © 2016 Medicalhelplines.com Inc and John Wiley & Sons Ltd.
Energy Technology Data Exchange (ETDEWEB)
Lima, Reginaldo Agapito de [Centro Universitario de Itajuba, MG (Brazil)], email: reginaldo_agapito@yahoo.com.br; Ribeiro Junior, Leopoldo Uberto [Voltalia Energia do Brasil, Sao Paulo, SP (Brazil)], email: leopoldo_junior@yahoo.com.br
2010-07-01
For implantation of a SHP, the barrage is the main structure where its sizing represents from 30% - 50% of general cost of civil works. Considering this it is very important to have a fast, didactic and accurate tool for elaborating a budget, also allowing a quantitative analysis of inherent cost for civil building of barrages concrete made for small hydropower plants. In face of this, the multi changing regression tool is very important as it allows a fast and correct establishing of preliminary costs, even approximate, for estimates of barrages in concrete cost, enabling to ease the budget, guiding feasibility decisions for selecting or neglecting new alternatives of fall. (author)
Introduction to multivariate discrimination
Directory of Open Access Journals (Sweden)
Kégl Balázs
2013-07-01
Full Text Available Multivariate discrimination or classification is one of the best-studied problem in machine learning, with a plethora of well-tested and well-performing algorithms. There are also several good general textbooks [1–9] on the subject written to an average engineering, computer science, or statistics graduate student; most of them are also accessible for an average physics student with some background on computer science and statistics. Hence, instead of writing a generic introduction, we concentrate here on relating the subject to a practitioner experimental physicist. After a short introduction on the basic setup (Section 1 we delve into the practical issues of complexity regularization, model selection, and hyperparameter optimization (Section 2, since it is this step that makes high-complexity non-parametric fitting so different from low-dimensional parametric fitting. To emphasize that this issue is not restricted to classification, we illustrate the concept on a low-dimensional but non-parametric regression example (Section 2.1. Section 3 describes the common algorithmic-statistical formal framework that unifies the main families of multivariate classification algorithms. We explain here the large-margin principle that partly explains why these algorithms work. Section 4 is devoted to the description of the three main (families of classification algorithms, neural networks, the support vector machine, and AdaBoost. We do not go into the algorithmic details; the goal is to give an overview on the form of the functions these methods learn and on the objective functions they optimize. Besides their technical description, we also make an attempt to put these algorithm into a socio-historical context. We then briefly describe some rather heterogeneous applications to illustrate the pattern recognition pipeline and to show how widespread the use of these methods is (Section 5. We conclude the chapter with three essentially open research problems
Introduction to multivariate discrimination
Kégl, Balázs
2013-07-01
Multivariate discrimination or classification is one of the best-studied problem in machine learning, with a plethora of well-tested and well-performing algorithms. There are also several good general textbooks [1-9] on the subject written to an average engineering, computer science, or statistics graduate student; most of them are also accessible for an average physics student with some background on computer science and statistics. Hence, instead of writing a generic introduction, we concentrate here on relating the subject to a practitioner experimental physicist. After a short introduction on the basic setup (Section 1) we delve into the practical issues of complexity regularization, model selection, and hyperparameter optimization (Section 2), since it is this step that makes high-complexity non-parametric fitting so different from low-dimensional parametric fitting. To emphasize that this issue is not restricted to classification, we illustrate the concept on a low-dimensional but non-parametric regression example (Section 2.1). Section 3 describes the common algorithmic-statistical formal framework that unifies the main families of multivariate classification algorithms. We explain here the large-margin principle that partly explains why these algorithms work. Section 4 is devoted to the description of the three main (families of) classification algorithms, neural networks, the support vector machine, and AdaBoost. We do not go into the algorithmic details; the goal is to give an overview on the form of the functions these methods learn and on the objective functions they optimize. Besides their technical description, we also make an attempt to put these algorithm into a socio-historical context. We then briefly describe some rather heterogeneous applications to illustrate the pattern recognition pipeline and to show how widespread the use of these methods is (Section 5). We conclude the chapter with three essentially open research problems that are either
Al-Khatib, Issam A; Abu Fkhidah, Ismail; Khatib, Jumana I; Kontogianni, Stamatia
2016-03-01
Forecasting of hospital solid waste generation is a critical challenge for future planning. The composition and generation rate of hospital solid waste in hospital units was the field where the proposed methodology of the present article was applied in order to validate the results and secure the outcomes of the management plan in national hospitals. A set of three multiple-variable regression models has been derived for estimating the daily total hospital waste, general hospital waste, and total hazardous waste as a function of number of inpatients, number of total patients, and number of beds. The application of several key indicators and validation procedures indicates the high significance and reliability of the developed models in predicting the hospital solid waste of any hospital. Methodology data were drawn from existent scientific literature. Also, useful raw data were retrieved from international organisations and the investigated hospitals' personnel. The primal generation outcomes are compared with other local hospitals and also with hospitals from other countries. The main outcome, which is the developed model results, are presented and analysed thoroughly. The goal is this model to act as leverage in the discussions among governmental authorities on the implementation of a national plan for safe hospital waste management in Palestine. © The Author(s) 2016.
Grégoire, G.
2014-12-01
The logistic regression originally is intended to explain the relationship between the probability of an event and a set of covariables. The model's coefficients can be interpreted via the odds and odds ratio, which are presented in introduction of the chapter. The observations are possibly got individually, then we speak of binary logistic regression. When they are grouped, the logistic regression is said binomial. In our presentation we mainly focus on the binary case. For statistical inference the main tool is the maximum likelihood methodology: we present the Wald, Rao and likelihoods ratio results and their use to compare nested models. The problems we intend to deal with are essentially the same as in multiple linear regression: testing global effect, individual effect, selection of variables to build a model, measure of the fitness of the model, prediction of new values… . The methods are demonstrated on data sets using R. Finally we briefly consider the binomial case and the situation where we are interested in several events, that is the polytomous (multinomial) logistic regression and the particular case of ordinal logistic regression.
Polynomial regression analysis and significance test of the regression function
International Nuclear Information System (INIS)
Gao Zhengming; Zhao Juan; He Shengping
2012-01-01
In order to analyze the decay heating power of a certain radioactive isotope per kilogram with polynomial regression method, the paper firstly demonstrated the broad usage of polynomial function and deduced its parameters with ordinary least squares estimate. Then significance test method of polynomial regression function is derived considering the similarity between the polynomial regression model and the multivariable linear regression model. Finally, polynomial regression analysis and significance test of the polynomial function are done to the decay heating power of the iso tope per kilogram in accord with the authors' real work. (authors)
Applied multivariate statistics with R
Zelterman, Daniel
2015-01-01
This book brings the power of multivariate statistics to graduate-level practitioners, making these analytical methods accessible without lengthy mathematical derivations. Using the open source, shareware program R, Professor Zelterman demonstrates the process and outcomes for a wide array of multivariate statistical applications. Chapters cover graphical displays, linear algebra, univariate, bivariate and multivariate normal distributions, factor methods, linear regression, discrimination and classification, clustering, time series models, and additional methods. Zelterman uses practical examples from diverse disciplines to welcome readers from a variety of academic specialties. Those with backgrounds in statistics will learn new methods while they review more familiar topics. Chapters include exercises, real data sets, and R implementations. The data are interesting, real-world topics, particularly from health and biology-related contexts. As an example of the approach, the text examines a sample from the B...
Bang, Casper N; Devereux, Richard B; Okin, Peter M
2014-01-01
Cornell product criteria, Sokolow-Lyon voltage criteria and electrocardiographic (ECG) strain (secondary ST-T abnormalities) are markers for left ventricular hypertrophy (LVH) and adverse prognosis in population studies. However, the relationship of regression of ECG LVH and strain during antihypertensive therapy to cardiovascular (CV) risk was unclear before the Losartan Intervention for Endpoint Reduction in Hypertension (LIFE) study. We reviewed findings on ECG LVH regression and strain over time in 9193 hypertensive patients with ECG LVH at baseline enrolled in the LIFE study. The composite endpoint of CV death, nonfatal MI, or stroke occurred in 1096 patients during 4.8±0.9years follow-up. In Cox multivariable models adjusting for randomized treatment, known risk factors including in-treatment blood pressure, and for severity ECG LVH by Cornell product and Sokolow-Lyon voltage, baseline ECG strain was associated with a 33% higher risk of the LIFE composite endpoint (HR. 1.33, 95% CI [1.11-1.59]). Development of new ECG strain between baseline and year-1 was associated with a 2-fold increased risk of the composite endpoint (HR. 2.05, 95% CI [1.51-2.78]), whereas the risk associated with regression or persistence of ECG strain was attenuated and no longer statistically significant (both p>0.05). After controlling for treatment with losartan or atenolol, for baseline Framingham risk score, Cornell product, and Sokolow-Lyon voltage, and for baseline and in-treatment systolic and diastolic blood pressure, 1 standard deviation (SD) lower in-treatment Cornell product was associated with a 14.5% decrease in the composite endpoint (HR. 0.86, 95% CI [0.82-0.90]). In a parallel analysis, 1 SD lower in-treatment Sokolow-Lyon voltage was associated with a 16.6% decrease in the composite endpoint (HR. 0.83, 95% CI [0.78-0.88]). The LIFE study shows that evaluation of both baseline and in-study ECG LVH defined by Cornell product criteria, Sokolow-Lyon voltage criteria or
Directory of Open Access Journals (Sweden)
Igor K. Kochanenko
2013-01-01
Full Text Available Procedures of construction of curve regress by criterion of the least fractals, i.e. the greatest probability of the sums of degrees of the least deviations measured intensity from their modelling values are proved. The exponent is defined as fractal dimension of a time number. The difference of results of a well-founded method and a method of the least squares is quantitatively estimated.
Sunspot Cycle Prediction Using Multivariate Regression and Binary ...
Indian Academy of Sciences (India)
49
engineering decision making. In the present study, the sunspot cycle prediction has been carried out by a hybrid model which employs ...... 6) Dikpati M, De Toma G and Gilman P A 2006 Predicting the strength of solar cycle 24 using a flux-transport dynamo-based tool; Geophys Res lett. 33(5). L05102. 7) Drecher P E, Little ...
Sunspot Cycle Prediction Using Multivariate Regression and Binary ...
Indian Academy of Sciences (India)
49
The flare index correlates well with various parameters of the solar activity. ... Neural networks. In climatological method(Osherovich et al 2008; Wang et al 2002), forecasts assume that the future of a system can be determined from the ... Finally neural network forecasts(Quassim et al 2007; Parsapoor et al 2015) are.
Directory of Open Access Journals (Sweden)
Mok Tik
2014-06-01
Full Text Available This study formulates regression of vector data that will enable statistical analysis of various geodetic phenomena such as, polar motion, ocean currents, typhoon/hurricane tracking, crustal deformations, and precursory earthquake signals. The observed vector variable of an event (dependent vector variable is expressed as a function of a number of hypothesized phenomena realized also as vector variables (independent vector variables and/or scalar variables that are likely to impact the dependent vector variable. The proposed representation has the unique property of solving the coefficients of independent vector variables (explanatory variables also as vectors, hence it supersedes multivariate multiple regression models, in which the unknown coefficients are scalar quantities. For the solution, complex numbers are used to rep- resent vector information, and the method of least squares is deployed to estimate the vector model parameters after transforming the complex vector regression model into a real vector regression model through isomorphism. Various operational statistics for testing the predictive significance of the estimated vector parameter coefficients are also derived. A simple numerical example demonstrates the use of the proposed vector regression analysis in modeling typhoon paths.
Adaptive metric kernel regression
DEFF Research Database (Denmark)
Goutte, Cyril; Larsen, Jan
2000-01-01
Kernel smoothing is a widely used non-parametric pattern recognition technique. By nature, it suffers from the curse of dimensionality and is usually difficult to apply to high input dimensions. In this contribution, we propose an algorithm that adapts the input metric used in multivariate...... regression by minimising a cross-validation estimate of the generalisation error. This allows to automatically adjust the importance of different dimensions. The improvement in terms of modelling performance is illustrated on a variable selection task where the adaptive metric kernel clearly outperforms...
Adaptive Metric Kernel Regression
DEFF Research Database (Denmark)
Goutte, Cyril; Larsen, Jan
1998-01-01
Kernel smoothing is a widely used nonparametric pattern recognition technique. By nature, it suffers from the curse of dimensionality and is usually difficult to apply to high input dimensions. In this paper, we propose an algorithm that adapts the input metric used in multivariate regression...... by minimising a cross-validation estimate of the generalisation error. This allows one to automatically adjust the importance of different dimensions. The improvement in terms of modelling performance is illustrated on a variable selection task where the adaptive metric kernel clearly outperforms the standard...
Constipation and Incident CKD.
Sumida, Keiichi; Molnar, Miklos Z; Potukuchi, Praveen K; Thomas, Fridtjof; Lu, Jun Ling; Matsushita, Kunihiro; Yamagata, Kunihiro; Kalantar-Zadeh, Kamyar; Kovesdy, Csaba P
2017-04-01
Constipation is one of the most prevalent conditions in primary care settings and increases the risk of cardiovascular disease, potentially through processes mediated by altered gut microbiota. However, little is known about the association of constipation with CKD. In a nationwide cohort of 3,504,732 United States veterans with an eGFR ≥60 ml/min per 1.73 m 2 , we examined the association of constipation status and severity (absent, mild, or moderate/severe), defined using diagnostic codes and laxative use, with incident CKD, incident ESRD, and change in eGFR in Cox models (for time-to-event analyses) and multinomial logistic regression models (for change in eGFR). Among patients, the mean (SD) age was 60.0 (14.1) years old; 93.2% of patients were men, and 24.7% were diabetic. After multivariable adjustments, compared with patients without constipation, patients with constipation had higher incidence rates of CKD (hazard ratio, 1.13; 95% confidence interval [95% CI], 1.11 to 1.14) and ESRD (hazard ratio, 1.09; 95% CI, 1.01 to 1.18) and faster eGFR decline (multinomial odds ratios for eGFR slope constipation associated with an incrementally higher risk for each renal outcome. In conclusion, constipation status and severity associate with higher risk of incident CKD and ESRD and with progressive eGFR decline, independent of known risk factors. Further studies should elucidate the underlying mechanisms. Copyright © 2017 by the American Society of Nephrology.
Directory of Open Access Journals (Sweden)
Susana de Paula Risso
2011-06-01
retrieved data obtained from the Brazilian Birth and Death Certificates of neonates born to mothers living in São José dos Campos, Brazil, from 2003 up to 2004. Variables associated to neonatal death were analyzed by multivariate analysis using the Cox model. Independent variables were: maternal age, maternal educational level, number of previous stillbirths, number of children alive in the family, single or multiple pregnancy, gestation length, type of delivery, sex, birth weight, 1st and 5th minute Apgar scores. Significance was set at p<0.05 RESULTS: There were 131 deaths up to the 28th day after birth during the study period. Results were expressed in relative risk (RR and 95% confidence intervals (CI. Gestational age <37 weeks (RR 6.92; 95%CI 3.64-13.17, 5th minute Apgar score <7 (RR 3.14; 95%CI 1.95-5.04, 1st minute Apgar score <7 (RR 3.48; CI 2.17-5.60 and low birth weight (RR 4.49; 95%CI 3.36-8.53 were associated with neonatal death in the final model. CONCLUSIONS: Variables associated with neonatal death in São José dos Campos, Brazil, are related to quality of health care during prenatal and perinatal periods.
Estrogen receptor polymorphisms and incident dementia: the prospective 3C study.
Ryan, Joanne; Carrière, Isabelle; Carcaillon, Laure; Dartigues, Jean-Francois; Auriacombe, Sophie; Rouaud, Olivier; Berr, Claudine; Ritchie, Karen; Scarabin, Pierre-Yves; Ancelin, Marie-Laure
2014-01-01
International audience; BACKGROUND: Genetic variation in the estrogen receptor (ESR) may be associated with the incidence of Alzheimer's disease (AD), but this association could be modified by genetic and environmental factors. METHODS: The association between five ESR α (ESR1) and β (ESR2) polymorphisms with 7-year dementia incidence was examined among 6959 older men and women from the Three City Study using multivariate-adjusted Cox regression models with delayed entry. Gender, the apolipop...
Continuous multivariate exponential extension
International Nuclear Information System (INIS)
Block, H.W.
1975-01-01
The Freund-Weinman multivariate exponential extension is generalized to the case of nonidentically distributed marginal distributions. A fatal shock model is given for the resulting distribution. Results in the bivariate case and the concept of constant multivariate hazard rate lead to a continuous distribution related to the multivariate exponential distribution (MVE) of Marshall and Olkin. This distribution is shown to be a special case of the extended Freund-Weinman distribution. A generalization of the bivariate model of Proschan and Sullo leads to a distribution which contains both the extended Freund-Weinman distribution and the MVE
Methods of Multivariate Analysis
Rencher, Alvin C
2012-01-01
Praise for the Second Edition "This book is a systematic, well-written, well-organized text on multivariate analysis packed with intuition and insight . . . There is much practical wisdom in this book that is hard to find elsewhere."-IIE Transactions Filled with new and timely content, Methods of Multivariate Analysis, Third Edition provides examples and exercises based on more than sixty real data sets from a wide variety of scientific fields. It takes a "methods" approach to the subject, placing an emphasis on how students and practitioners can employ multivariate analysis in real-life sit
Multivariate Time Series Search
National Aeronautics and Space Administration — Multivariate Time-Series (MTS) are ubiquitous, and are generated in areas as disparate as sensor recordings in aerospace systems, music and video streams, medical...
DEFF Research Database (Denmark)
Silvennoinen, Annastiina; Teräsvirta, Timo
This article contains a review of multivariate GARCH models. Most common GARCH models are presented and their properties considered. This also includes nonparametric and semiparametric models. Existing specification and misspecification tests are discussed. Finally, there is an empirical example...
Multivariate data analysis of 2 DE data
DEFF Research Database (Denmark)
Wulff, Tune; Jokumsen, Alfred; Jessen, Flemming
achieved by 2-DE. Protein spots, which individually or in combination with other spots varied according to hypoxia were found by multivariate data analysis (partial least squares regression) on group scaled data (normalised spot volumes) followed by selection of significant spots by jack-knifing. Tandem...
Prediction of longitudinal dispersion coefficient using multivariate ...
Indian Academy of Sciences (India)
and sn is sinuosity. The Buckingham theory was applied as dimensional analysis approach to derive effective dimensionless parameter on DL. Derived ...... 2015 Estimation of scour depth below free overfall spill- ways using multivariate adaptive regression splines and artificial neural networks; Engineering Applications of.
A MULTIVARIATE ANALYSIS OF CROATIAN COUNTIES ENTREPRENEURSHIP
Directory of Open Access Journals (Sweden)
Elza Jurun
2012-12-01
Full Text Available In the focus of this paper is a multivariate analysis of Croatian Counties entrepreneurship. Complete data base available by official statistic institutions at national and regional level is used. Modern econometric methodology starting from a comparative analysis via multiple regression to multivariate cluster analysis is carried out as well as the analysis of successful or inefficacious entrepreneurship measured by indicators of efficiency, profitability and productivity. Time horizons of the comparative analysis are in 2004 and 2010. Accelerators of socio-economic development - number of entrepreneur investors, investment in fixed assets and current assets ratio in multiple regression model are analytically filtered between twenty-six independent variables as variables of the dominant influence on GDP per capita in 2010 as dependent variable. Results of multivariate cluster analysis of twentyone Croatian Counties are interpreted also in the sense of three Croatian NUTS 2 regions according to European nomenclature of regional territorial division of Croatia.
International Nuclear Information System (INIS)
Prybutok, V.R.
1995-01-01
Risk associated with power generation must be identified to make intelligent choices between alternate power technologies. Radionuclide air stack emissions for a single coal plant and a single nuclear plant are used to compute the single plant leukemia incidence risk and total industry leukemia incidence risk. Leukemia incidence is the response variable as a function of radionuclide bone dose for the six proposed dose response curves considered. During normal operation a coal plant has higher radionuclide emissions than a nuclear plant and the coal industry has a higher leukaemia incidence risk than the nuclear industry, unless a nuclear accident occurs. Variation of nuclear accident size allows quantification of the impact of accidents on the total industry leukemia incidence risk comparison. The leukemia incidence risk is quantified as the number of accidents of a given size for the nuclear industry leukemia incidence risk to equal the coal industry leukemia incidence risk. The general linear model is used to develop equations that relate the accident frequency required for equal industry risks to the magnitude of the nuclear emission. Exploratory data analysis revealed that the relationship between the natural log of accident number versus the natural log of accident size is linear. (Author)
Applied multivariate statistical analysis
Härdle, Wolfgang Karl
2015-01-01
Focusing on high-dimensional applications, this 4th edition presents the tools and concepts used in multivariate data analysis in a style that is also accessible for non-mathematicians and practitioners. It surveys the basic principles and emphasizes both exploratory and inferential statistics; a new chapter on Variable Selection (Lasso, SCAD and Elastic Net) has also been added. All chapters include practical exercises that highlight applications in different multivariate data analysis fields: in quantitative financial studies, where the joint dynamics of assets are observed; in medicine, where recorded observations of subjects in different locations form the basis for reliable diagnoses and medication; and in quantitative marketing, where consumers’ preferences are collected in order to construct models of consumer behavior. All of these examples involve high to ultra-high dimensions and represent a number of major fields in big data analysis. The fourth edition of this book on Applied Multivariate ...
Multivariate bubbles and antibubbles
Fry, John
2014-08-01
In this paper we develop models for multivariate financial bubbles and antibubbles based on statistical physics. In particular, we extend a rich set of univariate models to higher dimensions. Changes in market regime can be explicitly shown to represent a phase transition from random to deterministic behaviour in prices. Moreover, our multivariate models are able to capture some of the contagious effects that occur during such episodes. We are able to show that declining lending quality helped fuel a bubble in the US stock market prior to 2008. Further, our approach offers interesting insights into the spatial development of UK house prices.
Directory of Open Access Journals (Sweden)
Philippe Michel
Full Text Available The study objectives were to describe the incidence and the nature of patient safety incidents (PSIs in primary care general practice settings, and to explore the association between these incidents and practice or organizational characteristics.GPs, randomly selected from a national influenza surveillance network (n = 800 across France, prospectively reported any incidents observed each day over a one-week period between May and July 2013. An incident was an event or circumstance that could have resulted, or did result, in harm to a patient, which the GP would not wish to recur. Primary outcome was the incidence of PSIs which was determined by counting reports per total number of patient encounters. Reports were categorized using existing taxonomies. The association with practice and organizational characteristics was calculated using a negative binomial regression model.127 GPs (participation rate 79% reported 317 incidents of which 270 were deemed to be a posteriori judged preventable, among 12,348 encounters. 77% had no consequences for the patient. The incidence of reported PSIs was 26 per 1000 patient encounters per week (95% CI [23‰ -28‰]. Incidents were three times more frequently related to the organization of healthcare than to knowledge and skills of health professionals, and especially to the workflow in the GPs' offices and to the communication between providers and with patients. Among GP characteristics, three were related with an increased incidence in the final multivariable model: length of consultation higher than 15 minutes, method of receiving radiological results (by fax compared to paper or email, and being in a multidisciplinary clinic compared with sole practitioners.Patient safety incidents (PSIs occurred in mean once every two days in the sampled GPs and 2% of them were associated with a definite possibility for harm. Studying the association between organizational features of general practices and PSIs remains a
DEFF Research Database (Denmark)
Hansen, Michael Adsetts Edberg
Interest in statistical methodology is increasing so rapidly in the astronomical community that accessible introductory material in this area is long overdue. This book fills the gap by providing a presentation of the most useful techniques in multivariate statistics. A wide-ranging annotated set...
Directional quantile regression in R
Czech Academy of Sciences Publication Activity Database
Boček, Pavel; Šiman, Miroslav
2017-01-01
Roč. 53, č. 3 (2017), s. 480-492 ISSN 0023-5954 R&D Projects: GA ČR GA14-07234S Institutional support: RVO:67985556 Keywords : multivariate quantile * regression quantile * halfspace depth * depth contour Subject RIV: BD - Theory of Information OBOR OECD: Applied mathematics Impact factor: 0.379, year: 2016 http://library.utia.cas.cz/separaty/2017/SI/bocek-0476587.pdf
Skopina, Maria; Protasov, Vladimir
2016-01-01
This book presents a systematic study of multivariate wavelet frames with matrix dilation, in particular, orthogonal and bi-orthogonal bases, which are a special case of frames. Further, it provides algorithmic methods for the construction of dual and tight wavelet frames with a desirable approximation order, namely compactly supported wavelet frames, which are commonly required by engineers. It particularly focuses on methods of constructing them. Wavelet bases and frames are actively used in numerous applications such as audio and graphic signal processing, compression and transmission of information. They are especially useful in image recovery from incomplete observed data due to the redundancy of frame systems. The construction of multivariate wavelet frames, especially bases, with desirable properties remains a challenging problem as although a general scheme of construction is well known, its practical implementation in the multidimensional setting is difficult. Another important feature of wavelet is ...
Multivariate calculus and geometry
Dineen, Seán
2014-01-01
Multivariate calculus can be understood best by combining geometric insight, intuitive arguments, detailed explanations and mathematical reasoning. This textbook has successfully followed this programme. It additionally provides a solid description of the basic concepts, via familiar examples, which are then tested in technically demanding situations. In this new edition the introductory chapter and two of the chapters on the geometry of surfaces have been revised. Some exercises have been replaced and others provided with expanded solutions. Familiarity with partial derivatives and a course in linear algebra are essential prerequisites for readers of this book. Multivariate Calculus and Geometry is aimed primarily at higher level undergraduates in the mathematical sciences. The inclusion of many practical examples involving problems of several variables will appeal to mathematics, science and engineering students.
Multivariate rational data fitting
Cuyt, Annie; Verdonk, Brigitte
1992-12-01
Sections 1 and 2 discuss the advantages of an object-oriented implementation combined with higher floating-point arithmetic, of the algorithms available for multivariate data fitting using rational functions. Section 1 will in particular explain what we mean by "higher arithmetic". Section 2 will concentrate on the concepts of "object orientation". In sections 3 and 4 we shall describe the generality of the data structure that can be dealt with: due to some new results virtually every data set is acceptable right now, with possible coalescence of coordinates or points. In order to solve the multivariate rational interpolation problem the data sets are fed to different algorithms depending on the structure of the interpolation points in then-variate space.
Multivariate Statistical Process Control
DEFF Research Database (Denmark)
Kulahci, Murat
2013-01-01
As sensor and computer technology continues to improve, it becomes a normal occurrence that we confront with high dimensional data sets. As in many areas of industrial statistics, this brings forth various challenges in statistical process control (SPC) and monitoring for which the aim...... is to identify “out-of-control” state of a process using control charts in order to reduce the excessive variation caused by so-called assignable causes. In practice, the most common method of monitoring multivariate data is through a statistic akin to the Hotelling’s T2. For high dimensional data with excessive...... in conjunction with image data are plagued with various challenges beyond the usual ones encountered in current applications. In this presentation we will introduce the basic ideas of SPC and the multivariate control charts commonly used in industry. We will further discuss the challenges the practitioners...
Intelligent multivariate process supervision
International Nuclear Information System (INIS)
Visuri, Pertti.
1986-01-01
This thesis addresses the difficulties encountered in managing large amounts of data in supervisory control of complex systems. Some previous alarm and disturbance analysis concepts are reviewed and a method for improving the supervision of complex systems is presented. The method, called multivariate supervision, is based on adding low level intelligence to the process control system. By using several measured variables linked together by means of deductive logic, the system can take into account the overall state of the supervised system. Thus, it can present to the operators fewer messages with higher information content than the conventional control systems which are based on independent processing of each variable. In addition, the multivariate method contains a special information presentation concept for improving the man-machine interface. (author)
DEFF Research Database (Denmark)
Barndorff-Nielsen, Ole; Hansen, Peter Reinhard; Lunde, Asger
We propose a multivariate realised kernel to estimate the ex-post covariation of log-prices. We show this new consistent estimator is guaranteed to be positive semi-definite and is robust to measurement noise of certain types and can also handle non-synchronous trading. It is the first estimator...... returns measured over 5 or 10 minutes intervals. We show the new estimator is substantially more precise....
DEFF Research Database (Denmark)
Barndorff-Nielsen, Ole Eiler; Hansen, Peter Reinhard; Lunde, Asger
2011-01-01
We propose a multivariate realised kernel to estimate the ex-post covariation of log-prices. We show this new consistent estimator is guaranteed to be positive semi-definite and is robust to measurement error of certain types and can also handle non-synchronous trading. It is the first estimator...... returns measured over 5 or 10 min intervals. We show that the new estimator is substantially more precise....
Multivariate interval-censored survival data
DEFF Research Database (Denmark)
Hougaard, Philip
2014-01-01
, derived from the L and R points. Asymptotic results are simple for the former and complicated for the latter. This paper is a review describing the extension to multivariate data, like eruption times for teeth examined at visits to the dentist. Parametric models extend easily to multivariate data. However......-parametric model for the marginal distribution. These three models are compared and discussed. Furthermore, extension to regression models is considered. The semi-parametric approach may be sensible in many cases, as it is more flexible than the parametric models, and it avoids some technical difficulties...
Some Simple Procedures for Handling Missing Data in Multivariate Analysis
Frane, James W.
1976-01-01
Several procedures are outlined for replacing missing values in multivariate analyses by regression values obtained in various ways, and for adjusting coefficients (such as factor score coefficients) when data are missing. None of the procedures are complex or expensive. (Author)
Pedrini, D. T.; Pedrini, Bonnie C.
Regression, another mechanism studied by Sigmund Freud, has had much research, e.g., hypnotic regression, frustration regression, schizophrenic regression, and infra-human-animal regression (often directly related to fixation). Many investigators worked with hypnotic age regression, which has a long history, going back to Russian reflexologists.…
Control Multivariable por Desacoplo
Directory of Open Access Journals (Sweden)
Fernando Morilla
2013-01-01
Full Text Available Resumen: La interacción entre variables es una característica inherente de los procesos multivariables, que dificulta su operación y el diseño de sus sistemas de control. Bajo el paradigma de Control por desacoplo se agrupan un conjunto de metodologías, que tradicionalmente han estado orientadas a eliminar o reducir la interacción, y que recientemente algunos investigadores han reorientado con objetivos de solucionar un problema tan complejo como es el control multivariable. Parte del material descrito en este artículo es bien conocido en el campo del control de procesos, pero la mayor parte de él son resultados de varios años de investigación de los autores en los que han primado la generalización del problema, la búsqueda de soluciones de fácil implementación y la combinación de bloques elementales de control PID. Esta conjunción de intereses provoca que no siempre se pueda conseguir un desacoplo perfecto, pero que sí se pueda conseguir una considerable reducción de la interacción en el nivel básico de la pirámide de control, en beneficio de otros sistemas de control que ocupan niveles jerárquicos superiores. El artículo resume todos los aspectos básicos del Control por desacoplo y su aplicación a dos procesos representativos: una planta experimental de cuatro tanques acoplados y un modelo 4×4 de un sistema experimental de calefacción, ventilación y aire acondicionado. Abstract: The interaction between variables is inherent in multivariable processes and this fact may complicate their operation and control system design. Under the paradigm of decoupling control, several methodologies that traditionally have been addressed to cancel or reduce the interactions are gathered. Recently, this approach has been reoriented by several researchers with the aim to solve such a complex problem as the multivariable control. Parts of the material in this work are well known in the process control field; however, most of them are
Energy Technology Data Exchange (ETDEWEB)
Crawfis, R.A.
1996-03-01
This paper presents a new technique for representing multivalued data sets defined on an integer lattice. It extends the state-of-the-art in volume rendering to include nonhomogeneous volume representations. That is, volume rendering of materials with very fine detail (e.g. translucent granite) within a voxel. Multivariate volume rendering is achieved by introducing controlled amounts of noise within the volume representation. Varying the local amount of noise within the volume is used to represent a separate scalar variable. The technique can also be used in image synthesis to create more realistic clouds and fog.
Luo, Chongliang; Liu, Jin; Dey, Dipak K; Chen, Kun
2016-07-01
In many fields, multi-view datasets, measuring multiple distinct but interrelated sets of characteristics on the same set of subjects, together with data on certain outcomes or phenotypes, are routinely collected. The objective in such a problem is often two-fold: both to explore the association structures of multiple sets of measurements and to develop a parsimonious model for predicting the future outcomes. We study a unified canonical variate regression framework to tackle the two problems simultaneously. The proposed criterion integrates multiple canonical correlation analysis with predictive modeling, balancing between the association strength of the canonical variates and their joint predictive power on the outcomes. Moreover, the proposed criterion seeks multiple sets of canonical variates simultaneously to enable the examination of their joint effects on the outcomes, and is able to handle multivariate and non-Gaussian outcomes. An efficient algorithm based on variable splitting and Lagrangian multipliers is proposed. Simulation studies show the superior performance of the proposed approach. We demonstrate the effectiveness of the proposed approach in an [Formula: see text] intercross mice study and an alcohol dependence study. © The Author 2016. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Precision Index in the Multivariate Context
Czech Academy of Sciences Publication Activity Database
Šiman, Miroslav
2014-01-01
Roč. 43, č. 2 (2014), s. 377-387 ISSN 0361-0926 R&D Projects: GA MŠk(CZ) 1M06047 Institutional support: RVO:67985556 Keywords : data depth * multivariate quantile * process capability index * precision index * regression quantile Subject RIV: BA - General Mathematics Impact factor: 0.274, year: 2014 http://library.utia.cas.cz/separaty/2014/SI/siman-0425059.pdf
Multivariable calculus with applications
Lax, Peter D
2017-01-01
This text in multivariable calculus fosters comprehension through meaningful explanations. Written with students in mathematics, the physical sciences, and engineering in mind, it extends concepts from single variable calculus such as derivative, integral, and important theorems to partial derivatives, multiple integrals, Stokes’ and divergence theorems. Students with a background in single variable calculus are guided through a variety of problem solving techniques and practice problems. Examples from the physical sciences are utilized to highlight the essential relationship between calculus and modern science. The symbiotic relationship between science and mathematics is shown by deriving and discussing several conservation laws, and vector calculus is utilized to describe a number of physical theories via partial differential equations. Students will learn that mathematics is the language that enables scientific ideas to be precisely formulated and that science is a source for the development of mathemat...
Analog multivariate counting analyzers
Nikitin, A V; Armstrong, T P
2003-01-01
Characterizing rates of occurrence of various features of a signal is of great importance in numerous types of physical measurements. Such signal features can be defined as certain discrete coincidence events, e.g. crossings of a signal with a given threshold, or occurrence of extrema of a certain amplitude. We describe measuring rates of such events by means of analog multivariate counting analyzers. Given a continuous scalar or multicomponent (vector) input signal, an analog counting analyzer outputs a continuous signal with the instantaneous magnitude equal to the rate of occurrence of certain coincidence events. The analog nature of the proposed analyzers allows us to reformulate many problems of the traditional counting measurements, and cast them in a form which is readily addressed by methods of differential calculus rather than by algebraic or logical means of digital signal processing. Analog counting analyzers can be easily implemented in discrete or integrated electronic circuits, do not suffer fro...
Walton, Joseph M.; And Others
1978-01-01
Ridge regression is an approach to the problem of large standard errors of regression estimates of intercorrelated regressors. The effect of ridge regression on the estimated squared multiple correlation coefficient is discussed and illustrated. (JKS)
Multivariate analysis of hydrophobic descriptors
Directory of Open Access Journals (Sweden)
Stefan Dove
2014-04-01
Full Text Available Multivariate approaches like principal component analysis (PCA are powerful tools to investigate hydrophobic descriptors and to discriminate between intrinsic hydrophobicity and polar contributions as hydrogen bonds and other electronic effects. PCA of log P values measured for 37 solutes in eight solvent-water systems and of hydrophobic octanol-water substituent constants p for 25 meta- and para-substituents from seven phenyl series were performed (re-analysis of previous work. In both cases, the descriptors are reproduced within experimental errors by two principal components, an intrinsic hydrophobic component and a second component accounting for differences between the systems due to electronic interactions. Underlying effects were identified by multiple linear regression analysis. Log P values depend on the water solubility of the solvents and hydrogen bonding capabilities of both the solute and the solvents. Results indicate different impacts of hydrogen bonds in nonpolar and polar solvent-water systems on log P and their dependence on isotropic and hydrated surface areas. In case of the p-values, the second component (loadings and scores correlates with electronic substituent constants. More detailed analysis of the data as p-values of disubstituted benzenes XPhY has led to extended symmetric bilinear Hammett-type models relating interaction increments to cross products pX sY, pY sX and sX sY which are mainly due to mutual effects on hydrogen-bonds with octanol.
Practical multivariate analysis
Afifi, Abdelmonem; Clark, Virginia A
2011-01-01
""First of all, it is very easy to read. … The authors manage to introduce and (at least partially) explain even quite complex concepts, e.g. eigenvalues, in an easy and pedagogical way that I suppose is attractive to readers without deeper statistical knowledge. The text is also sprinkled with references for those who want to probe deeper into a certain topic. Secondly, I personally find the book's emphasis on practical data handling very appealing. … Thirdly, the book gives very nice coverage of regression analysis. … this is a nicely written book that gives a good overview of a large number
Effects of shoulder dystocia training on the incidence of brachial plexus injury.
Inglis, Steven R; Feier, Nikolaus; Chetiyaar, Jyothi B; Naylor, Margaret H; Sumersille, Melanie; Cervellione, Kelly L; Predanic, Mladen
2011-04-01
We sought to determine whether implementation of shoulder dystocia training reduces the incidence of obstetric brachial plexus injury (OBPI). After implementing training for maternity staff, the incidence of OBPI was compared between pretraining and posttraining periods using both univariate and multivariate analyses in deliveries complicated by shoulder dystocia. The overall incidence of OBPI in vaginal deliveries decreased from 0.40% pretraining to 0.14% posttraining (P shoulder dystocia dropped from 30% to 10.67% posttraining (P shoulder dystocia training remained associated with reduced OBPI (P = .02) after logistic regression analysis. OBPI remained less in the posttraining period (P = .01), even after excluding all neonates with birthweights >2 SD above the mean. Shoulder dystocia training was associated with a lower incidence of OBPI and the incidence of OBPI in births complicated by shoulder dystocia. Copyright © 2011 Mosby, Inc. All rights reserved.
Estimation of National Colorectal-Cancer Incidence Using Claims Databases
Directory of Open Access Journals (Sweden)
C. Quantin
2012-01-01
Full Text Available Background. The aim of the study was to assess the accuracy of the colorectal-cancer incidence estimated from administrative data. Methods. We selected potential incident colorectal-cancer cases in 2004-2005 French administrative data, using two alternative algorithms. The first was based only on diagnostic and procedure codes, whereas the second considered the past history of the patient. Results of both methods were assessed against two corresponding local cancer registries, acting as “gold standards.” We then constructed a multivariable regression model to estimate the corrected total number of incident colorectal-cancer cases from the whole national administrative database. Results. The first algorithm provided an estimated local incidence very close to that given by the regional registries (646 versus 645 incident cases and had good sensitivity and positive predictive values (about 75% for both. The second algorithm overestimated the incidence by about 50% and had a poor positive predictive value of about 60%. The estimation of national incidence obtained by the first algorithm differed from that observed in 14 registries by only 2.34%. Conclusion. This study shows the usefulness of administrative databases for countries with no national cancer registry and suggests a method for correcting the estimates provided by these data.
Data fusion in multivariate calibration transfer.
Ni, Wangdong; Brown, Steven D; Man, Ruilin
2010-02-28
We report the use of stacked partial least-squares regression and stacked dual-domain regression analysis with four commonly used techniques for calibration transfer to improve predictive performance from transferred multivariate calibration models. The predictive performance from three conventional calibration transfer methods, piecewise direct standardization (PDS), orthogonal signal correction (OSC) and model updating (MUP), requiring standards measured on both instruments, was significantly improved from data fusion either by stacking of wavelet scales or by stacking of spectral intervals, as demonstrated by transfer of calibrations developed on near-infrared spectra of synthetic gasoline. Stacking did not produce as significant an improvement for calibration transfer using a finite impulse response (FIR) filter, but application of SPLS regression to FIR-transferred spectra improves predictive performance of the transferred model. Copyright 2010 Elsevier B.V. All rights reserved.
Multivariate methods and forecasting with IBM SPSS statistics
Aljandali, Abdulkader
2017-01-01
This is the second of a two-part guide to quantitative analysis using the IBM SPSS Statistics software package; this volume focuses on multivariate statistical methods and advanced forecasting techniques. More often than not, regression models involve more than one independent variable. For example, forecasting methods are commonly applied to aggregates such as inflation rates, unemployment, exchange rates, etc., that have complex relationships with determining variables. This book introduces multivariate regression models and provides examples to help understand theory underpinning the model. The book presents the fundamentals of multivariate regression and then moves on to examine several related techniques that have application in business-orientated fields such as logistic and multinomial regression. Forecasting tools such as the Box-Jenkins approach to time series modeling are introduced, as well as exponential smoothing and naïve techniques. This part also covers hot topics such as Factor Analysis, Dis...
Judy, Gregory D; Mosaly, Prithima R; Mazur, Lukasz M; Tracton, Gregg; Marks, Lawrence B; Chera, Bhishamjit S
2017-08-01
To identify factors associated with a near-miss or safety incident (NMSI) in patients undergoing radiotherapy and identify common root causes of NMSIs and their relationship with incident severity. We retrospectively studied NMSIs filed between October 2014 and April 2016. We extracted patient-, treatment-, and disease-specific data from patients with an NMSI (n = 200; incident group) and a similar group of control patients (n = 200) matched in time, without an NMSI. A root cause and incident severity were determined for each NMSI. Univariable and multivariable analyses were performed to determine which specific factors were contributing to NMSIs. Multivariable logistic regression was used to determine root causes of NMSIs and their relationship with incident severity. NMSIs were associated with the following factors: head and neck sites (odds ratio [OR], 5.2; P = .01), image-guided intensity-modulated radiotherapy (OR, 3; P = .009), daily imaging (OR, 7; P importance of a strong reporting system to support a safety culture.
Prospective surveillance of multivariate spatial disease data
Corberán-Vallet, A
2012-01-01
Surveillance systems are often focused on more than one disease within a predefined area. On those occasions when outbreaks of disease are likely to be correlated, the use of multivariate surveillance techniques integrating information from multiple diseases allows us to improve the sensitivity and timeliness of outbreak detection. In this article, we present an extension of the surveillance conditional predictive ordinate to monitor multivariate spatial disease data. The proposed surveillance technique, which is defined for each small area and time period as the conditional predictive distribution of those counts of disease higher than expected given the data observed up to the previous time period, alerts us to both small areas of increased disease incidence and the diseases causing the alarm within each area. We investigate its performance within the framework of Bayesian hierarchical Poisson models using a simulation study. An application to diseases of the respiratory system in South Carolina is finally presented. PMID:22534429
Logistic regression for dichotomized counts.
Preisser, John S; Das, Kalyan; Benecha, Habtamu; Stamm, John W
2016-12-01
Sometimes there is interest in a dichotomized outcome indicating whether a count variable is positive or zero. Under this scenario, the application of ordinary logistic regression may result in efficiency loss, which is quantifiable under an assumed model for the counts. In such situations, a shared-parameter hurdle model is investigated for more efficient estimation of regression parameters relating to overall effects of covariates on the dichotomous outcome, while handling count data with many zeroes. One model part provides a logistic regression containing marginal log odds ratio effects of primary interest, while an ancillary model part describes the mean count of a Poisson or negative binomial process in terms of nuisance regression parameters. Asymptotic efficiency of the logistic model parameter estimators of the two-part models is evaluated with respect to ordinary logistic regression. Simulations are used to assess the properties of the models with respect to power and Type I error, the latter investigated under both misspecified and correctly specified models. The methods are applied to data from a randomized clinical trial of three toothpaste formulations to prevent incident dental caries in a large population of Scottish schoolchildren. © The Author(s) 2014.
Multivariate statistics exercises and solutions
Härdle, Wolfgang Karl
2015-01-01
The authors present tools and concepts of multivariate data analysis by means of exercises and their solutions. The first part is devoted to graphical techniques. The second part deals with multivariate random variables and presents the derivation of estimators and tests for various practical situations. The last part introduces a wide variety of exercises in applied multivariate data analysis. The book demonstrates the application of simple calculus and basic multivariate methods in real life situations. It contains altogether more than 250 solved exercises which can assist a university teacher in setting up a modern multivariate analysis course. All computer-based exercises are available in the R language. All R codes and data sets may be downloaded via the quantlet download center www.quantlet.org or via the Springer webpage. For interactive display of low-dimensional projections of a multivariate data set, we recommend GGobi.
Boehmer, Ulrike; Miao, Xiaopeng; Maxwell, Nancy I; Ozonoff, Al
2014-03-26
Risk factors for breast, colorectal, and lung cancer are known to be more common among lesbian, gay, and bisexual (LGB) individuals, suggesting they may be more likely to develop these cancers. Our objective was to determine differences in cancer incidence by sexual orientation, using sexual orientation data aggregated at the county level. Data on cancer incidence were obtained from the California Cancer Registry and data on sexual orientation were obtained from the California Health Interview Survey, from which a measure of age-specific LGB population density by county was calculated. Using multivariable Poisson regression models, the association between the age-race-stratified incident rate of breast, lung and colorectal cancer in each county and LGB population density was examined, with race, age group and poverty as covariates. Among men, bisexual population density was associated with lower incidence of lung cancer and with higher incidence of colorectal cancer. Among women, lesbian population density was associated with lower incidence of lung and colorectal cancer and with higher incidence of breast cancer; bisexual population density was associated with higher incidence of lung and colorectal cancer and with lower incidence of breast cancer. These study findings clearly document links between county-level LGB population density and cancer incidence, illuminating an important public health disparity.
Atmospheric conditions, lunar phases, and childbirth: a multivariate analysis
Ochiai, Angela Megumi; Gonçalves, Fabio Luiz Teixeira; Ambrizzi, Tercio; Florentino, Lucia Cristina; Wei, Chang Yi; Soares, Alda Valeria Neves; De Araujo, Natalucia Matos; Gualda, Dulce Maria Rosa
2012-07-01
Our objective was to assess extrinsic influences upon childbirth. In a cohort of 1,826 days containing 17,417 childbirths among them 13,252 spontaneous labor admissions, we studied the influence of environment upon the high incidence of labor (defined by 75th percentile or higher), analyzed by logistic regression. The predictors of high labor admission included increases in outdoor temperature (odds ratio: 1.742, P = 0.045, 95%CI: 1.011 to 3.001), and decreases in atmospheric pressure (odds ratio: 1.269, P = 0.029, 95%CI: 1.055 to 1.483). In contrast, increases in tidal range were associated with a lower probability of high admission (odds ratio: 0.762, P = 0.030, 95%CI: 0.515 to 0.999). Lunar phase was not a predictor of high labor admission ( P = 0.339). Using multivariate analysis, increases in temperature and decreases in atmospheric pressure predicted high labor admission, and increases of tidal range, as a measurement of the lunar gravitational force, predicted a lower probability of high admission.
Regression analysis by example
Chatterjee, Samprit
2012-01-01
Praise for the Fourth Edition: ""This book is . . . an excellent source of examples for regression analysis. It has been and still is readily readable and understandable."" -Journal of the American Statistical Association Regression analysis is a conceptually simple method for investigating relationships among variables. Carrying out a successful application of regression analysis, however, requires a balance of theoretical results, empirical rules, and subjective judgment. Regression Analysis by Example, Fifth Edition has been expanded
Nonparametric modal regression
Chen, Yen-Chi; Genovese, Christopher R.; Tibshirani, Ryan J.; Wasserman, Larry
2016-01-01
Modal regression estimates the local modes of the distribution of $Y$ given $X=x$, instead of the mean, as in the usual regression sense, and can hence reveal important structure missed by usual regression methods. We study a simple nonparametric method for modal regression, based on a kernel density estimate (KDE) of the joint distribution of $Y$ and $X$. We derive asymptotic error bounds for this method, and propose techniques for constructing confidence sets and prediction sets. The latter...
Flexible survival regression modelling
DEFF Research Database (Denmark)
Cortese, Giuliana; Scheike, Thomas H; Martinussen, Torben
2009-01-01
Regression analysis of survival data, and more generally event history data, is typically based on Cox's regression model. We here review some recent methodology, focusing on the limitations of Cox's regression model. The key limitation is that the model is not well suited to represent time-varyi...
Model Checking Multivariate State Rewards
DEFF Research Database (Denmark)
Nielsen, Bo Friis; Nielson, Flemming; Nielson, Hanne Riis
2010-01-01
We consider continuous stochastic logics with state rewards that are interpreted over continuous time Markov chains. We show how results from multivariate phase type distributions can be used to obtain higher-order moments for multivariate state rewards (including covariance). We also generalise...
Processing data collected from radiometric experiments by multivariate technique
International Nuclear Information System (INIS)
Urbanski, P.; Kowalska, E.; Machaj, B.; Jakowiuk, A.
2005-01-01
Multivariate techniques applied for processing data collected from radiometric experiments can provide more efficient extraction of the information contained in the spectra. Several techniques are considered: (i) multivariate calibration using Partial Least Square Regression and Artificial Neural Network, (ii) standardization of the spectra, (iii) smoothing of collected spectra were autocorrelation function and bootstrap were used for the assessment of the processed data, (iv) image processing using Principal Component Analysis. Application of these techniques is illustrated on examples of some industrial applications. (author)
DEFF Research Database (Denmark)
Fitzenberger, Bernd; Wilke, Ralf Andreas
2015-01-01
if the mean regression model does not. We provide a short informal introduction into the principle of quantile regression which includes an illustrative application from empirical labor market research. This is followed by briefly sketching the underlying statistical model for linear quantile regression based......Quantile regression is emerging as a popular statistical approach, which complements the estimation of conditional mean models. While the latter only focuses on one aspect of the conditional distribution of the dependent variable, the mean, quantile regression provides more detailed insights...... by modeling conditional quantiles. Quantile regression can therefore detect whether the partial effect of a regressor on the conditional quantiles is the same for all quantiles or differs across quantiles. Quantile regression can provide evidence for a statistical relationship between two variables even...
MODELING SNAKE MICROHABITAT FROM RADIOTELEMETRY STUDIES USING POLYTOMOUS LOGISTIC REGRESSION
Multivariate analysis of snake microhabitat has historically used techniques that were derived under assumptions of normality and common covariance structure (e.g., discriminant function analysis, MANOVA). In this study, polytomous logistic regression (PLR which does not require ...
Distributed Monitoring of the R2 Statistic for Linear Regression
National Aeronautics and Space Administration — The problem of monitoring a multivariate linear regression model is relevant in studying the evolving relationship between a set of input variables (features) and...
A primer of multivariate statistics
Harris, Richard J
2014-01-01
Drawing upon more than 30 years of experience in working with statistics, Dr. Richard J. Harris has updated A Primer of Multivariate Statistics to provide a model of balance between how-to and why. This classic text covers multivariate techniques with a taste of latent variable approaches. Throughout the book there is a focus on the importance of describing and testing one's interpretations of the emergent variables that are produced by multivariate analysis. This edition retains its conversational writing style while focusing on classical techniques. The book gives the reader a feel for why
Incidence of Self-Reported Diabetes in New York City, 2002, 2004, and 2008
Chamany, Shadi; Driver, Cynthia R.; Kerker, Bonnie; Silver, Lynn
2012-01-01
Introduction Prevalence and incidence of diabetes among adults are increasing in the United States. The purpose of this study was to estimate the incidence of self-reported diabetes in New York City, examine factors associated with diabetes incidence, and estimate changes in the incidence over time. Methods We used data from the New York City Community Health Survey in 2002, 2004, and 2008 to estimate the age-adjusted incidence of self-reported diabetes among 24,384 adults aged 18 years or older. Multiple logistic regression analysis was performed to examine factors associated with incident diabetes. Results Survey results indicated that the age-adjusted incidence of diabetes per 1,000 population was 9.4 in 2002, 11.9 in 2004, and 8.6 in 2008. In multivariable-adjusted analysis, diabetes incidence was significantly associated with being aged 45 or older, being black or Hispanic, being overweight or obese, and having less than a high school diploma. Conclusion Our results suggest that the incidence of diabetes in New York City may be stabilizing. Age, black race, Hispanic ethnicity, elevated body mass index, and low educational attainment are risk factors for diabetes. Large-scale implementation of prevention efforts addressing obesity and sedentary lifestyle and targeting racial/ethnic minority groups and those with low educational attainment are essential to control diabetes in New York City. PMID:22698175
Introduction to regression graphics
Cook, R Dennis
2009-01-01
Covers the use of dynamic and interactive computer graphics in linear regression analysis, focusing on analytical graphics. Features new techniques like plot rotation. The authors have composed their own regression code, using Xlisp-Stat language called R-code, which is a nearly complete system for linear regression analysis and can be utilized as the main computer program in a linear regression course. The accompanying disks, for both Macintosh and Windows computers, contain the R-code and Xlisp-Stat. An Instructor's Manual presenting detailed solutions to all the problems in the book is ava
Alternative Methods of Regression
Birkes, David
2011-01-01
Of related interest. Nonlinear Regression Analysis and its Applications Douglas M. Bates and Donald G. Watts ".an extraordinary presentation of concepts and methods concerning the use and analysis of nonlinear regression models.highly recommend[ed].for anyone needing to use and/or understand issues concerning the analysis of nonlinear regression models." --Technometrics This book provides a balance between theory and practice supported by extensive displays of instructive geometrical constructs. Numerous in-depth case studies illustrate the use of nonlinear regression analysis--with all data s
Multivariate covariance generalized linear models
DEFF Research Database (Denmark)
Bonat, W. H.; Jørgensen, Bent
2016-01-01
We propose a general framework for non-normal multivariate data analysis called multivariate covariance generalized linear models, designed to handle multivariate response variables, along with a wide range of temporal and spatial correlation structures defined in terms of a covariance link...... function combined with a matrix linear predictor involving known matrices. The method is motivated by three data examples that are not easily handled by existing methods. The first example concerns multivariate count data, the second involves response variables of mixed types, combined with repeated...... measures and longitudinal structures, and the third involves a spatiotemporal analysis of rainfall data. The models take non-normality into account in the conventional way by means of a variance function, and the mean structure is modelled by means of a link function and a linear predictor. The models...
P.M.C. de Boer (Paul); C.M. Hafner (Christian)
2005-01-01
textabstractWe argue in this paper that general ridge (GR) regression implies no major complication compared with simple ridge regression. We introduce a generalization of an explicit GR estimator derived by Hemmerle and by Teekens and de Boer and show that this estimator, which is more
Ai, Zi-Sheng; Gao, You-Shui; Sun, Yuan; Liu, Yue; Zhang, Chang-Qing; Jiang, Cheng-Hua
2013-03-01
Risk factors for femoral neck fracture-induced avascular necrosis of the femoral head have not been elucidated clearly in middle-aged and elderly patients. Moreover, the high incidence of screw removal in China and its effect on the fate of the involved femoral head require statistical methods to reflect their intrinsic relationship. Ninety-nine patients older than 45 years with femoral neck fracture were treated by internal fixation between May 1999 and April 2004. Descriptive analysis, interaction analysis between associated factors, single factor logistic regression, multivariate logistic regression, and detailed interaction analysis were employed to explore potential relationships among associated factors. Avascular necrosis of the femoral head was found in 15 cases (15.2 %). Age × the status of implants (removal vs. maintenance) and gender × the timing of reduction were interactive according to two-factor interactive analysis. Age, the displacement of fractures, the quality of reduction, and the status of implants were found to be significant factors in single factor logistic regression analysis. Age, age × the status of implants, and the quality of reduction were found to be significant factors in multivariate logistic regression analysis. In fine interaction analysis after multivariate logistic regression analysis, implant removal was the most important risk factor for avascular necrosis in 56-to-85-year-old patients, with a risk ratio of 26.00 (95 % CI = 3.076-219.747). The middle-aged and elderly have less incidence of avascular necrosis of the femoral head following femoral neck fractures treated by cannulated screws. The removal of cannulated screws can induce a significantly high incidence of avascular necrosis of the femoral head in elderly patients, while a high-quality reduction is helpful to reduce avascular necrosis.
Drongelen AW van; Roszek B; Hilbers-Modderman ESM; Kallewaard M; Wassenaar C; LGM
2002-01-01
This RIVM study was performed to gain insight into wheelchair-related incidents with powered and manual wheelchairs reported to the USA FDA, the British MDA and the Dutch Center for Quality and Usability Research of Technical Aids (KBOH). The data in the databases do not indicate that incidents with
On directional multiple-output quantile regression
Czech Academy of Sciences Publication Activity Database
Paindaveine, D.; Šiman, Miroslav
2011-01-01
Roč. 102, č. 2 (2011), s. 193-212 ISSN 0047-259X R&D Projects: GA MŠk(CZ) 1M06047 Grant - others:Commision EC(BE) Fonds National de la Recherche Scientifique Institutional research plan: CEZ:AV0Z10750506 Keywords : multivariate quantile * quantile regression * multiple-output regression * halfspace depth * portfolio optimization * value -at risk Subject RIV: BA - General Mathematics Impact factor: 0.879, year: 2011 http://library.utia.cas.cz/separaty/2011/SI/siman-0364128.pdf
Weisberg, Sanford
2013-01-01
Praise for the Third Edition ""...this is an excellent book which could easily be used as a course text...""-International Statistical Institute The Fourth Edition of Applied Linear Regression provides a thorough update of the basic theory and methodology of linear regression modeling. Demonstrating the practical applications of linear regression analysis techniques, the Fourth Edition uses interesting, real-world exercises and examples. Stressing central concepts such as model building, understanding parameters, assessing fit and reliability, and drawing conclusions, the new edition illus
Hosmer, David W; Sturdivant, Rodney X
2013-01-01
A new edition of the definitive guide to logistic regression modeling for health science and other applications This thoroughly expanded Third Edition provides an easily accessible introduction to the logistic regression (LR) model and highlights the power of this model by examining the relationship between a dichotomous outcome and a set of covariables. Applied Logistic Regression, Third Edition emphasizes applications in the health sciences and handpicks topics that best suit the use of modern statistical software. The book provides readers with state-of-
An assessment on the use of bivariate, multivariate and soft ...
Indian Academy of Sciences (India)
Conditional probability (CP), logistic regression (LR) and artificial neural networks (ANN) models representing the bivariate, multivariate and soft computing techniques were used in GIS based collapse susceptibility mapping in an area from Sivas basin (Turkey). Collapse-related factors, directly or indirectly related to the ...
Directional quantile regression in Octave (and MATLAB)
Czech Academy of Sciences Publication Activity Database
Boček, Pavel; Šiman, Miroslav
2016-01-01
Roč. 52, č. 1 (2016), s. 28-51 ISSN 0023-5954 R&D Projects: GA ČR GA14-07234S Institutional support: RVO:67985556 Keywords : quantile regression * multivariate quantile * depth contour * Matlab Subject RIV: IN - Informatics, Computer Science Impact factor: 0.379, year: 2016 http://library.utia.cas.cz/separaty/2016/SI/bocek-0458380.pdf
Directory of Open Access Journals (Sweden)
2016-12-01
Full Text Available This paper is on data analysis strategy in a complex, multidimensional, and dynamic domain. The focus is on the use of data mining techniques to explore the importance of multivariate structures; using climate variables which influences climate change. Techniques involved in data mining exercise vary according to the data structures. The multivariate analysis strategy considered here involved choosing an appropriate tool to analyze a process. Factor analysis is introduced into data mining technique in order to reveal the influencing impacts of factors involved as well as solving for multicolinearity effect among the variables. The temporal nature and multidimensionality of the target variables is revealed in the model using multidimensional regression estimates. The strategy of integrating the method of several statistical techniques, using climate variables in Nigeria was employed. R2 of 0.518 was obtained from the ordinary least square regression analysis carried out and the test was not significant at 5% level of significance. However, factor analysis regression strategy gave a good fit with R2 of 0.811 and the test was significant at 5% level of significance. Based on this study, model building should go beyond the usual confirmatory data analysis (CDA, rather it should be complemented with exploratory data analysis (EDA in order to achieve a desired result.
Multivariate stochastic simulation with subjective multivariate normal distributions
P. J. Ince; J. Buongiorno
1991-01-01
In many applications of Monte Carlo simulation in forestry or forest products, it may be known that some variables are correlated. However, for simplicity, in most simulations it has been assumed that random variables are independently distributed. This report describes an alternative Monte Carlo simulation technique for subjectively assesed multivariate normal...
Grégoire, G.
2014-12-01
This chapter deals with the multiple linear regression. That is we investigate the situation where the mean of a variable depends linearly on a set of covariables. The noise is supposed to be gaussian. We develop the least squared method to get the parameter estimators and estimates of their precisions. This leads to design confidence intervals, prediction intervals, global tests, individual tests and more generally tests of submodels defined by linear constraints. Methods for model's choice and variables selection, measures of the quality of the fit, residuals study, diagnostic methods are presented. Finally identification of departures from the model's assumptions and the way to deal with these problems are addressed. A real data set is used to illustrate the methodology with software R. Note that this chapter is intended to serve as a guide for other regression methods, like logistic regression or AFT models and Cox regression.
Glyph: Symbolic Regression Tools
Quade, Markus; Gout, Julien; Abel, Markus
2018-01-01
We present Glyph - a Python package for genetic programming based symbolic regression. Glyph is designed for usage let by numerical simulations let by real world experiments. For experimentalists, glyph-remote provides a separation of tasks: a ZeroMQ interface splits the genetic programming optimization task from the evaluation of an experimental (or numerical) run. Glyph can be accessed at http://github.com/ambrosys/glyph . Domain experts are be able to employ symbolic regression in their ex...
Multivariate Matrix-Exponential Distributions
DEFF Research Database (Denmark)
Bladt, Mogens; Nielsen, Bo Friis
2010-01-01
be written as linear combinations of the elements in the exponential of a matrix. For this reason we shall refer to multivariate distributions with rational Laplace transform as multivariate matrix-exponential distributions (MVME). The marginal distributions of an MVME are univariate matrix......-exponential distributions. We prove a characterization that states that a distribution is an MVME distribution if and only if all non-negative, non-null linear combinations of the coordinates have a univariate matrix-exponential distribution. This theorem is analog to a well-known characterization theorem...
Pansharpening via sparse regression
Tang, Songze; Xiao, Liang; Liu, Pengfei; Huang, Lili; Zhou, Nan; Xu, Yang
2017-09-01
Pansharpening is an effective way to enhance the spatial resolution of a multispectral (MS) image by fusing it with a provided panchromatic image. Instead of restricting the coding coefficients of low-resolution (LR) and high-resolution (HR) images to be equal, we propose a pansharpening approach via sparse regression in which the relationship between sparse coefficients of HR and LR MS images is modeled by ridge regression and elastic-net regression simultaneously learning the corresponding dictionaries. The compact dictionaries are learned based on the sampled patch pairs from the high- and low-resolution images, which can greatly characterize the structural information of the LR MS and HR MS images. Later, taking the complex relationship between the coding coefficients of LR MS and HR MS images into account, the ridge regression is used to characterize the relationship of intrapatches. The elastic-net regression is employed to describe the relationship of interpatches. Thus, the HR MS image can be almost identically reconstructed by multiplying the HR dictionary and the calculated sparse coefficient vector with the learned regression relationship. The simulated and real experimental results illustrate that the proposed method outperforms several well-known methods, both quantitatively and perceptually.
Epidemiology of road traffic incidents in Peru 1973-2008: incidence, mortality, and fatality.
Directory of Open Access Journals (Sweden)
J Jaime Miranda
Full Text Available The epidemiological profile and trends of road traffic injuries (RTIs in Peru have not been well-defined, though this is a necessary step to address this significant public health problem in Peru. The objective of this study was to determine trends of incidence, mortality, and fatality of RTIs in Peru during 1973-2008, as well as their relationship to population trends such as economic growth.Secondary aggregated databases were used to estimate incidence, mortality and fatality rate ratios (IRRs of RTIs. These estimates were standardized to age groups and sex of the 2008 Peruvian population. Negative binomial regression and cubic spline curves were used for multivariable analysis. During the 35-year period there were 952,668 road traffic victims, injured or killed. The adjusted yearly incidence of RTIs increased by 3.59 (95% CI 2.43-5.31 on average. We did not observe any significant trends in the yearly mortality rate. The total adjusted yearly fatality rate decreased by 0.26 (95% CI 0.15-0.43, while among adults the fatality rate increased by 1.25 (95% CI 1.09-1.43. Models fitted with splines suggest that the incidence follows a bimodal curve and closely followed trends in the gross domestic product (GDP per capita.The significant increasing incidence of RTIs in Peru affirms their growing threat to public health. A substantial improvement of information systems for RTIs is needed to create a more accurate epidemiologic profile of RTIs in Peru. This approach can be of use in other similar low and middle-income settings to inform about the local challenges posed by RTIs.
Epidemiology of Road Traffic Incidents in Peru 1973–2008: Incidence, Mortality, and Fatality
Miranda, J. Jaime; López-Rivera, Luis A.; Quistberg, D. Alex; Rosales-Mayor, Edmundo; Gianella, Camila; Paca-Palao, Ada; Luna, Diego; Huicho, Luis; Paca, Ada; Luis, López; Luna, Diego; Rosales, Edmundo; Best, Pablo; Best, Pablo; Egúsquiza, Miriam; Gianella, Camila; Lema, Claudia; Ludeña, Esperanza; Miranda, J. Jaime; Huicho, Luis
2014-01-01
Background The epidemiological profile and trends of road traffic injuries (RTIs) in Peru have not been well-defined, though this is a necessary step to address this significant public health problem in Peru. The objective of this study was to determine trends of incidence, mortality, and fatality of RTIs in Peru during 1973–2008, as well as their relationship to population trends such as economic growth. Methods and Findings Secondary aggregated databases were used to estimate incidence, mortality and fatality rate ratios (IRRs) of RTIs. These estimates were standardized to age groups and sex of the 2008 Peruvian population. Negative binomial regression and cubic spline curves were used for multivariable analysis. During the 35-year period there were 952,668 road traffic victims, injured or killed. The adjusted yearly incidence of RTIs increased by 3.59 (95% CI 2.43–5.31) on average. We did not observe any significant trends in the yearly mortality rate. The total adjusted yearly fatality rate decreased by 0.26 (95% CI 0.15–0.43), while among adults the fatality rate increased by 1.25 (95% CI 1.09–1.43). Models fitted with splines suggest that the incidence follows a bimodal curve and closely followed trends in the gross domestic product (GDP) per capita Conclusions The significant increasing incidence of RTIs in Peru affirms their growing threat to public health. A substantial improvement of information systems for RTIs is needed to create a more accurate epidemiologic profile of RTIs in Peru. This approach can be of use in other similar low and middle-income settings to inform about the local challenges posed by RTIs. PMID:24927195
Reduced Rank Ridge Regression and Its Kernel Extensions.
Mukherjee, Ashin; Zhu, Ji
2011-12-01
In multivariate linear regression, it is often assumed that the response matrix is intrinsically of lower rank. This could be because of the correlation structure among the prediction variables or the coefficient matrix being lower rank. To accommodate both, we propose a reduced rank ridge regression for multivariate linear regression. Specifically, we combine the ridge penalty with the reduced rank constraint on the coefficient matrix to come up with a computationally straightforward algorithm. Numerical studies indicate that the proposed method consistently outperforms relevant competitors. A novel extension of the proposed method to the reproducing kernel Hilbert space (RKHS) set-up is also developed.
The Multivariate Gaussian Probability Distribution
DEFF Research Database (Denmark)
Ahrendt, Peter
2005-01-01
This technical report intends to gather information about the multivariate gaussian distribution, that was previously not (at least to my knowledge) to be found in one place and written as a reference manual. Additionally, some useful tips and tricks are collected that may be useful in practical...
A "Model" Multivariable Calculus Course.
Beckmann, Charlene E.; Schlicker, Steven J.
1999-01-01
Describes a rich, investigative approach to multivariable calculus. Introduces a project in which students construct physical models of surfaces that represent real-life applications of their choice. The models, along with student-selected datasets, serve as vehicles to study most of the concepts of the course from both continuous and discrete…
Mixture of Regression Models with Single-Index
Xiang, Sijia; Yao, Weixin
2016-01-01
In this article, we propose a class of semiparametric mixture regression models with single-index. We argue that many recently proposed semiparametric/nonparametric mixture regression models can be considered special cases of the proposed model. However, unlike existing semiparametric mixture regression models, the new pro- posed model can easily incorporate multivariate predictors into the nonparametric components. Backfitting estimates and the corresponding algorithms have been proposed for...
Local bilinear multiple-output quantile/depth regression
Czech Academy of Sciences Publication Activity Database
Hallin, M.; Lu, Z.; Paindaveine, D.; Šiman, Miroslav
2015-01-01
Roč. 21, č. 3 (2015), s. 1435-1466 ISSN 1350-7265 R&D Projects: GA MŠk(CZ) 1M06047 Institutional support: RVO:67985556 Keywords : conditional depth * growth chart * halfspace depth * local bilinear regression * multivariate quantile * quantile regression * regression depth Subject RIV: BA - General Mathematics Impact factor: 1.372, year: 2015 http://library.utia.cas.cz/separaty/2015/SI/siman-0446857.pdf
DEFF Research Database (Denmark)
Bache, Stefan Holst
A new and alternative quantile regression estimator is developed and it is shown that the estimator is root n-consistent and asymptotically normal. The estimator is based on a minimax ‘deviance function’ and has asymptotically equivalent properties to the usual quantile regression estimator. It is......, however, a different and therefore new estimator. It allows for both linear- and nonlinear model specifications. A simple algorithm for computing the estimates is proposed. It seems to work quite well in practice but whether it has theoretical justification is still an open question....
Practical Session: Logistic Regression
Clausel, M.; Grégoire, G.
2014-12-01
An exercise is proposed to illustrate the logistic regression. One investigates the different risk factors in the apparition of coronary heart disease. It has been proposed in Chapter 5 of the book of D.G. Kleinbaum and M. Klein, "Logistic Regression", Statistics for Biology and Health, Springer Science Business Media, LLC (2010) and also by D. Chessel and A.B. Dufour in Lyon 1 (see Sect. 6 of http://pbil.univ-lyon1.fr/R/pdf/tdr341.pdf). This example is based on data given in the file evans.txt coming from http://www.sph.emory.edu/dkleinb/logreg3.htm#data.
Sex work and HIV incidence among people who inject drugs.
Kerr, Thomas; Shannon, Kate; Ti, Lianping; Strathdee, Steffanie; Hayashi, Kanna; Nguyen, Paul; Montaner, Julio; Wood, Evan
2016-02-20
Although the global burden of HIV infection among sex workers (SW) has been well recognized, HIV-related risks among sex workers who inject drugs (SW-IDU) have received less attention. We investigated the relationship between sex work and HIV incidence among people who inject drugs (IDU) in a Canadian setting. Prospective cohort study. Using Kaplan-Meier methods and the extended Cox regression, we compared HIV incidence among SW-IDU and non-SW-IDU in Vancouver, Canada, after adjusting for potential confounders. Between 1996 and 2012, 1647 participants were included in the study, including 512 (31.1%) IDU engaged in sex work. At 5 years the HIV cumulative incidence was higher among SW-IDU in comparison to other IDU (12 vs. 7%, P = 0.001). In unadjusted Cox regression analyses, HIV incidence among SW-IDU was also elevated [relative hazard: 1.69; 95% confidence interval (CI): 1.13-2.53]. However, in a multivariable analysis, sex work did not remain associated with HIV infection (adjusted relative hazard: 0.74; 95% CI: 0.45-1.20), with cocaine injection appearing to account for the elevated risk for HIV infection among SW-IDU. These data suggest that local SW-IDU have elevated rates of HIV infection. However, our exploration of risk factors among SW-IDU demonstrated that drug use patterns and environmental factors, rather than sexual risks, may explain the elevated HIV incidence among SW-IDU locally. Our findings highlight the need for social and structural interventions, including increased access to harm reduction programs and addiction treatment.
Software Regression Verification
2013-12-11
of recursive procedures. Acta Informatica , 45(6):403 – 439, 2008. [GS11] Benny Godlin and Ofer Strichman. Regression verifica- tion. Technical Report...functions. Therefore, we need to rede - fine m-term. – Mutual termination. If either function f or function f ′ (or both) is non- deterministic, then their
Multiple linear regression analysis
Edwards, T. R.
1980-01-01
Program rapidly selects best-suited set of coefficients. User supplies only vectors of independent and dependent data and specifies confidence level required. Program uses stepwise statistical procedure for relating minimal set of variables to set of observations; final regression contains only most statistically significant coefficients. Program is written in FORTRAN IV for batch execution and has been implemented on NOVA 1200.
Mechanisms of neuroblastoma regression
Brodeur, Garrett M.; Bagatell, Rochelle
2014-01-01
Recent genomic and biological studies of neuroblastoma have shed light on the dramatic heterogeneity in the clinical behaviour of this disease, which spans from spontaneous regression or differentiation in some patients, to relentless disease progression in others, despite intensive multimodality therapy. This evidence also suggests several possible mechanisms to explain the phenomena of spontaneous regression in neuroblastomas, including neurotrophin deprivation, humoral or cellular immunity, loss of telomerase activity and alterations in epigenetic regulation. A better understanding of the mechanisms of spontaneous regression might help to identify optimal therapeutic approaches for patients with these tumours. Currently, the most druggable mechanism is the delayed activation of developmentally programmed cell death regulated by the tropomyosin receptor kinase A pathway. Indeed, targeted therapy aimed at inhibiting neurotrophin receptors might be used in lieu of conventional chemotherapy or radiation in infants with biologically favourable tumours that require treatment. Alternative approaches consist of breaking immune tolerance to tumour antigens or activating neurotrophin receptor pathways to induce neuronal differentiation. These approaches are likely to be most effective against biologically favourable tumours, but they might also provide insights into treatment of biologically unfavourable tumours. We describe the different mechanisms of spontaneous neuroblastoma regression and the consequent therapeutic approaches. PMID:25331179
Bounded Gaussian process regression
DEFF Research Database (Denmark)
Jensen, Bjørn Sand; Nielsen, Jens Brehm; Larsen, Jan
2013-01-01
We extend the Gaussian process (GP) framework for bounded regression by introducing two bounded likelihood functions that model the noise on the dependent variable explicitly. This is fundamentally different from the implicit noise assumption in the previously suggested warped GP framework. We...
Bayesian logistic regression analysis
Van Erp, H.R.N.; Van Gelder, P.H.A.J.M.
2012-01-01
In this paper we present a Bayesian logistic regression analysis. It is found that if one wishes to derive the posterior distribution of the probability of some event, then, together with the traditional Bayes Theorem and the integrating out of nuissance parameters, the Jacobian transformation is an
International Nuclear Information System (INIS)
Francois, P.
1996-01-01
We undertook a study programme at the end of 1991. To start with, we performed some exploratory studies aimed at learning some preliminary lessons on this type of analysis: Assessment of the interest of probabilistic incident analysis; possibility of using PSA scenarios; skills and resources required. At the same time, EPN created a working group whose assignment was to define a new approach for analysis of incidents on NPPs. This working group gave thought to both aspects of Operating Feedback that EPN wished to improve: Analysis of significant incidents; analysis of potential consequences. We took part in the work of this group, and for the second aspects, we proposed a method based on an adaptation of the event-tree method in order to establish a link between existing PSA models and actual incidents. Since PSA provides an exhaustive database of accident scenarios applicable to the two most common types of units in France, they are obviously of interest for this sort of analysis. With this method we performed some incident analyses, and at the same time explores some methods employed abroad, particularly ASP (Accident Sequence Precursor, a method used by the NRC). Early in 1994 EDF began a systematic analysis programme. The first, transient phase will set up methods and an organizational structure. 7 figs
Sparse Linear Identifiable Multivariate Modeling
DEFF Research Database (Denmark)
Henao, Ricardo; Winther, Ole
2011-01-01
In this paper we consider sparse and identifiable linear latent variable (factor) and linear Bayesian network models for parsimonious analysis of multivariate data. We propose a computationally efficient method for joint parameter and model inference, and model comparison. It consists of a fully...... Bayesian hierarchy for sparse models using slab and spike priors (two-component δ-function and continuous mixtures), non-Gaussian latent factors and a stochastic search over the ordering of the variables. The framework, which we call SLIM (Sparse Linear Identifiable Multivariate modeling), is validated...... computational complexity. We attribute this mainly to the stochastic search strategy used, and to parsimony (sparsity and identifiability), which is an explicit part of the model. We propose two extensions to the basic i.i.d. linear framework: non-linear dependence on observed variables, called SNIM (Sparse Non-linear...
Bayesian nonlinear regression for large small problems
Chakraborty, Sounak
2012-07-01
Statistical modeling and inference problems with sample sizes substantially smaller than the number of available covariates are challenging. This is known as large p small n problem. Furthermore, the problem is more complicated when we have multiple correlated responses. We develop multivariate nonlinear regression models in this setup for accurate prediction. In this paper, we introduce a full Bayesian support vector regression model with Vapnik\\'s ε-insensitive loss function, based on reproducing kernel Hilbert spaces (RKHS) under the multivariate correlated response setup. This provides a full probabilistic description of support vector machine (SVM) rather than an algorithm for fitting purposes. We have also introduced a multivariate version of the relevance vector machine (RVM). Instead of the original treatment of the RVM relying on the use of type II maximum likelihood estimates of the hyper-parameters, we put a prior on the hyper-parameters and use Markov chain Monte Carlo technique for computation. We have also proposed an empirical Bayes method for our RVM and SVM. Our methods are illustrated with a prediction problem in the near-infrared (NIR) spectroscopy. A simulation study is also undertaken to check the prediction accuracy of our models. © 2012 Elsevier Inc.
Descriptor Learning via Supervised Manifold Regularization for Multioutput Regression.
Zhen, Xiantong; Yu, Mengyang; Islam, Ali; Bhaduri, Mousumi; Chan, Ian; Li, Shuo
2017-09-01
Multioutput regression has recently shown great ability to solve challenging problems in both computer vision and medical image analysis. However, due to the huge image variability and ambiguity, it is fundamentally challenging to handle the highly complex input-target relationship of multioutput regression, especially with indiscriminate high-dimensional representations. In this paper, we propose a novel supervised descriptor learning (SDL) algorithm for multioutput regression, which can establish discriminative and compact feature representations to improve the multivariate estimation performance. The SDL is formulated as generalized low-rank approximations of matrices with a supervised manifold regularization. The SDL is able to simultaneously extract discriminative features closely related to multivariate targets and remove irrelevant and redundant information by transforming raw features into a new low-dimensional space aligned to targets. The achieved discriminative while compact descriptor largely reduces the variability and ambiguity for multioutput regression, which enables more accurate and efficient multivariate estimation. We conduct extensive evaluation of the proposed SDL on both synthetic data and real-world multioutput regression tasks for both computer vision and medical image analysis. Experimental results have shown that the proposed SDL can achieve high multivariate estimation accuracy on all tasks and largely outperforms the algorithms in the state of the arts. Our method establishes a novel SDL framework for multioutput regression, which can be widely used to boost the performance in different applications.
Subset selection in regression
Miller, Alan
2002-01-01
Originally published in 1990, the first edition of Subset Selection in Regression filled a significant gap in the literature, and its critical and popular success has continued for more than a decade. Thoroughly revised to reflect progress in theory, methods, and computing power, the second edition promises to continue that tradition. The author has thoroughly updated each chapter, incorporated new material on recent developments, and included more examples and references. New in the Second Edition:A separate chapter on Bayesian methodsComplete revision of the chapter on estimationA major example from the field of near infrared spectroscopyMore emphasis on cross-validationGreater focus on bootstrappingStochastic algorithms for finding good subsets from large numbers of predictors when an exhaustive search is not feasible Software available on the Internet for implementing many of the algorithms presentedMore examplesSubset Selection in Regression, Second Edition remains dedicated to the techniques for fitting...
Ridge Regression Signal Processing
Kuhl, Mark R.
1990-01-01
The introduction of the Global Positioning System (GPS) into the National Airspace System (NAS) necessitates the development of Receiver Autonomous Integrity Monitoring (RAIM) techniques. In order to guarantee a certain level of integrity, a thorough understanding of modern estimation techniques applied to navigational problems is required. The extended Kalman filter (EKF) is derived and analyzed under poor geometry conditions. It was found that the performance of the EKF is difficult to predict, since the EKF is designed for a Gaussian environment. A novel approach is implemented which incorporates ridge regression to explain the behavior of an EKF in the presence of dynamics under poor geometry conditions. The basic principles of ridge regression theory are presented, followed by the derivation of a linearized recursive ridge estimator. Computer simulations are performed to confirm the underlying theory and to provide a comparative analysis of the EKF and the recursive ridge estimator.
Muscular strength and incident hypertension in normotensive and prehypertensive men.
Maslow, Andréa L; Sui, Xuemei; Colabianchi, Natalie; Hussey, Jim; Blair, Steven N
2010-02-01
The protective effects of cardiorespiratory fitness (CRF) on hypertension (HTN) are well known; however, the association between muscular strength and incidence of HTN has yet to be examined. This study evaluated the strength-HTN association with and without accounting for CRF. Participants were 4147 men (age = 20-82 yr) in the Aerobics Center Longitudinal Study for whom an age-specific composite muscular strength score was computed from measures of a one-repetition maximal leg and a one-repetition maximal bench press. CRF was quantified by maximal treadmill exercise test time in minutes. Cox proportional hazards regression analysis was used to estimate hazard ratios (HR) and 95% confidence intervals of incident HTN events according to exposure categories. During a mean follow-up of 19 yr, there were 503 incident HTN cases. Multivariable-adjusted (excluding CRF) HR of HTN in normotensive men comparing middle- and high-strength thirds to the lowest third were not significant at 1.17 and 0.84, respectively. Multivariable-adjusted (excluding CRF) HR of HTN in baseline prehypertensive men comparing middle- and high-strength thirds to the lowest third were significant at 0.73 and 0.72 (P = 0.01 each), respectively. The association between muscular strength and incidence of HTN in baseline prehypertensive men was no longer significant after control for CRF (P = 0.26). The study indicated that middle and high levels of muscular strength were associated with a reduced risk of HTN in prehypertensive men only. However, this relationship was no longer significant after controlling for CRF.
Regression in organizational leadership.
Kernberg, O F
1979-02-01
The choice of good leaders is a major task for all organizations. Inforamtion regarding the prospective administrator's personality should complement questions regarding his previous experience, his general conceptual skills, his technical knowledge, and the specific skills in the area for which he is being selected. The growing psychoanalytic knowledge about the crucial importance of internal, in contrast to external, object relations, and about the mutual relationships of regression in individuals and in groups, constitutes an important practical tool for the selection of leaders.
Classification and regression trees
Breiman, Leo; Olshen, Richard A; Stone, Charles J
1984-01-01
The methodology used to construct tree structured rules is the focus of this monograph. Unlike many other statistical procedures, which moved from pencil and paper to calculators, this text's use of trees was unthinkable before computers. Both the practical and theoretical sides have been developed in the authors' study of tree methods. Classification and Regression Trees reflects these two sides, covering the use of trees as a data analysis method, and in a more mathematical framework, proving some of their fundamental properties.
Better Autologistic Regression
Directory of Open Access Journals (Sweden)
Mark A. Wolters
2017-11-01
Full Text Available Autologistic regression is an important probability model for dichotomous random variables observed along with covariate information. It has been used in various fields for analyzing binary data possessing spatial or network structure. The model can be viewed as an extension of the autologistic model (also known as the Ising model, quadratic exponential binary distribution, or Boltzmann machine to include covariates. It can also be viewed as an extension of logistic regression to handle responses that are not independent. Not all authors use exactly the same form of the autologistic regression model. Variations of the model differ in two respects. First, the variable coding—the two numbers used to represent the two possible states of the variables—might differ. Common coding choices are (zero, one and (minus one, plus one. Second, the model might appear in either of two algebraic forms: a standard form, or a recently proposed centered form. Little attention has been paid to the effect of these differences, and the literature shows ambiguity about their importance. It is shown here that changes to either coding or centering in fact produce distinct, non-nested probability models. Theoretical results, numerical studies, and analysis of an ecological data set all show that the differences among the models can be large and practically significant. Understanding the nature of the differences and making appropriate modeling choices can lead to significantly improved autologistic regression analyses. The results strongly suggest that the standard model with plus/minus coding, which we call the symmetric autologistic model, is the most natural choice among the autologistic variants.
DEFF Research Database (Denmark)
Hansen, Henrik; Tarp, Finn
2001-01-01
. There are, however, decreasing returns to aid, and the estimated effectiveness of aid is highly sensitive to the choice of estimator and the set of control variables. When investment and human capital are controlled for, no positive effect of aid is found. Yet, aid continues to impact on growth via...... investment. We conclude by stressing the need for more theoretical work before this kind of cross-country regressions are used for policy purposes....
Hilbe, Joseph M
2009-01-01
This book really does cover everything you ever wanted to know about logistic regression … with updates available on the author's website. Hilbe, a former national athletics champion, philosopher, and expert in astronomy, is a master at explaining statistical concepts and methods. Readers familiar with his other expository work will know what to expect-great clarity.The book provides considerable detail about all facets of logistic regression. No step of an argument is omitted so that the book will meet the needs of the reader who likes to see everything spelt out, while a person familiar with some of the topics has the option to skip "obvious" sections. The material has been thoroughly road-tested through classroom and web-based teaching. … The focus is on helping the reader to learn and understand logistic regression. The audience is not just students meeting the topic for the first time, but also experienced users. I believe the book really does meet the author's goal … .-Annette J. Dobson, Biometric...
DEFF Research Database (Denmark)
Ozenne, Brice; Sørensen, Anne Lyngholm; Scheike, Thomas
2017-01-01
In the presence of competing risks a prediction of the time-dynamic absolute risk of an event can be based on cause-specific Cox regression models for the event and the competing risks (Benichou and Gail, 1990). We present computationally fast and memory optimized C++ functions with an R interfac...... functionals. The software presented here is implemented in the riskRegression package.......In the presence of competing risks a prediction of the time-dynamic absolute risk of an event can be based on cause-specific Cox regression models for the event and the competing risks (Benichou and Gail, 1990). We present computationally fast and memory optimized C++ functions with an R interface...... for predicting the covariate specific absolute risks, their confidence intervals, and their confidence bands based on right censored time to event data. We provide explicit formulas for our implementation of the estimator of the (stratified) baseline hazard function in the presence of tied event times. As a by...
On generalized elliptical quantiles in the nonlinear quantile regression setup
Czech Academy of Sciences Publication Activity Database
Hlubinka, D.; Šiman, Miroslav
2015-01-01
Roč. 24, č. 2 (2015), s. 249-264 ISSN 1133-0686 R&D Projects: GA ČR GA14-07234S Institutional support: RVO:67985556 Keywords : multivariate quantile * elliptical quantile * quantile regression * multivariate statistical inference * portfolio optimization Subject RIV: BA - General Mathematics Impact factor: 1.207, year: 2015 http://library.utia.cas.cz/separaty/2014/SI/siman-0434510.pdf
Ridge Regression: A Regression Procedure for Analyzing correlated Independent Variables
Rakow, Ernest A.
1978-01-01
Ridge regression is a technique used to ameliorate the problem of highly correlated independent variables in multiple regression analysis. This paper explains the fundamentals of ridge regression and illustrates its use. (JKS)
Likelihood estimators for multivariate extremes
Huser, Raphaël
2015-11-17
The main approach to inference for multivariate extremes consists in approximating the joint upper tail of the observations by a parametric family arising in the limit for extreme events. The latter may be expressed in terms of componentwise maxima, high threshold exceedances or point processes, yielding different but related asymptotic characterizations and estimators. The present paper clarifies the connections between the main likelihood estimators, and assesses their practical performance. We investigate their ability to estimate the extremal dependence structure and to predict future extremes, using exact calculations and simulation, in the case of the logistic model.
Essentials of multivariate data analysis
Spencer, Neil H
2013-01-01
""… this text provides an overview at an introductory level of several methods in multivariate data analysis. It contains in-depth examples from one data set woven throughout the text, and a free [Excel] Add-In to perform the analyses in Excel, with step-by-step instructions provided for each technique. … could be used as a text (possibly supplemental) for courses in other fields where researchers wish to apply these methods without delving too deeply into the underlying statistics.""-The American Statistician, February 2015
DEFF Research Database (Denmark)
Barndorff-Nielsen, Ole Eiler; Stelzer, Robert
2011-01-01
parameter is restricted to normal matrices and especially to strictly negative definite ones. For finite variation Lévy bases we are able to give conditions for supOU processes to have locally bounded càdlàg paths of finite variation and to show an analogue of the stochastic differential equation of OU......-type processes, which has been suggested in [2] in the univariate case. Finally, as an important special case, we introduce positive semi-definite supOU processes, and we discuss the relevance of multivariate supOU processes in applications....
Aspects of multivariate statistical theory
Muirhead, Robb J
2009-01-01
The Wiley-Interscience Paperback Series consists of selected books that have been made more accessible to consumers in an effort to increase global appeal and general circulation. With these new unabridged softcover volumes, Wiley hopes to extend the lives of these works by making them available to future generations of statisticians, mathematicians, and scientists. "". . . the wealth of material on statistics concerning the multivariate normal distribution is quite exceptional. As such it is a very useful source of information for the general statistician and a must for anyone wanting to pen
Directory of Open Access Journals (Sweden)
BI Rongxin
2015-02-01
Full Text Available ObjectiveTo investigate the incidence and risk factors for endoscopic retrograde cholangiopancreatography (ERCP-related adverse events in patients with primary sclerosing cholangitis (PSC. MethodsThis study included 72 patients who were diagnosed with PSC by magnetic resonance cholangiopancreatography and underwent ERCP in the Third Hospital of Xingtai City from December 2009 to December 2013. The incidence of postoperative adverse events within 30 d after ERCP was monitored and recorded. Univariate and multivariate logistic regression analyses were used to analyze the risk factors for ERCP-related adverse events in PSC patients. ResultsThe success rate of ERCP was 94.4% (68/72. Among all adverse events, the incidence of pancreatitis and biliary tract infection were highest (6.94% and 4.17%, while the incidence of perforation was lowest (1.38%. Univariate logistic regression analysis showed that the risk of adverse events was significantly higher in patients who underwent cholangiopancreatography and sphincterotomy than in those not undergoing these procedures (OR=13.642, P=0.017; OR=7.381, P=0.000; guide wire insertion and cholangiopancreatography also increased the incidence of adverse reactions (OR=8.042, P=0.000; OR=2.651, P=0.032. Multivariate logistic regression analysis showed that guide wire insertion (OR = 4.547, 95%CI: 1.076-12.543 and biliary sphincterotomy (OR=5.023, 95%CI: 2.643-18.321 are associated with the incidence of ERCP-related adverse events. ConclusionSphincterotomy and guide wire insertion can increase the risk of adverse events in PSC patients after ERCP.
Increased incidence and prevalence of psoriasis in multiple sclerosis.
Marrie, Ruth Ann; Patten, Scott B; Tremlett, Helen; Wolfson, Christina; Leung, Stella; Fisk, John D
2017-04-01
Psoriasis and multiple sclerosis (MS) share some risk factors, and fumarates are effective disease-modifying therapies for both psoriasis and MS, suggesting a common pathogenesis. However, findings regarding the occurrence of psoriasis in the MS population are inconsistent. We aimed to estimate the incidence and prevalence of psoriasis in the MS population versus a matched cohort from the general population. We used population-based administrative data from the Canadian province of Manitoba to identify 4911 persons with MS and 23,274 age-, sex- and geographically-matched controls aged 20 years and older. We developed case definitions for psoriasis using ICD-9/10 codes and prescription claims. These case definitions were compared to self-reported psoriasis diagnoses. The preferred definition was applied to estimate the incidence and prevalence of psoriasis over the period 1998-2008. We used multivariable Cox regression to estimate the risk of psoriasis in the MS population at the individual level, adjusting for sex, age at the index date, socioeconomic status and physician visits. In 2008, the crude incidence of psoriasis per 100,000 person-years was 466.7 (95%CI: 266.8-758.0) in the MS population, and 221.3 in the matched population (95%CI: 158.1-301.4). The crude prevalence of psoriasis per 100,000 persons was 4666.1 (95%CI: 3985.2-5429.9) in the MS population, and 3313.5 (95%CI: 3057.4-3585.3) in the matched population. The incidence and prevalence of psoriasis rose slightly over time. After adjusting for sex, age at the index date, socioeconomic status and physician visits, the risk of incident psoriasis was 54% higher in the MS population (HR 1.54; 95%CI: 1.07-2.24). Psoriasis incidence and prevalence are higher in the MS population than in the matched population. Copyright © 2017. Published by Elsevier B.V.
DEFF Research Database (Denmark)
Bordacconi, Mats Joe; Larsen, Martin Vinæs
2014-01-01
Humans are fundamentally primed for making causal attributions based on correlations. This implies that researchers must be careful to present their results in a manner that inhibits unwarranted causal attribution. In this paper, we present the results of an experiment that suggests regression...... more likely. Our experiment drew on a sample of 235 university students from three different social science degree programs (political science, sociology and economics), all of whom had received substantial training in statistics. The subjects were asked to compare and evaluate the validity...
Incident alopecia areata and vitiligo in adult women with atopic dermatitis: Nurses' Health Study 2.
Drucker, A M; Thompson, J M; Li, W-Q; Cho, E; Li, T; Guttman-Yassky, E; Qureshi, A A
2017-05-01
We aimed to determine the risk of alopecia areata (AA) and vitiligo associated with atopic dermatitis (AD) in a large cohort of US women, the Nurses' Health Study 2. We used logistic regression to calculate age- and multivariate-adjusted odds ratios to determine the risk of incident AA and vitiligo associated with AD diagnosed in or before 2009. A total of 87 406 and 87 447 participants were included in the AA and vitiligo analyses, respectively. A history of AD in 2009 was reported in 11% of participants. There were 147 incident cases of AA and 98 incident cases of vitiligo over 2 years of follow-up. AD was associated with increased risk of developing AA (OR 1.80, 95% CI 1.18-2.76) and vitiligo (OR 2.14, 95% CI 1.29-3.54) in multivariate models. In this study of US women, AD was associated with increased risk of incident vitiligo and AA in adulthood. © 2017 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Sparse reduced-rank regression with covariance estimation
Chen, Lisha
2014-12-08
Improving the predicting performance of the multiple response regression compared with separate linear regressions is a challenging question. On the one hand, it is desirable to seek model parsimony when facing a large number of parameters. On the other hand, for certain applications it is necessary to take into account the general covariance structure for the errors of the regression model. We assume a reduced-rank regression model and work with the likelihood function with general error covariance to achieve both objectives. In addition we propose to select relevant variables for reduced-rank regression by using a sparsity-inducing penalty, and to estimate the error covariance matrix simultaneously by using a similar penalty on the precision matrix. We develop a numerical algorithm to solve the penalized regression problem. In a simulation study and real data analysis, the new method is compared with two recent methods for multivariate regression and exhibits competitive performance in prediction and variable selection.
Lecture notes on ridge regression
van Wieringen, Wessel N.
2015-01-01
The linear regression model cannot be fitted to high-dimensional data, as the high-dimensionality brings about empirical non-identifiability. Penalized regression overcomes this non-identifiability by augmentation of the loss function by a penalty (i.e. a function of regression coefficients). The ridge penalty is the sum of squared regression coefficients, giving rise to ridge regression. Here many aspect of ridge regression are reviewed e.g. moments, mean squared error, its equivalence to co...
Directory of Open Access Journals (Sweden)
Liyun Su
2010-01-01
obtaining the point spread function (PSF parameter, iterative wiener filter is adopted to complete the restoration. We experimentally illustrate its performance on simulated data and real blurred image. Results show that the proposed PSF parameter estimation technique and the image restoration method are effective.
Tomás-Rodríguez, María I; Palazón-Bru, Antonio; Martínez-St John, Damian R J; Navarro-Cremades, Felipe; Toledo-Marhuenda, José V; Gil-Guillén, Vicente F
2017-04-01
In the literature about primary dysmenorrhea (PD), either a pain gradient has been studied just in women with PD or pain was assessed as a binary variable (presence or absence). Accordingly, we decided to carry out a study in young women to determine possible factors associated with intense pain. A cross-sectional observational study. A Spanish University in 2016. A total of 306 women, aged 18-30 years. A questionnaire was filled in by the participants to assess associated factors with dysmenorrhoea. Our outcome measure was the Andersch and Milsom scale (grade from 0 to 3). grade 0 (menstruation is not painful and daily activity is unaffected), grade 1 (menstruation is painful but seldom inhibits normal activity, analgesics are seldom required, and mild pain), grade 2 (daily activity affected, analgesics required and give relief so that absence from work or school is unusual, and moderate pain), and grade 3 (activity clearly inhibited, poor effect of analgesics, vegetative symptoms and severe pain). Factors significantly associated with more extreme pain: a higher menstrual flow (odds ratio [OR], 2.11; P < .001), a worse quality of life (OR, 0.97; P < .001) and use of medication for PD (OR, 8.22; P < .001). We determined factors associated with extreme pain in PD in a novel way. Further studies are required to corroborate our results. Copyright © 2016 North American Society for Pediatric and Adolescent Gynecology. Published by Elsevier Inc. All rights reserved.
Terjung, B.; Bogsch, F.; Klein, R.; Söhne, J.; Reichel, C.; Wasmuth, J.-C.; Beuers, U.; Sauerbruch, T.; Spengler, U.
2004-01-01
INTRODUCTION: Antineutrophil cytoplasmic antibodies (atypical p-ANCA) are detected at high prevalence in sera from patients with autoimmune hepatitis (AIH), but their diagnostic relevance for AIH has not been systematically evaluated so far. METHODS: Here, we studied sera from 357 patients with
CRD and beyond: multivariable regression models to predict severity of hazelnut allergy
Datema, Mareen R.; van Ree, Ronald; Asero, Riccardo; Barreales, Laura; Belohlavkova, Simona; de Blay, Frédéric; Clausen, Michael; Dubakiene, Ruta; Fernández-Perez, Cristina; Fritsche, Philipp; Gislason, David; Hoffmann-Sommergruber, Karin; Jedrzejczak-Czechowicz, Monika; Jongejan, Laurian; Knulst, André C.; Kowalski, Marek; Kralimarkova, Tanya Z.; Le, Thuy-My; Lidholm, Jonas; Papadopoulos, Nikolaos G.; Popov, Todor A.; del Prado, Nayade; Purohit, Ashok; Reig, Isabel; Seneviratne, Suranjith L.; Sinaniotis, Athanassios; Versteeg, Serge A.; Vieths, Stefan; Zwinderman, A. H.; Clare Mills, E. N.; Fernández-Rivas, Montserrat; Ballmer-Weber, Barbara
2017-01-01
Component-resolved diagnosis (CRD) has revealed significant associations between IgE against individual allergens and severity of hazelnut allergy. Less attention has been given to combining them with clinical factors in predicting severity. To analyze associations between severity and sensitization
National Research Council Canada - National Science Library
Aberg, P
2001-01-01
... before and after application of chemicals on volar forearms of volunteers, Tegobetaine and sodium lauryl sulphate were used to induce the irritations, The spectra were filtered using orthogonal signal correction (OSC...
Veale, A.J.; Xie, Sheng Quan; Anderson, Iain Alexander
2017-01-01
Wearable exoskeletons and soft robots require actuators with muscle-like compliance. These actuators can benefit from the robust and effective interaction that biological muscles' compliance enables them to have in the uncertainty of the real world. Fluidic muscles are compliant but difficult to
Directory of Open Access Journals (Sweden)
Jiabin Chen
2015-08-01
Conclusion: Differentiation between borderline and invasive ovarian tumors can be achieved using a model based on the following criteria: menopausal status; cancer antigen 125 level; and ultrasound parameters. The model is helpful to oncologists and patients in the initial evaluation phase of ovarian tumors.
Mehta, Supriya D.; Moses, Stephen; Parker, Corette B.; Agot, Kawango; Maclean, Ian; Bailey, Robert C.
2013-01-01
Objective We assessed the protective effect of medical male circumcision (MMC) against HIV, herpes simplex virus type 2 (HSV-2), and genital ulcer disease (GUD) incidence. Design Two thousand, seven hundred and eighty-seven men aged 18–24 years living in Kisumu, Kenya were randomly assigned to circumcision (n=1391) or delayed circumcision (n =1393) and assessed by HIV and HSV-2 testing and medical examinations during follow-ups at 1, 3, 6, 12, 18, and 24 months. Methods Cox regression estimated the risk ratio of each outcome (incident HIV, GUD, HSV-2) for circumcision status and multivariable models estimated HIV risk associated with HSV-2, GUD, and circumcision status as time-varying covariates. Results HIV incidence was 1.42 per 100 person-years. Circumcision was 62% protective against HIV [risk ratio =0.38; 95% confidence interval (CI) 0.22–0.67] and did not change when controlling for HSV-2 and GUD (risk ratio =0.39; 95% CI 0.23–0.69). GUD incidence was halved among circumcised men (risk ratio =0.52; 95% CI 0.37–0.73). HSV-2 incidence did not differ by circumcision status (risk ratio =0.94; 95% CI 0.70–1.25). In the multivariable model, HIV seroconversions were tripled (risk ratio =3.44; 95% CI 1.52–7.80) among men with incident HSV-2 and seven times greater (risk ratio =6.98; 95% CI 3.50–13.9) for men with GUD. Conclusion Contrary to findings from the South African and Ugandan trials, the protective effect of MMC against HIV was independent of GUD and HSV-2, and MMC had no effect on HSV-2 incidence. Determining the causes of GUD is necessary to reduce associated HIV risk and to understand how circumcision confers protection against GUD and HIV PMID:22382150
The incidence of adjacent segment disease after lumbar discectomy: A study of 751 patients.
Bydon, Mohamad; Macki, Mohamed; Kerezoudis, Panagiotis; Sciubba, Daniel M; Wolinsky, Jean-Paul; Witham, Timothy F; Gokaslan, Ziya L; Bydon, Ali
2017-01-01
The objective of this study is to determine the incidence and prognostic factors of adjacent segment disease (ASD) following first-time lumbar discectomy (LD). We retrospectively reviewed all neurosurgical patients who underwent first-time LD for degenerative lumbar disease from 1990 to 2012. ASD was defined as a clinical and radiographic progression of degenerative spinal disease that required surgical decompression (with or without fusion) at the level above or below the index discectomy. Adjusted odds ratios were calculated from multivariable logistical regression controlling for sex and age, as well as postoperative sensory deficit, motor deficit, back pain, neurogenic claudication, and radiculopathy. Of the 751 patients who underwent single-level LD, the cumulative reoperation rate for degenerative spinal disease was 10.79%. The incidence of ASD requiring reoperation was 4% over 3.11years. More specifically, the incidence of adjacent level discectomy was 1.86% over 3.45years. The annualized reoperation rate for ASD was 1.35% (1.35 ASD reoperations per 100 person-years). The 63.33% incidence of cranial ASD requiring reoperation was statistically significantly higher than the 40.00% incidence of caudal ASD requiring reoperation. Following multivariable logistical regression, the strongest (and only) statistically significant predictor of ASD requiring reoperation was lower extremity radiculopathy after the index discectomy operation (OR=14.23, p<0.001). In the first series on ASD following first-time LD without fusion, the rate of reoperation for ASD was 4% and the cumulative reoperation rate 10.79%. Rostral ASD is more common than caudal ASD and lower extremity radiculopathy is the strongest predictor of ASD. Copyright © 2016 Elsevier Ltd. All rights reserved.
Kuhl, Mark R.
1990-01-01
Current navigation requirements depend on a geometric dilution of precision (GDOP) criterion. As long as the GDOP stays below a specific value, navigation requirements are met. The GDOP will exceed the specified value when the measurement geometry becomes too collinear. A new signal processing technique, called Ridge Regression Processing, can reduce the effects of nearly collinear measurement geometry; thereby reducing the inflation of the measurement errors. It is shown that the Ridge signal processor gives a consistently better mean squared error (MSE) in position than the Ordinary Least Mean Squares (OLS) estimator. The applicability of this technique is currently being investigated to improve the following areas: receiver autonomous integrity monitoring (RAIM), coverage requirements, availability requirements, and precision approaches.
Multivariate approaches in plant science
DEFF Research Database (Denmark)
Gottlieb, D.M.; Schultz, j.; Bruun, Susanne Wrang
2004-01-01
includes labor intensive work in order to manage, handle and analyze data. The field of classical proteomics should therefore be extended to also include handling of large datasets in an objective way. The separation obtained by two-dimensional electrophoresis and mass spectrometry gives rise to huge...... is to encircle the identity of proteins of interest. However, the overall relation between proteins must also be explained. Classical proteomics consist of separation and characterization, based on two-dimensional electrophoresis, trypsin digestion, mass spectrometry and database searching. Characterization...... amount of data. We present a multivariate approach to the handling of data in proteomics with the advantage that protein patterns can be spotted at an early stage and consequently the proteins selected for sequencing can be selected intelligently. These methods can also be applied to other data...
Simulation of multivariate diffusion bridges
DEFF Research Database (Denmark)
Bladt, Mogens; Finch, Samuel; Sørensen, Michael
We propose simple methods for multivariate diffusion bridge simulation, which plays a fundamental role in simulation-based likelihood and Bayesian inference for stochastic differential equations. By a novel application of classical coupling methods, the new approach generalizes a previously...... proposed simulation method for one-dimensional bridges to the mulit-variate setting. First a method of simulating approzimate, but often very accurate, diffusion bridges is proposed. These approximate bridges are used as proposal for easily implementable MCMC algorithms that produce exact diffusion bridges....... The new method is much more generally applicable than previous methods. Another advantage is that the new method works well for diffusion bridges in long intervals because the computational complexity of the method is linear in the length of the interval. In a simulation study the new method performs well...
DEFF Research Database (Denmark)
Barndorff-Nielsen, Ole Eiler; Stelzer, Robert
Univariate superpositions of Ornstein-Uhlenbeck (OU) type processes, called supOU processes, provide a class of continuous time processes capable of exhibiting long memory behaviour. This paper introduces multivariate supOU processes and gives conditions for their existence and finiteness...... of moments. Moreover, the second order moment structure is explicitly calculated, and examples exhibit the possibility of long range dependence. Our supOU processes are defined via homogeneous and factorisable Lévy bases. We show that the behaviour of supOU processes is particularly nice when the mean...... reversion parameter is restricted to normal matrices and especially to strictly negative definite ones.For finite variation Lévy bases we are able to give conditions for supOU processes to have locally bounded càdlàg paths of finite variation and to show an analogue of the stochastic differential equation...
Multivariate methods for particle identification
Visan, Cosmin
2013-01-01
The purpose of this project was to evaluate several MultiVariate methods in order to determine which one, if any, offers better results in Particle Identification (PID) than a simple n$\\sigma$ cut on the response of the ALICE PID detectors. The particles considered in the analysis were Pions, Kaons and Protons and the detectors used were TPC and TOF. When used with the same input n$\\sigma$ variables, the results show similar perfoance between the Rectangular Cuts Optimization method and the simple n$\\sigma$ cuts. The method MLP and BDT show poor results for certain ranges of momentum. The KNN method is the best performing, showing similar results for Pions and Protons as the Cuts method, and better results for Kaons. The extension of the methods to include additional input variables leads to poor results, related to instabilities still to be investigated.
Acoustic multivariate condition monitoring - AMCM
Energy Technology Data Exchange (ETDEWEB)
Rosenhave, P.E. [Vestfold College, Maritime Dept., Toensberg (Norway)
1997-12-31
In Norway, Vestfold College, Maritime Department presents new opportunities for non-invasive, on- or off-line acoustic monitoring of rotating machinery such as off-shore pumps and diesel engines. New developments within acoustic sensor technology coupled with chemometric data analysis of complex signals now allow condition monitoring of hitherto unavailable flexibility and diagnostic specificity. Chemometrics paired with existing knowledge yields a new and powerful tool for condition monitoring. By the use of multivariate techniques and acoustics it is possible to quantify wear and tear as well as predict the performance of working components in complex machinery. This presentation describes the AMCM method and one result of a feasibility study conducted onboard the LPG/C `Norgas Mariner` owned by Norwegian Gas Carriers as (NGC), Oslo. (orig.) 6 refs.
Incidence and Persistence of Major Depressive Disorder Among People Living with HIV in Uganda.
Kinyanda, Eugene; Weiss, Helen A; Levin, Jonathan; Nakasujja, Noeline; Birabwa, Harriet; Nakku, Juliet; Mpango, Richard; Grosskurth, Heiner; Seedat, Soraya; Araya, Ricardo; Patel, Vikram
2017-06-01
Data on the course of major depressive disorder (MDD) among people living with HIV (PLWH) are needed to inform refinement of screening and interventions for MDD. This paper describes the incidence and persistence rate of MDD in PLWH in Uganda. 1099 ART-naïve PLWH attending HIV clinics in Uganda were followed up for 12 months. MDD was assessed using the DSM IV based Mini-International Neuropsychiatric Interview with a prevalence for MDD at baseline of 14.0 % (95 % CI 11.7-16.3 %) reported. Multivariable logistic regression was used to determine predictors of incident and persistent MDD. Cumulative incidence of MDD was 6.1 per 100 person-years (95 % CI 4.6-7.8) with significant independent predictors of study site, higher baseline depression scores and increased stress. Persistence of MDD was 24.6 % (95 % CI 17.9-32.5 %) with independent significant predictors of study site, higher baseline depression scores, and increased weight. Risks of incident and persistent MDD observed in this study were high. Potentially modifiable factors of elevated baseline depressive scores and stress (only for incident MDD) were important predictors of incident and persistent MDD.
Elliptical multiple-output quantile regression and convex optimization
Czech Academy of Sciences Publication Activity Database
Hallin, M.; Šiman, Miroslav
2016-01-01
Roč. 109, č. 1 (2016), s. 232-237 ISSN 0167-7152 R&D Projects: GA ČR GA14-07234S Institutional support: RVO:67985556 Keywords : quantile regression * elliptical quantile * multivariate quantile * multiple-output regression Subject RIV: BA - General Mathematics Impact factor: 0.540, year: 2016 http://library.utia.cas.cz/separaty/2016/SI/siman-0458243.pdf
Calculating a Stepwise Ridge Regression.
Morris, John D.
1986-01-01
Although methods for using ordinary least squares regression computer programs to calculate a ridge regression are available, the calculation of a stepwise ridge regression requires a special purpose algorithm and computer program. The correct stepwise ridge regression procedure is given, and a parallel FORTRAN computer program is described.…
Incidence, 10-year recidivism rate and prognostic factors for cholesteatoma.
Britze, A; Møller, M L; Ovesen, T
2017-04-01
Cholesteatoma patients have a high risk of recurrence with complications, and knowledge exchange is a prerequisite for improving treatment. This study aimed to apply appropriate statistics to provide meaningful and transferable results from cholesteatoma surgery, to highlight independent prognostic factors, and to assess the incidence rate. Incidence rates were assessed for the district of Aarhus, Denmark. From 147 patients operated on mainly with canal wall up mastoidectomies for debuting cholesteatomas, 10-year Kaplan-Meier recidivism rates were calculated and independent prognostic factors for the recidivism were identified by Cox multivariate regression analyses. Incidence rate was 6.8 per 100 000 per year. The 10-year cumulative recidivism rate was 0.44 (95 per cent confidence interval, 0.37-0.53). Independent prognostic factors for the recidivism were: age below 15 years (hazard ratio = 2.2; p > z = 0.002), cholesteatoma localised to the mastoid (hazard ratio = 1.7; p > z = 0.04), stapes erosion (hazard ratio = 1.9; p > z = 0.02) and incus erosion (hazard ratio = 1.9; p > z = 0.04). The recidivism rate is influenced by several factors that are important to observe, both in the clinic and when comparing results from surgery.
Factors Associated with Incidence of Induced Abortion in Hamedan, Iran.
Hosseini, Hatam; Erfani, Amir; Nojomi, Marzieh
2017-05-01
There is limited reliable information on abortion in Iran, where abortion is illegal and many women of reproductive age seek clandestine abortion to end their unintended pregnancy. This study aims to examine the determinants of induced abortion in the city of Hamedan, Iran. The study utilizes recent data from the 2015 Hamedan Survey of Fertility, conducted in a representative sample of 3,000 married women aged 15-49 years in the city of Hamedan, Iran. Binary logistic regression models are used to examine factors associated with the incidence of abortion. Overall, 3.8% of respondents reported having had an induced abortion in their life. Multivariate results showed that the incidence of abortion was strongly associated with women's education, type of contraceptive and family income level, after controlling for confounding factors. Women using long-acting contraceptive methods, those educated under high school diploma or postsecondary education, and those with high level of income were more likely to report having an induced abortion. The high incidence of abortion among less or more educated women and those with high income level signifies unmet family planning needs among these women, which must be addressed by focused reproductive health and family planning programs.
Sexually transmitted infection incidence among adolescents in Ireland.
Davoren, Martin P; Hayes, Kevin; Horgan, Mary; Shiely, Frances
2014-10-01
The burden of sexually transmitted infections (STIs) rests with young people, yet in Ireland there has been very little research into this population. The purpose of this study was to determine the incidence rate and establish risk factors that predict STI occurrence among adolescents in Ireland. Routine diagnostic, demographic and behavioural data from first-time visits to three screening centres in the southwest of Ireland were obtained. Univariate and multivariable logistic regression models were used to assess risk factors that predict STI occurrence among adolescents. A total of 2784 first-time patients, aged 13-19 years, received 3475 diagnoses between January 1999 and September 2009; 1168 (42%) of adolescents had notifiable STIs. The incidence rate of STIs is 225/100 000 person-years. Univariate analysis identified eligible risk factors (pIreland. The proportion of notifications among those aged under 20 years is increasing. These data illustrate the significance of age, condom use and number of sexual partners as risk factors for STI diagnosis. Furthermore, providing data for the first time, we report on the high incidence rate of STIs among adolescents in Ireland. The high levels of risk-taking behaviour and STI acquisition are highlighted and suggest that there is a need for an integrated public health approach to combat this phenomenon in the adolescent population. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
Multivariate strategies in functional magnetic resonance imaging
DEFF Research Database (Denmark)
Hansen, Lars Kai
2007-01-01
We discuss aspects of multivariate fMRI modeling, including the statistical evaluation of multivariate models and means for dimensional reduction. In a case study we analyze linear and non-linear dimensional reduction tools in the context of a `mind reading' predictive multivariate fMRI model....
Multivariate Exponential Autoregressive and Autoregressive Moving ...
African Journals Online (AJOL)
Autoregressive (AR) and autoregressive moving average (ARMA) processes with multivariate exponential (ME) distribution are presented and discussed. The theory of positive dependence is used to show that in many cases, multivariate exponential autoregressive (MEAR) and multivariate autoregressive moving average ...
Modified Regression Correlation Coefficient for Poisson Regression Model
Kaengthong, Nattacha; Domthong, Uthumporn
2017-09-01
This study gives attention to indicators in predictive power of the Generalized Linear Model (GLM) which are widely used; however, often having some restrictions. We are interested in regression correlation coefficient for a Poisson regression model. This is a measure of predictive power, and defined by the relationship between the dependent variable (Y) and the expected value of the dependent variable given the independent variables [E(Y|X)] for the Poisson regression model. The dependent variable is distributed as Poisson. The purpose of this research was modifying regression correlation coefficient for Poisson regression model. We also compare the proposed modified regression correlation coefficient with the traditional regression correlation coefficient in the case of two or more independent variables, and having multicollinearity in independent variables. The result shows that the proposed regression correlation coefficient is better than the traditional regression correlation coefficient based on Bias and the Root Mean Square Error (RMSE).
Multinomial logistic regression ensembles.
Lee, Kyewon; Ahn, Hongshik; Moon, Hojin; Kodell, Ralph L; Chen, James J
2013-05-01
This article proposes a method for multiclass classification problems using ensembles of multinomial logistic regression models. A multinomial logit model is used as a base classifier in ensembles from random partitions of predictors. The multinomial logit model can be applied to each mutually exclusive subset of the feature space without variable selection. By combining multiple models the proposed method can handle a huge database without a constraint needed for analyzing high-dimensional data, and the random partition can improve the prediction accuracy by reducing the correlation among base classifiers. The proposed method is implemented using R, and the performance including overall prediction accuracy, sensitivity, and specificity for each category is evaluated on two real data sets and simulation data sets. To investigate the quality of prediction in terms of sensitivity and specificity, the area under the receiver operating characteristic (ROC) curve (AUC) is also examined. The performance of the proposed model is compared to a single multinomial logit model and it shows a substantial improvement in overall prediction accuracy. The proposed method is also compared with other classification methods such as the random forest, support vector machines, and random multinomial logit model.
Energy Technology Data Exchange (ETDEWEB)
Hall, Matthew D. [Department of Radiation Oncology, City of Hope National Medical Center, Duarte, California (United States); Schultheiss, Timothy E., E-mail: schultheiss@coh.org [Department of Radiation Oncology, City of Hope National Medical Center, Duarte, California (United States); Smith, David D. [Division of Biostatistics, City of Hope National Medical Center, Duarte, California (United States); Nguyen, Khanh H. [Department of Radiation Oncology, City of Hope National Medical Center, Duarte, California (United States); Department of Radiation Oncology, Bayhealth Cancer Center, Dover, Delaware (United States); Wong, Jeffrey Y.C. [Department of Radiation Oncology, City of Hope National Medical Center, Duarte, California (United States)
2015-01-01
Purpose/Objective(s): To perform a meta-regression on published data and to model the 5-year probability of cataract development after hematopoietic stem cell transplantation (HSCT) with and without total body irradiation (TBI). Methods and Materials: Eligible studies reporting cataract incidence after HSCT with TBI were identified by a PubMed search. Seventeen publications provided complete information on radiation dose schedule, fractionation, dose rate, and actuarial cataract incidence. Chemotherapy-only regimens were included as zero radiation dose regimens. Multivariate meta-regression with a weighted generalized linear model was used to model the 5-year cataract incidence and contributory factors. Results: Data from 1386 patients in 21 series were included for analysis. TBI was administered to a total dose of 0 to 15.75 Gy with single or fractionated schedules with a dose rate of 0.04 to 0.16 Gy/min. Factors significantly associated with 5-year cataract incidence were dose, dose times dose per fraction (D•dpf), pediatric versus adult status, and the absence of an ophthalmologist as an author. Dose rate, graft versus host disease, steroid use, hyperfractionation, and number of fractions were not significant. Five-fold internal cross-validation showed a model validity of 83% ± 8%. Regression diagnostics showed no evidence of lack-of-fit and no patterns in the studentized residuals. The α/β ratio from the linear quadratic model, estimated as the ratio of the coefficients for dose and D•dpf, was 0.76 Gy (95% confidence interval [CI], 0.05-1.55). The odds ratio for pediatric patients was 2.8 (95% CI, 1.7-4.6) relative to adults. Conclusions: Dose, D•dpf, pediatric status, and regimented follow-up care by an ophthalmologist were predictive of 5-year cataract incidence after HSCT. The low α/β ratio indicates the importance of fractionation in reducing cataracts. Dose rate effects have been observed in single institution studies but not in the
Assessing risk factors for periodontitis using regression
Lobo Pereira, J. A.; Ferreira, Maria Cristina; Oliveira, Teresa
2013-10-01
Multivariate statistical analysis is indispensable to assess the associations and interactions between different factors and the risk of periodontitis. Among others, regression analysis is a statistical technique widely used in healthcare to investigate and model the relationship between variables. In our work we study the impact of socio-demographic, medical and behavioral factors on periodontal health. Using regression, linear and logistic models, we can assess the relevance, as risk factors for periodontitis disease, of the following independent variables (IVs): Age, Gender, Diabetic Status, Education, Smoking status and Plaque Index. The multiple linear regression analysis model was built to evaluate the influence of IVs on mean Attachment Loss (AL). Thus, the regression coefficients along with respective p-values will be obtained as well as the respective p-values from the significance tests. The classification of a case (individual) adopted in the logistic model was the extent of the destruction of periodontal tissues defined by an Attachment Loss greater than or equal to 4 mm in 25% (AL≥4mm/≥25%) of sites surveyed. The association measures include the Odds Ratios together with the correspondent 95% confidence intervals.
Lee, Tsair-Fwu; Liou, Ming-Hsiang; Huang, Yu-Jie; Chao, Pei-Ju; Ting, Hui-Min; Lee, Hsiao-Yi
2014-01-01
To predict the incidence of moderate-to-severe patient-reported xerostomia among head and neck squamous cell carcinoma (HNSCC) and nasopharyngeal carcinoma (NPC) patients treated with intensity-modulated radiotherapy (IMRT). Multivariable normal tissue complication probability (NTCP) models were developed by using quality of life questionnaire datasets from 152 patients with HNSCC and 84 patients with NPC. The primary endpoint was defined as moderate-to-severe xerostomia after IMRT. The numbers of predictive factors for a multivariable logistic regression model were determined using the least absolute shrinkage and selection operator (LASSO) with bootstrapping technique. Four predictive models were achieved by LASSO with the smallest number of factors while preserving predictive value with higher AUC performance. For all models, the dosimetric factors for the mean dose given to the contralateral and ipsilateral parotid gland were selected as the most significant predictors. Followed by the different clinical and socio-economic factors being selected, namely age, financial status, T stage, and education for different models were chosen. The predicted incidence of xerostomia for HNSCC and NPC patients can be improved by using multivariable logistic regression models with LASSO technique. The predictive model developed in HNSCC cannot be generalized to NPC cohort treated with IMRT without validation and vice versa. PMID:25163814
Nishijima, Takeshi; Teruya, Katsuji; Shibata, Satoshi; Yanagawa, Yasuaki; Kobayashi, Taiichiro; Mizushima, Daisuke; Aoki, Takahiro; Kinai, Ei; Yazaki, Hirohisa; Tsukada, Kunihisa; Genka, Ikumi; Kikuchi, Yoshimi; Oka, Shinichi; Gatanaga, Hiroyuki
2016-01-01
The epidemiology of incident syphilis infection among HIV-1-infected men who have sex with men (MSM) largely remains unknown. The incidence and risk factors for incident syphilis (positive TPHA and RPR> = 1:8) among HIV-1-infected MSM who visited a large HIV clinic in Tokyo for the first time between 2008 and 2013 were determined, using clinical data and stored blood samples taken every three months for screening and determination of the date of incident syphilis. Poisson regression compared the incidence of syphilis at different observation periods. Of 885 HIV-1-infected MSM with baseline data, 34% either presented with active syphilis at baseline (21%) or became infected with syphilis during follow-up (13%). After excluding 214 patients (MSM with syphilis at baseline (n = 190) and no follow-up syphilis test (n = 24)), of 671 men, 112 (17%) developed incident syphilis with an incidence of 43.7/1,000 person-years [95% CI, 36.5-52.3]. The incidence decreased slightly during observation period although the trend was not significant (2008-2009: 48.2/1,000 person-years, 2010-2011: 51.1/1,000 person-years, 2012-2013: 42.6/1,000 person-years, 2014 to 2015: 37.9/1,000 person-years, p = 0.315). Multivariable analysis identified young age (40, HR 4.0, 95%CI 2.22-7.18, pTokyo. Regular screening for syphilis needs to be strictly applied to this population.
Nishijima, Takeshi; Teruya, Katsuji; Shibata, Satoshi; Yanagawa, Yasuaki; Kobayashi, Taiichiro; Mizushima, Daisuke; Aoki, Takahiro; Kinai, Ei; Yazaki, Hirohisa; Tsukada, Kunihisa; Genka, Ikumi; Kikuchi, Yoshimi; Oka, Shinichi; Gatanaga, Hiroyuki
2016-01-01
Background The epidemiology of incident syphilis infection among HIV-1-infected men who have sex with men (MSM) largely remains unknown. Methods The incidence and risk factors for incident syphilis (positive TPHA and RPR> = 1:8) among HIV-1-infected MSM who visited a large HIV clinic in Tokyo for the first time between 2008 and 2013 were determined, using clinical data and stored blood samples taken every three months for screening and determination of the date of incident syphilis. Poisson regression compared the incidence of syphilis at different observation periods. Results Of 885 HIV-1-infected MSM with baseline data, 34% either presented with active syphilis at baseline (21%) or became infected with syphilis during follow-up (13%). After excluding 214 patients (MSM with syphilis at baseline (n = 190) and no follow-up syphilis test (n = 24)), of 671 men, 112 (17%) developed incident syphilis with an incidence of 43.7/1,000 person-years [95% CI, 36.5–52.3]. The incidence decreased slightly during observation period although the trend was not significant (2008–2009: 48.2/1,000 person-years, 2010–2011: 51.1/1,000 person-years, 2012–2013: 42.6/1,000 person-years, 2014 to 2015: 37.9/1,000 person-years, p = 0.315). Multivariable analysis identified young age (40, HR 4.0, 95%CI 2.22–7.18, pTokyo. Regular screening for syphilis needs to be strictly applied to this population. PMID:27992604
Flexible competing risks regression modeling and goodness-of-fit
DEFF Research Database (Denmark)
Scheike, Thomas; Zhang, Mei-Jie
2008-01-01
In this paper we consider different approaches for estimation and assessment of covariate effects for the cumulative incidence curve in the competing risks model. The classic approach is to model all cause-specific hazards and then estimate the cumulative incidence curve based on these cause......-specific hazards. Another recent approach is to directly model the cumulative incidence by a proportional model (Fine and Gray, J Am Stat Assoc 94:496-509, 1999), and then obtain direct estimates of how covariates influences the cumulative incidence curve. We consider a simple and flexible class of regression...
Pinto, Guido; Beltrán-Sánchez, Hiram
2015-01-01
To prospectively assess the relationship between overweight/obesity and incidence of type 2 diabetes mellitus (T2DM) among Mexicans aged 50+, assessing effects of age, genetic predisposition, education, physical activity, and place of residence. The Mexican Health and Aging Study (MHAS) was used to prospectively follow respondents free of diabetes in 2001 who became diabetic by 2012. Multivariate random effects logistic regression was used to assess covariates effects on the incidence of T2DM. Obese or overweight individuals at baseline (2001) were about 3 and 2 times, respectively, significantly more likely to become diabetic by 2012. Genetic predisposition increases the risk of diabetes by about three times compared to those with no family history of diabetes. Overweight/obesity and genetic predisposition are the primary drivers of diabetes incidence among Mexican older adults. Reducing body weight and having access to health care may amel iorate the disease burden of T2DM.
Multivariate Analysis in Nuclear Physics
Désesquelles, P.
Nuclear physics deals more and more with experiments involving a large number of parameters. The analysis of such experiments requires well adapted statistical techniques. The multivariate analysis techniques consist in the representation of the experimental events as points in the multidimensional space of the physical variables. One aim will be to treat experimental information as a whole. This formalism permits the simultaneous studies of the structures of the event cloud and of the correlations between the variables. Principal Component Analysis is concerned with the determination of the so-called principal variables, linear combinations of the primary physical variables, which represent the maximum information. Correspondence Analysis visualises, on a 2D diagram, the correlations between the modalities of qualitative variables. The goal of the Discriminant Analysis is to discriminate different types of events, that is to affect them to a familly. The last part of the work is devoted to a global protocol, involving the PCA, for the comparison of experimental data with data generated by a simulation code. Lecture given at the Joliot-Curie summer school on nuclear physics (Maubuisson France, sept. 1994)
Incidence of scleritis and episcleritis: results from the Pacific Ocular Inflammation Study.
Homayounfar, Gelareh; Nardone, Natalie; Borkar, Durga S; Tham, Vivien M; Porco, Travis C; Enanoria, Wayne T A; Parker, John V; Vinoya, Aleli C; Uchida, Aileen; Acharya, Nisha R
2013-10-01
To ascertain the incidence of scleritis and episcleritis in a Hawaiian population and describe variations by age, sex, and race. Retrospective, population-based cohort study. All electronic medical records for enrollees in Kaiser Permanente Hawaii (n = 217,061) from January 1, 2006 to December 31, 2007 were searched for International Classification of Diseases, 9th Edition (ICD-9) codes associated with ocular inflammation. Chart review was conducted to verify a clinical diagnosis of scleritis or episcleritis. Confirmed cases were used to calculate incidence rates per 100,000 person-years. Ninety-five percent confidence intervals (CI) were calculated for each incidence rate, including age-, sex-, and race-specific rates, using bias-corrected Poisson regression. To assess for confounding, a multivariate analysis adjusting for age, sex, and race was also performed. Of 217,061 eligible patients, 17 incident scleritis cases and 93 incident episcleritis cases were confirmed. The overall incidence rates of scleritis and episcleritis were 4.1 (95% CI: 2.6-6.6) and 21.7 (95% CI: 17.7-26.5) cases per 100,000 person-years, respectively. Women were overrepresented among scleritis patients (P = .049). Pacific Islanders were the most underrepresented racial group among cases of scleritis and episcleritis (P = .006, P = .001). Blacks had the highest incidence of scleritis (P = .004). These results provide a population-based estimate of the incidence of scleritis and episcleritis in a diverse population and highlight differences in patients' demographic characteristics. Differences in incidence by sex and race raise questions about genetic and environmental influences on the development of these conditions. Copyright © 2013 Elsevier Inc. All rights reserved.
Plas, Matthijs; Hemmer, Patrick H J; Been, Lukas B; van Ginkel, Robert J; de Bock, Geertruida H; van Leeuwen, Barbara L
2018-02-01
Incidence of, and baseline characteristics associated with delirium in patients after cytoreduction surgery-hyperthermic intraperitoneal chemotherapy (CRS-HIPEC), were subject of investigation. The study was conducted among a consecutive series of prospectively included patients who underwent CRS-HIPEC at the University Medical Center Groningen, Groningen, the Netherlands, between February 2006 and January 2015. A chart-based instrument for delirium during hospitalization was used to identify patients with symptoms of delirium who were not diagnosed by a psychiatrist during admission. Uni- and multivariate logistic regression analyses were performed. Data of 136 patients were included in the analysis. Median age was 60 years (range: 18-76) and 50 (37%) patients were male. During hospitalization, 38 (28%) patients were diagnosed with delirium. Factors that differed significantly between the patients with and without delirium by univariate analysis were included in multivariate analysis. Multivariate analysis showed that after adjustment for age and complications other than delirium, having three or more organs resected and the CRP serum levels were independent predictors for delirium (OR: 3.97; 95% 1.24-12.76; OR: 1.01; 95% 1-1.01, respectively). This report shows an incidence of 28% of delirium, occurring after CRS-HIPEC and suggests a role for systemic inflammation in the development of postoperative delirium. © 2017 Wiley Periodicals, Inc.
Directory of Open Access Journals (Sweden)
Vladimir Revicky
2011-01-01
Full Text Available Background/Aims. Aim of the study was to establish an effect of obesity on the incidence of bladder injury or urinary retention following tension-free vaginal tape (TVT procedure. Methods. This was a retrospective cohort study based at the Norfolk and Norwich University Hospital in the UK. Study population included 342 cases of TVT procedures. Incidence of bladder injury was 4.7% (16/342. Rate of urinary retention was 9% (31/342. Body mass index (BMI, age, type of analgesia, concomitant prolapse repair, and previous surgery were factors studied. Univariate analysis was performed to establish a relationship between BMI and complications, followed by a multivariable regression analysis to adjust for age, concomitant surgery, type of analgesia, and previous surgery. Results. Neither univariate analysis nor multivariate regression analysis revealed any statistically significant influence of obesity on the incidence of bladder injury or urinary retention. Unadjusted odds ratios and adjusted odds ratios for bladder injury and urinary retention by BMI groups were OR 1.7296 CI 0.4818–6.2097; OR 1.3745 CI 0.5718–3.3043 and adj. OR 2.885 CI 0.603–13.8; adj. OR 1.299 CI 0.502–3.365. Conclusion. Obesity does not appear to influence the rate of bladder injury or urinary retention following TVT procedure.
Recursive Algorithm For Linear Regression
Varanasi, S. V.
1988-01-01
Order of model determined easily. Linear-regression algorithhm includes recursive equations for coefficients of model of increased order. Algorithm eliminates duplicative calculations, facilitates search for minimum order of linear-regression model fitting set of data satisfactory.
Voxelwise multivariate analysis of multimodality magnetic resonance imaging.
Naylor, Melissa G; Cardenas, Valerie A; Tosun, Duygu; Schuff, Norbert; Weiner, Michael; Schwartzman, Armin
2014-03-01
Most brain magnetic resonance imaging (MRI) studies concentrate on a single MRI contrast or modality, frequently structural MRI. By performing an integrated analysis of several modalities, such as structural, perfusion-weighted, and diffusion-weighted MRI, new insights may be attained to better understand the underlying processes of brain diseases. We compare two voxelwise approaches: (1) fitting multiple univariate models, one for each outcome and then adjusting for multiple comparisons among the outcomes and (2) fitting a multivariate model. In both cases, adjustment for multiple comparisons is performed over all voxels jointly to account for the search over the brain. The multivariate model is able to account for the multiple comparisons over outcomes without assuming independence because the covariance structure between modalities is estimated. Simulations show that the multivariate approach is more powerful when the outcomes are correlated and, even when the outcomes are independent, the multivariate approach is just as powerful or more powerful when at least two outcomes are dependent on predictors in the model. However, multiple univariate regressions with Bonferroni correction remain a desirable alternative in some circumstances. To illustrate the power of each approach, we analyze a case control study of Alzheimer's disease, in which data from three MRI modalities are available. Copyright © 2013 Wiley Periodicals, Inc.
Multivariate refined composite multiscale entropy analysis
Energy Technology Data Exchange (ETDEWEB)
Humeau-Heurtier, Anne, E-mail: anne.humeau@univ-angers.fr
2016-04-01
Multiscale entropy (MSE) has become a prevailing method to quantify signals complexity. MSE relies on sample entropy. However, MSE may yield imprecise complexity estimation at large scales, because sample entropy does not give precise estimation of entropy when short signals are processed. A refined composite multiscale entropy (RCMSE) has therefore recently been proposed. Nevertheless, RCMSE is for univariate signals only. The simultaneous analysis of multi-channel (multivariate) data often over-performs studies based on univariate signals. We therefore introduce an extension of RCMSE to multivariate data. Applications of multivariate RCMSE to simulated processes reveal its better performances over the standard multivariate MSE. - Highlights: • Multiscale entropy quantifies data complexity but may be inaccurate at large scale. • A refined composite multiscale entropy (RCMSE) has therefore recently been proposed. • Nevertheless, RCMSE is adapted to univariate time series only. • We herein introduce an extension of RCMSE to multivariate data. • It shows better performances than the standard multivariate multiscale entropy.
Dimensionality Reduction via Regression in Hyperspectral Imagery
Laparra, Valero; Malo, Jesus; Camps-Valls, Gustau
2015-09-01
This paper introduces a new unsupervised method for dimensionality reduction via regression (DRR). The algorithm belongs to the family of invertible transforms that generalize Principal Component Analysis (PCA) by using curvilinear instead of linear features. DRR identifies the nonlinear features through multivariate regression to ensure the reduction in redundancy between he PCA coefficients, the reduction of the variance of the scores, and the reduction in the reconstruction error. More importantly, unlike other nonlinear dimensionality reduction methods, the invertibility, volume-preservation, and straightforward out-of-sample extension, makes DRR interpretable and easy to apply. The properties of DRR enable learning a more broader class of data manifolds than the recently proposed Non-linear Principal Components Analysis (NLPCA) and Principal Polynomial Analysis (PPA). We illustrate the performance of the representation in reducing the dimensionality of remote sensing data. In particular, we tackle two common problems: processing very high dimensional spectral information such as in hyperspectral image sounding data, and dealing with spatial-spectral image patches of multispectral images. Both settings pose collinearity and ill-determination problems. Evaluation of the expressive power of the features is assessed in terms of truncation error, estimating atmospheric variables, and surface land cover classification error. Results show that DRR outperforms linear PCA and recently proposed invertible extensions based on neural networks (NLPCA) and univariate regressions (PPA).
On logistic regression analysis of dichotomized responses.
Lu, Kaifeng
2017-01-01
We study the properties of treatment effect estimate in terms of odds ratio at the study end point from logistic regression model adjusting for the baseline value when the underlying continuous repeated measurements follow a multivariate normal distribution. Compared with the analysis that does not adjust for the baseline value, the adjusted analysis produces a larger treatment effect as well as a larger standard error. However, the increase in standard error is more than offset by the increase in treatment effect so that the adjusted analysis is more powerful than the unadjusted analysis for detecting the treatment effect. On the other hand, the true adjusted odds ratio implied by the normal distribution of the underlying continuous variable is a function of the baseline value and hence is unlikely to be able to be adequately represented by a single value of adjusted odds ratio from the logistic regression model. In contrast, the risk difference function derived from the logistic regression model provides a reasonable approximation to the true risk difference function implied by the normal distribution of the underlying continuous variable over the range of the baseline distribution. We show that different metrics of treatment effect have similar statistical power when evaluated at the baseline mean. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Combining Alphas via Bounded Regression
Directory of Open Access Journals (Sweden)
Zura Kakushadze
2015-11-01
Full Text Available We give an explicit algorithm and source code for combining alpha streams via bounded regression. In practical applications, typically, there is insufficient history to compute a sample covariance matrix (SCM for a large number of alphas. To compute alpha allocation weights, one then resorts to (weighted regression over SCM principal components. Regression often produces alpha weights with insufficient diversification and/or skewed distribution against, e.g., turnover. This can be rectified by imposing bounds on alpha weights within the regression procedure. Bounded regression can also be applied to stock and other asset portfolio construction. We discuss illustrative examples.
A Bayesian Approach for Identifying Multivariate Differences Between Groups.
Sverchkov, Yuriy; Cooper, Gregory F
2015-10-01
We present a novel approach to the problem of detecting multivariate statistical differences across groups of data. The need to compare data in a multivariate manner arises naturally in observational studies, randomized trials, comparative effectiveness research, abnormality and anomaly detection scenarios, and other application areas. In such comparisons, it is of interest to identify statistical differences across the groups being compared. The approach we present in this paper addresses this issue by constructing statistical models that describe the groups being compared and using a decomposable Bayesian Dirichlet score of the models to identify variables that behave statistically differently between the groups. In our evaluation, the new method performed significantly better than logistic lasso regression in indentifying differences in a variety of datasets under a variety of conditions.
Directory of Open Access Journals (Sweden)
Takeshi Nishijima
Full Text Available The epidemiology of incident syphilis infection among HIV-1-infected men who have sex with men (MSM largely remains unknown.The incidence and risk factors for incident syphilis (positive TPHA and RPR> = 1:8 among HIV-1-infected MSM who visited a large HIV clinic in Tokyo for the first time between 2008 and 2013 were determined, using clinical data and stored blood samples taken every three months for screening and determination of the date of incident syphilis. Poisson regression compared the incidence of syphilis at different observation periods.Of 885 HIV-1-infected MSM with baseline data, 34% either presented with active syphilis at baseline (21% or became infected with syphilis during follow-up (13%. After excluding 214 patients (MSM with syphilis at baseline (n = 190 and no follow-up syphilis test (n = 24, of 671 men, 112 (17% developed incident syphilis with an incidence of 43.7/1,000 person-years [95% CI, 36.5-52.3]. The incidence decreased slightly during observation period although the trend was not significant (2008-2009: 48.2/1,000 person-years, 2010-2011: 51.1/1,000 person-years, 2012-2013: 42.6/1,000 person-years, 2014 to 2015: 37.9/1,000 person-years, p = 0.315. Multivariable analysis identified young age (40, HR 4.0, 95%CI 2.22-7.18, p<0.001, history of syphilis at baseline (HR 3.0, 95%CI 2.03-4.47, p<0.001, positive anti-amoeba antibody (HR 1.8, 95%CI 1.17-2.68, p = 0.006, and high baseline CD4 count (CD4 ≥350 /μL versus CD4 <200, HR 1.6, 95%CI 1.00-2.53, p = 0.050 as risk factors for incident syphilis. Incidence of syphilis was particularly high among young patients (age <33 years: 60.1/1,000 person-years. Interestingly, 37% of patients with incident syphilis were asymptomatic.Although incidence of syphilis did not increase during the observation period, it was high among HIV-1-infected MSM, especially among young HIV-1-infected MSM and those with history of syphilis, in Tokyo. Regular screening for syphilis needs to be
Linear regression in astronomy. I
Isobe, Takashi; Feigelson, Eric D.; Akritas, Michael G.; Babu, Gutti Jogesh
1990-01-01
Five methods for obtaining linear regression fits to bivariate data with unknown or insignificant measurement errors are discussed: ordinary least-squares (OLS) regression of Y on X, OLS regression of X on Y, the bisector of the two OLS lines, orthogonal regression, and 'reduced major-axis' regression. These methods have been used by various researchers in observational astronomy, most importantly in cosmic distance scale applications. Formulas for calculating the slope and intercept coefficients and their uncertainties are given for all the methods, including a new general form of the OLS variance estimates. The accuracy of the formulas was confirmed using numerical simulations. The applicability of the procedures is discussed with respect to their mathematical properties, the nature of the astronomical data under consideration, and the scientific purpose of the regression. It is found that, for problems needing symmetrical treatment of the variables, the OLS bisector performs significantly better than orthogonal or reduced major-axis regression.
Kowalkowski, Marc A.; Kramer, Jennifer R.; Richardson, Peter R.; Suteria, Insia; Chiao, Elizabeth Y.
2015-01-01
Background. Kaposi sarcoma (KS) incidence has decreased since combination antiretroviral therapy (cART). However, effects of cART type and duration on KS remain difficult to interpret secondary to KS-associated immune reconstitution inflammatory syndrome (IRIS). Methods. We performed a retrospective study of Veterans Affairs Human Immunodeficiency Virus Clinical Case Registry data from 1985 to 2010. We analyzed the relationship between cART regimens and KS using multivariable Poisson regression, stratified or adjusted for timing around cART initiation. KS was identified by ≥1 inpatient or ≥2 outpatient International Classification of Diseases, Ninth Revision codes (176.0–9). Percent of cART on specific regimen and total duration on specific regimen were examined. Results. There were 341 KS cases among 25 529 HIV-infected male veterans (incidence rate = 2.02/1000 person-years). Stratified by years after starting cART, every additional 10% time on boosted protease inhibitors (BPIs) was associated with reduced KS incidence in the third year of cART (incidence rate ratio [IRR] = 0.79; 95% confidence interval [CI], .69–.90). Months on BPIs was associated with lower KS incidence (P = .02). KS incidence was lower at 12–23 (IRR = 0.47; 95% CI, .23–.95) and ≥36 (IRR = 0.14; 95% CI, .02–1.00) months on BPIs compared with <6 months. Longer duration on other regimens was not associated with decreased KS incidence. Conclusions. Lower KS incidence was observed with longer BPI use, after accounting for potential IRIS and other factors. Future research should evaluate newer cART regimens and long-term benefits of PI-based cART on KS in other cohorts and prospective studies. PMID:25586682
Kotanchek, Mark E.; Vladislavleva, Ekaterina Y.; Smits, Guido F.
In this chapter we illustrate a framework based on symbolic regression to generate and sharpen the questions about the nature of the underlying system and provide additional context and understanding based on multi-variate numeric data.
Incidence of Stingers in Young Rugby Players.
Kawasaki, Takayuki; Ota, Chihiro; Yoneda, Takeshi; Maki, Nobukazu; Urayama, Shingo; Nagao, Masashi; Nagayama, Masataka; Kaketa, Takefumi; Takazawa, Yuji; Kaneko, Kazuo
2015-11-01
A stinger is a type of neurapraxia of the cervical roots or brachial plexus and represents a reversible peripheral nerve injury. The incidence of and major risk factors for stingers among young rugby players remain uninvestigated. To investigate the incidence, symptoms, and intrinsic risk factors for stingers in elite rugby union teams of young players. Descriptive epidemiology study. A total of 569 male rugby players, including 358 players from 7 high school teams and 211 players from 2 university teams, were investigated using self-administered preseason and postseason questionnaires. The prevalence of a history of stingers was 33.9% (95% CI, 30.3-37.9), and 20.9% (119/569) of players experienced at least 1 episode of a stinger during the season (34.2 [95% CI, 26.2-42.1] events per 1000 player-hours of match exposure). The reinjury rate for stingers per season was 37.3% (95% CI, 30.4-44.2). Using the multivariate Poisson regression method, a history of stingers in the previous season and the grade and position of the player were found to be risk factors for stingers during the current season. The mean severity of injury was 2.9 days, with 79.3% (191/241) of the players not losing any time from playing after sustaining a stinger injury and 5.8% (14/241) of the players recovering within more than 14 days. The most frequent symptom was numbness in the unilateral upper extremity, and the most severe symptom was weakness of grasping (mean severity, 6 days). A logistic regression analysis indicated that a history of stingers in the previous season and an injury with more than 3 symptoms, especially motor weakness, were correlated with the severity of injury. Young rugby players with a history of stingers have a significantly high rate of repeat injuries. Although nearly 80% of the players experienced only minimal (0-1 day) time loss injuries, neurological deficits sometimes last beyond 1 month. A history of stingers was identified to be the strongest risk factor for
Directory of Open Access Journals (Sweden)
Robert Y. L. Zee
2018-01-01
Full Text Available Recent studies have demonstrated the importance of endoplasmic reticulum aminopeptidase (ERAP in blood pressure (BP homeostasis. To date, no large prospective, genetic–epidemiological data are available on genetic variation within ERAP and hypertension risk. The association of 45 genetic variants of ERAP1 and ERAP2 was investigated in 17,255 Caucasian female participants from the Women’s Genome Health Study. All subjects were free of hypertension at baseline. During an 18-year follow-up period, 10,216 incident hypertensive cases were identified. Multivariable linear, logistic, and Cox regression analyses were performed to assess the relationship of genotypes with baseline BP levels, BP progression at 48 months, and incident hypertension assuming an additive genetic model. Linear regression analyses showed associations of four tSNPs (ERAP1: rs27524; ERAP2: rs3733904, rs4869315, and rs2549782; all p<0.05 with baseline systolic BP levels. Three tSNPs (ERAP1: rs27851, rs27429, and rs34736, all p<0.05 were associated with baseline diastolic BP levels. Multivariable logistic regression analysis showed that ERAP1 rs27772 was associated with BP progression at 48 months (p=0.0366. Multivariable Cox regression analysis showed an association of three tSNPs (ERAP1: rs469783 and rs10050860; ERAP2: rs2927615; all p<0.05 with risk of incident hypertension. Analyses of dbGaP for genotype–phenotype association and GTEx Portal for gene expression quantitative trait loci revealed five tSNPs with differential association of BP and nine tSNPs with lower ERAP1 and ERAP2 mRNA expression levels, respectively. The present study suggests that ERAP1 and ERAP2 gene variation may be useful for risk assessment of BP progression and the development of hypertension.
Condon, John R; Zhang, Xiaohua; Dempsey, Karen; Garling, Lindy; Guthridge, Steven
2016-11-21
To assess trends in cancer incidence and survival for Indigenous and non-Indigenous Australians in the Northern Territory. Retrospective analysis of population-based cancer registration data. New cancer diagnoses in the NT, 1991-2012. Age-adjusted incidence rates; rate ratios comparing incidence in NT Indigenous and non-Indigenous populations with that for other Australians; 5-year survival; multivariable Poisson regression of excess mortality. The incidence of most cancers in the NT non-Indigenous population was similar to that for other Australians. For the NT Indigenous population, the incidence of cancer at several sites was much higher (v other Australians: lung, 84% higher; head and neck, 325% higher; liver, 366% higher; cervix, 120% higher). With the exception of cervical cancer (65% decrease), incidence rates in the Indigenous population did not fall between 1991-1996 and 2007-2012. The incidence of several other cancers (breast, bowel, prostate, melanoma) was much lower in 1991-1996 than for other Australians, but had increased markedly by 2007-2012 (breast, 274% increase; bowel, 120% increase; prostate, 116% increase). Five-year survival was lower for NT Indigenous than for NT non-Indigenous patients, but had increased for both populations between 1991-2000 and 2001-2010. The incidence of several cancers that were formerly less common in NT Indigenous people has increased, without a concomitant reduction in the incidence of higher incidence cancers (several of which are smoking-related). The excess burden of cancer in this population will persist until lifestyle risks are mitigated, particularly by reducing the extraordinarily high prevalence of smoking.
Multivariate Marshall and Olkin Exponential Minification Process ...
African Journals Online (AJOL)
A stationary bivariate minification process with bivariate Marshall-Olkin exponential distribution that was earlier studied by Miroslav et al [15]is in this paper extended to multivariate minification process with multivariate Marshall and Olkin exponential distribution as its stationary marginal distribution. The innovation and the ...
Multivariate multiscale entropy of financial markets
Lu, Yunfan; Wang, Jun
2017-11-01
In current process of quantifying the dynamical properties of the complex phenomena in financial market system, the multivariate financial time series are widely concerned. In this work, considering the shortcomings and limitations of univariate multiscale entropy in analyzing the multivariate time series, the multivariate multiscale sample entropy (MMSE), which can evaluate the complexity in multiple data channels over different timescales, is applied to quantify the complexity of financial markets. Its effectiveness and advantages have been detected with numerical simulations with two well-known synthetic noise signals. For the first time, the complexity of four generated trivariate return series for each stock trading hour in China stock markets is quantified thanks to the interdisciplinary application of this method. We find that the complexity of trivariate return series in each hour show a significant decreasing trend with the stock trading time progressing. Further, the shuffled multivariate return series and the absolute multivariate return series are also analyzed. As another new attempt, quantifying the complexity of global stock markets (Asia, Europe and America) is carried out by analyzing the multivariate returns from them. Finally we utilize the multivariate multiscale entropy to assess the relative complexity of normalized multivariate return volatility series with different degrees.
Handbook of univariate and multivariate data analysis with IBM SPSS
Ho, Robert
2013-01-01
Using the same accessible, hands-on approach as its best-selling predecessor, the Handbook of Univariate and Multivariate Data Analysis with IBM SPSS, Second Edition explains how to apply statistical tests to experimental findings, identify the assumptions underlying the tests, and interpret the findings. This second edition now covers more topics and has been updated with the SPSS statistical package for Windows.New to the Second EditionThree new chapters on multiple discriminant analysis, logistic regression, and canonical correlationNew section on how to deal with missing dataCoverage of te
Application of Multivariate Analysis Tools to Industrial Scale Fermentation Data
DEFF Research Database (Denmark)
Mears, Lisa; Nørregård, Rasmus; Stocks, Stuart M.
The analysis of batch process data can provide insight into the process operation, and there is a vast amount of historical data available for data mining. Empirical modelling utilising this data is desirable where there is a lack of understanding regarding the underlying process (Formenti et al....... application of multivariate methods to industrial scale process data to cover these considerations....... prediction error of 7.6%. The success of the final regression model was heavily dependent on the decisions made in the pre-processing stages, where the issues of different batch lengths, different measurement intervals, and variable scaling are considered. Therefore a methodology is presented for future...
Fall, Tove; Hamlin, Helene Hansson; Hedhammar, Ake; Kämpe, Olle; Egenvall, Agneta
2007-01-01
Canine diabetes mellitus (DM) is a common endocrinopathy with an unclear etiology. For a better understanding of the underlying mechanisms, there is a need for comprehensive epidemiologic studies. Earlier studies have shown that the risk of disease is higher in certain dog breeds. Incidence, age of onset, survival and sex proportion of DM vary by breed. Data from a cohort of 182,087 insured dogs aged 5-12 years accounting for 652,898 dog-years at risk were studied retrospectively. Incidence rates by sex, breed, and geography were calculated with exact denominators. Age-specific incidence and survival after 1st DM claim were computed with Cox's regression and Kaplan-Meier survival function. Multivariable survival analysis was performed for the outcome diagnosis of DM with age, sex, and geography tested as fixed effects, previous endocrine or pancreatic diseases tested as time-dependent covariates, and breed tested as a random effect. The mean age at 1st insurance claim for the 860 DM dogs (72% females) was 8.6 years. The incidence of DM was 13 cases per 10,000 dog-years at risk. Australian Terriers, Samoyeds, Swedish Elkhounds, and Swedish Lapphunds were found to have the highest incidence. The proportion of females with DM varied significantly among breeds. Swedish Elkhounds, Beagles, Norwegian Elkhounds, and Border Collies that developed DM were almost exclusively females. The multivariable model showed that breed, previous hyperadrenocorticism, and female sex were risk factors for developing DM. Median survival time was 57 days after 1st claim. Excluding the 223 dogs that died within 1 day, the median survival time was 2 years after 1st claim of DM. The significant breed-specific sex and age differences shown in this study indicate that genetic variation could make breeds more or less susceptible to different types of DM.
TMVA(Toolkit for Multivariate Analysis) new architectures design and implementation.
Zapata Mesa, Omar Andres
2016-01-01
Toolkit for Multivariate Analysis(TMVA) is a package in ROOT for machine learning algorithms for classification and regression of the events in the detectors. In TMVA, we are developing new high level algorithms to perform multivariate analysis as cross validation, hyper parameter optimization, variable importance etc... Almost all the algorithms are expensive and designed to process a huge amount of data. It is very important to implement the new technologies on parallel computing to reduce the processing times.
Linear regression in astronomy. II
Feigelson, Eric D.; Babu, Gutti J.
1992-01-01
A wide variety of least-squares linear regression procedures used in observational astronomy, particularly investigations of the cosmic distance scale, are presented and discussed. The classes of linear models considered are (1) unweighted regression lines, with bootstrap and jackknife resampling; (2) regression solutions when measurement error, in one or both variables, dominates the scatter; (3) methods to apply a calibration line to new data; (4) truncated regression models, which apply to flux-limited data sets; and (5) censored regression models, which apply when nondetections are present. For the calibration problem we develop two new procedures: a formula for the intercept offset between two parallel data sets, which propagates slope errors from one regression to the other; and a generalization of the Working-Hotelling confidence bands to nonstandard least-squares lines. They can provide improved error analysis for Faber-Jackson, Tully-Fisher, and similar cosmic distance scale relations.
Time-adaptive quantile regression
DEFF Research Database (Denmark)
Møller, Jan Kloppenborg; Nielsen, Henrik Aalborg; Madsen, Henrik
2008-01-01
An algorithm for time-adaptive quantile regression is presented. The algorithm is based on the simplex algorithm, and the linear optimization formulation of the quantile regression problem is given. The observations have been split to allow a direct use of the simplex algorithm. The simplex method...... and an updating procedure are combined into a new algorithm for time-adaptive quantile regression, which generates new solutions on the basis of the old solution, leading to savings in computation time. The suggested algorithm is tested against a static quantile regression model on a data set with wind power...... production, where the models combine splines and quantile regression. The comparison indicates superior performance for the time-adaptive quantile regression in all the performance parameters considered....
Bootstrap Enhanced Penalized Regression for Variable Selection with Neuroimaging Data.
Abram, Samantha V; Helwig, Nathaniel E; Moodie, Craig A; DeYoung, Colin G; MacDonald, Angus W; Waller, Niels G
2016-01-01
Recent advances in fMRI research highlight the use of multivariate methods for examining whole-brain connectivity. Complementary data-driven methods are needed for determining the subset of predictors related to individual differences. Although commonly used for this purpose, ordinary least squares (OLS) regression may not be ideal due to multi-collinearity and over-fitting issues. Penalized regression is a promising and underutilized alternative to OLS regression. In this paper, we propose a nonparametric bootstrap quantile (QNT) approach for variable selection with neuroimaging data. We use real and simulated data, as well as annotated R code, to demonstrate the benefits of our proposed method. Our results illustrate the practical potential of our proposed bootstrap QNT approach. Our real data example demonstrates how our method can be used to relate individual differences in neural network connectivity with an externalizing personality measure. Also, our simulation results reveal that the QNT method is effective under a variety of data conditions. Penalized regression yields more stable estimates and sparser models than OLS regression in situations with large numbers of highly correlated neural predictors. Our results demonstrate that penalized regression is a promising method for examining associations between neural predictors and clinically relevant traits or behaviors. These findings have important implications for the growing field of functional connectivity research, where multivariate methods produce numerous, highly correlated brain networks.
Bias-corrected quantile regression estimation of censored regression models
Cizek, Pavel; Sadikoglu, Serhan
2018-01-01
In this paper, an extension of the indirect inference methodology to semiparametric estimation is explored in the context of censored regression. Motivated by weak small-sample performance of the censored regression quantile estimator proposed by Powell (J Econom 32:143–155, 1986a), two- and
Quantile regression theory and applications
Davino, Cristina; Vistocco, Domenico
2013-01-01
A guide to the implementation and interpretation of Quantile Regression models This book explores the theory and numerous applications of quantile regression, offering empirical data analysis as well as the software tools to implement the methods. The main focus of this book is to provide the reader with a comprehensivedescription of the main issues concerning quantile regression; these include basic modeling, geometrical interpretation, estimation and inference for quantile regression, as well as issues on validity of the model, diagnostic tools. Each methodological aspect is explored and
Ghanbarzadeh, Mitra; Aminghafari, Mina
2015-05-01
This article studies the prediction of periodically correlated process using wavelet transform and multivariate methods with applications to climatological data. Periodically correlated processes can be reformulated as multivariate stationary processes. Considering this fact, two new prediction methods are proposed. In the first method, we use stepwise regression between the principal components of the multivariate stationary process and past wavelet coefficients of the process to get a prediction. In the second method, we propose its multivariate version without principal component analysis a priori. Also, we study a generalization of the prediction methods dealing with a deterministic trend using exponential smoothing. Finally, we illustrate the performance of the proposed methods on simulated and real climatological data (ozone amounts, flows of a river, solar radiation, and sea levels) compared with the multivariate autoregressive model. The proposed methods give good results as we expected.
Early and Late Recurrent Epistaxis Admissions: Patterns of Incidence and Risk Factors.
Cohen, Oded; Shoffel-Havakuk, Hagit; Warman, Meir; Tzelnick, Sharon; Haimovich, Yaara; Kohlberg, Gavriel D; Halperin, Doron; Lahav, Yonatan
2017-09-01
Objective Epistaxis is a common complaint, yet few studies have focused on the incidence and risk factors of recurrent epistaxis. Our objective was to determine the patterns of incidence and risk factors for recurrent epistaxis admission (REA). Study Design Case series with chart review. Settings Single academic center. Subjects and Methods The medical records of patients admitted for epistaxis between 1999 and 2015 were reviewed. The follow-up period was defined as 3 years following initial admission. REAs were categorized as early (30 days) and late (31 days to 3 years) following initial admission. Logistic regression was used to identify potential predictors of REAs. Results A total of 653 patients were included. Eighty-six patients (14%) had REAs: 48 (7.5%) early and 38 (6.5%) late. Nonlinear incidence curve was demonstrated for both early and late REAs. Based on logistic regression, prior nasal surgery and anemia were independent risk factors for early REAs. According to multivariate analysis, thrombocytopenia was significantly associated with late REAs. Conclusion Early and late REAs demonstrate different risk predictors. Knowledge of such risk factors may help in risk stratification for this selected group of patients. All patients at risk should be advised on possible preventive measures. Patients at risk for early REA may benefit from a more proactive approach.
Rice consumption and cancer incidence in US men and women.
Zhang, Ran; Zhang, Xuehong; Wu, Kana; Wu, Hongyu; Sun, Qi; Hu, Frank B; Han, Jiali; Willett, Walter C; Giovannucci, Edward L
2016-02-01
While both the 2012 and 2014 Consumer Reports concerned arsenic levels in US rice, no previous study has evaluated long-term consumption of total rice, white rice and brown rice in relation to risk of developing cancers. We investigated this in the female Nurses' Health Study (1984-2010), and Nurses' Health Study II (1989-2009), and the male Health Professionals Follow-up Study (1986-2008), which included a total of 45,231 men and 160,408 women, free of cancer at baseline. Validated food frequency questionnaires were used to measure rice consumption at baseline and repeated almost every 4 years thereafter. We employed Cox proportional hazards regression model to estimate multivariable relative risks (RRs) and 95% confidence intervals (95% CIs). During up to 26 years of follow-up, we documented 31,655 incident cancer cases (10,833 in men and 20,822 in women). Age-adjusted results were similar to multivariable-adjusted results. Compared to participants with less than one serving per week, the multivariable RRs of overall cancer for individuals who ate at least five servings per week were 0.97 for total rice (95% CI: 0.85-1.07), 0.87 for white rice (95% CI: 0.75-1.01), and 1.17 for brown rice (95% CI: 0.90-1.26). Similar non-significant associations were observed for specific sites of cancers including prostate, breast, colon and rectum, melanoma, bladder, kidney, and lung. Additionally, the null associations were observed among European Americans and non-smokers, and were not modified by BMI. Long-term consumption of total rice, white rice or brown rice was not associated with risk of developing cancer in US men and women. © 2015 UICC.
The Multivariate Order Statistics for Exponential and Weibull Distributions
Directory of Open Access Journals (Sweden)
Mariyam Hafeez
2014-09-01
Full Text Available In this paper we have derived the distribution of multivariate order statistics for multivariate exponential & multivariate weibull distribution. The moment expression for multivariate order statistics has also been derived.
Poulsen, Kjeld; Cleal, Bryan; Willaing, Ingrid
2014-12-01
To investigate the extent and socioeconomic distribution of incident diabetes among the Danish working-age population. The Danish National Diabetes Register was linked with socioeconomic and population-based registers covering the entire population. We analysed the 12-year diabetes incidence using multivariate Poisson regression for 2,086,682 people, adjusting for gender, 10-year age groups, main population groups defined by country of origin, and seven socioeconomic groups: professionals, managers, technicians, workers skilled at basic level, unskilled workers, unemployed and pensioners. The crude 12-year incidence of diabetes was 5.8%. The saturated multivariate model, adjusted for gender, age, country of origin and socioeconomic status; showed a relative risk (RR) for diabetes incidence of 1.44 for male (reference: female), 3.95 for the age range of 50-59 years (reference: 30-39 years), 2.07 for unskilled workers (reference: professionals) and 2.15 for people from countries of 'non-Western origin' (reference: Danish origin). Diabetes incidence increases with age, male gender and low socioeconomic status; and also among people from countries of 'non-Western origin'. The results indicate that getting a more senior workforce will substantially increase the proportion of workers with diabetes, especially among already vulnerable groups. © 2014 the Nordic Societies of Public Health.
Logistic Regression: Concept and Application
Cokluk, Omay
2010-01-01
The main focus of logistic regression analysis is classification of individuals in different groups. The aim of the present study is to explain basic concepts and processes of binary logistic regression analysis intended to determine the combination of independent variables which best explain the membership in certain groups called dichotomous…
Testing discontinuities in nonparametric regression
Dai, Wenlin
2017-01-19
In nonparametric regression, it is often needed to detect whether there are jump discontinuities in the mean function. In this paper, we revisit the difference-based method in [13 H.-G. Müller and U. Stadtmüller, Discontinuous versus smooth regression, Ann. Stat. 27 (1999), pp. 299–337. doi: 10.1214/aos/1018031100
Illuminance Flow Estimation by Regression
Karlsson, S.M.; Pont, S.C.; Koenderink, J.J.; Zisserman, A.
2010-01-01
We investigate the estimation of illuminance flow using Histograms of Oriented Gradient features (HOGs). In a regression setting, we found for both ridge regression and support vector machines, that the optimal solution shows close resemblance to the gradient based structure tensor (also known as
Multivariate meta-analysis: Potential and promise
Jackson, Dan; Riley, Richard; White, Ian R
2011-01-01
The multivariate random effects model is a generalization of the standard univariate model. Multivariate meta-analysis is becoming more commonly used and the techniques and related computer software, although continually under development, are now in place. In order to raise awareness of the multivariate methods, and discuss their advantages and disadvantages, we organized a one day ‘Multivariate meta-analysis’ event at the Royal Statistical Society. In addition to disseminating the most recent developments, we also received an abundance of comments, concerns, insights, critiques and encouragement. This article provides a balanced account of the day's discourse. By giving others the opportunity to respond to our assessment, we hope to ensure that the various view points and opinions are aired before multivariate meta-analysis simply becomes another widely used de facto method without any proper consideration of it by the medical statistics community. We describe the areas of application that multivariate meta-analysis has found, the methods available, the difficulties typically encountered and the arguments for and against the multivariate methods, using four representative but contrasting examples. We conclude that the multivariate methods can be useful, and in particular can provide estimates with better statistical properties, but also that these benefits come at the price of making more assumptions which do not result in better inference in every case. Although there is evidence that multivariate meta-analysis has considerable potential, it must be even more carefully applied than its univariate counterpart in practice. Copyright © 2011 John Wiley & Sons, Ltd. PMID:21268052
Food Intake Patterns Associated With Incident Type 2 Diabetes
Liese, Angela D.; Weis, Kristina E.; Schulz, Mandy; Tooze, Janet A.
2009-01-01
OBJECTIVE—Markers of hemostasis and inflammation such as plasminogen activator inhibitor-1 (PAI-1) and fibrinogen have been associated with risk of type 2 diabetes. We aimed to identify food intake patterns influencing this pathway and evaluate their association with incident diabetes. RESEARCH DESIGN AND METHODS—The Insulin Resistance Atherosclerosis Study cohort included 880 middle-aged adults initially free of diabetes. At the 5-year follow-up, 144 individuals had developed diabetes. Usual dietary intake was ascertained with a 114-item food frequency questionnaire. Using reduced rank regression, we identified a food pattern maximizing the explained variation in PAI-1 and fibrinogen. Subsequently, the food pattern–diabetes association was evaluated using logistic regression. RESULTS—High intake of the food groups red meat, low-fiber bread and cereal, dried beans, fried potatoes, tomato vegetables, eggs, cheese, and cottage cheese and low intake of wine characterized the pattern, which was positively associated with both biomarkers. With increasing pattern score, the odds of diabetes increased significantly (Ptrend < 0.01). After multivariate adjustment, the odds ratio comparing extreme quartiles was 4.3 (95% CI 1.7–10.8). Adjustment for insulin sensitivity and secretion and other metabolic factors had little impact (4.9, 1.8–13.7). CONCLUSIONS—Our findings provide support for potential behavioral prevention strategies, as we identified a food intake pattern that was strongly related to PAI-1 and fibrinogen and independently predicted type 2 diabetes. PMID:19033409
Fungible weights in logistic regression.
Jones, Jeff A; Waller, Niels G
2016-06-01
In this article we develop methods for assessing parameter sensitivity in logistic regression models. To set the stage for this work, we first review Waller's (2008) equations for computing fungible weights in linear regression. Next, we describe 2 methods for computing fungible weights in logistic regression. To demonstrate the utility of these methods, we compute fungible logistic regression weights using data from the Centers for Disease Control and Prevention's (2010) Youth Risk Behavior Surveillance Survey, and we illustrate how these alternate weights can be used to evaluate parameter sensitivity. To make our work accessible to the research community, we provide R code (R Core Team, 2015) that will generate both kinds of fungible logistic regression weights. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Multivariable control in nuclear power stations
International Nuclear Information System (INIS)
Parent, M.; McMorran, P.D.
1982-11-01
Multivariable methods have the potential to improve the control of large systems such as nuclear power stations. Linear-quadratic optimal control is a multivariable method based on the minimization of a cost function. A related technique leads to the Kalman filter for estimation of plant state from noisy measurements. A design program for optimal control and Kalman filtering has been developed as part of a computer-aided design package for multivariable control systems. The method is demonstrated on a model of a nuclear steam generator, and simulated results are presented
Exploratory multivariate analysis by example using R
Husson, Francois; Pages, Jerome
2010-01-01
Full of real-world case studies and practical advice, Exploratory Multivariate Analysis by Example Using R focuses on four fundamental methods of multivariate exploratory data analysis that are most suitable for applications. It covers principal component analysis (PCA) when variables are quantitative, correspondence analysis (CA) and multiple correspondence analysis (MCA) when variables are categorical, and hierarchical cluster analysis.The authors take a geometric point of view that provides a unified vision for exploring multivariate data tables. Within this framework, they present the prin
Multivariate statistical methods a first course
Marcoulides, George A
2014-01-01
Multivariate statistics refer to an assortment of statistical methods that have been developed to handle situations in which multiple variables or measures are involved. Any analysis of more than two variables or measures can loosely be considered a multivariate statistical analysis. An introductory text for students learning multivariate statistical methods for the first time, this book keeps mathematical details to a minimum while conveying the basic principles. One of the principal strategies used throughout the book--in addition to the presentation of actual data analyses--is poin
Tumor regression patterns in retinoblastoma
International Nuclear Information System (INIS)
Zafar, S.N.; Siddique, S.N.; Zaheer, N.
2016-01-01
To observe the types of tumor regression after treatment, and identify the common pattern of regression in our patients. Study Design: Descriptive study. Place and Duration of Study: Department of Pediatric Ophthalmology and Strabismus, Al-Shifa Trust Eye Hospital, Rawalpindi, Pakistan, from October 2011 to October 2014. Methodology: Children with unilateral and bilateral retinoblastoma were included in the study. Patients were referred to Pakistan Institute of Medical Sciences, Islamabad, for chemotherapy. After every cycle of chemotherapy, dilated funds examination under anesthesia was performed to record response of the treatment. Regression patterns were recorded on RetCam II. Results: Seventy-four tumors were included in the study. Out of 74 tumors, 3 were ICRB group A tumors, 43 were ICRB group B tumors, 14 tumors belonged to ICRB group C, and remaining 14 were ICRB group D tumors. Type IV regression was seen in 39.1% (n=29) tumors, type II in 29.7% (n=22), type III in 25.6% (n=19), and type I in 5.4% (n=4). All group A tumors (100%) showed type IV regression. Seventeen (39.5%) group B tumors showed type IV regression. In group C, 5 tumors (35.7%) showed type II regression and 5 tumors (35.7%) showed type IV regression. In group D, 6 tumors (42.9%) regressed to type II non-calcified remnants. Conclusion: The response and success of the focal and systemic treatment, as judged by the appearance of different patterns of tumor regression, varies with the ICRB grouping of the tumor. (author)
Directory of Open Access Journals (Sweden)
Daniele Maria Pelissari
Full Text Available Although many studies have identified social conditions associated with tuberculosis, contextual and individual factors have rarely been analysed simultaneously. Consequently, we aimed to identify contextual and individual factors associated with tuberculosis incidence in general population in Brazil in 2010. We also assessed whether household crowding mediates the association between socioeconomic determinants and tuberculosis incidence. Individual data of tuberculosis cases were obtained from 5,565 municipalities in Brazil in 2010 (last year of national census, and merged with contextual variables. The associations were evaluated in a multilevel analysis using negative binomial regression. After adjusting for individual factors (age, sex and race and geographic region, the following contextual factors were associated with tuberculosis incidence rate: AIDS incidence rate [incidence rate ratio (IRR, 1.21; 95% confidence interval (CI, 1.18-1.24], unemployment rate (IRR, 1.16; 95% CI, 1.13-1.19, Gini coefficient (IRR, 1.05; 95% CI, 1.02-1.08, proportion of inmates (IRR, 1.11; 95% CI, 1.09-1.14, mean per capita household income (IRR, 0.94; 95% CI, 0.91-0.97 and primary care coverage (IRR, 0.94; 95% CI, 0.92-0.96. Inclusion of household crowding in the multivariate model led to a loss of the associations of both Gini coefficient and mean per capita household income. In conclusion, our findings suggest that income inequality and poverty, as determinants of tuberculosis incidence, can be mediated by household crowding. Moreover, prison population can represent a potential social reservoir of tuberculosis in Brazil and should be addressed as a priority for disease control. Finally, the negative association between primary health coverage and tuberculosis incidence highlights the importance of this level of care as a strategy to control this disease.
Influence Of Demographic Factors And History Of Malaria With The Incidence Malaria In MORU PHC
Directory of Open Access Journals (Sweden)
Sudirman Manumpa
2017-01-01
Full Text Available Malaria morbidity in Moru health center, with parameter Annual Parasite Incident (API, amounted to 16.9% in 2014. This figure was still high when compared to the target of eliminating malaria in Indonesia about <1% in 2030. Incidence of malaria is more common in children aged 5 months - <12 years. This high rates of malaria leads to poverty, low level of learning achievement of children and in pregnant women causing low birth weight in babies and death. The purpose of this study was to analyze the factors that influence the incidence of tertian and Tropikana malaria or combined Tropikana and tertian (mix in Moru PHC in sub-district Alor Southwestern, Alor Regency.This study used a cross-sectional design, the population of study were all patients undergoing peripheral blood examination in Moru PHC’s laboratory from June to October 2015. The number of samples in this study was 173 respondents. The sampling technique was Simple Random Sampling. Instruments of data collection were a questionnaire and observation sheet.Results of the study by Chi-Square test showed that the factors influencing the incidence of malaria were socioeconomic status (sig 0,000, education level (sig 0.001. By using multivariate analysis with logistic regression test, results were obtained the age of 5 months - <12 value (sig 0.025 and socioeconomic status (sig 0,000 influencing the incidence of malaria.Variables that affect the incidence of malaria were demographic factors such as age, education level, socioeconomic status. It is advisable to harness swamp thus improving the economic status of society and build permanent house. Keywords: incidence malaria, demographic factors, history of malaria
Extracting bb Higgs Decay Signals using Multivariate Techniques
Energy Technology Data Exchange (ETDEWEB)
Smith, W Clarke; /George Washington U. /SLAC
2012-08-28
For low-mass Higgs boson production at ATLAS at {radical}s = 7 TeV, the hard subprocess gg {yields} h{sup 0} {yields} b{bar b} dominates but is in turn drowned out by background. We seek to exploit the intrinsic few-MeV mass width of the Higgs boson to observe it above the background in b{bar b}-dijet mass plots. The mass resolution of existing mass-reconstruction algorithms is insufficient for this purpose due to jet combinatorics, that is, the algorithms cannot identify every jet that results from b{bar b} Higgs decay. We combine these algorithms using the neural net (NN) and boosted regression tree (BDT) multivariate methods in attempt to improve the mass resolution. Events involving gg {yields} h{sup 0} {yields} b{bar b} are generated using Monte Carlo methods with Pythia and then the Toolkit for Multivariate Analysis (TMVA) is used to train and test NNs and BDTs. For a 120 GeV Standard Model Higgs boson, the m{sub h{sup 0}}-reconstruction width is reduced from 8.6 to 6.5 GeV. Most importantly, however, the methods used here allow for more advanced m{sub h{sup 0}}-reconstructions to be created in the future using multivariate methods.
Regression analysis with categorized regression calibrated exposure: some interesting findings
Directory of Open Access Journals (Sweden)
Hjartåker Anette
2006-07-01
Full Text Available Abstract Background Regression calibration as a method for handling measurement error is becoming increasingly well-known and used in epidemiologic research. However, the standard version of the method is not appropriate for exposure analyzed on a categorical (e.g. quintile scale, an approach commonly used in epidemiologic studies. A tempting solution could then be to use the predicted continuous exposure obtained through the regression calibration method and treat it as an approximation to the true exposure, that is, include the categorized calibrated exposure in the main regression analysis. Methods We use semi-analytical calculations and simulations to evaluate the performance of the proposed approach compared to the naive approach of not correcting for measurement error, in situations where analyses are performed on quintile scale and when incorporating the original scale into the categorical variables, respectively. We also present analyses of real data, containing measures of folate intake and depression, from the Norwegian Women and Cancer study (NOWAC. Results In cases where extra information is available through replicated measurements and not validation data, regression calibration does not maintain important qualities of the true exposure distribution, thus estimates of variance and percentiles can be severely biased. We show that the outlined approach maintains much, in some cases all, of the misclassification found in the observed exposure. For that reason, regression analysis with the corrected variable included on a categorical scale is still biased. In some cases the corrected estimates are analytically equal to those obtained by the naive approach. Regression calibration is however vastly superior to the naive method when applying the medians of each category in the analysis. Conclusion Regression calibration in its most well-known form is not appropriate for measurement error correction when the exposure is analyzed on a
Logic regression and its extensions.
Schwender, Holger; Ruczinski, Ingo
2010-01-01
Logic regression is an adaptive classification and regression procedure, initially developed to reveal interacting single nucleotide polymorphisms (SNPs) in genetic association studies. In general, this approach can be used in any setting with binary predictors, when the interaction of these covariates is of primary interest. Logic regression searches for Boolean (logic) combinations of binary variables that best explain the variability in the outcome variable, and thus, reveals variables and interactions that are associated with the response and/or have predictive capabilities. The logic expressions are embedded in a generalized linear regression framework, and thus, logic regression can handle a variety of outcome types, such as binary responses in case-control studies, numeric responses, and time-to-event data. In this chapter, we provide an introduction to the logic regression methodology, list some applications in public health and medicine, and summarize some of the direct extensions and modifications of logic regression that have been proposed in the literature. Copyright © 2010 Elsevier Inc. All rights reserved.
Multivariate survival analysis and competing risks
Crowder, Martin J
2012-01-01
Multivariate Survival Analysis and Competing Risks introduces univariate survival analysis and extends it to the multivariate case. It covers competing risks and counting processes and provides many real-world examples, exercises, and R code. The text discusses survival data, survival distributions, frailty models, parametric methods, multivariate data and distributions, copulas, continuous failure, parametric likelihood inference, and non- and semi-parametric methods. There are many books covering survival analysis, but very few that cover the multivariate case in any depth. Written for a graduate-level audience in statistics/biostatistics, this book includes practical exercises and R code for the examples. The author is renowned for his clear writing style, and this book continues that trend. It is an excellent reference for graduate students and researchers looking for grounding in this burgeoning field of research.
An Introduction to Applied Multivariate Analysis
Raykov, Tenko
2008-01-01
Focuses on the core multivariate statistics topics which are of fundamental relevance for its understanding. This book emphasis on the topics that are critical to those in the behavioral, social, and educational sciences.
Ellipsoidal prediction regions for multivariate uncertainty characterization
DEFF Research Database (Denmark)
Golestaneh, Faranak; Pinson, Pierre; Azizipanah-Abarghooee, Rasoul
2018-01-01
, for classes of decision-making problems based on robust, interval chance-constrained optimization, necessary inputs take the form of multivariate prediction regions rather than scenarios. The current literature is at very primitive stage of characterizing multivariate prediction regions to be employed...... in these classes of optimization problems. To address this issue, we introduce a new class of multivariate forecasts which form as multivariate ellipsoids for non-Gaussian variables. We propose a data-driven systematic framework to readily generate and evaluate ellipsoidal prediction regions, with predeﬁned...... probability guarantees and minimum conservativeness. A skill score is proposed for quantitative assessment of the quality of prediction ellipsoids. A set of experiments is used to illustrate the discrimination ability of the proposed scoring rule for potential misspeciﬁcation of ellipsoidal prediction regions...
Regression analysis of censored data using pseudo-observations
DEFF Research Database (Denmark)
Parner, Erik T.; Andersen, Per Kragh
2010-01-01
We draw upon a series of articles in which a method based on pseu- dovalues is proposed for direct regression modeling of the survival function, the restricted mean, and the cumulative incidence function in competing risks with right-censored data. The models, once the pseudovalues have been...
Multivariable Feedback Control of Nuclear Reactors
Directory of Open Access Journals (Sweden)
Rune Moen
1982-07-01
Full Text Available Multivariable feedback control has been adapted for optimal control of the spatial power distribution in nuclear reactor cores. Two design techniques, based on the theory of automatic control, were developed: the State Variable Feedback (SVF is an application of the linear optimal control theory, and the Multivariable Frequency Response (MFR is based on a generalization of the traditional frequency response approach to control system design.
Simulating multivariate time series using flocking
Schruben, Lee W.; Singham, Dashi I.
2010-01-01
Refereed Conference Paper Notions from agent based modeling (ABM) can be used to simulate multivariate time series. An example is given using the ABM concept of flocking, which models the behaviors of birds (called boids) in a flock. A multivariate time series is mapped into the coordinates of a bounded orthotope. This represents the flight path of a boid. Other boids are generated that flock around this data boid. The coordinates of these new boids are mapped back to simulate replicates o...
Multivariate Analysis of Industrial Scale Fermentation Data
DEFF Research Database (Denmark)
Mears, Lisa; Nørregård, Rasmus; Stocks, Stuart M.
2015-01-01
of multivariate modelling were carried out using different data pre-processing and scaling methods in order to extract the trends from the industrial data set, obtained from a production process operating in Novozymes A/S. This data set poses challenges for data analysis, combining both online and offline......, with an average prediction error of 7.6%. A methodology is proposed for applying multivariate analysis to industrial scale batch process data....
Application of multivariate splines to discrete mathematics
Xu, Zhiqiang
2005-01-01
Using methods developed in multivariate splines, we present an explicit formula for discrete truncated powers, which are defined as the number of non-negative integer solutions of linear Diophantine equations. We further use the formula to study some classical problems in discrete mathematics as follows. First, we extend the partition function of integers in number theory. Second, we exploit the relation between the relative volume of convex polytopes and multivariate truncated powers and giv...
Patel, Achint; Singh, Dinesh; Bhatt, Parth; Thakkar, Badal; Akingbola, Olugbenga A; Srivastav, Sudesh K
2016-09-01
There are limited data regarding the incidence, trends, and outcomes of cerebral edema among patients with diabetic ketoacidosis (DKA). NIS database was used from year 2002 to 2012. Cases with primary diagnosis of DKA were identified using International Classification of Diseases, Ninth Revision-Clinical Modification (ICD-9 CM) code 250.1 x. Cerebral edema patients were identified using ICD-9 CM code 348.5. We compared the baseline characteristics of both groups to estimate differences using the χ(2) test, Student's t test, Wilcoxon rank-sum test, and survey regression depending on the distributions of variables. For trend analysis, the χ(2) test of trend for proportions was used using the Cochrane Armitage test via the "trend" command in Statistical Analysis Software (SAS). Multivariate odds ratios were calculated. P value for cerebral edema were identified among 52 049 (weighted n = 246 925) DKA patients, which estimates the incidence of cerebral edema at 0.39%. Trends of incidence of developing cerebral edema increased almost 2 times, from 0.34 in 2002 to 0.64 in 2012 (P cerebral edema. Our study shows that over the study period, trend in incidence of cerebral edema among DKA patients has increased. Patients with cerebral edema were found to have longer LOS and higher cost of hospitalization. © The Author(s) 2015.
Prostate cancer incidence and agriculture practices in Georgia, 2000–2010
Welton, Michael; Robb, Sara W.; Shen, Ye; Guillebeau, Paul; Vena, John
2015-01-01
Background: Georgia has prostate cancer incidence rates consistently above the national average. A notable portion of Georgia's economy is rooted in agricultural production, and agricultural practices have been associated with an increased risk of prostate cancer. Methods: Statistical analyses considered county age-adjusted prostate cancer incidence rates as the outcome of interest and three agricultural variables (farmland as percent of county land, dollars spent per county acre on agriculture chemicals, and dollars spent per county acre on commercial fertilizers) as exposures of interest. Multivariate linear regression models analyzed for each separately. Data were obtained from National Cancer Institute Surveillance, Epidemiology and End Results (SEER) 2000–2010, United States Department of Agriculture (USDA) 1987 Agriculture Survey, and 2010 US Census. Results: In counties with equal to or greater than Georgia counties' median percent African-American population (27%), dollars per acre spent on agriculture chemicals was significantly associated (P = 0.04) and dollars spent of commercial fertilizers was moderately associated (P = 0.07) with elevated prostate cancer incidence rates. There was no association between percent of county farmland and prostate cancer rates. Conclusion: This study identified associations between prostate cancer incidence rates, agriculture chemical expenditure, and commercial fertilizer expenditure in Georgia counties with a population comprised of more than 27% of African Americans. PMID:25785490
Prostate cancer incidence and agriculture practices in Georgia, 2000-2010.
Welton, Michael; Robb, Sara W; Shen, Ye; Guillebeau, Paul; Vena, John
2015-01-01
Georgia has prostate cancer incidence rates consistently above the national average. A notable portion of Georgia's economy is rooted in agricultural production, and agricultural practices have been associated with an increased risk of prostate cancer. Statistical analyses considered county age-adjusted prostate cancer incidence rates as the outcome of interest and three agricultural variables (farmland as percent of county land, dollars spent per county acre on agriculture chemicals, and dollars spent per county acre on commercial fertilizers) as exposures of interest. Multivariate linear regression models analyzed for each separately. Data were obtained from National Cancer Institute Surveillance, Epidemiology and End Results (SEER) 2000-2010, United States Department of Agriculture (USDA) 1987 Agriculture Survey, and 2010 US Census. In counties with equal to or greater than Georgia counties' median percent African-American population (27%), dollars per acre spent on agriculture chemicals was significantly associated (P = 0.04) and dollars spent of commercial fertilizers was moderately associated (P = 0.07) with elevated prostate cancer incidence rates. There was no association between percent of county farmland and prostate cancer rates. This study identified associations between prostate cancer incidence rates, agriculture chemical expenditure, and commercial fertilizer expenditure in Georgia counties with a population comprised of more than 27% of African Americans.
Panagioti, Maria; Blakeman, Thomas; Hann, Mark; Bower, Peter
2017-05-30
Increasing evidence suggests that patient safety is a serious concern for older patients with long-term conditions. Despite this, there is a lack of research on safety incidents encountered by this patient group. In this study, we sought to examine patient reports of safety incidents and factors associated with reports of safety incidents in older patients with long-term conditions. The baseline cross-sectional data from a longitudinal cohort study were analysed. Older patients (n=3378 aged 65 years and over) with a long-term condition registered in general practices were included in the study. The main outcome was patient-reported safety incidents including availability and appropriateness of medical tests and prescription of wrong types or doses of medication. Binary univariate and multivariate logistic regression analyses were undertaken to examine factors associated with patient-reported safety incidents. Safety incidents were reported by 11% of the patients. Four factors were significantly associated with patient-reported safety incidents in multivariate analyses. The experience of multiple long-term conditions (OR=1.09, 95% CI 1.05 to 1.13), a probable diagnosis of depression (OR=1.36, 95% CI 1.06 to 1.74) and greater relational continuity of care (OR=1.28, 95% CI 1.08 to 1.52) were associated with increased odds for patient-reported safety incidents. Perceived greater support and involvement in self-management was associated with lower odds for patient-reported safety incidents (OR=0.95, 95% CI 0.93 to 0.97). We found that older patients with multimorbidity and depression are more likely to report experiences of patient safety incidents. Improving perceived support and involvement of patients in their care may help prevent patient-reported safety incidents. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2017. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Multivariate return periods of sea storms for coastal erosion risk assessment
Directory of Open Access Journals (Sweden)
S. Corbella
2012-08-01
Full Text Available The erosion of a beach depends on various storm characteristics. Ideally, the risk associated with a storm would be described by a single multivariate return period that is also representative of the erosion risk, i.e. a 100 yr multivariate storm return period would cause a 100 yr erosion return period. Unfortunately, a specific probability level may be associated with numerous combinations of storm characteristics. These combinations, despite having the same multivariate probability, may cause very different erosion outcomes. This paper explores this ambiguity problem in the context of copula based multivariate return periods and using a case study at Durban on the east coast of South Africa. Simulations were used to correlate multivariate return periods of historical events to return periods of estimated storm induced erosion volumes. In addition, the relationship of the most-likely design event (Salvadori et al., 2011 to coastal erosion was investigated. It was found that the multivariate return periods for wave height and duration had the highest correlation to erosion return periods. The most-likely design event was found to be an inadequate design method in its current form. We explore the inclusion of conditions based on the physical realizability of wave events and the use of multivariate linear regression to relate storm parameters to erosion computed from a process based model. Establishing a link between storm statistics and erosion consequences can resolve the ambiguity between multivariate storm return periods and associated erosion return periods.
Spontaneous regression of metastases from malignant melanoma: a case report
DEFF Research Database (Denmark)
Kalialis, Louise V; Drzewiecki, Krzysztof T; Mohammadi, Mahin
2008-01-01
A case of a 61-year-old male with widespread metastatic melanoma is presented 5 years after complete spontaneous cure. Spontaneous regression occurred in cutaneous, pulmonary, hepatic and cerebral metastases. A review of the literature reveals seven cases of regression of cerebral metastases......; this report is the first to document complete spontaneous regression of cerebral metastases from malignant melanoma by means of computed tomography scans. Spontaneous regression is defined as the partial or complete disappearance of a malignant tumour in the absence of all treatment or in the presence...... of therapy, which is considered inadequate to exert a significant influence on neoplastic disease. The incidence of spontaneous regression of metastases from malignant melanoma is approximately one per 400 patients, and possible mechanisms include immunologic, endocrine, inflammatory and tumour nutritional...
Ban, J; Takao, Y; Okuno, Y; Mori, Y; Asada, H; Yamanishi, K; Iso, H
2017-04-01
Few studies have examined the impact of cigarette smoking on the risk for herpes zoster. The Shozu Herpes Zoster (SHEZ) Study is a community-based prospective cohort study over 3 years in Japan aiming to clarify the incidence and predictive and immunological factors for herpes zoster. We investigated the associations of smoking status with past history and incidence of herpes zoster. A total of 12 351 participants provided valid information on smoking status and past history of herpes zoster at baseline survey. Smoking status was classified into three categories (current, former, never smoker), and if currently smoking, the number of cigarettes consumed per day was recorded. The participants were under the active surveillance for first-ever incident herpes zoster for 3 years. We used a logistic regression model for the cross-sectional study on the association between smoking status and past history of herpes zoster, and a Cox proportional hazards regression model for the cohort study on the association with risk of incidence. The multivariable adjusted odd ratios (95% CI) of past history of herpes zoster for current vs. never smokers were 0·67 (0·54-0·80) for total subjects, 0·72 (0·56-0·93) for men and 0·65 (0·44-0·96) for women. The multivariable adjusted hazard ratios (95% CI) of incident herpes zoster for current vs. never smokers were 0·52 (0·33-0·81) for total subjects, 0·49 (0·29-0·83) for men and 0·52 (0·19-1·39) for women. Smoking status was inversely associated with the prevalence and incidence of herpes zoster in the general population of men and women aged ⩾50 years.
Abstract Expression Grammar Symbolic Regression
Korns, Michael F.
This chapter examines the use of Abstract Expression Grammars to perform the entire Symbolic Regression process without the use of Genetic Programming per se. The techniques explored produce a symbolic regression engine which has absolutely no bloat, which allows total user control of the search space and output formulas, which is faster, and more accurate than the engines produced in our previous papers using Genetic Programming. The genome is an all vector structure with four chromosomes plus additional epigenetic and constraint vectors, allowing total user control of the search space and the final output formulas. A combination of specialized compiler techniques, genetic algorithms, particle swarm, aged layered populations, plus discrete and continuous differential evolution are used to produce an improved symbolic regression sytem. Nine base test cases, from the literature, are used to test the improvement in speed and accuracy. The improved results indicate that these techniques move us a big step closer toward future industrial strength symbolic regression systems.
Forecasting with Dynamic Regression Models
Pankratz, Alan
2012-01-01
One of the most widely used tools in statistical forecasting, single equation regression models is examined here. A companion to the author's earlier work, Forecasting with Univariate Box-Jenkins Models: Concepts and Cases, the present text pulls together recent time series ideas and gives special attention to possible intertemporal patterns, distributed lag responses of output to input series and the auto correlation patterns of regression disturbance. It also includes six case studies.
From Rasch scores to regression
DEFF Research Database (Denmark)
Christensen, Karl Bang
2006-01-01
Rasch models provide a framework for measurement and modelling latent variables. Having measured a latent variable in a population a comparison of groups will often be of interest. For this purpose the use of observed raw scores will often be inadequate because these lack interval scale propertie....... This paper compares two approaches to group comparison: linear regression models using estimated person locations as outcome variables and latent regression models based on the distribution of the score....
Testing Heteroscedasticity in Robust Regression
Czech Academy of Sciences Publication Activity Database
Kalina, Jan
2011-01-01
Roč. 1, č. 4 (2011), s. 25-28 ISSN 2045-3345 Grant - others:GA ČR(CZ) GA402/09/0557 Institutional research plan: CEZ:AV0Z10300504 Keywords : robust regression * heteroscedasticity * regression quantiles * diagnostics Subject RIV: BB - Applied Statistics , Operational Research http://www.researchjournals.co.uk/documents/Vol4/06%20Kalina.pdf
Regression methods for medical research
Tai, Bee Choo
2013-01-01
Regression Methods for Medical Research provides medical researchers with the skills they need to critically read and interpret research using more advanced statistical methods. The statistical requirements of interpreting and publishing in medical journals, together with rapid changes in science and technology, increasingly demands an understanding of more complex and sophisticated analytic procedures.The text explains the application of statistical models to a wide variety of practical medical investigative studies and clinical trials. Regression methods are used to appropriately answer the
Dimension Reduction Regression in R
Directory of Open Access Journals (Sweden)
Sanford Weisberg
2002-01-01
Full Text Available Regression is the study of the dependence of a response variable y on a collection predictors p collected in x. In dimension reduction regression, we seek to find a few linear combinations β1x,...,βdx, such that all the information about the regression is contained in these linear combinations. If d is very small, perhaps one or two, then the regression problem can be summarized using simple graphics; for example, for d=1, the plot of y versus β1x contains all the regression information. When d=2, a 3D plot contains all the information. Several methods for estimating d and relevant functions of β1,..., βdhave been suggested in the literature. In this paper, we describe an R package for three important dimension reduction methods: sliced inverse regression or sir, sliced average variance estimates, or save, and principal Hessian directions, or phd. The package is very general and flexible, and can be easily extended to include other methods of dimension reduction. It includes tests and estimates of the dimension , estimates of the relevant information including β1,..., βd, and some useful graphical summaries as well.
Lee, Won Jae; Jo, Kyung-Il; Yeon, Je Young; Hong, Seung-Chyul; Kim, Jong-Soo
2015-04-01
Chronic subdural hematoma (CSDH) is a rare complication of unruptured aneurysm clipping surgery. The purpose of this study was to identify the incidence and risk factors of postoperative CSDH after surgical clipping for unruptured anterior circulation aneurysms. This retrospective study included 518 patients from a single tertiary institute from January 2008 to December 2013. CSDH was defined as subdural hemorrhage which needed surgical treatment. The degree of brain atrophy was estimated using the bicaudate ratio (BCR) index. We used uni- and multivariate analyses to identify risk factors correlated with CSDH. Sixteen (3.1%) patients experienced postoperative CSDH that required burr hole drainage surgery. In univariate analyses, male gender (p<0.001), size of aneurysm (p=0.030), higher BCR index (p=0.004), and the use of antithrombotic medication (p=0.006) were associated with postoperative CSDH. In multivariate analyses using logistic regression test, male gender [odds ratio (OR) 4.037, range 1.287-12.688], high BCR index (OR 5.376, range 1.170-25.000), and the use of antithrombotic medication (OR 4.854, range 1.658-14.085) were associated with postoperative CSDH (p<0.05). Postoperative subdural fluid collection and arachnoid plasty were not showed statistically significant difference in this study. The incidence of CSDH was 3.1% in unruptured anterior circulation aneurysm surgery. This study shows that male gender, degree of brain atrophy, and the use of antithrombotic medication were associated with postoperative CSDH.
Mariet, Anne-Sophie; Retel, Olivier; Avocat, Hélène; Serre, Anne; Schapman, Lucie; Schmitt, Marielle; Charron, Martine; Monnet, Elisabeth
2013-09-01
While several studies conducted on Lyme borreliosis (LB) risk in the United States showed an association with environmental characteristics, most of European studies considered solely the effect of climate characteristics. The aims of this study were to estimate incidence of erythema migrans (EM) in five regions of France and to analyze associations with several environmental characteristics of the place of residence. LB surveillance networks of general practitioners (GPs) were set up for a period of 2 years in five regions of France. Participating GPs reported all patients with EM during the study period. Data were pooled according to a standardized EM case definition. For each area with a participating GP, age-standardized incidence rates and ratios were estimated. Associations with altitude, indicators of landscape composition, and indicators of landscape configuration were tested with multivariate Poisson regression. Standardized estimated incidence rates of EM per 10(5) person-years were 8.8 [95% confidence interval (CI)=7.9-9.7] in Aquitaine, 40.0 (95% CI 36.4-43.6) in Limousin, 76.0 (95% CI 72.9-79.1) in the three participating départements of Rhône-Alpes, 46.1 (95% CI 43.0-49.2) in Franche-Comté, and 87.7 (95% CI 84.6-90.8) in Alsace. In multivariate analysis, age-adjusted incidence rates increased with the altitude (p<0.0001) and decreased with forest patch density (p<0.0001). The marked variations in EM risk among the five regions were partly related to differences in landscape and environmental characteristics. The latter may point out potential risk areas and provide information for targeting preventive actions.
A General Framework for Multivariate Analysis with Optimal Scaling: The R Package aspect
Directory of Open Access Journals (Sweden)
Patrick Mair
2009-11-01
Full Text Available In a series of papers De Leeuw developed a general framework for multivariate analysis with optimal scaling. The basic idea of optimal scaling is to transform the observed variables (categories in terms of quantifications. In the approach presented here the multivariate data are collected into a multivariable. An aspect of a multivariable is a function that is used to measure how well the multivariable satisfies some criterion. Basically we can think of two different families of aspects which unify many well-known multivariate methods: Correlational aspects based on sums of correlations, eigenvalues and determinants which unify multiple regression, path analysis, correspondence analysis, nonlinear PCA, etc. Non-correlational aspects which linearize bivariate regressions and can be used for SEM preprocessing with categorical data. Additionally, other aspects can be established that do not correspond to classical techniques at all. By means of the R package aspect we provide a unified majorization-based implementation of this methodology. Using various data examples we will show the flexibility of this approach and how the optimally scaled results can be represented using graphical tools provided by the package.
Multivariate Max-Stable Spatial Processes
Genton, Marc G.
2014-01-06
Analysis of spatial extremes is currently based on univariate processes. Max-stable processes allow the spatial dependence of extremes to be modelled and explicitly quantified, they are therefore widely adopted in applications. For a better understanding of extreme events of real processes, such as environmental phenomena, it may be useful to study several spatial variables simultaneously. To this end, we extend some theoretical results and applications of max-stable processes to the multivariate setting to analyze extreme events of several variables observed across space. In particular, we study the maxima of independent replicates of multivariate processes, both in the Gaussian and Student-t cases. Then, we define a Poisson process construction in the multivariate setting and introduce multivariate versions of the Smith Gaussian extremevalue, the Schlather extremal-Gaussian and extremal-t, and the BrownResnick models. Inferential aspects of those models based on composite likelihoods are developed. We present results of various Monte Carlo simulations and of an application to a dataset of summer daily temperature maxima and minima in Oklahoma, U.S.A., highlighting the utility of working with multivariate models in contrast to the univariate case. Based on joint work with Simone Padoan and Huiyan Sang.
DEFF Research Database (Denmark)
Houe, H.; Baker, J.C.; Maes, R.K.
1995-01-01
Based on 2 previous surveys on the occurrence of infection with bovine virus diarrhoea virus (BVDV) in Danish and Michigan dairy herds, the prevalence and incidence of the infection were compared. The presence of certain possible risk factors for the occurrence of infection in the 2 areas were...... purchased more than 40 animals within recent 3 1/2-4 years were significantly associated with presence of PI animals in the dairy herds (p = 0.01) when tested by the Mantel-Haenszel chi 2. Using multivariable logistic regression, the occurrence of PI animals was found to be significantly related...
A Multivariate Test of the Bott Hypothesis in an Urban Irish Setting
Gordon, Michael; Downing, Helen
1978-01-01
Using a sample of 686 married Irish women in Cork City the Bott hypothesis was tested, and the results of a multivariate regression analysis revealed that neither network connectedness nor the strength of the respondent's emotional ties to the network had any explanatory power. (Author)
Identification of Civil Engineering Structures using Multivariate ARMAV and RARMAV Models
DEFF Research Database (Denmark)
Kirkegaard, Poul Henning; Andersen, P.; Brincker, Rune
This paper presents how to make system identification of civil engineering structures using multivariate auto-regressive moving-average vector (ARMAV) models. Further, the ARMAV technique is extended to a recursive technique (RARMAV). The ARMAV model is used to identify measured stationary data....... The results show the usefulness of the approaches for identification of civil engineering structures excited by natural excitation...
Directory of Open Access Journals (Sweden)
Drzewiecki Wojciech
2016-12-01
Full Text Available In this work nine non-linear regression models were compared for sub-pixel impervious surface area mapping from Landsat images. The comparison was done in three study areas both for accuracy of imperviousness coverage evaluation in individual points in time and accuracy of imperviousness change assessment. The performance of individual machine learning algorithms (Cubist, Random Forest, stochastic gradient boosting of regression trees, k-nearest neighbors regression, random k-nearest neighbors regression, Multivariate Adaptive Regression Splines, averaged neural networks, and support vector machines with polynomial and radial kernels was also compared with the performance of heterogeneous model ensembles constructed from the best models trained using particular techniques.
Age at menopause and incident heart failure: the Multi-Ethnic Study of Atherosclerosis.
Ebong, Imo A; Watson, Karol E; Goff, David C; Bluemke, David A; Srikanthan, Preethi; Horwich, Tamara; Bertoni, Alain G
2014-06-01
This study aims to evaluate the associations of early menopause (menopause occurring before age 45 years) and age at menopause with incident heart failure (HF) in postmenopausal women. We also explored the associations of early menopause and age at menopause with left ventricular (LV) measures of structure and function in postmenopausal women. We included 2,947 postmenopausal women, aged 45 to 84 years without known cardiovascular disease (2000-2002), from the Multi-Ethnic Study of Atherosclerosis. Cox proportional hazards models were used to examine the associations of early menopause and age at menopause with incident HF. In 2,123 postmenopausal women in whom cardiac magnetic resonance imaging was obtained at baseline, we explored the associations of early menopause and age at menopause with LV measures using multivariable linear regression. Across a median follow-up of 8.5 years, we observed 71 HF events. There were no significant interactions with ethnicity for incident HF (Pinteraction > 0.05). In adjusted analysis, early menopause was associated with an increased risk of incident HF (hazard ratio, 1.66; 95% CI, 1.01-2.73), whereas every 1-year increase in age at menopause was associated with a decreased risk of incident HF (hazard ratio, 0.96; 95% CI, 0.94-0.99). We observed significant interactions between early menopause and ethnicity for LV mass-to-volume ratio (LVMVR; Pinteraction = 0.02). In Chinese-American women, early menopause was associated with a higher LVMVR (+0.11; P = 0.0002), whereas every 1-year increase in age at menopause was associated with a lower LVMVR (-0.004; P = 0.04) at baseline. Older age at menopause is independently associated with a decreased risk of incident HF. Concentric LV remodeling, indicated by a higher LVMVR, is present in Chinese-American women who experienced early menopause at baseline.
Symptoms of delirium predict incident delirium in older long-term care residents.
Cole, Martin G; McCusker, Jane; Voyer, Philippe; Monette, Johanne; Champoux, Nathalie; Ciampi, Antonio; Vu, Minh; Dyachenko, Alina; Belzile, Eric
2013-06-01
Detection of long-term care (LTC) residents at risk of delirium may lead to prevention of this disorder. The primary objective of this study was to determine if the presence of one or more Confusion Assessment Method (CAM) core symptoms of delirium at baseline assessment predicts incident delirium. Secondary objectives were to determine if the number or the type of symptoms predict incident delirium. The study was a secondary analysis of data collected for a prospective study of delirium among older residents of seven LTC facilities in Montreal and Quebec City, Canada. The Mini-Mental State Exam (MMSE), CAM, Delirium Index (DI), Hierarchic Dementia Scale, Barthel Index, and Cornell Scale for Depression were completed at baseline. The MMSE, CAM, and DI were repeated weekly for six months. Multivariate Cox regression models were used to determine if baseline symptoms predict incident delirium. Of 273 residents, 40 (14.7%) developed incident delirium. Mean (SD) time to onset of delirium was 10.8 (7.4) weeks. When one or more CAM core symptoms were present at baseline, the Hazard Ratio (HR) for incident delirium was 3.5 (95% CI = 1.4, 8.9). The HRs for number of symptoms present ranged from 2.9 (95% CI = 1.0, 8.3) for one symptom to 3.8 (95% CI = 1.3, 11.0) for three symptoms. The HR for one type of symptom, fluctuation, was 2.2 (95% CI = 1.2, 4.2). The presence of CAM core symptoms at baseline assessment predicts incident delirium in older LTC residents. These findings have potentially important implications for clinical practice and research in LTC settings.
Multivariate analysis: A statistical approach for computations
Michu, Sachin; Kaushik, Vandana
2014-10-01
Multivariate analysis is a type of multivariate statistical approach commonly used in, automotive diagnosis, education evaluating clusters in finance etc and more recently in the health-related professions. The objective of the paper is to provide a detailed exploratory discussion about factor analysis (FA) in image retrieval method and correlation analysis (CA) of network traffic. Image retrieval methods aim to retrieve relevant images from a collected database, based on their content. The problem is made more difficult due to the high dimension of the variable space in which the images are represented. Multivariate correlation analysis proposes an anomaly detection and analysis method based on the correlation coefficient matrix. Anomaly behaviors in the network include the various attacks on the network like DDOs attacks and network scanning.
A MATLAB companion for multivariable calculus
Cooper, Jeffery
2001-01-01
Offering a concise collection of MatLab programs and exercises to accompany a third semester course in multivariable calculus, A MatLab Companion for Multivariable Calculus introduces simple numerical procedures such as numerical differentiation, numerical integration and Newton''s method in several variables, thereby allowing students to tackle realistic problems. The many examples show students how to use MatLab effectively and easily in many contexts. Numerous exercises in mathematics and applications areas are presented, graded from routine to more demanding projects requiring some programming. Matlab M-files are provided on the Harcourt/Academic Press web site at http://www.harcourt-ap.com/matlab.html.* Computer-oriented material that complements the essential topics in multivariable calculus* Main ideas presented with examples of computations and graphics displays using MATLAB * Numerous examples of short code in the text, which can be modified for use with the exercises* MATLAB files are used to implem...
Simplicial band depth for multivariate functional data
López-Pintado, Sara
2014-03-05
We propose notions of simplicial band depth for multivariate functional data that extend the univariate functional band depth. The proposed simplicial band depths provide simple and natural criteria to measure the centrality of a trajectory within a sample of curves. Based on these depths, a sample of multivariate curves can be ordered from the center outward and order statistics can be defined. Properties of the proposed depths, such as invariance and consistency, can be established. A simulation study shows the robustness of this new definition of depth and the advantages of using a multivariate depth versus the marginal depths for detecting outliers. Real data examples from growth curves and signature data are used to illustrate the performance and usefulness of the proposed depths. © 2014 Springer-Verlag Berlin Heidelberg.
Multivariate generalized linear mixed models using R
Berridge, Damon Mark
2011-01-01
Multivariate Generalized Linear Mixed Models Using R presents robust and methodologically sound models for analyzing large and complex data sets, enabling readers to answer increasingly complex research questions. The book applies the principles of modeling to longitudinal data from panel and related studies via the Sabre software package in R. A Unified Framework for a Broad Class of Models The authors first discuss members of the family of generalized linear models, gradually adding complexity to the modeling framework by incorporating random effects. After reviewing the generalized linear model notation, they illustrate a range of random effects models, including three-level, multivariate, endpoint, event history, and state dependence models. They estimate the multivariate generalized linear mixed models (MGLMMs) using either standard or adaptive Gaussian quadrature. The authors also compare two-level fixed and random effects linear models. The appendices contain additional information on quadrature, model...
An architecture for implementation of multivariable controllers
DEFF Research Database (Denmark)
Niemann, Hans Henrik; Stoustrup, Jakob
1999-01-01
Browse > Conferences> American Control Conference, Prev | Back to Results | Next » An architecture for implementation of multivariable controllers 786292 searchabstract Niemann, H. ; Stoustrup, J. ; Dept. of Autom., Tech. Univ., Lyngby This paper appears in: American Control Conference, 1999....... Proceedings of the 1999 Issue Date : 1999 Volume : 6 On page(s): 4029 - 4033 vol.6 Location: San Diego, CA Meeting Date : 02 Jun 1999-04 Jun 1999 Print ISBN: 0-7803-4990-3 References Cited: 7 INSPEC Accession Number: 6403075 Digital Object Identifier : 10.1109/ACC.1999.786292 Date of Current Version : 06...... august 2002 Abstract An architecture for implementation of multivariable controllers is presented in this paper. The architecture is based on the Youla-Jabr-Bongiorno-Kucera parameterization of all stabilizing controllers. By using this architecture for implementation of multivariable controllers...
Theory of net analyte signal vectors in inverse regression
DEFF Research Database (Denmark)
Bro, R.; Andersen, Charlotte Møller
2003-01-01
The. net analyte signal and the net analyte signal vector are useful measures in building and optimizing multivariate calibration models. In this paper a theory for their use in inverse regression is developed. The theory of net analyte signal was originally derived from classical least squares i...... recently suggested by Faber (Anal. Chem. 1998; 70: 5108-5110). A required correction of the net analyte signal in situations with negative predicted responses is also discussed. Copyright (C) 2004 John Wiley Sons, Ltd.......The. net analyte signal and the net analyte signal vector are useful measures in building and optimizing multivariate calibration models. In this paper a theory for their use in inverse regression is developed. The theory of net analyte signal was originally derived from classical least squares...... in spectral calibration where the responses of all pure analytes and interferents are assumed to be known. However, in chemometrics, inverse calibration models such as partial least squares regression are more abundant and several tools for calculating the net analyte signal in inverse regression models have...
Meng, Xiangfei; Brunet, Alain; Turecki, Gustavo; Liu, Aihua; D'Arcy, Carl; Caron, Jean
2017-01-01
Objective Few studies have examined the effect of risk factor modifications on depression incidence. This study was to explore psychosocial risk factors for depression and quantify the effect of risk factor modifications on depression incidence in a large-scale, longitudinal population-based study. Methods Data were from the Montreal Longitudinal Catchment Area study (N=2433). Multivariate modified Poisson regression was used to estimate relative risk (RR). Population attributable fractions were also used to estimate the potential impact of risk factor modifications on depression incidence. Results The cumulative incidence rate of major depressive disorder at the 2-year follow-up was 4.8%, and 6.6% at the 4-year follow-up. Being a younger adult, female, widowed, separated or divorced, Caucasian, poor, occasional drinker, having a family history of mental health problems, having less education and living in areas with higher unemployment rates and higher proportions of visible minorities, more cultural community centres and community organisations, were consistently associated with the increased risk of incident major depressive disorder. Although only 5.1% of the disease incidence was potentially attributable to occasional drinking (vs abstainers) at the 2-year follow-up, the attribution of occasional drinking doubled at the 4-year follow-up. A 10% reduction in the prevalence of occasional drinking in this population could potentially prevent half of incident cases. Conclusions Modifiable risk factors, both individual and societal, could be the targets for public depression prevention programmes. These programmes should also be gender-specific, as different risk factors have been identified for men and women. Public health preventions at individual levels could focus on the better management of occasional drinking, as it explained around 5%~10% of incident major depressive disorders. Neighbourhood characteristics could also be the target for public prevention
Nonparametric Mixture of Regression Models.
Huang, Mian; Li, Runze; Wang, Shaoli
2013-07-01
Motivated by an analysis of US house price index data, we propose nonparametric finite mixture of regression models. We study the identifiability issue of the proposed models, and develop an estimation procedure by employing kernel regression. We further systematically study the sampling properties of the proposed estimators, and establish their asymptotic normality. A modified EM algorithm is proposed to carry out the estimation procedure. We show that our algorithm preserves the ascent property of the EM algorithm in an asymptotic sense. Monte Carlo simulations are conducted to examine the finite sample performance of the proposed estimation procedure. An empirical analysis of the US house price index data is illustrated for the proposed methodology.
Regression filter for signal resolution
International Nuclear Information System (INIS)
Matthes, W.
1975-01-01
The problem considered is that of resolving a measured pulse height spectrum of a material mixture, e.g. gamma ray spectrum, Raman spectrum, into a weighed sum of the spectra of the individual constituents. The model on which the analytical formulation is based is described. The problem reduces to that of a multiple linear regression. A stepwise linear regression procedure was constructed. The efficiency of this method was then tested by transforming the procedure in a computer programme which was used to unfold test spectra obtained by mixing some spectra, from a library of arbitrary chosen spectra, and adding a noise component. (U.K.)
Logistic regression for circular data
Al-Daffaie, Kadhem; Khan, Shahjahan
2017-05-01
This paper considers the relationship between a binary response and a circular predictor. It develops the logistic regression model by employing the linear-circular regression approach. The maximum likelihood method is used to estimate the parameters. The Newton-Raphson numerical method is used to find the estimated values of the parameters. A data set from weather records of Toowoomba city is analysed by the proposed methods. Moreover, a simulation study is considered. The R software is used for all computations and simulations.
Bayesian variable selection in regression
Energy Technology Data Exchange (ETDEWEB)
Mitchell, T.J.; Beauchamp, J.J.
1987-01-01
This paper is concerned with the selection of subsets of ''predictor'' variables in a linear regression model for the prediction of a ''dependent'' variable. We take a Bayesian approach and assign a probability distribution to the dependent variable through a specification of prior distributions for the unknown parameters in the regression model. The appropriate posterior probabilities are derived for each submodel and methods are proposed for evaluating the family of prior distributions. Examples are given that show the application of the Bayesian methodology. 23 refs., 3 figs.
Millard, Heather R; Musani, Solomon K; Dibaba, Daniel T; Talegawkar, Sameera A; Taylor, Herman A; Tucker, Katherine L; Bidulescu, Aurelian
2018-02-01
Several mechanisms have been described through which dietary intake of choline and its derivative betaine may be associated in both directions with subclinical atherosclerosis. We assessed the association of dietary intake of choline and betaine with cardiovascular risk and markers of subclinical cardiovascular disease. Data from 3924 Jackson Heart Study (JHS) African-American participants with complete food frequency questionnaire at baseline and follow-up measurements of heart disease measures were used. Multivariable linear regression models were employed to assess associations between choline and betaine intake with carotid intima-media thickness, coronary artery calcium, abdominal aortic calcium and left ventricular mass. Cox proportional hazards regression models were used to estimate associations with time to incident coronary heart disease (CHD), ischemic stroke and cardiovascular disease (CVD). During an average nine years of follow-up, 124 incident CHD events, 75 incident stroke events and 153 incident CVD events were documented. In women, greater choline intake was associated with lower left ventricular mass (p = 0.0006 for trend across choline quartiles) and with abdominal aortic calcium score. Among all JHS participants, there was a statistically significant inverse association between dietary choline intake and incident stroke, β = -0.33 (p = 0.04). Betaine intake was associated with greater risk of incident CHD when comparing the third quartile of intake with the lowest quartile of intake (HR 1.89, 95 % CI 1.14, 3.15). Among our African-American participants, higher dietary choline intake was associated with a lower risk of incident ischemic stroke, and thus putative dietary benefits. Higher dietary betaine intake was associated with a nonlinear higher risk of incident CHD.
History of uterine leiomyomata and incidence of breast cancer.
Wise, Lauren A; Radin, Rose G; Rosenberg, Lynn; Adams-Campbell, Lucile; Palmer, Julie R
2015-10-01
Uterine leiomyomata (UL), benign tumors of the myometrium, are influenced by sex steroid hormones. A history of UL diagnosis has been associated with a higher risk of uterine malignancies. The relation between UL and breast cancer, another hormonally responsive cancer, has not been studied. We investigated the association between self-reported physician-diagnosed UL and incidence of breast cancer in the Black Women's Health Study, a prospective cohort study. We followed 57,747 participants without a history of breast cancer from 1995 to 2013. UL diagnoses were reported at baseline and biennially. Breast cancer was reported on biennial questionnaires and confirmed by pathology data from medical records or cancer registries. Cox regression was used to derive incidence rate ratios (IRRs) and 95 % confidence intervals (CI) and adjust for potential confounders. There were 2,276 incident cases of breast cancer (1,699 invasive, 394 in situ, and 183 unknown) during 879,672 person-years of follow-up. The multivariable IRR for the overall association between history of UL and breast cancer incidence was 0.99 (95 % CI 0.90-1.08), with similar results for ER + (IRR = 1.03) and ER - breast cancer (IRR = 1.05). IRRs for early diagnosis of UL (before age 30) were slightly above 1.0, with IRRs of 1.14 (95 % CI 0.99-1.31) for overall breast cancer, 1.14 (95 % CI 0.93-1.40) for ER + breast cancer, and 1.20 (95 % CI 0.89-1.61) for ER - breast cancer. IRRs for early diagnosis of UL were elevated for breast cancer diagnosed before 40 years of age (IRR = 1.39, 95 % CI 0.97-1.99) and premenopausal breast cancer (IRR = 1.26, 95 % CI 1.01-1.58). No consistent patterns in risk were observed across estrogen receptor subtypes, and IRRs did not differ appreciably within strata of BMI, female hormone use, mammography recency, or family history of breast cancer. The present study of US black women suggests that a history of UL diagnosis is unrelated to the incidence of breast cancer overall. The
Survival analysis II: Cox regression
Stel, Vianda S.; Dekker, Friedo W.; Tripepi, Giovanni; Zoccali, Carmine; Jager, Kitty J.
2011-01-01
In contrast to the Kaplan-Meier method, Cox proportional hazards regression can provide an effect estimate by quantifying the difference in survival between patient groups and can adjust for confounding effects of other variables. The purpose of this article is to explain the basic concepts of the
Regression of lumbar disk herniation
Directory of Open Access Journals (Sweden)
G. Yu Evzikov
2015-01-01
Full Text Available Compression of the spinal nerve root, giving rise to pain and sensory and motor disorders in the area of its innervation is the most vivid manifestation of herniated intervertebral disk. Different treatment modalities, including neurosurgery, for evolving these conditions are discussed. There has been recent evidence that spontaneous regression of disk herniation can regress. The paper describes a female patient with large lateralized disc extrusion that has caused compression of the nerve root S1, leading to obvious myotonic and radicular syndrome. Magnetic resonance imaging has shown that the clinical manifestations of discogenic radiculopathy, as well myotonic syndrome and morphological changes completely regressed 8 months later. The likely mechanism is inflammation-induced resorption of a large herniated disk fragment, which agrees with the data available in the literature. A decision to perform neurosurgery for which the patient had indications was made during her first consultation. After regression of discogenic radiculopathy, there was only moderate pain caused by musculoskeletal diseases (facet syndrome, piriformis syndrome that were successfully eliminated by minimally invasive techniques.
Ridge Regression for Interactive Models.
Tate, Richard L.
1988-01-01
An exploratory study of the value of ridge regression for interactive models is reported. Assuming that the linear terms in a simple interactive model are centered to eliminate non-essential multicollinearity, a variety of common models, representing both ordinal and disordinal interactions, are shown to have "orientations" that are…
Regression Models for Repairable Systems
Czech Academy of Sciences Publication Activity Database
Novák, Petr
2015-01-01
Roč. 17, č. 4 (2015), s. 963-972 ISSN 1387-5841 Institutional support: RVO:67985556 Keywords : Reliability analysis * Repair models * Regression Subject RIV: BB - Applied Statistics , Operational Research Impact factor: 0.782, year: 2015 http://library.utia.cas.cz/separaty/2015/SI/novak-0450902.pdf
Cactus: An Introduction to Regression
Hyde, Hartley
2008-01-01
When the author first used "VisiCalc," the author thought it a very useful tool when he had the formulas. But how could he design a spreadsheet if there was no known formula for the quantities he was trying to predict? A few months later, the author relates he learned to use multiple linear regression software and suddenly it all clicked into…
Barra, Lillian J; Pope, Janet E; Hitchon, Carol; Boire, Gilles; Schieir, Orit; Lin, Daming; Thorne, Carter J; Tin, Diane; Keystone, Edward C; Haraoui, Boulos; Jamal, Shahin; Bykerk, Vivian P
2017-05-01
. RA is associated with an increased risk of cardiovascular events (CVEs). The objective was to estimate independent effects of RA autoantibodies on the incident CVEs in patients with early RA. Patients were enrolled in the Canadian Early Inflammatory Arthritis Cohort, a prospective multicentre inception cohort. Incident CVEs, including acute coronary syndromes and cerebrovascular events, were self-reported by the patient and partially validated by medical chart review. Seropositive status was defined as either RF or ACPA positive. Multivariable Cox proportional hazards survival analysis was used to estimate the effects of seropositive status on incident CVEs, controlling for RA clinical variables and traditional cardiovascular risk factors. . A total of 2626 patients were included: the mean symptom duration at diagnosis was 6.3 months ( s . d . 4.6), the mean age was 53 years ( s . d . 15), 72% were female and 86% met classification criteria for RA. Forty-six incident CVEs occurred over 6483 person-years [incidence rate 7.1/1000 person-years (95% confidence interval 5.3, 9.4)]. The CVE rate did not differ in seropositive vs seronegative subjects and seropositivity was not associated with incident CVEs in multivariable Cox regression models. Baseline covariates independently associated with incident CVEs were older age, a history of hypertension and a longer duration of RA symptoms prior to diagnosis. The rate of CVEs early in the course of inflammatory arthritis was low; however, delays in the diagnosis of arthritis increased the rate of CVEs. Hypertension was the strongest independent risk factor for CVEs. Results support early aggressive management of RA disease activity and co-morbidities to prevent severe complications. © The Author 2017. Published by Oxford University Press on behalf of the British Society for Rheumatology. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Multivariate mixtures of Erlangs for density estimation under censoring
Verbelen, R.; Antonio, K.; Claeskens, G.
2016-01-01
Multivariate mixtures of Erlang distributions form a versatile, yet analytically tractable, class of distributions making them suitable for multivariate density estimation. We present a flexible and effective fitting procedure for multivariate mixtures of Erlangs, which iteratively uses the EM
Skewness for multivariate distributions: two approaches
Avérous, Jean; Meste, Michel
1997-01-01
This paper presents two approaches for qualitative, quantitative and comparative concepts of skewness to be defined with respect to the spatial median for multivariate distributions. They extend the known quantile-based notions defined for real distributions. The main tool for such extensions consists of a family of central parts that provide suitable generalizations of the real interquantile intervals.
Multivariate Analysis of Industrial Scale Fermentation Data
DEFF Research Database (Denmark)
Mears, Lisa; Nørregård, Rasmus; Stocks, Stuart M.
2015-01-01
Multivariate analysis allows process understanding to be gained from the vast and complex datasets recorded from fermentation processes, however the application of such techniques to this field can be limited by the data pre-processing requirements and data handling. In this work many iterations...
Ranking multivariate GARCH models by problem dimension
M. Caporin (Massimiliano); M.J. McAleer (Michael)
2010-01-01
textabstractIn the last 15 years, several Multivariate GARCH (MGARCH) models have appeared in the literature. The two most widely known and used are the Scalar BEKK model of Engle and Kroner (1995) and Ding and Engle (2001), and the DCC model of Engle (2002). Some recent research has begun to
Irreducible multivariate polynomials obtained from polynomials in ...
Indian Academy of Sciences (India)
over K(X)? We provided some methods to construct irreducible multivariate polynomials over an arbi- trary field, starting from arbitrary irreducible polynomials in fewer variables, of which we mention the following two results: Theorem A. If we write an irreducible polynomial f ∈ K[X] as a sum of polynomials a0,..., an ∈ K[X] ...
The value of multivariate model sophistication
DEFF Research Database (Denmark)
Rombouts, Jeroen; Stentoft, Lars; Violante, Francesco
2014-01-01
We assess the predictive accuracies of a large number of multivariate volatility models in terms of pricing options on the Dow Jones Industrial Average. We measure the value of model sophistication in terms of dollar losses by considering a set of 444 multivariate models that differ in their spec......We assess the predictive accuracies of a large number of multivariate volatility models in terms of pricing options on the Dow Jones Industrial Average. We measure the value of model sophistication in terms of dollar losses by considering a set of 444 multivariate models that differ...... in their specification of the conditional variance, conditional correlation, innovation distribution, and estimation approach. All of the models belong to the dynamic conditional correlation class, which is particularly suitable because it allows consistent estimations of the risk neutral dynamics with a manageable....... In addition to investigating the value of model sophistication in terms of dollar losses directly, we also use the model confidence set approach to statistically infer the set of models that delivers the best pricing performances....
Multivariate linear models and repeated measurements revisited
DEFF Research Database (Denmark)
Dalgaard, Peter
2009-01-01
Methods for generalized analysis of variance based on multivariate normal theory have been known for many years. In a repeated measurements context, it is most often of interest to consider transformed responses, typically within-subject contrasts or averages. Efficiency considerations leads to s...
Multivariate linear models and repeated measurements revisited
DEFF Research Database (Denmark)
Dalgaard, Peter
2009-01-01
Methods for generalized analysis of variance based on multivariate normal theory have been known for many years. In a repeated measurements context, it is most often of interest to consider transformed responses, typically within-subject contrasts or averages. Efficiency considerations leads to s...... method involving differences between orthogonal projections onto subspaces generated by within-subject models....
Irreducible multivariate polynomials obtained from polynomials in ...
Indian Academy of Sciences (India)
We provided some methods to construct irreducible multivariate polynomials over an arbi- trary field, starting from arbitrary irreducible polynomials in fewer variables, of which we mention the following two results: Theorem A. If we write an irreducible polynomial f ∈ K[X] as a sum of polynomials a0,..., an ∈ K[X] with deg a0 ...
Visualization of Multivariate Athlete Performance Data
Telea, A.; Hillerin, P. de; Valeanu, V.
2007-01-01
We present a set of visualization methods for the analysis of multivariate data recorded from the measurement of the performance of athletes during training. We use a modified training device to measure the force, acceleration, displacement, and speed of the athlete’s feet and arms while performing
MBIS: multivariate Bayesian image segmentation tool.
Esteban, Oscar; Wollny, Gert; Gorthi, Subrahmanyam; Ledesma-Carbayo, María-J; Thiran, Jean-Philippe; Santos, Andrés; Bach-Cuadra, Meritxell
2014-07-01
We present MBIS (Multivariate Bayesian Image Segmentation tool), a clustering tool based on the mixture of multivariate normal distributions model. MBIS supports multichannel bias field correction based on a B-spline model. A second methodological novelty is the inclusion of graph-cuts optimization for the stationary anisotropic hidden Markov random field model. Along with MBIS, we release an evaluation framework that contains three different experiments on multi-site data. We first validate the accuracy of segmentation and the estimated bias field for each channel. MBIS outperforms a widely used segmentation tool in a cross-comparison evaluation. The second experiment demonstrates the robustness of results on atlas-free segmentation of two image sets from scan-rescan protocols on 21 healthy subjects. Multivariate segmentation is more replicable than the monospectral counterpart on T1-weighted images. Finally, we provide a third experiment to illustrate how MBIS can be used in a large-scale study of tissue volume change with increasing age in 584 healthy subjects. This last result is meaningful as multivariate segmentation performs robustly without the need for prior knowledge. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Makino, Takahiro; Honda, Hirotsugu; Fujiwara, Hiroyasu; Yoshikawa, Hideki; Yonenobu, Kazuo; Kaito, Takashi
2018-01-01
A retrospective review of prospectively collected data. To investigate the incidence of radiographic and symptomatic adjacent segment disease (ASD) and identify possible risk factors for ASD after posterior lumbar interbody fusion (PLIF) with minimum disc distraction by selecting low-height interbody cages. Excessive disc space distraction is reportedly 1 of the risk factors for ASD after PLIF; however, the incidence and other risk factors of ASD after PLIF with minimum disc distraction remain unclear. Forty-one consecutive patients who underwent PLIF at L4-L5 and were postoperatively followed up for a minimum of 2 years were included. The height and shape (box or bullet shape) of interbody cages was determined according to the disc height and morphology of the intervertebral space assessed on preoperative computed tomography scans to avoid excessive distraction. The incidence of radiographic and symptomatic ASD was evaluated and all demographic and radiographic parameters were compared between patients with and without ASD. Multivariate logistic regression analysis was performed to identify risk factors for ASD among the variables with P < .20 in univariate analysis. The overall incidence of ASD was 12.2% (5/41 patients): radiographic ASD, 7.3% (3 patients); symptomatic ASD, 4.9% (2 patients). Multivariate analysis revealed preoperative retrolisthesis of L3 on extension as the sole risk factor for ASD after PLIF with minimum disc distraction (odds ratio, 2.13; 95% confidence interval, 1.00-4.05; P = .049). The incidence of ASD in this study was lower than that of ASD in our previous study about PLIF with distraction of disc space (12.2% vs. 31.8%). Minimum disc distraction by selection of low-height interbody cages is a simple and effective method to prevent ASD at the surgeons' discretion, although preexisting retrolisthesis at the adjacent upper segment should be taken into consideration. Copyright © 2017 The Authors. Published by Wolters Kluwer Health
DEFF Research Database (Denmark)
Bladt, Mogens; Nielsen, Bo Friis
2012-01-01
Numerous definitions of multivariate exponential and gamma distributions can be retrieved from the literature [4]. These distribtuions belong to the class of Multivariate Matrix-- Exponetial Distributions (MVME) whenever their joint Laplace transform is a rational function. The majority of these ......Numerous definitions of multivariate exponential and gamma distributions can be retrieved from the literature [4]. These distribtuions belong to the class of Multivariate Matrix-- Exponetial Distributions (MVME) whenever their joint Laplace transform is a rational function. The majority...... Laplace transform. In a longer perspective stochastic and statistical analysis for MVME will in particular apply to any of the previously defined distributions. Multivariate gamma distributions have been used in a variety of fields like hydrology, [11], [10], [6], space (wind modeling) [9] reliability [3...
Handicap 5 years after stroke in the North East Melbourne Stroke Incidence Study.
Gall, Seana L; Dewey, Helen M; Sturm, Jonathan W; Macdonell, Richard A L; Thrift, Amanda G
2009-01-01
Handicap is rarely comprehensively examined after stroke. We examined handicap among 5-year stroke survivors from an 'ideal' stroke incidence study. Survivors were assessed with the London Handicap Scale [LHS, score range: 0 (greatest handicap) to 100 (least handicap)]. Multivariable regression was used to examine demographic, risk and stroke-related factors associated with handicap. 351 of 441 (80%) survivors were assessed. Those assessed were more often Australian born than those not assessed (p handicap was present for physical independence and occupation/leisure items. Handicap was associated with older age, manual occupations, smoking, initial stroke severity, recurrent stroke and mood disorders. Reducing recurrent stroke, through better risk factor management, is likely to reduce handicap. The association between handicap and mood disorders, which are potentially modifiable, warrants further investigation. Copyright (c) 2008 S. Karger AG, Basel.
Multiple imputation with multivariate imputation by chained equation (MICE) package.
Zhang, Zhongheng
2016-01-01
Multiple imputation (MI) is an advanced technique for handing missing values. It is superior to single imputation in that it takes into account uncertainty in missing value imputation. However, MI is underutilized in medical literature due to lack of familiarity and computational challenges. The article provides a step-by-step approach to perform MI by using R multivariate imputation by chained equation (MICE) package. The procedure firstly imputed m sets of complete dataset by calling mice() function. Then statistical analysis such as univariate analysis and regression model can be performed within each dataset by calling with() function. This function sets the environment for statistical analysis. Lastly, the results obtained from each analysis are combined by using pool() function.
Multivariate data analysis of enzyme production for hydrolysis purposes
DEFF Research Database (Denmark)
Schmidt, A.S.; Suhr, K.I.
1999-01-01
of the structure in the data - possibly combined with analysis of variance (ANOVA). Partial least squares regression (PLSR) showed a clear connection between the two differentdata matrices (the fermentation variables and the hydrolysis variables). Hence, PLSR was suitable for prediction purposes. The hydrolysis......Data from enzyme production experiments were analysed using different multivariate methods. The data set comprised of 12 objects (3 fungi (¤Aspergillus oryzae, Aspergillus fumigatur, Trichoderma reesei¤) grown on 4 substrates (lenzing and/or wet-oxidisedzylan)) and 12 variables (pH, biomass, 7...... enzyme activities (xylanase, zylosidase, arabinosidase, cellulase, acetyl zylan esterase, glucuronidase, feroyl esterase) and 3 hydrolysis efficiencies (reducing suggars at 3 different enzyme loadings)). Principalcomponent analysis (PCA) proved to be an efficient method to obtain an overview...
The Multivariate Generalised von Mises Distribution: Inference and Applications
DEFF Research Database (Denmark)
Navarro, Alexandre Khae Wu; Frellsen, Jes; Turner, Richard
2017-01-01
Circular variables arise in a multitude of data-modelling contexts ranging from robotics to the social sciences, but they have been largely overlooked by the machine learning community. This paper partially redresses this imbalance by extending some standard probabilistic modelling tools to the c......Circular variables arise in a multitude of data-modelling contexts ranging from robotics to the social sciences, but they have been largely overlooked by the machine learning community. This paper partially redresses this imbalance by extending some standard probabilistic modelling tools......-torus. Previously proposed multivariate circular distributions are shown to be special cases of this construction. Second, we introduce a new probabilistic model for circular regression inspired by Gaussian Processes, and a method for probabilistic Principal Component Analysis with circular hidden variables...
Kernel regression for fMRI pattern prediction.
Chu, Carlton; Ni, Yizhao; Tan, Geoffrey; Saunders, Craig J; Ashburner, John
2011-05-15
This paper introduces two kernel-based regression schemes to decode or predict brain states from functional brain scans as part of the Pittsburgh Brain Activity Interpretation Competition (PBAIC) 2007, in which our team was awarded first place. Our procedure involved image realignment, spatial smoothing, detrending of low-frequency drifts, and application of multivariate linear and non-linear kernel regression methods: namely kernel ridge regression (KRR) and relevance vector regression (RVR). RVR is based on a Bayesian framework, which automatically determines a sparse solution through maximization of marginal likelihood. KRR is the dual-form formulation of ridge regression, which solves regression problems with high dimensional data in a computationally efficient way. Feature selection based on prior knowledge about human brain function was also used. Post-processing by constrained deconvolution and re-convolution was used to furnish the prediction. This paper also contains a detailed description of how prior knowledge was used to fine tune predictions of specific "feature ratings," which we believe is one of the key factors in our prediction accuracy. The impact of pre-processing was also evaluated, demonstrating that different pre-processing may lead to significantly different accuracies. Although the original work was aimed at the PBAIC, many techniques described in this paper can be generally applied to any fMRI decoding works to increase the prediction accuracy. Published by Elsevier Inc.
Regression Methods for Virtual Metrology of Layer Thickness in Chemical Vapor Deposition
DEFF Research Database (Denmark)
Purwins, Hendrik; Barak, Bernd; Nagi, Ahmed
2014-01-01
predictive variable alone, the 3 most predictive variables, an expert selection, and full set. The following regression methods are compared: Simple Linear Regression, Multiple Linear Regression, Partial Least Square Regression, and Ridge Linear Regression utilizing the Partial Least Square Estimate......The quality of wafer production in semiconductor manufacturing cannot always be monitored by a costly physical measurement. Instead of measuring a quantity directly, it can be predicted by a regression method (Virtual Metrology). In this paper, a survey on regression methods is given to predict...... algorithm, and Support Vector Regression (SVR). On a test set, SVR outperforms the other methods by a large margin, being more robust towards changes in the production conditions. The method performs better on high-dimensional multivariate input data than on the most predictive variables alone. Process...
Bohl, Daniel D; Ahn, Junyoung; Rossi, Vincent J; Tabaraee, Ehsan; Grauer, Jonathan N; Singh, Kern
2016-03-01
Postoperative pneumonia has important clinical consequences for both patients and the health-care system. Few studies have examined pneumonia following anterior cervical decompression and fusion (ACDF) procedures. This study aimed to determine the incidence and risk factors for development of pneumonia following ACDF procedures. A retrospective cohort study of data collected prospectively by the American College of Surgeons National Surgical Quality Improvement Program was carried out. This study comprised 11,353 patients undergoing ACDF procedures during 2011-2013. The primary outcome was diagnosis of pneumonia in the first 30 postoperative days. Independent risk factors for the development of pneumonia were identified using multivariate regression. Readmission rates were compared between patients who did and did not develop pneumonia using multivariate regression that adjusted for all demographic, comorbidity, and procedural characteristics. The incidence of pneumonia was 0.45% (95% confidence interval=0.33%-0.57%). In the multivariate analysis, independent risk factors for the development of pneumonia were greater age (prisk [RR]=5.3, ppneumonia following discharge had a higher readmission rate than other patients (72.7% vs. 2.4%, adjusted RR=24.5, ppneumonia. Pneumonia occurs in approximately 1 in 200 patients following ACDF procedures. Patients who are older, are functionally dependent, or have chronic obstructive pulmonary disease are at greater risk. These patients should be counseled, monitored, and targeted with preventative interventions accordingly. Greater operative duration is also an independent risk factor. Approximately three in four patients who develop pneumonia following hospitalization for ACDF procedures are readmitted. This elevated readmission rate has implications for bundled payments and hospital performance reports. Copyright © 2016 Elsevier Inc. All rights reserved.
Montgomery County of Maryland — This dataset contains the monthly summary data indicating incident occurred in each fire station response area. The summary data is the incident count broken down by...
Police Incident Reports Written
Town of Chapel Hill, North Carolina — This table contains incident reports filed with the Chapel Hill Police Department. Multiple incidents may have been reported at the same time. The most serious...
Takase, Hiroyuki; Sugiura, Tomonori; Kimura, Genjiro; Ohte, Nobuyuki; Dohi, Yasuaki
2015-01-01
Background Although there is a close relationship between dietary sodium and hypertension, the concept that persons with relatively high dietary sodium are at increased risk of developing hypertension compared with those with relatively low dietary sodium has not been studied intensively in a cohort. Methods and Results We conducted an observational study to investigate whether dietary sodium intake predicts future blood pressure and the onset of hypertension in the general population. Individual sodium intake was estimated by calculating 24-hour urinary sodium excretion from spot urine in 4523 normotensive participants who visited our hospital for a health checkup. After a baseline examination, they were followed for a median of 1143 days, with the end point being development of hypertension. During the follow-up period, hypertension developed in 1027 participants (22.7%). The risk of developing hypertension was higher in those with higher rather than lower sodium intake (hazard ratio 1.25, 95% CI 1.04 to 1.50). In multivariate Cox proportional hazards regression analysis, baseline sodium intake and the yearly change in sodium intake during the follow-up period (as continuous variables) correlated with the incidence of hypertension. Furthermore, both the yearly increase in sodium intake and baseline sodium intake showed significant correlations with the yearly increase in systolic blood pressure in multivariate regression analysis after adjustment for possible risk factors. Conclusions Both relatively high levels of dietary sodium intake and gradual increases in dietary sodium are associated with future increases in blood pressure and the incidence of hypertension in the Japanese general population. PMID:26224048
Takase, Hiroyuki; Sugiura, Tomonori; Kimura, Genjiro; Ohte, Nobuyuki; Dohi, Yasuaki
2015-07-29
Although there is a close relationship between dietary sodium and hypertension, the concept that persons with relatively high dietary sodium are at increased risk of developing hypertension compared with those with relatively low dietary sodium has not been studied intensively in a cohort. We conducted an observational study to investigate whether dietary sodium intake predicts future blood pressure and the onset of hypertension in the general population. Individual sodium intake was estimated by calculating 24-hour urinary sodium excretion from spot urine in 4523 normotensive participants who visited our hospital for a health checkup. After a baseline examination, they were followed for a median of 1143 days, with the end point being development of hypertension. During the follow-up period, hypertension developed in 1027 participants (22.7%). The risk of developing hypertension was higher in those with higher rather than lower sodium intake (hazard ratio 1.25, 95% CI 1.04 to 1.50). In multivariate Cox proportional hazards regression analysis, baseline sodium intake and the yearly change in sodium intake during the follow-up period (as continuous variables) correlated with the incidence of hypertension. Furthermore, both the yearly increase in sodium intake and baseline sodium intake showed significant correlations with the yearly increase in systolic blood pressure in multivariate regression analysis after adjustment for possible risk factors. Both relatively high levels of dietary sodium intake and gradual increases in dietary sodium are associated with future increases in blood pressure and the incidence of hypertension in the Japanese general population. © 2015 The Authors. Published on behalf of the American Heart Association, Inc., by Wiley Blackwell.
Lim, Dong Hui; Shin, Dong Hoon; Han, Gyule; Chung, Eui Sang; Chung, Tae Young
2017-08-01
In the present study, the incidence and risk factors of lens-iris diaphragm retropulsion syndrome (LIDRS) were evaluated. Patients who underwent cataract surgery using phacoemulsification between June 2014 and December 2014 were included in the study. The preoperative ocular biometric and intraoperative surgical parameters were examined. The incidence of LIDRS and various risk factors were analyzed using an independent t-test, Pearson's chi-square test, and univariable and multivariable logistic regression analyses. Among 124 eyes of 124 patients, 100 (80.6%) had no LIDRS and 24 (19.4%) had LIDRS. LIDRS occurred in 13 of 31 vitrectomized eyes (41.9%) and 11 of 93 non-vitrectomized eyes (11.8%). Based on univariable analysis, age (odds ratio [OR], 0.920; p = 0.001), vitrectomized eye (OR, 5.038; p = 0.001), spherical equivalent (OR, 0.778; p < 0.001), axial length (OR, 1.716; p < 0.001), anterior chamber depth (OR, 3.328; p = 0.037), and 3.0 mm vs. 2.2 mm incision size (OR, 4.964; p = 0.001) were statistically significant risk factors associated with the development of LIDRS. Conditional multivariable logistic regression showed that vitrectomized eye (OR, 3.865; 95% confidence interval [CI], 1.201 to 12.436; p = 0.023), long axial length (OR, 1.709; 95% CI, 1.264 to 2.310; p = 0.001), and 3.0 vs. 2.2 mm incision size (OR, 3.571; 95% CI, 1.120 to 11.393; p = 0.031) were significant independent risk factors associated with LIDRS. LIDRS is a relatively common occurrence and was found to be associated with vitrectomized eye, long axial length, and larger incision size. Evaluating risk factors prior to cataract surgery can help reduce associated morbidity.
Quantile Regression With Measurement Error
Wei, Ying
2009-08-27
Regression quantiles can be substantially biased when the covariates are measured with error. In this paper we propose a new method that produces consistent linear quantile estimation in the presence of covariate measurement error. The method corrects the measurement error induced bias by constructing joint estimating equations that simultaneously hold for all the quantile levels. An iterative EM-type estimation algorithm to obtain the solutions to such joint estimation equations is provided. The finite sample performance of the proposed method is investigated in a simulation study, and compared to the standard regression calibration approach. Finally, we apply our methodology to part of the National Collaborative Perinatal Project growth data, a longitudinal study with an unusual measurement error structure. © 2009 American Statistical Association.
Active set support vector regression.
Musicant, David R; Feinberg, Alexander
2004-03-01
This paper presents active set support vector regression (ASVR), a new active set strategy to solve a straightforward reformulation of the standard support vector regression problem. This new algorithm is based on the successful ASVM algorithm for classification problems, and consists of solving a finite number of linear equations with a typically large dimensionality equal to the number of points to be approximated. However, by making use of the Sherman-Morrison-Woodbury formula, a much smaller matrix of the order of the original input space is inverted at each step. The algorithm requires no specialized quadratic or linear programming code, but merely a linear equation solver which is publicly available. ASVR is extremely fast, produces comparable generalization error to other popular algorithms, and is available on the web for download.
Producing The New Regressive Left
DEFF Research Database (Denmark)
Crone, Christine
This thesis is the first comprehensive research work conducted on the Beirut based TV station, an important representative of the post-2011 generation of Arab satellite news media. The launch of al-Mayadeen in June 2012 was closely linked to the political developments across the Arab world...... members, this thesis investigates a growing political trend and ideological discourse in the Arab world that I have called The New Regressive Left. On the premise that a media outlet can function as a forum for ideology production, the thesis argues that an analysis of this material can help to trace...... the contexture of The New Regressive Left. If the first part of the thesis lays out the theoretical approach and draws the contextual framework, through an exploration of the surrounding Arab media-and ideoscapes, the second part is an analytical investigation of the discourse that permeates the programmes aired...
AUTISTIC EPILEPTIFORM REGRESSION (A REVIEW
Directory of Open Access Journals (Sweden)
L. Yu. Glukhova
2012-01-01
Full Text Available The author represents the review of current scientific literature devoted to autistic epileptiform regression — the special form of autistic disorder, characterized by development of severe communicative disorders in children as a result of continuous prolonged epileptiform activity on EEG. This condition has been described by R.F. Tuchman and I. Rapin in 1997. The author describes the aspects of pathogenesis, clinical pictures and diagnostics of this disorder, including the peculiar anomalies on EEG (benign epileptiform patterns of childhood, with a high index of epileptiform activity, especially in the sleep. The especial attention is given to approaches to the treatment of autistic epileptiform regression. Efficacy of valproates, corticosteroid hormones and antiepileptic drugs of other groups is considered.
Polynomial Regressions and Nonsense Inference
Directory of Open Access Journals (Sweden)
Daniel Ventosa-Santaulària
2013-11-01
Full Text Available Polynomial specifications are widely used, not only in applied economics, but also in epidemiology, physics, political analysis and psychology, just to mention a few examples. In many cases, the data employed to estimate such specifications are time series that may exhibit stochastic nonstationary behavior. We extend Phillips’ results (Phillips, P. Understanding spurious regressions in econometrics. J. Econom. 1986, 33, 311–340. by proving that an inference drawn from polynomial specifications, under stochastic nonstationarity, is misleading unless the variables cointegrate. We use a generalized polynomial specification as a vehicle to study its asymptotic and finite-sample properties. Our results, therefore, lead to a call to be cautious whenever practitioners estimate polynomial regressions.
Spontaneous regression of colon cancer.
Kihara, Kyoichi; Fujita, Shin; Ohshiro, Taihei; Yamamoto, Seiichiro; Sekine, Shigeki
2015-01-01
A case of spontaneous regression of transverse colon cancer is reported. A 64-year-old man was diagnosed as having cancer of the transverse colon at a local hospital. Initial and second colonoscopy examinations revealed a typical cancer of the transverse colon, which was diagnosed as moderately differentiated adenocarcinoma. The patient underwent right hemicolectomy 6 weeks after the initial colonoscopy. The resected specimen showed only a scar at the tumor site, and no cancerous tissue was proven histologically. The patient is alive with no evidence of recurrence 1 year after surgery. Although an antitumor immune response is the most likely explanation, the exact nature of the phenomenon was unclear. We describe this rare case and review the literature pertaining to spontaneous regression of colorectal cancer. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Cruyff, M.; Böckenholt, U.; van der Heijden, P.G.M.; Frank, L.E.
2016-01-01
In survey research, it is often problematic to ask people sensitive questions because they may refuse to answer or they may provide a socially desirable answer that does not reveal their true status on the sensitive question. To solve this problem Warner (1965) proposed randomized response (RR).
Incidence of and Risk Factors for Free Bowel Perforation in Patients with Crohn's Disease.
Kim, Jong Wook; Lee, Ho-Su; Ye, Byong Duk; Yang, Suk-Kyun; Hwang, Sung Wook; Park, Sang Hyoung; Yang, Dong-Hoon; Kim, Kyung-Jo; Byeon, Jeong-Sik; Myung, Seung-Jae; Yoon, Yong Sik; Yu, Chang Sik; Kim, Jin-Ho
2017-06-01
Incidence of and risk factors for intestinal free perforation (FP) in patients with Crohn's disease (CD) are not established. To establish rate of and risk factors for FP in a large cohort of CD patients. Medical records of CD patients who visited Asan Medical Center from June 1989 to December 2012 were reviewed. After matching the FP patients to controls (1:4) by gender, year, and age at CD diagnosis, and disease location, their clinical characteristics were compared using conditional logistic regression analysis. Among 2043 patients who were included in our study cohort, 44 patients (2.15%) developed FP over a median follow-up period of 79.8 months (interquartile range 37.3-124.6), with an incidence of 3.18 per 1000 person-years [95% confidence interval (CI) 2.37-4.28]. All 44 patients underwent emergency surgery, and eight patients underwent reoperation within 12 months (8/44, 18.2%). Multivariable-adjusted analysis revealed that anti-TNF therapy [odds ratio (OR), 3.73; 95% CI 1.19-11.69; p = 0.024] was associated with an increased risk of FP. The incidence of FP in a large cohort of Korean CD patients was 2.15%, which was similar to that in Western reports. Anti-TNF therapy could be risk factors for FP.
Reproductive factors and incidence of endometrial cancer in U.S. black women.
Sponholtz, Todd R; Palmer, Julie R; Rosenberg, Lynn; Hatch, Elizabeth E; Adams-Campbell, Lucile L; Wise, Lauren A
2017-06-01
Previous studies have shown that reproductive history is a strong determinant of endometrial cancer risk among white women. Less is known about how reproductive history affects endometrial cancer risk among black women, whose incidence and mortality differ from white women. We investigated the associations of age at menarche, parity, timing of births, and menopausal age with endometrial cancer in the Black Women's Health Study, a prospective cohort study. Every 2 years from 1995 to 2013, 47,555 participants with intact uteri at baseline in 1995 completed questionnaires on reproductive and medical history, and lifestyle factors. Self-reported cases of endometrial cancer were confirmed by medical record, cancer registry, or death certificate when available. Cox proportional hazards regression was used to estimate multivariable incidence rate ratios (IRR) and 95% confidence intervals (CI). During 689,501 person-years of follow-up, we identified 300 incident cases of endometrial cancer. The strongest associations with endometrial cancer were found for early age at menarche (black women were generally consistent with those in studies of white women.
Incident Information Management Tool
Pejovic, Vladimir
2015-01-01
Flaws of\tcurrent incident information management at CMS and CERN\tare discussed. A new data\tmodel for future incident database is\tproposed and briefly described. Recently developed draft version of GIS-‐based tool for incident tracking is presented.
Gaussian Process Regression Model in Spatial Logistic Regression
Sofro, A.; Oktaviarina, A.
2018-01-01
Spatial analysis has developed very quickly in the last decade. One of the favorite approaches is based on the neighbourhood of the region. Unfortunately, there are some limitations such as difficulty in prediction. Therefore, we offer Gaussian process regression (GPR) to accommodate the issue. In this paper, we will focus on spatial modeling with GPR for binomial data with logit link function. The performance of the model will be investigated. We will discuss the inference of how to estimate the parameters and hyper-parameters and to predict as well. Furthermore, simulation studies will be explained in the last section.
Majumdar, Arunabha; Witte, John S; Ghosh, Saurabh
2015-12-01
Binary phenotypes commonly arise due to multiple underlying quantitative precursors and genetic variants may impact multiple traits in a pleiotropic manner. Hence, simultaneously analyzing such correlated traits may be more powerful than analyzing individual traits. Various genotype-level methods, e.g., MultiPhen (O'Reilly et al. []), have been developed to identify genetic factors underlying a multivariate phenotype. For univariate phenotypes, the usefulness and applicability of allele-level tests have been investigated. The test of allele frequency difference among cases and controls is commonly used for mapping case-control association. However, allelic methods for multivariate association mapping have not been studied much. In this article, we explore two allelic tests of multivariate association: one using a Binomial regression model based on inverted regression of genotype on phenotype (Binomial regression-based Association of Multivariate Phenotypes [BAMP]), and the other employing the Mahalanobis distance between two sample means of the multivariate phenotype vector for two alleles at a single-nucleotide polymorphism (Distance-based Association of Multivariate Phenotypes [DAMP]). These methods can incorporate both discrete and continuous phenotypes. Some theoretical properties for BAMP are studied. Using simulations, the power of the methods for detecting multivariate association is compared with the genotype-level test MultiPhen's. The allelic tests yield marginally higher power than MultiPhen for multivariate phenotypes. For one/two binary traits under recessive mode of inheritance, allelic tests are found to be substantially more powerful. All three tests are applied to two different real data and the results offer some support for the simulation study. We propose a hybrid approach for testing multivariate association that implements MultiPhen when Hardy-Weinberg Equilibrium (HWE) is violated and BAMP otherwise, because the allelic approaches assume HWE
Risk factors for radiotherapy incidents: a single institutional experience.
Ishiyama, Hiromichi; Shuto, Nobuaki; Terazaki, Tsuyoshi; Noda, Shigetoshi; Ishigami, Minoru; Yogo, Katsunori; Hayakawa, Kazushige
2018-01-30
We aimed to analyze risk factors for incidents occurring during the practice of external beam radiotherapy (EBRT) at a single Japanese center. Treatment data for EBRT from June 2014 to March 2017 were collected. Data from incident reports submitted during this period were reviewed. Near-miss cases were not included. Risk factors for incidents, including patient characteristics and treatment-related factors, were explored using uni- and multivariate analyses. Factors contributing to each incident were also retrospectively categorized according to the recommendations of the American Association of Physicists in Medicine (AAPM). A total of 2887 patients were treated during the study period, and 26 incidents occurred (0.90% per patient). Previous history of radiotherapy and large fraction size were identified as risk factors for incidents by univariate analysis. Only previous history of radiotherapy was detected as a risk factor in multivariate analysis. Identified categories of contributing factors were human behavior (50.0%), communication (40.6%), and technical (9.4%). The incident rate of EBRT was 0.90% per patient in our institution. Previous history of radiotherapy and large fraction size were detected as risk factors for incidents. Human behavior and communication errors were identified as contributing factors for most incidents. Copyright © 2018 American Association of Medical Dosimetrists. Published by Elsevier Inc. All rights reserved.
McMillan, Matthew W; Lehnus, Kristina S
2018-01-01
To identify factors contributing to the development of anaesthetic safety incidents. Prospective, descriptive, voluntary reporting audit of safety incidents with subsequent systems analysis. All animals anaesthetized in a multispecies veterinary teaching hospital from November 2014 to October 2016. Peri-anaesthetic incidents that risked or caused unnecessary harm to an animal were reported by anaesthetists alongside animal morbidity and mortality data. A modified systems analysis framework was used to identify contributing factors from the following categories: Animal and Owner, Task and Technology, Individual, Team, Work Environmental, and Organizational and Management. The outcome was graded using a simple descriptive scale. Data were analysed using Pearson's Chi-Square test for association and univariable and multivariable logistic regression analysis. Totally, 3379 anaesthetics were performed during the audit period. Of these, 174 incident reports were analysed, 163 of which impacted safe veterinary care and 26 incidents were considered to have had major or catastrophic outcomes. Incident outcome was believed to have been limited by anaesthetist intervention in 104 (63.8%) cases. Various factors were identified as: Individual in 123 (70.7%), Team in 108 (62.1%), Organizational and Management in 94 (54.0%), Task and Technology in 80 (46.0%), Work Environmental in 53 (30.5%) and Animal and Owner in 36 (20.7%) incidents. Individual factors were rarely seen in isolation. Significant associations were identified between Experience and Supervision, X 2 (1, n=174)=54177, p=0.001, Failure to follow a standard operating procedure and Task Management, X 2 (2, n=174)=11318, p=0.001, and Staffing and Poor Scheduling, X 2 (1, n=174)=36742, p=0.001. Animal Condition [odds ratio (OR)=16210, 95% confidence interval (CI)=5573-47147)] and anaesthetist Decision Making (OR=3437, 95% CI=1184-9974) were risk factors for catastrophic and major outcomes. Individual factors contribute
Design of multivariable controllers for robot manipulators
Seraji, H.
1986-01-01
The paper presents a simple method for the design of linear multivariable controllers for multi-link robot manipulators. The control scheme consists of multivariable feedforward and feedback controllers. The feedforward controller is the minimal inverse of the linearized model of robot dynamics and contains only proportional-double-derivative (PD2) terms. This controller ensures that the manipulator joint angles track any reference trajectories. The feedback controller is of proportional-integral-derivative (PID) type and achieves pole placement. This controller reduces any initial tracking error to zero as desired and also ensures that robust steady-state tracking of step-plus-exponential trajectories is achieved by the joint angles. The two controllers are independent of each other and are designed separately based on the linearized robot model and then integrated in the overall control scheme. The proposed scheme is simple and can be implemented for real-time control of robot manipulators.
Advancing emotion theory with multivariate pattern classification.
Kragel, Philip A; LaBar, Kevin S
2014-04-01
Characterizing how activity in the central and autonomic nervous systems corresponds to distinct emotional states is one of the central goals of affective neuroscience. Despite the ease with which individuals label their own experiences, identifying specific autonomic and neural markers of emotions remains a challenge. Here we explore how multivariate pattern classification approaches offer an advantageous framework for identifying emotion specific biomarkers and for testing predictions of theoretical models of emotion. Based on initial studies using multivariate pattern classification, we suggest that central and autonomic nervous system activity can be reliably decoded into distinct emotional states. Finally, we consider future directions in applying pattern classification to understand the nature of emotion in the nervous system.
Directional outlyingness for multivariate functional data
Dai, Wenlin
2018-04-07
The direction of outlyingness is crucial to describing the centrality of multivariate functional data. Motivated by this idea, classical depth is generalized to directional outlyingness for functional data. Theoretical properties of functional directional outlyingness are investigated and the total outlyingness can be naturally decomposed into two parts: magnitude outlyingness and shape outlyingness which represent the centrality of a curve for magnitude and shape, respectively. This decomposition serves as a visualization tool for the centrality of curves. Furthermore, an outlier detection procedure is proposed based on functional directional outlyingness. This criterion applies to both univariate and multivariate curves and simulation studies show that it outperforms competing methods. Weather and electrocardiogram data demonstrate the practical application of our proposed framework.
Multivariate Approaches to Classification in Extragalactic Astronomy
Directory of Open Access Journals (Sweden)
Didier eFraix-Burnet
2015-08-01
Full Text Available Clustering objects into synthetic groups is a natural activity of any science. Astrophysics is not an exception and is now facing a deluge of data. For galaxies, the one-century old Hubble classification and the Hubble tuning fork are still largely in use, together with numerous mono- or bivariate classifications most often made by eye. However, a classification must be driven by the data, and sophisticated multivariate statistical tools are used more and more often. In this paper we review these different approaches in order to situate them in the general context of unsupervised and supervised learning. We insist on the astrophysical outcomes of these studies to show that multivariate analyses provide an obvious path toward a renewal of our classification of galaxies and are invaluable tools to investigate the physics and evolution of galaxies.
Multivariate max-stable spatial processes
Genton, Marc G.
2015-02-11
Max-stable processes allow the spatial dependence of extremes to be modelled and quantified, so they are widely adopted in applications. For a better understanding of extremes, it may be useful to study several variables simultaneously. To this end, we study the maxima of independent replicates of multivariate processes, both in the Gaussian and Student-t cases. We define a Poisson process construction and introduce multivariate versions of the Smith Gaussian extreme-value, the Schlather extremal-Gaussian and extremal-t, and the Brown–Resnick models. We develop inference for the models based on composite likelihoods. We present results of Monte Carlo simulations and an application to daily maximum wind speed and wind gust.
Power Estimation in Multivariate Analysis of Variance
Directory of Open Access Journals (Sweden)
Jean François Allaire
2007-09-01
Full Text Available Power is often overlooked in designing multivariate studies for the simple reason that it is believed to be too complicated. In this paper, it is shown that power estimation in multivariate analysis of variance (MANOVA can be approximated using a F distribution for the three popular statistics (Hotelling-Lawley trace, Pillai-Bartlett trace, Wilk`s likelihood ratio. Consequently, the same procedure, as in any statistical test, can be used: computation of the critical F value, computation of the noncentral parameter (as a function of the effect size and finally estimation of power using a noncentral F distribution. Various numerical examples are provided which help to understand and to apply the method. Problems related to post hoc power estimation are discussed.
Multivariate approaches to classification in extragalactic astronomy
International Nuclear Information System (INIS)
Fraix-Burnet, Didier; Thuillard, Marc; Chattopadhyay, Asis K.
2015-01-01
Clustering objects into synthetic groups is a natural activity of any science. Astrophysics is not an exception and is now facing a deluge of data. For galaxies, the one-century old Hubble classification and the Hubble tuning fork are still largely in use, together with numerous mono- or bivariate classifications most often made by eye. However, a classification must be driven by the data, and sophisticated multivariate statistical tools are used more and more often. In this paper we review these different approaches in order to situate them in the general context of unsupervised and supervised learning. We insist on the astrophysical outcomes of these studies to show that multivariate analyses provide an obvious path toward a renewal of our classification of galaxies and are invaluable tools to investigate the physics and evolution of galaxies.
Climate variability, weather and enteric disease incidence in New Zealand: time series analysis.
Directory of Open Access Journals (Sweden)
Aparna Lal
Full Text Available BACKGROUND: Evaluating the influence of climate variability on enteric disease incidence may improve our ability to predict how climate change may affect these diseases. OBJECTIVES: To examine the associations between regional climate variability and enteric disease incidence in New Zealand. METHODS: Associations between monthly climate and enteric diseases (campylobacteriosis, salmonellosis, cryptosporidiosis, giardiasis were investigated using Seasonal Auto Regressive Integrated Moving Average (SARIMA models. RESULTS: No climatic factors were significantly associated with campylobacteriosis and giardiasis, with similar predictive power for univariate and multivariate models. Cryptosporidiosis was positively associated with average temperature of the previous month (β = 0.130, SE = 0.060, p <0.01 and inversely related to the Southern Oscillation Index (SOI two months previously (β = -0.008, SE = 0.004, p <0.05. By contrast, salmonellosis was positively associated with temperature (β = 0.110, SE = 0.020, p<0.001 of the current month and SOI of the current (β = 0.005, SE = 0.002, p<0.050 and previous month (β = 0.005, SE = 0.002, p<0.05. Forecasting accuracy of the multivariate models for cryptosporidiosis and salmonellosis were significantly higher. CONCLUSIONS: Although spatial heterogeneity in the observed patterns could not be assessed, these results suggest that temporally lagged relationships between climate variables and national communicable disease incidence data can contribute to disease prediction models and early warning systems.
International Nuclear Information System (INIS)
Bishop, Andrew J.; McDonald, Mark W.; Chang, Andrew L.; Esiashvili, Natia
2012-01-01
Purpose: To evaluate the incidence of infant brain tumors and survival outcomes by disease and treatment variables. Methods and Materials: The Surveillance, Epidemiology, and End Results (SEER) Program November 2008 submission database provided age-adjusted incidence rates and individual case information for primary brain tumors diagnosed between 1973 and 2006 in infants less than 12 months of age. Results: Between 1973 and 1986, the incidence of infant brain tumors increased from 16 to 40 cases per million (CPM), and from 1986 to 2006, the annual incidence rate averaged 35 CPM. Leading histologies by annual incidence in CPM were gliomas (13.8), medulloblastoma and primitive neuroectodermal tumors (6.6), and ependymomas (3.6). The annual incidence was higher in whites than in blacks (35.0 vs. 21.3 CPM). Infants with low-grade gliomas had the highest observed survival, and those with atypical teratoid rhabdoid tumors (ATRTs) or primary rhabdoid tumors of the brain had the lowest. Between 1979 and 1993, the annual rate of cases treated with radiation within the first 4 months from diagnosis declined from 20.5 CPM to <2 CPM. For infants with medulloblastoma, desmoplastic histology and treatment with both surgery and upfront radiation were associated with improved survival, but on multivariate regression, only combined surgery and radiation remained associated with improved survival, with a hazard ratio for death of 0.17 compared with surgery alone (p = 0.005). For ATRTs, those treated with surgery and upfront radiation had a 12-month survival of 100% compared with 24.4% for those treated with surgery alone (p = 0.016). For ependymomas survival was higher in patients treated in more recent decades (p = 0.001). Conclusion: The incidence of infant brain tumors has been stable since 1986. Survival outcomes varied markedly by histology. For infants with medulloblastoma and ATRTs, improved survival was observed in patients treated with both surgery and early radiation
Cardiorespiratory fitness, fatness and incident diabetes
DEFF Research Database (Denmark)
Holtermann, Andreas; Gyntelberg, Finn; Bauman, Adrian
2017-01-01
with diabetes incidence were estimated in multivariable Cox-models including conventional risk factors and social class. Diabetes incidence was assessed through a national register. Results During 44 years of follow-up, 518 (10.4%) incident cases of diabetes occurred. In the multi-adjusted model, the obese had......: 0.76–1.23). Conclusion High CRF has a stronger protective effect on diabetes among obese than among normal weight men, supporting the recommendation of fitness-enhancing physical activity for preventing diabetes among the obese.......Aims Increases in prevalence have led to a diabetes pandemic. Obesity and low cardiorespiratory fitness (CRF) are considered to be central mechanisms. We investigated if the effect of CRF on diabetes risk was equivalent across levels of fatness among healthy men. Methods In total 4988 middle...
On Multivariate Methods in Robust Econometrics
Czech Academy of Sciences Publication Activity Database
Kalina, Jan
2012-01-01
Roč. 21, č. 1 (2012), s. 69-82 ISSN 1210-0455 R&D Projects: GA MŠk(CZ) 1M06014 Institutional research plan: CEZ:AV0Z10300504 Keywords : least weighted squares * heteroscedasticity * multivariate statistics * model selection * diagnostics * computational aspects Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.561, year: 2012 http://www.vse.cz/pep/abstrakt.php?IDcl=411
The evolution of multivariate maternal effects.
Directory of Open Access Journals (Sweden)
Bram Kuijper
2014-04-01
Full Text Available There is a growing interest in predicting the social and ecological contexts that favor the evolution of maternal effects. Most predictions focus, however, on maternal effects that affect only a single character, whereas the evolution of maternal effects is poorly understood in the presence of suites of interacting traits. To overcome this, we simulate the evolution of multivariate maternal effects (captured by the matrix M in a fluctuating environment. We find that the rate of environmental fluctuations has a substantial effect on the properties of M: in slowly changing environments, offspring are selected to have a multivariate phenotype roughly similar to the maternal phenotype, so that M is characterized by positive dominant eigenvalues; by contrast, rapidly changing environments favor Ms with dominant eigenvalues that are negative, as offspring favor a phenotype which substantially differs from the maternal phenotype. Moreover, when fluctuating selection on one maternal character is temporally delayed relative to selection on other traits, we find a striking pattern of cross-trait maternal effects in which maternal characters influence not only the same character in offspring, but also other offspring characters. Additionally, when selection on one character contains more stochastic noise relative to selection on other traits, large cross-trait maternal effects evolve from those maternal traits that experience the smallest amounts of noise. The presence of these cross-trait maternal effects shows that individual maternal effects cannot be studied in isolation, and that their study in a multivariate context may provide important insights about the nature of past selection. Our results call for more studies that measure multivariate maternal effects in wild populations.
Multivariable PID Controller For Robotic Manipulator
Seraji, Homayoun; Tarokh, Mahmoud
1990-01-01
Gains updated during operation to cope with changes in characteristics and loads. Conceptual multivariable controller for robotic manipulator includes proportional/derivative (PD) controller in inner feedback loop, and proportional/integral/derivative (PID) controller in outer feedback loop. PD controller places poles of transfer function (in Laplace-transform space) of control system for linearized mathematical model of dynamics of robot. PID controller tracks trajectory and decouples input and output.
Multivariate statistical assessment of coal properties
Czech Academy of Sciences Publication Activity Database
Klika, Z.; Serenčíšová, J.; Kožušníková, Alena; Kolomazník, I.; Študentová, S.; Vontorová, J.
2014-01-01
Roč. 128, č. 128 (2014), s. 119-127 ISSN 0378-3820 R&D Projects: GA MŠk ED2.1.00/03.0082 Institutional support: RVO:68145535 Keywords : coal properties * structural,chemical and petrographical properties * multivariate statistics Subject RIV: DH - Mining, incl. Coal Mining Impact factor: 3.352, year: 2014 http://dx.doi.org/10.1016/j.fuproc.2014.06.029
Modeling Covariance Breakdowns in Multivariate GARCH
Jin, Xin; Maheu, John M
2014-01-01
This paper proposes a flexible way of modeling dynamic heterogeneous covariance breakdowns in multivariate GARCH (MGARCH) models. During periods of normal market activity, volatility dynamics are governed by an MGARCH specification. A covariance breakdown is any significant temporary deviation of the conditional covariance matrix from its implied MGARCH dynamics. This is captured through a flexible stochastic component that allows for changes in the conditional variances, covariances and impl...
Determinants of LSIL Regression in Women from a Colombian Cohort
International Nuclear Information System (INIS)
Molano, Monica; Gonzalez, Mauricio; Gamboa, Oscar; Ortiz, Natasha; Luna, Joaquin; Hernandez, Gustavo; Posso, Hector; Murillo, Raul; Munoz, Nubia
2010-01-01
Objective: To analyze the role of Human Papillomavirus (HPV) and other risk factors in the regression of cervical lesions in women from the Bogota Cohort. Methods: 200 HPV positive women with abnormal cytology were included for regression analysis. The time of lesion regression was modeled using methods for interval censored survival time data. Median duration of total follow-up was 9 years. Results: 80 (40%) women were diagnosed with Atypical Squamous Cells of Undetermined Significance (ASCUS) or Atypical Glandular Cells of Undetermined Significance (AGUS) while 120 (60%) were diagnosed with Low Grade Squamous Intra-epithelial Lesions (LSIL). Globally, 40% of the lesions were still present at first year of follow up, while 1.5% was still present at 5 year check-up. The multivariate model showed similar regression rates for lesions in women with ASCUS/AGUS and women with LSIL (HR= 0.82, 95% CI 0.59-1.12). Women infected with HR HPV types and those with mixed infections had lower regression rates for lesions than did women infected with LR types (HR=0.526, 95% CI 0.33-0.84, for HR types and HR=0.378, 95% CI 0.20-0.69, for mixed infections). Furthermore, women over 30 years had a higher lesion regression rate than did women under 30 years (HR1.53, 95% CI 1.03-2.27). The study showed that the median time for lesion regression was 9 months while the median time for HPV clearance was 12 months. Conclusions: In the studied population, the type of infection and the age of the women are critical factors for the regression of cervical lesions.
Control of wastewater using multivariate control chart
Nugraha, Jaka; Fatimah, Is; Prabowo, Rino Galang
2017-03-01
Wastewater treatment is a crucial process in industry cause untreated or improper treatment of wastewater may leads some problems affecting to the other parts of environmental aspects. For many kinds of wastewater treatments, the parameters of Biological Oxygen Demand (BOD), Chemical Oxygen Demand (COD), and the Total Suspend Solid (TSS) are usual parameters to be controlled as a standard. In this paper, the application of multivariate Hotteling T2 Individual was reported to control wastewater treatment. By using wastewater treatment data from PT. ICBP, east Java branch, while the fulfillment of quality standards are based on East Java Governor Regulation No. 72 Year 2013 on Standards of Quality of Waste Water Industry and / or Other Business Activities. The obtained results are COD and TSS has a correlation with BOD values with the correlation coefficient higher than 50%, and it is is also found that influence of the COD and TSS to BOD values are 82% and 1.9% respectively. Based on Multivariate control chart Individual T2 Hotteling, it is found that BOD-COD and BOD-TSS are each one subgroup that are outside the control limits. Thus, it can be said there is a process that is not multivariate controlled, but univariately the variables of BOD, COD and TSS are within specification (standard quality) that has been determined.
Directory of Open Access Journals (Sweden)
Villanueva-Martınez Manuel
2012-03-01
Full Text Available Abstract Background To analyze changes in incidence and outcomes of patients undergoing revision total hip arthroplasty (RTHA over an 8-year study period in Spain. Methods We selected all surgical admissions in individuals aged ≥ 40 years who underwent RTHA (ICD-9-CM procedure code 81.53 between 2001 and 2008 from the Spanish National Hospital Discharge Database. Age- and sex-specific incidence rates, Charlson co-morbidity index, length of stay (LOS, costs and in-hospital mortality (IHM were estimated for each year. Multivariate analyses were conducted to asses time trends. Results 32, 280 discharges of patients (13, 391 men/18, 889 women having undergone RTHA were identified. Overall crude incidence showed a small but significant increase from 20.2 to 21.8 RTHA per 100, 000 inhabitants from 2001 to 2008 (p The incidence increased for men (17.7 to 19.8 in 2008 but did not vary for women (22.3 in 2001 and 22.2 in 2008. Greater increments were observed in patients older than 84 years and in the age group 75-84. In 2001, 19% of RTHA patients had a Charlson Index ≥ 1 and this proportion rose to 24.6% in 2008 (p The crude overall in-hospital mortality (IHM increased from 1.16% in 2001 to 1.77% (p = 0.025 in 2008. For both sexes the risk of death was higher with age, with the highest mortality rates found among those aged 85 or over. After multivariate analysis no change was observed in IHM over time. The mean inflation adjusted cost per patient increased by 78.3%, from 9, 375 to 16, 715 Euros from 2001 to 2008. After controlling for possible confounders using Poisson regression models, we observed that the incidence of RTHA hospitalizations significantly increased for men and women over the period 2001 to 2008 (IRR 1.10, 95% CI 1.03-1.18 and 1.08, 95% CI 1.02-1.14 respectively. Conclusions The crude incidence of RTHA in Spain showed a small but significant increase from 2001 to 2008 with concomitant reductions in LOS, significant increase in co
Lange, Elizabeth M S; Segal, Scott; Pancaro, Carlo; Wong, Cynthia A; Grobman, William A; Russell, Gregory B; Toledo, Paloma
2017-12-01
Intrapartum maternal fever is associated with several adverse neonatal outcomes. Intrapartum fever can be infectious or inflammatory in etiology. Increases in interleukin 6 and other inflammatory markers are associated with maternal fever. Magnesium has been shown to attenuate interleukin 6-mediated fever in animal models. We hypothesized that parturients exposed to intrapartum magnesium would have a lower incidence of fever than nonexposed parturients. In this study, electronic medical record data from all deliveries at Northwestern Memorial Hospital (Chicago, Illinois) between 2007 and 2014 were evaluated. The primary outcome was intrapartum fever (temperature at or higher than 38.0°C). Factors associated with the development of maternal fever were evaluated using a multivariable logistic regression model. Propensity score matching was used to reduce potential bias from nonrandom selection of magnesium administration. Of the 58,541 women who met inclusion criteria, 5,924 (10.1%) developed intrapartum fever. Febrile parturients were more likely to be nulliparous, have used neuraxial analgesia, and have been delivered via cesarean section. The incidence of fever was lower in women exposed to magnesium (6.0%) than those who were not (10.2%). In multivariable logistic regression, women exposed to magnesium were less likely to develop a fever (adjusted odds ratio = 0.42 [95% CI, 0.31 to 0.58]). After propensity matching (N = 959 per group), the odds ratio of developing fever was lower in women who received magnesium therapy (odds ratio = 0.68 [95% CI, 0.48 to 0.98]). Magnesium may play a protective role against the development of intrapartum fever. Future work should further explore the association between magnesium dosing and the incidence of maternal fever.
Liu, Yuxiu; Gao, Yufang; Wei, Lili; Chen, Weifen; Ma, Xiaoyan; Song, Lei
2015-01-01
Background Peripherally inserted central catheters (PICCs) are widely used in chemotherapy, but the reported PICC thrombosis incidence varies greatly, and risks of PICC thrombosis are not well defined. This study was to investigate the incidence and risk factors of PICC-related upper extremity vein thrombosis in cancer patients. Methods This was a prospective study conducted in two tertiary referral hospitals from May 2010 to February 2013. Cancer patients who were subject to PICC placement were enrolled and checked by Doppler ultrasound weekly for at least 1 month. Univariable and multivariable logistic regression analyses were applied for identification of risk factors. Results Three hundred and eleven cancer patients were enrolled in the study. One hundred and sixty (51.4%) developed PICC thrombosis, of which 87 (54.4%) cases were symptomatic. The mean time interval from PICC insertion to thrombosis onset was 11.04±5.538 days. The univariable logistic regression analysis showed that complications (odds ratio [OR] 1.686, P=0.032), less activity (OR 1.476, P=0.006), obesity (OR 3.148, P=0.000), and chemotherapy history (OR 3.405, P=0.030) were associated with PICC thrombosis. Multivariate analysis showed that less activity (OR 9.583, P=0.000) and obesity (OR 3.466, P=0.014) were significantly associated with PICC thrombosis. Conclusions The incidence of PICC thrombosis is relatively high, and nearly half are asymptomatic. Less activity and obesity are risk factors of PICC-related thrombosis. PMID:25673995
Regression analysis of censored data using pseudo-observations
DEFF Research Database (Denmark)
Parner, Erik T.; Andersen, Per Kragh
2010-01-01
We draw upon a series of articles in which a method based on pseu- dovalues is proposed for direct regression modeling of the survival function, the restricted mean, and the cumulative incidence function in competing risks with right-censored data. The models, once the pseudovalues have been...... computed, can be fit using standard generalized estimating equation software. Here we present Stata procedures for computing these pseudo-observations. An example from a bone marrow transplantation study is used to illustrate the method....
Incidence and predictors of difficult mask ventilation and intubation.
Shah, Prerana N; Sundaram, Vimal
2012-10-01
This study is aimed to determine the incidence and predictors of difficult and impossible mask ventilation. Information like age, snoring history, obstructive sleep apnea, dental and mandibular abnormalities, macroglossia, grading like SLUX, Mallampatti, Cormack Lehanne, atlantooccipital extension, presence of beard or moustache, mouth opening were collected. During mask ventilation, the information related to the ventilation and intubation was collected. All variables found to be significant in univariate analysis were subjected to the multivariate logistic regression model to identify independent predictors of measured outcome. Difficult mask ventilation (DMV) was observed in 30 male patients and 9 female patients. Of the 40 patients who had difficult intubation (DI), 7 patients had both DMV and intubation and 1 patient was of impossible mask ventilation/ intubation. Snoring was the lone significant risk factor for DMV. The risk factors identified for DI were snoring, retrognathia, micrognathia, macroglossia, short thick neck, Mallampatti grade [III/IV], abnormal SLUX grade, Cormack Lehanne grade [II,III/IV], abnormal atlantooccipital extension grading, flexion/extension deformity of neck, protuberant teeth, cervical spine abnormality, mouth opening 26 kg/m(2). BMI > 26 kg/m(2) and atlantooccipital extension grade > 3 were independent risk factors for DI and the presence of two of the variables made the sensitivity and specificity of 43% and 99% respectively with a positive predictive value of 74%. The predictive score may lead to a better anticipation of difficult airway management, potentially deceasing the morbidity and mortality resulting from hypoxia or anoxia with failed ventilation.
Sleep Duration as a Risk Factor for Diabetes Incidence in a Large US Sample
Gangwisch, James E.; Heymsfield, Steven B.; Boden-Albala, Bernadette; Buijs, Ruud M.; Kreier, Felix; Pickering, Thomas G.; Rundle, Andrew G.; Zammit, Gary K.; Malaspina, Dolores
2007-01-01
Study Objectives: To explore the relationship between sleep duration and diabetes incidence over an 8- to 10-year follow-up period in data from the First National Health and Nutrition Examination Survey (NHANES I). We hypothesized that prolonged short sleep duration is associated with diabetes and that obesity and hypertension act as partial mediators of this relationship. The increased load on the pancreas from insulin resistance induced by chronically short sleep durations can, over time, compromise β-cell function and lead to type 2 diabetes. No plausible mechanism has been identified by which long sleep duration could lead to diabetes. Design: Multivariate longitudinal analyses of the NHANES I using logistic regression models. Setting: Probability sample (n = 8992) of the noninstitutionalized population of the United States between 1982 and 1992. Participants: Subjects between the ages of 32 and 86 years. Measurements and Results: Between 1982 and 1992, 4.8% of the sample (n = 430) were determined by physician diagnosis, hospital record, or cause of death to be incident cases of diabetes. Subjects with sleep durations of 5 or fewer hours (odds ratio = 1.47, 95% confidence interval 1.03–2.09) and subjects with sleep durations of 9 or more hours (odds ratio = 1.52, 95% confidence interval 1.06–2.18) were significantly more likely to have incident diabetes over the follow-up period after controlling for covariates. Conclusions: Short sleep duration could be a significant risk factor for diabetes. The association between long sleep duration and diabetes incidence is more likely to be due to some unmeasured confounder such as poor sleep quality. Citation: Gangwisch JE; Heymsfield SB; Boden-Albala B; Buijs RM; Kreier F; Pickering TG; Rundle AG; Zammit GK; Malaspina D. Sleep duration as a risk factor for diabetes incidence in a large US sample. SLEEP 2007;30(12):1667-1673. PMID:18246976
High HIV incidence in a cohort of male injection drug users in Delhi, India.
Sarna, Avina; Saraswati, Lopamudra Ray; Sebastian, Mary; Sharma, Vartika; Madan, Ira; Lewis, Dean; Pulerwitz, Julie; Thior, Ibou; Tun, Waimar
2014-06-01
India has an estimated 177,000 injection drug users (IDU) with a national HIV prevalence of 7.14%. Reliable estimates of HIV incidence are not available for this population. We report HIV incidence in a cohort of male, HIV-negative IDUs recruited through peer-referral, targeted outreach and as walk-in clients in Delhi from May to October, 2011. Fourth-generation Antigen-Antibody tests were used to diagnose new infections and results were confirmed using Western blot tests. HIV incidence based on HIV seroconversion was calculated as number of events/person-years. Cox regression was used to identify significant (p<0.05) seroconversion predictors. A total of 2790 male HIV-negative IDUs were recruited at baseline; 67.4% (n=1880) returned for their first follow-up visit and 96% (n=1806) underwent HIV testing. Participants were followed for a median of 9.7 months. A total of 112 new HIV infections occurred over a cumulative 1398.5 person-years of follow-up resulting in an incidence rate of 8.01 new infections/100 person-years (95% CI: 6.65-9.64); 74% of these participants reported risky injection practices in the past month. In multivariate analysis, moderate-high risk injection behaviors (Adjusted Hazard Ratio [AHR] 2.59; 95% CI 1.45-4.62) were associated with a higher risk of new infections. Male IDUs in Delhi continue to practice unsafe injection practices leading to high sero-incidence despite the availability of HIV prevention services offered through targeted intervention programs. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
A prospective study of the incidence of asymptomatic pulp necrosis following crown preparation.
Kontakiotis, E G; Filippatos, C G; Stefopoulos, S; Tzanetakis, G N
2015-06-01
To determine the incidence of asymptomatic pulp necrosis following crown preparation as well as the positive predictive value of the electric pulp testing. A total of 120 teeth with healthy pulps scheduled to receive fixed crowns (experimental teeth) were included. Teeth were divided into two groups according to the preoperative crown condition (intact teeth and teeth with preoperative caries, restorations or crowns) and into four groups according to tooth type (maxillary anterior teeth, maxillary posterior teeth, mandibular anterior teeth and mandibular posterior teeth). Experimental and control teeth were submitted to electric pulp testing on three different occasions before treatment commencement (stage 0), at the impression making session (stage 1) and just before the final cementation of the crown (stage 2). Teeth that were considered to contain necrotic pulps were submitted to root canal treatment. Upon access, absence of bleeding was considered as a confirmation of pulp necrosis. Data were analysed using bivariate (chi-square) and multivariate analysis (logistic regression). All reported probability values (P-values) were based on two-sided tests and compared to a significance level of 5%. The overall incidence of pulp necrosis was 9%. Intact teeth had a significantly lower incidence of pulp necrosis (5%) compared with preoperatively structurally compromised teeth (13%) [(OR: 9.113, P = 0.035)]. No significant differences were found amongst the four groups with regard to tooth type (P = 0.923). The positive predictive value of the electric pulp testing was 1.00. The incidence of asymptomatic pulp necrosis of teeth following crown preparation is noteworthy. The presence of preoperative caries, restorations or crowns of experimental teeth correlated with a significantly higher incidence of pulp necrosis. Electric pulp testing remains a useful diagnostic instrument for determining the pulp condition. © 2014 International Endodontic Journal. Published by
Proton pump inhibitors increase the incidence of bone fractures in hepatitis C patients.
Mello, Michael; Weideman, Rick A; Little, Bertis B; Weideman, Mark W; Cryer, Byron; Brown, Geri R
2012-09-01
While proton pump inhibitors (PPI) may increase the risk of bone fractures, the incidence of new bone fractures in a chronic hepatitis C virus (HCV) infected cohort, with or without PPI exposure, has not been explored. A retrospective cohort study of the incidence of bone fractures over 10 years in 9,437 HCV antibody positive patients in the Dallas VA Hepatitis C Registry was performed. The study endpoint was the incidence of verified new bone fractures per patient-years (pt-yrs) in PPI users compared to non-PPI users. PPI use was defined as those taking a PPI for ≥360 days. Pt-yrs of exposure for PPI users began on the first PPI prescription date, and pt-yrs of exposure for non-PPI users began with first date of any non-PPI prescription. For both HCV groups, the final date of patients' study duration was defined by end of PPI exposure, bone fracture occurrence, death or end of study evaluation period. Exclusion criteria included use of bone health modifying medications ≥30 days. Statistical differences in fracture incidence between groups were determined by multivariate regression analysis. Among the total study population analyzed (n = 2,573), 109 bone fractures occurred. Unadjusted bone fracture incidences were 13.99/1,000 pt-yrs vs. 5.86/1,000 pt-yrs in PPI and non-PPI users, respectively. The adjusted hazard ratio for new bone fractures was 3.87 (95 % CI 2.46-6.08) (p 1 year increased the risk of new bone fractures by more than threefold.
The following SAS macros can be used to create a multivariate usual intake distribution for multiple dietary components that are consumed nearly every day or episodically. A SAS macro for performing balanced repeated replication (BRR) variance estimation is also included.
On Weighted Support Vector Regression
DEFF Research Database (Denmark)
Han, Xixuan; Clemmensen, Line Katrine Harder
2014-01-01
We propose a new type of weighted support vector regression (SVR), motivated by modeling local dependencies in time and space in prediction of house prices. The classic weights of the weighted SVR are added to the slack variables in the objective function (OF‐weights). This procedure directly...... the differences and similarities of the two types of weights by demonstrating the connection between the Least Absolute Shrinkage and Selection Operator (LASSO) and the SVR. We show that an SVR problem can be transformed to a LASSO problem plus a linear constraint and a box constraint. We demonstrate...
Quadrotor system identification using the multivariate multiplex b-spline
Visser, T.; De Visser, C.C.; Van Kampen, E.J.
2015-01-01
A novel method for aircraft system identification is presented that is based on a new multivariate spline type; the multivariate multiplex B-spline. The multivariate multiplex B-spline is a generalization of the recently introduced tensor-simplex B-spline. Multivariate multiplex splines obtain