WorldWideScience

Sample records for survey logistic regression

  1. Logistic regression.

    Science.gov (United States)

    Nick, Todd G; Campbell, Kathleen M

    2007-01-01

    The Medical Subject Headings (MeSH) thesaurus used by the National Library of Medicine defines logistic regression models as "statistical models which describe the relationship between a qualitative dependent variable (that is, one which can take only certain discrete values, such as the presence or absence of a disease) and an independent variable." Logistic regression models are used to study effects of predictor variables on categorical outcomes and normally the outcome is binary, such as presence or absence of disease (e.g., non-Hodgkin's lymphoma), in which case the model is called a binary logistic model. When there are multiple predictors (e.g., risk factors and treatments) the model is referred to as a multiple or multivariable logistic regression model and is one of the most frequently used statistical model in medical journals. In this chapter, we examine both simple and multiple binary logistic regression models and present related issues, including interaction, categorical predictor variables, continuous predictor variables, and goodness of fit.

  2. Applied logistic regression

    CERN Document Server

    Hosmer, David W; Sturdivant, Rodney X

    2013-01-01

     A new edition of the definitive guide to logistic regression modeling for health science and other applications This thoroughly expanded Third Edition provides an easily accessible introduction to the logistic regression (LR) model and highlights the power of this model by examining the relationship between a dichotomous outcome and a set of covariables. Applied Logistic Regression, Third Edition emphasizes applications in the health sciences and handpicks topics that best suit the use of modern statistical software. The book provides readers with state-of-

  3. Fungible weights in logistic regression.

    Science.gov (United States)

    Jones, Jeff A; Waller, Niels G

    2016-06-01

    In this article we develop methods for assessing parameter sensitivity in logistic regression models. To set the stage for this work, we first review Waller's (2008) equations for computing fungible weights in linear regression. Next, we describe 2 methods for computing fungible weights in logistic regression. To demonstrate the utility of these methods, we compute fungible logistic regression weights using data from the Centers for Disease Control and Prevention's (2010) Youth Risk Behavior Surveillance Survey, and we illustrate how these alternate weights can be used to evaluate parameter sensitivity. To make our work accessible to the research community, we provide R code (R Core Team, 2015) that will generate both kinds of fungible logistic regression weights. (PsycINFO Database Record

  4. Practical Session: Logistic Regression

    Science.gov (United States)

    Clausel, M.; Grégoire, G.

    2014-12-01

    An exercise is proposed to illustrate the logistic regression. One investigates the different risk factors in the apparition of coronary heart disease. It has been proposed in Chapter 5 of the book of D.G. Kleinbaum and M. Klein, "Logistic Regression", Statistics for Biology and Health, Springer Science Business Media, LLC (2010) and also by D. Chessel and A.B. Dufour in Lyon 1 (see Sect. 6 of http://pbil.univ-lyon1.fr/R/pdf/tdr341.pdf). This example is based on data given in the file evans.txt coming from http://www.sph.emory.edu/dkleinb/logreg3.htm#data.

  5. [Understanding logistic regression].

    Science.gov (United States)

    El Sanharawi, M; Naudet, F

    2013-10-01

    Logistic regression is one of the most common multivariate analysis models utilized in epidemiology. It allows the measurement of the association between the occurrence of an event (qualitative dependent variable) and factors susceptible to influence it (explicative variables). The choice of explicative variables that should be included in the logistic regression model is based on prior knowledge of the disease physiopathology and the statistical association between the variable and the event, as measured by the odds ratio. The main steps for the procedure, the conditions of application, and the essential tools for its interpretation are discussed concisely. We also discuss the importance of the choice of variables that must be included and retained in the regression model in order to avoid the omission of important confounding factors. Finally, by way of illustration, we provide an example from the literature, which should help the reader test his or her knowledge.

  6. Comparison of logistic regression and artificial neural network in low back pain prediction: second national health survey.

    Science.gov (United States)

    Parsaeian, M; Mohammad, K; Mahmoudi, M; Zeraati, H

    2012-01-01

    The purpose of this investigation was to compare empirically predictive ability of an artificial neural network with a logistic regression in prediction of low back pain. Data from the second national health survey were considered in this investigation. This data includes the information of low back pain and its associated risk factors among Iranian people aged 15 years and older. Artificial neural network and logistic regression models were developed using a set of 17294 data and they were validated in a test set of 17295 data. Hosmer and Lemeshow recommendation for model selection was used in fitting the logistic regression. A three-layer perceptron with 9 inputs, 3 hidden and 1 output neurons was employed. The efficiency of two models was compared by receiver operating characteristic analysis, root mean square and -2 Loglikelihood criteria. The area under the ROC curve (SE), root mean square and -2Loglikelihood of the logistic regression was 0.752 (0.004), 0.3832 and 14769.2, respectively. The area under the ROC curve (SE), root mean square and -2Loglikelihood of the artificial neural network was 0.754 (0.004), 0.3770 and 14757.6, respectively. Based on these three criteria, artificial neural network would give better performance than logistic regression. Although, the difference is statistically significant, it does not seem to be clinically significant.

  7. Logistic Regression: Concept and Application

    Science.gov (United States)

    Cokluk, Omay

    2010-01-01

    The main focus of logistic regression analysis is classification of individuals in different groups. The aim of the present study is to explain basic concepts and processes of binary logistic regression analysis intended to determine the combination of independent variables which best explain the membership in certain groups called dichotomous…

  8. Logistic regression: a brief primer.

    Science.gov (United States)

    Stoltzfus, Jill C

    2011-10-01

    Regression techniques are versatile in their application to medical research because they can measure associations, predict outcomes, and control for confounding variable effects. As one such technique, logistic regression is an efficient and powerful way to analyze the effect of a group of independent variables on a binary outcome by quantifying each independent variable's unique contribution. Using components of linear regression reflected in the logit scale, logistic regression iteratively identifies the strongest linear combination of variables with the greatest probability of detecting the observed outcome. Important considerations when conducting logistic regression include selecting independent variables, ensuring that relevant assumptions are met, and choosing an appropriate model building strategy. For independent variable selection, one should be guided by such factors as accepted theory, previous empirical investigations, clinical considerations, and univariate statistical analyses, with acknowledgement of potential confounding variables that should be accounted for. Basic assumptions that must be met for logistic regression include independence of errors, linearity in the logit for continuous variables, absence of multicollinearity, and lack of strongly influential outliers. Additionally, there should be an adequate number of events per independent variable to avoid an overfit model, with commonly recommended minimum "rules of thumb" ranging from 10 to 20 events per covariate. Regarding model building strategies, the three general types are direct/standard, sequential/hierarchical, and stepwise/statistical, with each having a different emphasis and purpose. Before reaching definitive conclusions from the results of any of these methods, one should formally quantify the model's internal validity (i.e., replicability within the same data set) and external validity (i.e., generalizability beyond the current sample). The resulting logistic regression model

  9. Logistic regression for circular data

    Science.gov (United States)

    Al-Daffaie, Kadhem; Khan, Shahjahan

    2017-05-01

    This paper considers the relationship between a binary response and a circular predictor. It develops the logistic regression model by employing the linear-circular regression approach. The maximum likelihood method is used to estimate the parameters. The Newton-Raphson numerical method is used to find the estimated values of the parameters. A data set from weather records of Toowoomba city is analysed by the proposed methods. Moreover, a simulation study is considered. The R software is used for all computations and simulations.

  10. Predicting Social Trust with Binary Logistic Regression

    Science.gov (United States)

    Adwere-Boamah, Joseph; Hufstedler, Shirley

    2015-01-01

    This study used binary logistic regression to predict social trust with five demographic variables from a national sample of adult individuals who participated in The General Social Survey (GSS) in 2012. The five predictor variables were respondents' highest degree earned, race, sex, general happiness and the importance of personally assisting…

  11. Common pitfalls in statistical analysis: Logistic regression.

    Science.gov (United States)

    Ranganathan, Priya; Pramesh, C S; Aggarwal, Rakesh

    2017-01-01

    Logistic regression analysis is a statistical technique to evaluate the relationship between various predictor variables (either categorical or continuous) and an outcome which is binary (dichotomous). In this article, we discuss logistic regression analysis and the limitations of this technique.

  12. Standards for Standardized Logistic Regression Coefficients

    Science.gov (United States)

    Menard, Scott

    2011-01-01

    Standardized coefficients in logistic regression analysis have the same utility as standardized coefficients in linear regression analysis. Although there has been no consensus on the best way to construct standardized logistic regression coefficients, there is now sufficient evidence to suggest a single best approach to the construction of a…

  13. Standards for Standardized Logistic Regression Coefficients

    Science.gov (United States)

    Menard, Scott

    2011-01-01

    Standardized coefficients in logistic regression analysis have the same utility as standardized coefficients in linear regression analysis. Although there has been no consensus on the best way to construct standardized logistic regression coefficients, there is now sufficient evidence to suggest a single best approach to the construction of a…

  14. Logistic Regression for Evolving Data Streams Classification

    Institute of Scientific and Technical Information of China (English)

    YIN Zhi-wu; HUANG Shang-teng; XUE Gui-rong

    2007-01-01

    Logistic regression is a fast classifier and can achieve higher accuracy on small training data. Moreover,it can work on both discrete and continuous attributes with nonlinear patterns. Based on these properties of logistic regression, this paper proposed an algorithm, called evolutionary logistical regression classifier (ELRClass), to solve the classification of evolving data streams. This algorithm applies logistic regression repeatedly to a sliding window of samples in order to update the existing classifier, to keep this classifier if its performance is deteriorated by the reason of bursting noise, or to construct a new classifier if a major concept drift is detected. The intensive experimental results demonstrate the effectiveness of this algorithm.

  15. Should metacognition be measured by logistic regression?

    Science.gov (United States)

    Rausch, Manuel; Zehetleitner, Michael

    2017-03-01

    Are logistic regression slopes suitable to quantify metacognitive sensitivity, i.e. the efficiency with which subjective reports differentiate between correct and incorrect task responses? We analytically show that logistic regression slopes are independent from rating criteria in one specific model of metacognition, which assumes (i) that rating decisions are based on sensory evidence generated independently of the sensory evidence used for primary task responses and (ii) that the distributions of evidence are logistic. Given a hierarchical model of metacognition, logistic regression slopes depend on rating criteria. According to all considered models, regression slopes depend on the primary task criterion. A reanalysis of previous data revealed that massive numbers of trials are required to distinguish between hierarchical and independent models with tolerable accuracy. It is argued that researchers who wish to use logistic regression as measure of metacognitive sensitivity need to control the primary task criterion and rating criteria. Copyright © 2017 Elsevier Inc. All rights reserved.

  16. An Application on Multinomial Logistic Regression Model

    Directory of Open Access Journals (Sweden)

    Abdalla M El-Habil

    2012-03-01

    Full Text Available Normal 0 false false false EN-US X-NONE X-NONE This study aims to identify an application of Multinomial Logistic Regression model which is one of the important methods for categorical data analysis. This model deals with one nominal/ordinal response variable that has more than two categories, whether nominal or ordinal variable. This model has been applied in data analysis in many areas, for example health, social, behavioral, and educational.To identify the model by practical way, we used real data on physical violence against children, from a survey of Youth 2003 which was conducted by Palestinian Central Bureau of Statistics (PCBS. Segment of the population of children in the age group (10-14 years for residents in Gaza governorate, size of 66,935 had been selected, and the response variable consisted of four categories. Eighteen of explanatory variables were used for building the primary multinomial logistic regression model. Model had been tested through a set of statistical tests to ensure its appropriateness for the data. Also the model had been tested by selecting randomly of two observations of the data used to predict the position of each observation in any classified group it can be, by knowing the values of the explanatory variables used. We concluded by using the multinomial logistic regression model that we can able to define accurately the relationship between the group of explanatory variables and the response variable, identify the effect of each of the variables, and we can predict the classification of any individual case.

  17. Satellite rainfall retrieval by logistic regression

    Science.gov (United States)

    Chiu, Long S.

    1986-01-01

    The potential use of logistic regression in rainfall estimation from satellite measurements is investigated. Satellite measurements provide covariate information in terms of radiances from different remote sensors.The logistic regression technique can effectively accommodate many covariates and test their significance in the estimation. The outcome from the logistical model is the probability that the rainrate of a satellite pixel is above a certain threshold. By varying the thresholds, a rainrate histogram can be obtained, from which the mean and the variant can be estimated. A logistical model is developed and applied to rainfall data collected during GATE, using as covariates the fractional rain area and a radiance measurement which is deduced from a microwave temperature-rainrate relation. It is demonstrated that the fractional rain area is an important covariate in the model, consistent with the use of the so-called Area Time Integral in estimating total rain volume in other studies. To calibrate the logistical model, simulated rain fields generated by rainfield models with prescribed parameters are needed. A stringent test of the logistical model is its ability to recover the prescribed parameters of simulated rain fields. A rain field simulation model which preserves the fractional rain area and lognormality of rainrates as found in GATE is developed. A stochastic regression model of branching and immigration whose solutions are lognormally distributed in some asymptotic limits has also been developed.

  18. Interpreting parameters in the logistic regression model with random effects

    DEFF Research Database (Denmark)

    Larsen, Klaus; Petersen, Jørgen Holm; Budtz-Jørgensen, Esben

    2000-01-01

    interpretation, interval odds ratio, logistic regression, median odds ratio, normally distributed random effects......interpretation, interval odds ratio, logistic regression, median odds ratio, normally distributed random effects...

  19. Logistic regression a self-learning text

    CERN Document Server

    Kleinbaum, David G

    1994-01-01

    This textbook provides students and professionals in the health sciences with a presentation of the use of logistic regression in research. The text is self-contained, and designed to be used both in class or as a tool for self-study. It arises from the author's many years of experience teaching this material and the notes on which it is based have been extensively used throughout the world.

  20. Leukemia prediction using sparse logistic regression.

    Directory of Open Access Journals (Sweden)

    Tapio Manninen

    Full Text Available We describe a supervised prediction method for diagnosis of acute myeloid leukemia (AML from patient samples based on flow cytometry measurements. We use a data driven approach with machine learning methods to train a computational model that takes in flow cytometry measurements from a single patient and gives a confidence score of the patient being AML-positive. Our solution is based on an [Formula: see text] regularized logistic regression model that aggregates AML test statistics calculated from individual test tubes with different cell populations and fluorescent markers. The model construction is entirely data driven and no prior biological knowledge is used. The described solution scored a 100% classification accuracy in the DREAM6/FlowCAP2 Molecular Classification of Acute Myeloid Leukaemia Challenge against a golden standard consisting of 20 AML-positive and 160 healthy patients. Here we perform a more extensive validation of the prediction model performance and further improve and simplify our original method showing that statistically equal results can be obtained by using simple average marker intensities as features in the logistic regression model. In addition to the logistic regression based model, we also present other classification models and compare their performance quantitatively. The key benefit in our prediction method compared to other solutions with similar performance is that our model only uses a small fraction of the flow cytometry measurements making our solution highly economical.

  1. Logistic regression when binary predictor variables are highly correlated.

    Science.gov (United States)

    Barker, L; Brown, C

    Standard logistic regression can produce estimates having large mean square error when predictor variables are multicollinear. Ridge regression and principal components regression can reduce the impact of multicollinearity in ordinary least squares regression. Generalizations of these, applicable in the logistic regression framework, are alternatives to standard logistic regression. It is shown that estimates obtained via ridge and principal components logistic regression can have smaller mean square error than estimates obtained through standard logistic regression. Recommendations for choosing among standard, ridge and principal components logistic regression are developed. Published in 2001 by John Wiley & Sons, Ltd.

  2. Logistic Regression Applied to Seismic Discrimination

    Energy Technology Data Exchange (ETDEWEB)

    BG Amindan; DN Hagedorn

    1998-10-08

    The usefulness of logistic discrimination was examined in an effort to learn how it performs in a regional seismic setting. Logistic discrimination provides an easily understood method, works with user-defined models and few assumptions about the population distributions, and handles both continuous and discrete data. Seismic event measurements from a data set compiled by Los Alamos National Laboratory (LANL) of Chinese events recorded at station WMQ were used in this demonstration study. PNNL applied logistic regression techniques to the data. All possible combinations of the Lg and Pg measurements were tried, and a best-fit logistic model was created. The best combination of Lg and Pg frequencies for predicting the source of a seismic event (earthquake or explosion) used Lg{sub 3.0-6.0} and Pg{sub 3.0-6.0} as the predictor variables. A cross-validation test was run, which showed that this model was able to correctly predict 99.7% earthquakes and 98.0% explosions for this given data set. Two other models were identified that used Pg and Lg measurements from the 1.5 to 3.0 Hz frequency range. Although these other models did a good job of correctly predicting the earthquakes, they were not as effective at predicting the explosions. Two possible biases were discovered which affect the predicted probabilities for each outcome. The first bias was due to this being a case-controlled study. The sampling fractions caused a bias in the probabilities that were calculated using the models. The second bias is caused by a change in the proportions for each event. If at a later date the proportions (a priori probabilities) of explosions versus earthquakes change, this would cause a bias in the predicted probability for an event. When using logistic regression, the user needs to be aware of the possible biases and what affect they will have on the predicted probabilities.

  3. Variable Selection in Logistic Regression Mo del

    Institute of Scientific and Technical Information of China (English)

    ZHANG Shangli; ZHANG Lili; QIU Kuanmin; LU Ying; CAI Baigen

    2015-01-01

    Variable selection is one of the most impor-tant problems in pattern recognition. In linear regression model, there are many methods can solve this problem, such as Least absolute shrinkage and selection operator (LASSO) and many improved LASSO methods, but there are few variable selection methods in generalized linear models. We study the variable selection problem in logis-tic regression model. We propose a new variable selection method–the logistic elastic net, prove that it has grouping eff ect which means that the strongly correlated predictors tend to be in or out of the model together. The logistic elastic net is particularly useful when the number of pre-dictors (p) is much bigger than the number of observations (n). By contrast, the LASSO is not a very satisfactory vari-able selection method in the case when p is more larger than n. The advantage and eff ectiveness of this method are demonstrated by real leukemia data and a simulation study.

  4. Supporting Regularized Logistic Regression Privately and Efficiently.

    Directory of Open Access Journals (Sweden)

    Wenfa Li

    Full Text Available As one of the most popular statistical and machine learning models, logistic regression with regularization has found wide adoption in biomedicine, social sciences, information technology, and so on. These domains often involve data of human subjects that are contingent upon strict privacy regulations. Concerns over data privacy make it increasingly difficult to coordinate and conduct large-scale collaborative studies, which typically rely on cross-institution data sharing and joint analysis. Our work here focuses on safeguarding regularized logistic regression, a widely-used statistical model while at the same time has not been investigated from a data security and privacy perspective. We consider a common use scenario of multi-institution collaborative studies, such as in the form of research consortia or networks as widely seen in genetics, epidemiology, social sciences, etc. To make our privacy-enhancing solution practical, we demonstrate a non-conventional and computationally efficient method leveraging distributing computing and strong cryptography to provide comprehensive protection over individual-level and summary data. Extensive empirical evaluations on several studies validate the privacy guarantee, efficiency and scalability of our proposal. We also discuss the practical implications of our solution for large-scale studies and applications from various disciplines, including genetic and biomedical studies, smart grid, network analysis, etc.

  5. Supporting Regularized Logistic Regression Privately and Efficiently.

    Science.gov (United States)

    Li, Wenfa; Liu, Hongzhe; Yang, Peng; Xie, Wei

    2016-01-01

    As one of the most popular statistical and machine learning models, logistic regression with regularization has found wide adoption in biomedicine, social sciences, information technology, and so on. These domains often involve data of human subjects that are contingent upon strict privacy regulations. Concerns over data privacy make it increasingly difficult to coordinate and conduct large-scale collaborative studies, which typically rely on cross-institution data sharing and joint analysis. Our work here focuses on safeguarding regularized logistic regression, a widely-used statistical model while at the same time has not been investigated from a data security and privacy perspective. We consider a common use scenario of multi-institution collaborative studies, such as in the form of research consortia or networks as widely seen in genetics, epidemiology, social sciences, etc. To make our privacy-enhancing solution practical, we demonstrate a non-conventional and computationally efficient method leveraging distributing computing and strong cryptography to provide comprehensive protection over individual-level and summary data. Extensive empirical evaluations on several studies validate the privacy guarantee, efficiency and scalability of our proposal. We also discuss the practical implications of our solution for large-scale studies and applications from various disciplines, including genetic and biomedical studies, smart grid, network analysis, etc.

  6. Supporting Regularized Logistic Regression Privately and Efficiently

    Science.gov (United States)

    Li, Wenfa; Liu, Hongzhe; Yang, Peng; Xie, Wei

    2016-01-01

    As one of the most popular statistical and machine learning models, logistic regression with regularization has found wide adoption in biomedicine, social sciences, information technology, and so on. These domains often involve data of human subjects that are contingent upon strict privacy regulations. Concerns over data privacy make it increasingly difficult to coordinate and conduct large-scale collaborative studies, which typically rely on cross-institution data sharing and joint analysis. Our work here focuses on safeguarding regularized logistic regression, a widely-used statistical model while at the same time has not been investigated from a data security and privacy perspective. We consider a common use scenario of multi-institution collaborative studies, such as in the form of research consortia or networks as widely seen in genetics, epidemiology, social sciences, etc. To make our privacy-enhancing solution practical, we demonstrate a non-conventional and computationally efficient method leveraging distributing computing and strong cryptography to provide comprehensive protection over individual-level and summary data. Extensive empirical evaluations on several studies validate the privacy guarantee, efficiency and scalability of our proposal. We also discuss the practical implications of our solution for large-scale studies and applications from various disciplines, including genetic and biomedical studies, smart grid, network analysis, etc. PMID:27271738

  7. Jackknife bias reduction for polychotomous logistic regression.

    Science.gov (United States)

    Bull, S B; Greenwood, C M; Hauck, W W

    1997-03-15

    Despite theoretical and empirical evidence that the usual MLEs can be misleading in finite samples and some evidence that bias reduced estimates are less biased and more efficient, they have not seen a wide application in practice. One can obtain bias reduced estimates by jackknife methods, with or without full iteration, or by use of higher order terms in a Taylor series expansion of the log-likelihood to approximate asymptotic bias. We provide details of these methods for polychotomous logistic regression with a nominal categorical response. We conducted a Monte Carlo comparison of the jackknife and Taylor series estimates in moderate sample sizes in a general logistic regression setting, to investigate dichotomous and trichotomous responses and a mixture of correlated and uncorrelated binary and normal covariates. We found an approximate two-step jackknife and the Taylor series methods useful when the ratio of the number of observations to the number of parameters is greater than 15, but we cannot recommend the two-step and the fully iterated jackknife estimates when this ratio is less than 20, especially when there are large effects, binary covariates, or multicollinearity in the covariates.

  8. Logistic regression applied to natural hazards: rare event logistic regression with replications

    Directory of Open Access Journals (Sweden)

    M. Guns

    2012-06-01

    Full Text Available Statistical analysis of natural hazards needs particular attention, as most of these phenomena are rare events. This study shows that the ordinary rare event logistic regression, as it is now commonly used in geomorphologic studies, does not always lead to a robust detection of controlling factors, as the results can be strongly sample-dependent. In this paper, we introduce some concepts of Monte Carlo simulations in rare event logistic regression. This technique, so-called rare event logistic regression with replications, combines the strength of probabilistic and statistical methods, and allows overcoming some of the limitations of previous developments through robust variable selection. This technique was here developed for the analyses of landslide controlling factors, but the concept is widely applicable for statistical analyses of natural hazards.

  9. Logistic regression against a divergent Bayesian network

    Directory of Open Access Journals (Sweden)

    Noel Antonio Sánchez Trujillo

    2015-01-01

    Full Text Available This article is a discussion about two statistical tools used for prediction and causality assessment: logistic regression and Bayesian networks. Using data of a simulated example from a study assessing factors that might predict pulmonary emphysema (where fingertip pigmentation and smoking are considered; we posed the following questions. Is pigmentation a confounding, causal or predictive factor? Is there perhaps another factor, like smoking, that confounds? Is there a synergy between pigmentation and smoking? The results, in terms of prediction, are similar with the two techniques; regarding causation, differences arise. We conclude that, in decision-making, the sum of both: a statistical tool, used with common sense, and previous evidence, taking years or even centuries to develop; is better than the automatic and exclusive use of statistical resources.

  10. Uso de regressões logísticas múltiplas para mapeamento digital de solos no Planalto Médio do RS Multiple logistic regression applied to soil survey in rio grande do sul state, Brazil

    Directory of Open Access Journals (Sweden)

    Samuel Ribeiro Figueiredo

    2008-12-01

    hydrographic variables (distance to rivers, flow length, topographical wetness index, and stream power index. Multiple logistic regressions were established between the soil classes mapped on the basis of a traditional survey at a scale of 1:80.000 and the land variables calculated using the DEM. The regressions were used to calculate the probability of occurrence of each soil class. The final estimated soil map was drawn by assigning the soil class with highest probability of occurrence to each cell. The general accuracy was evaluated at 58 % and the Kappa coefficient at 38 % in a comparison of the original soil map with the map estimated at the original scale. A legend simplification had little effect to increase the general accuracy of the map (general accuracy of 61 % and Kappa coefficient of 39 %. It was concluded that multiple logistic regressions have a predictive potential as tool of supervised soil mapping.

  11. Using Dominance Analysis to Determine Predictor Importance in Logistic Regression

    Science.gov (United States)

    Azen, Razia; Traxel, Nicole

    2009-01-01

    This article proposes an extension of dominance analysis that allows researchers to determine the relative importance of predictors in logistic regression models. Criteria for choosing logistic regression R[superscript 2] analogues were determined and measures were selected that can be used to perform dominance analysis in logistic regression. A…

  12. Using Dominance Analysis to Determine Predictor Importance in Logistic Regression

    Science.gov (United States)

    Azen, Razia; Traxel, Nicole

    2009-01-01

    This article proposes an extension of dominance analysis that allows researchers to determine the relative importance of predictors in logistic regression models. Criteria for choosing logistic regression R[superscript 2] analogues were determined and measures were selected that can be used to perform dominance analysis in logistic regression. A…

  13. Logistic Regression Model on Antenna Control Unit Autotracking Mode

    Science.gov (United States)

    2015-10-20

    412TW-PA-15240 Logistic Regression Model on Antenna Control Unit Autotracking Mode DANIEL T. LAIRD AIR FORCE TEST CENTER EDWARDS AFB, CA...OCT 15 4. TITLE AND SUBTITLE Logistic Regression Model on Antenna Control Unit Autotracking Mode 5a. CONTRACT NUMBER 5b. GRANT...alternative-hypothesis. This paper will present an Antenna Auto- tracking model using Logistic Regression modeling. This paper presents an example of

  14. Combining logistic regression and neural networks to create predictive models.

    OpenAIRE

    Spackman, K. A.

    1992-01-01

    Neural networks are being used widely in medicine and other areas to create predictive models from data. The statistical method that most closely parallels neural networks is logistic regression. This paper outlines some ways in which neural networks and logistic regression are similar, shows how a small modification of logistic regression can be used in the training of neural network models, and illustrates the use of this modification for variable selection and predictive model building wit...

  15. Personal, social, and game-related correlates of active and non-active gaming among dutch gaming adolescents: survey-based multivariable, multilevel logistic regression analyses.

    Science.gov (United States)

    Simons, Monique; de Vet, Emely; Chinapaw, Mai Jm; de Boer, Michiel; Seidell, Jacob C; Brug, Johannes

    2014-04-04

    Playing video games contributes substantially to sedentary behavior in youth. A new generation of video games-active games-seems to be a promising alternative to sedentary games to promote physical activity and reduce sedentary behavior. At this time, little is known about correlates of active and non-active gaming among adolescents. The objective of this study was to examine potential personal, social, and game-related correlates of both active and non-active gaming in adolescents. A survey assessing game behavior and potential personal, social, and game-related correlates was conducted among adolescents (12-16 years, N=353) recruited via schools. Multivariable, multilevel logistic regression analyses, adjusted for demographics (age, sex and educational level of adolescents), were conducted to examine personal, social, and game-related correlates of active gaming ≥1 hour per week (h/wk) and non-active gaming >7 h/wk. Active gaming ≥1 h/wk was significantly associated with a more positive attitude toward active gaming (OR 5.3, CI 2.4-11.8; Pgames (OR 0.30, CI 0.1-0.6; P=.002), a higher score on habit strength regarding gaming (OR 1.9, CI 1.2-3.2; P=.008) and having brothers/sisters (OR 6.7, CI 2.6-17.1; Pgaming and a little bit lower score on game engagement (OR 0.95, CI 0.91-0.997; P=.04). Non-active gaming >7 h/wk was significantly associated with a more positive attitude toward non-active gaming (OR 2.6, CI 1.1-6.3; P=.035), a stronger habit regarding gaming (OR 3.0, CI 1.7-5.3; Pgaming (OR 3.3, CI 1.46-7.53; P=.004), and a more positive image of a non-active gamer (OR 2, CI 1.07-3.75; P=.03). Various factors were significantly associated with active gaming ≥1 h/wk and non-active gaming >7 h/wk. Active gaming is most strongly (negatively) associated with attitude with respect to non-active games, followed by observed active game behavior of brothers and sisters and attitude with respect to active gaming (positive associations). On the other hand, non

  16. A logistic regression estimating function for spatial Gibbs point processes

    DEFF Research Database (Denmark)

    Baddeley, Adrian; Coeurjolly, Jean-François; Rubak, Ege

    We propose a computationally efficient logistic regression estimating function for spatial Gibbs point processes. The sample points for the logistic regression consist of the observed point pattern together with a random pattern of dummy points. The estimating function is closely related...

  17. Spatial correlation in Bayesian logistic regression with misclassification

    DEFF Research Database (Denmark)

    Bihrmann, Kristine; Toft, Nils; Nielsen, Søren Saxmose

    2014-01-01

    Standard logistic regression assumes that the outcome is measured perfectly. In practice, this is often not the case, which could lead to biased estimates if not accounted for. This study presents Bayesian logistic regression with adjustment for misclassification of the outcome applied to data...

  18. A Methodology for Generating Placement Rules that Utilizes Logistic Regression

    Science.gov (United States)

    Wurtz, Keith

    2008-01-01

    The purpose of this article is to provide the necessary tools for institutional researchers to conduct a logistic regression analysis and interpret the results. Aspects of the logistic regression procedure that are necessary to evaluate models are presented and discussed with an emphasis on cutoff values and choosing the appropriate number of…

  19. SMOOTH TRANSITION LOGISTIC REGRESSION MODEL TREE

    OpenAIRE

    RODRIGO PINTO MOREIRA

    2008-01-01

    Este trabalho tem como objetivo principal adaptar o modelo STR-Tree, o qual é a combinação de um modelo Smooth Transition Regression com Classification and Regression Tree (CART), a fim de utilizá-lo em Classificação. Para isto algumas alterações foram realizadas em sua forma estrutural e na estimação. Devido ao fato de estarmos fazendo classificação de variáveis dependentes binárias, se faz necessária a utilização das técnicas empregadas em Regressão Logística, dessa forma a estimação dos pa...

  20. Parameters Estimation of Geographically Weighted Ordinal Logistic Regression (GWOLR) Model

    Science.gov (United States)

    Zuhdi, Shaifudin; Retno Sari Saputro, Dewi; Widyaningsih, Purnami

    2017-06-01

    A regression model is the representation of relationship between independent variable and dependent variable. The dependent variable has categories used in the logistic regression model to calculate odds on. The logistic regression model for dependent variable has levels in the logistics regression model is ordinal. GWOLR model is an ordinal logistic regression model influenced the geographical location of the observation site. Parameters estimation in the model needed to determine the value of a population based on sample. The purpose of this research is to parameters estimation of GWOLR model using R software. Parameter estimation uses the data amount of dengue fever patients in Semarang City. Observation units used are 144 villages in Semarang City. The results of research get GWOLR model locally for each village and to know probability of number dengue fever patient categories.

  1. Large unbalanced credit scoring using Lasso-logistic regression ensemble.

    Directory of Open Access Journals (Sweden)

    Hong Wang

    Full Text Available Recently, various ensemble learning methods with different base classifiers have been proposed for credit scoring problems. However, for various reasons, there has been little research using logistic regression as the base classifier. In this paper, given large unbalanced data, we consider the plausibility of ensemble learning using regularized logistic regression as the base classifier to deal with credit scoring problems. In this research, the data is first balanced and diversified by clustering and bagging algorithms. Then we apply a Lasso-logistic regression learning ensemble to evaluate the credit risks. We show that the proposed algorithm outperforms popular credit scoring models such as decision tree, Lasso-logistic regression and random forests in terms of AUC and F-measure. We also provide two importance measures for the proposed model to identify important variables in the data.

  2. Large unbalanced credit scoring using Lasso-logistic regression ensemble.

    Science.gov (United States)

    Wang, Hong; Xu, Qingsong; Zhou, Lifeng

    2015-01-01

    Recently, various ensemble learning methods with different base classifiers have been proposed for credit scoring problems. However, for various reasons, there has been little research using logistic regression as the base classifier. In this paper, given large unbalanced data, we consider the plausibility of ensemble learning using regularized logistic regression as the base classifier to deal with credit scoring problems. In this research, the data is first balanced and diversified by clustering and bagging algorithms. Then we apply a Lasso-logistic regression learning ensemble to evaluate the credit risks. We show that the proposed algorithm outperforms popular credit scoring models such as decision tree, Lasso-logistic regression and random forests in terms of AUC and F-measure. We also provide two importance measures for the proposed model to identify important variables in the data.

  3. MODELING SNAKE MICROHABITAT FROM RADIOTELEMETRY STUDIES USING POLYTOMOUS LOGISTIC REGRESSION

    Science.gov (United States)

    Multivariate analysis of snake microhabitat has historically used techniques that were derived under assumptions of normality and common covariance structure (e.g., discriminant function analysis, MANOVA). In this study, polytomous logistic regression (PLR which does not require ...

  4. An Original Stepwise Multilevel Logistic Regression Analysis of Discriminatory Accuracy

    DEFF Research Database (Denmark)

    Merlo, Juan; Wagner, Philippe; Ghith, Nermin

    2016-01-01

    BACKGROUND AND AIM: Many multilevel logistic regression analyses of "neighbourhood and health" focus on interpreting measures of associations (e.g., odds ratio, OR). In contrast, multilevel analysis of variance is rarely considered. We propose an original stepwise analytical approach that disting......BACKGROUND AND AIM: Many multilevel logistic regression analyses of "neighbourhood and health" focus on interpreting measures of associations (e.g., odds ratio, OR). In contrast, multilevel analysis of variance is rarely considered. We propose an original stepwise analytical approach...

  5. Logistic regression for risk factor modelling in stuttering research.

    Science.gov (United States)

    Reed, Phil; Wu, Yaqionq

    2013-06-01

    To outline the uses of logistic regression and other statistical methods for risk factor analysis in the context of research on stuttering. The principles underlying the application of a logistic regression are illustrated, and the types of questions to which such a technique has been applied in the stuttering field are outlined. The assumptions and limitations of the technique are discussed with respect to existing stuttering research, and with respect to formulating appropriate research strategies to accommodate these considerations. Finally, some alternatives to the approach are briefly discussed. The way the statistical procedures are employed are demonstrated with some hypothetical data. Research into several practical issues concerning stuttering could benefit if risk factor modelling were used. Important examples are early diagnosis, prognosis (whether a child will recover or persist) and assessment of treatment outcome. After reading this article you will: (a) Summarize the situations in which logistic regression can be applied to a range of issues about stuttering; (b) Follow the steps in performing a logistic regression analysis; (c) Describe the assumptions of the logistic regression technique and the precautions that need to be checked when it is employed; (d) Be able to summarize its advantages over other techniques like estimation of group differences and simple regression. Copyright © 2012 Elsevier Inc. All rights reserved.

  6. Estimating the exceedance probability of rain rate by logistic regression

    Science.gov (United States)

    Chiu, Long S.; Kedem, Benjamin

    1990-01-01

    Recent studies have shown that the fraction of an area with rain intensity above a fixed threshold is highly correlated with the area-averaged rain rate. To estimate the fractional rainy area, a logistic regression model, which estimates the conditional probability that rain rate over an area exceeds a fixed threshold given the values of related covariates, is developed. The problem of dependency in the data in the estimation procedure is bypassed by the method of partial likelihood. Analyses of simulated scanning multichannel microwave radiometer and observed electrically scanning microwave radiometer data during the Global Atlantic Tropical Experiment period show that the use of logistic regression in pixel classification is superior to multiple regression in predicting whether rain rate at each pixel exceeds a given threshold, even in the presence of noisy data. The potential of the logistic regression technique in satellite rain rate estimation is discussed.

  7. Application of ordinal logistic regression analysis in determining risk factors of child malnutrition in Bangladesh

    OpenAIRE

    Das Sumonkanti; Rahman Rajwanur M

    2011-01-01

    Abstract Background The study attempts to develop an ordinal logistic regression (OLR) model to identify the determinants of child malnutrition instead of developing traditional binary logistic regression (BLR) model using the data of Bangladesh Demographic and Health Survey 2004. Methods Based on weight-for-age anthropometric index (Z-score) child nutrition status is categorized into three groups-severely undernourished (< -3.0), moderately undernourished (-3.0 to -2.01) and nourished (≥-2.0...

  8. Bayesian Lasso and multinomial logistic regression on GPU.

    Science.gov (United States)

    Češnovar, Rok; Štrumbelj, Erik

    2017-01-01

    We describe an efficient Bayesian parallel GPU implementation of two classic statistical models-the Lasso and multinomial logistic regression. We focus on parallelizing the key components: matrix multiplication, matrix inversion, and sampling from the full conditionals. Our GPU implementations of Bayesian Lasso and multinomial logistic regression achieve 100-fold speedups on mid-level and high-end GPUs. Substantial speedups of 25 fold can also be achieved on older and lower end GPUs. Samplers are implemented in OpenCL and can be used on any type of GPU and other types of computational units, thereby being convenient and advantageous in practice compared to related work.

  9. Credit Scoring Model Hybridizing Artificial Intelligence with Logistic Regression

    Directory of Open Access Journals (Sweden)

    Han Lu

    2013-01-01

    Full Text Available Today the most commonly used techniques for credit scoring are artificial intelligence and statistics. In this paper, we started a new way to use these two kinds of models. Through logistic regression filters the variables with a high degree of correlation, artificial intelligence models reduce complexity and accelerate convergence, while these models hybridizing logistic regression have better explanations in statistically significance, thus improve the effect of artificial intelligence models. With experiments on German data set, we find an interesting phenomenon defined as ‘Dimensional interference’ with support vector machine and from cross validation it can be seen that the new method gives a lot of help with credit scoring.

  10. Score normalization using logistic regression with expected parameters

    NARCIS (Netherlands)

    Aly, Robin

    2014-01-01

    State-of-the-art score normalization methods use generative models that rely on sometimes unrealistic assumptions. We propose a novel parameter estimation method for score normalization based on logistic regression. Experiments on the Gov2 and CluewebA collection indicate that our method is consiste

  11. Geographically Weighted Logistic Regression Applied to Credit Scoring Models

    Directory of Open Access Journals (Sweden)

    Pedro Henrique Melo Albuquerque

    Full Text Available Abstract This study used real data from a Brazilian financial institution on transactions involving Consumer Direct Credit (CDC, granted to clients residing in the Distrito Federal (DF, to construct credit scoring models via Logistic Regression and Geographically Weighted Logistic Regression (GWLR techniques. The aims were: to verify whether the factors that influence credit risk differ according to the borrower’s geographic location; to compare the set of models estimated via GWLR with the global model estimated via Logistic Regression, in terms of predictive power and financial losses for the institution; and to verify the viability of using the GWLR technique to develop credit scoring models. The metrics used to compare the models developed via the two techniques were the AICc informational criterion, the accuracy of the models, the percentage of false positives, the sum of the value of false positive debt, and the expected monetary value of portfolio default compared with the monetary value of defaults observed. The models estimated for each region in the DF were distinct in their variables and coefficients (parameters, with it being concluded that credit risk was influenced differently in each region in the study. The Logistic Regression and GWLR methodologies presented very close results, in terms of predictive power and financial losses for the institution, and the study demonstrated viability in using the GWLR technique to develop credit scoring models for the target population in the study.

  12. Detecting Differential Item Functioning Using Logistic Regression Procedures.

    Science.gov (United States)

    Swaminathan, Hariharan; Rogers, H. Jane

    1990-01-01

    A logistic regression model for characterizing differential item functioning (DIF) between two groups is presented. A distinction is drawn between uniform and nonuniform DIF in terms of model parameters. A statistic for testing the hypotheses of no DIF is developed, and simulation studies compare it with the Mantel-Haenszel procedure. (Author/TJH)

  13. Classification of microarray data with penalized logistic regression

    Science.gov (United States)

    Eilers, Paul H. C.; Boer, Judith M.; van Ommen, Gert-Jan; van Houwelingen, Hans C.

    2001-06-01

    Classification of microarray data needs a firm statistical basis. In principle, logistic regression can provide it, modeling the probability of membership of a class with (transforms of) linear combinations of explanatory variables. However, classical logistic regression does not work for microarrays, because generally there will be far more variables than observations. One problem is multicollinearity: estimating equations become singular and have no unique and stable solution. A second problem is over-fitting: a model may fit well into a data set, but perform badly when used to classify new data. We propose penalized likelihood as a solution to both problems. The values of the regression coefficients are constrained in a similar way as in ridge regression. All variables play an equal role, there is no ad-hoc selection of most relevant or most expressed genes. The dimension of the resulting systems of equations is equal to the number of variables, and generally will be too large for most computers, but it can dramatically be reduced with the singular value decomposition of some matrices. The penalty is optimized with AIC (Akaike's Information Criterion), which essentially is a measure of prediction performance. We find that penalized logistic regression performs well on a public data set (the MIT ALL/AML data).

  14. A Solution to Separation and Multicollinearity in Multiple Logistic Regression.

    Science.gov (United States)

    Shen, Jianzhao; Gao, Sujuan

    2008-10-01

    In dementia screening tests, item selection for shortening an existing screening test can be achieved using multiple logistic regression. However, maximum likelihood estimates for such logistic regression models often experience serious bias or even non-existence because of separation and multicollinearity problems resulting from a large number of highly correlated items. Firth (1993, Biometrika, 80(1), 27-38) proposed a penalized likelihood estimator for generalized linear models and it was shown to reduce bias and the non-existence problems. The ridge regression has been used in logistic regression to stabilize the estimates in cases of multicollinearity. However, neither solves the problems for each other. In this paper, we propose a double penalized maximum likelihood estimator combining Firth's penalized likelihood equation with a ridge parameter. We present a simulation study evaluating the empirical performance of the double penalized likelihood estimator in small to moderate sample sizes. We demonstrate the proposed approach using a current screening data from a community-based dementia study.

  15. Determination of riverbank erosion probability using Locally Weighted Logistic Regression

    Science.gov (United States)

    Ioannidou, Elena; Flori, Aikaterini; Varouchakis, Emmanouil A.; Giannakis, Georgios; Vozinaki, Anthi Eirini K.; Karatzas, George P.; Nikolaidis, Nikolaos

    2015-04-01

    Riverbank erosion is a natural geomorphologic process that affects the fluvial environment. The most important issue concerning riverbank erosion is the identification of the vulnerable locations. An alternative to the usual hydrodynamic models to predict vulnerable locations is to quantify the probability of erosion occurrence. This can be achieved by identifying the underlying relations between riverbank erosion and the geomorphological or hydrological variables that prevent or stimulate erosion. Thus, riverbank erosion can be determined by a regression model using independent variables that are considered to affect the erosion process. The impact of such variables may vary spatially, therefore, a non-stationary regression model is preferred instead of a stationary equivalent. Locally Weighted Regression (LWR) is proposed as a suitable choice. This method can be extended to predict the binary presence or absence of erosion based on a series of independent local variables by using the logistic regression model. It is referred to as Locally Weighted Logistic Regression (LWLR). Logistic regression is a type of regression analysis used for predicting the outcome of a categorical dependent variable (e.g. binary response) based on one or more predictor variables. The method can be combined with LWR to assign weights to local independent variables of the dependent one. LWR allows model parameters to vary over space in order to reflect spatial heterogeneity. The probabilities of the possible outcomes are modelled as a function of the independent variables using a logistic function. Logistic regression measures the relationship between a categorical dependent variable and, usually, one or several continuous independent variables by converting the dependent variable to probability scores. Then, a logistic regression is formed, which predicts success or failure of a given binary variable (e.g. erosion presence or absence) for any value of the independent variables. The

  16. Parameter Estimation for Improving Association Indicators in Binary Logistic Regression

    Directory of Open Access Journals (Sweden)

    Mahdi Bashiri

    2012-02-01

    Full Text Available The aim of this paper is estimation of Binary logistic regression parameters for maximizing the log-likelihood function with improved association indicators. In this paper the parameter estimation steps have been explained and then measures of association have been introduced and their calculations have been analyzed. Moreover a new related indicators based on membership degree level have been expressed. Indeed association measures demonstrate the number of success responses occurred in front of failure in certain number of Bernoulli independent experiments. In parameter estimation, existing indicators values is not sensitive to the parameter values, whereas the proposed indicators are sensitive to the estimated parameters during the iterative procedure. Therefore, proposing a new association indicator of binary logistic regression with more sensitivity to the estimated parameters in maximizing the log- likelihood in iterative procedure is innovation of this study.

  17. Sugarcane Land Classification with Satellite Imagery using Logistic Regression Model

    Science.gov (United States)

    Henry, F.; Herwindiati, D. E.; Mulyono, S.; Hendryli, J.

    2017-03-01

    This paper discusses the classification of sugarcane plantation area from Landsat-8 satellite imagery. The classification process uses binary logistic regression method with time series data of normalized difference vegetation index as input. The process is divided into two steps: training and classification. The purpose of training step is to identify the best parameter of the regression model using gradient descent algorithm. The best fit of the model can be utilized to classify sugarcane and non-sugarcane area. The experiment shows high accuracy and successfully maps the sugarcane plantation area which obtained best result of Cohen’s Kappa value 0.7833 (strong) with 89.167% accuracy.

  18. On modified skew logistic regression model and its applications

    Directory of Open Access Journals (Sweden)

    C. Satheesh Kumar

    2015-12-01

    Full Text Available Here we consider a modified form of the logistic regression model useful for situations where the dependent variable is dichotomous in nature and the explanatory variables exhibit asymmetric and multimodal behaviour. The proposed model has been fitted to some real life data set by using method of maximum likelihood estimation and illustrated its usefulness in certain medical applications.

  19. Model performance analysis and model validation in logistic regression

    Directory of Open Access Journals (Sweden)

    Rosa Arboretti Giancristofaro

    2007-10-01

    Full Text Available In this paper a new model validation procedure for a logistic regression model is presented. At first, we illustrate a brief review of different techniques of model validation. Next, we define a number of properties required for a model to be considered "good", and a number of quantitative performance measures. Lastly, we describe a methodology for the assessment of the performance of a given model by using an example taken from a management study.

  20. APPLYING LOGISTIC REGRESSION MODEL TO THE EXAMINATION RESULTS DATA

    Directory of Open Access Journals (Sweden)

    Goutam Saha

    2011-01-01

    Full Text Available The binary logistic regression model is used to analyze the school examination results(scores of 1002 students. The analysis is performed on the basis of the independent variables viz.gender, medium of instruction, type of schools, category of schools, board of examinations andlocation of schools, where scores or marks are assumed to be dependent variables. The odds ratioanalysis compares the scores obtained in two examinations viz. matriculation and highersecondary.

  1. Diagnostic profiles of acute abdominal pain with multinomial logistic regression

    Directory of Open Access Journals (Sweden)

    Ohmann, Christian

    2007-07-01

    Full Text Available Purpose: Application of multinomial logistic regression for diagnostic support of acute abdominal pain, a diagnostic problem with many differential diagnoses. Methods: The analysis is based on a prospective data base with 2280 patients with acute abdominal pain, characterized by 87 variables from history and clinical examination and 12 differential diagnoses. Associations between single variables from history and clinical examination and the final diagnoses were investigated with multinomial logistic regression. Results: Exemplarily, the results are presented for the variable rigidity. A statistical significant association was observed for generalized rigidity and the diagnoses appendicitis, bowel obstruction, pancreatitis, perforated ulcer, multiple and other diagnoses and for localized rigidity and appendicitis, diverticulitis, biliary disease and perforated ulcer. Diagnostic profiles were generated by summarizing the statistical significant associations. As an example the diagnostic profile of acute appendicitis is presented. Conclusions: Compared to alternative approaches (e.g. independent Bayes, loglinear model there are advantages for multinomial logistic regression to support complex differential diagnostic problems, provided potential traps are avoided (e.g. α-error, interpretation of odds ratio.

  2. Multiple Logistic Regression Analysis of Cigarette Use among High School Students

    Science.gov (United States)

    Adwere-Boamah, Joseph

    2011-01-01

    A binary logistic regression analysis was performed to predict high school students' cigarette smoking behavior from selected predictors from 2009 CDC Youth Risk Behavior Surveillance Survey. The specific target student behavior of interest was frequent cigarette use. Five predictor variables included in the model were: a) race, b) frequency of…

  3. Prediction of siRNA potency using sparse logistic regression.

    Science.gov (United States)

    Hu, Wei; Hu, John

    2014-06-01

    RNA interference (RNAi) can modulate gene expression at post-transcriptional as well as transcriptional levels. Short interfering RNA (siRNA) serves as a trigger for the RNAi gene inhibition mechanism, and therefore is a crucial intermediate step in RNAi. There have been extensive studies to identify the sequence characteristics of potent siRNAs. One such study built a linear model using LASSO (Least Absolute Shrinkage and Selection Operator) to measure the contribution of each siRNA sequence feature. This model is simple and interpretable, but it requires a large number of nonzero weights. We have introduced a novel technique, sparse logistic regression, to build a linear model using single-position specific nucleotide compositions which has the same prediction accuracy of the linear model based on LASSO. The weights in our new model share the same general trend as those in the previous model, but have only 25 nonzero weights out of a total 84 weights, a 54% reduction compared to the previous model. Contrary to the linear model based on LASSO, our model suggests that only a few positions are influential on the efficacy of the siRNA, which are the 5' and 3' ends and the seed region of siRNA sequences. We also employed sparse logistic regression to build a linear model using dual-position specific nucleotide compositions, a task LASSO is not able to accomplish well due to its high dimensional nature. Our results demonstrate the superiority of sparse logistic regression as a technique for both feature selection and regression over LASSO in the context of siRNA design.

  4. Robust Logistic Regression to Static Geometric Representation of Ratios

    Directory of Open Access Journals (Sweden)

    Alireza Bahiraie

    2009-01-01

    Full Text Available Problem statement: Some methodological problems concerning financial ratios such as non-proportionality, non-asymetricity, non-salacity were solved in this study and we presented a complementary technique for empirical analysis of financial ratios and bankruptcy risk. This new method would be a general methodological guideline associated with financial data and bankruptcy risk. Approach: We proposed the use of a new measure of risk, the Share Risk (SR measure. We provided evidence of the extent to which changes in values of this index are associated with changes in each axis values and how this may alter our economic interpretation of changes in the patterns and directions. Our simple methodology provided a geometric illustration of the new proposed risk measure and transformation behavior. This study also employed Robust logit method, which extends the logit model by considering outlier. Results: Results showed new SR method obtained better numerical results in compare to common ratios approach. With respect to accuracy results, Logistic and Robust Logistic Regression Analysis illustrated that this new transformation (SR produced more accurate prediction statistically and can be used as an alternative for common ratios. Additionally, robust logit model outperforms logit model in both approaches and was substantially superior to the logit method in predictions to assess sample forecast performances and regressions. Conclusion/Recommendations: This study presented a new perspective on the study of firm financial statement and bankruptcy. In this study, a new dimension to risk measurement and data representation with the advent of the Share Risk method (SR was proposed. With respect to forecast results, robust loigt method was substantially superior to the logit method. It was strongly suggested the use of SR methodology for ratio analysis, which provided a conceptual and complimentary methodological solution to many problems associated with the

  5. Semi-Supervised Additive Logistic Regression: A Gradient Descent Solution

    Institute of Scientific and Technical Information of China (English)

    2007-01-01

    This paper describes a semi-supervised regularized method for additive logistic regression. The graph regularization term of the combined functions is added to the original cost functional used in AdaBoost. This term constrains the learned function to be smooth on a graph. Then the gradient solution is computed with the advantage that the regularization parameter can be adaptively selected. Finally, the function step-size of each iteration can be computed using Newton-Raphson iteration. Experiments on benchmark data sets show that the algorithm gives better results than existing methods.

  6. MENENTUKAN PROBABILITAS QUALITAS LULUSAN PROGRAM STUDI MENGGUNAKAN LOGISTIC REGRESSION

    Directory of Open Access Journals (Sweden)

    Maxsi Ary

    2016-03-01

    Full Text Available Abstract – Human resources (HR is one of the success factors in the economic field, namely how to create a human resources (HR qualified and have the skills and highly competitive in the global competition. Educational level of the labor force that is still relatively low. The structure of education of the workforce is still dominated Indonesian basic education which is about 63.2%. The issue raised is to determine the probability of a program of study (whether or not to see some of the ratio of the number of graduates by the number of students per class, the amount of quota size class (large or small using logistic regression models. Data were obtained from a search result based on the amount of data the study program students and graduates in 2010 Data processing using SPSS. The results of the analysis by assessing model fit and the results will be given for each model fit. Starting with the hypothesis for assessing model fit, statistical -2LogL, Cox and Snell's R Square, Hosmer and Lemeshow's Goodness of Fit Test, and the classification table. The results of the analysis using SPSS as a tool aimed at measuring quality of graduate courses at a university, college, or academy, whether or not based on the ratio of the number of graduates and class quotas. Keywords: Quota Class, Probability, Logistic Regression Abstrak – Sumberdaya manusia (SDM adalah salah satu faktor kesuksesan dalam bidang ekonomi, yaitu bagaimana menciptakan sumber daya manusia (SDM yang berkualitas dan memiliki keterampilan serta berdaya saing tinggi dalam persaingan global. Tingkat pendidikan angkatan kerja yang ada masih relatif rendah. Struktur pendidikan angkatan kerja Indonesia masih didominasi pendidikan dasar yaitu sekitar 63,2%. Persoalan yang dikemukakan adalah menentukan probabilitas sebuah program studi (baik atau tidak dengan melihat beberapa rasio jumlah lulusan dengan jumlah mahasiswa per angkatan, ukuran besarnya kuota kelas (besar atau kecil menggunakan

  7. Logistic regression in estimates of femoral neck fracture by fall

    Directory of Open Access Journals (Sweden)

    Jaroslava Wendlová

    2010-04-01

    Full Text Available Jaroslava WendlováDerer’s University Hospital and Policlinic, Osteological Unit, Bratislava, SlovakiaAbstract: The latest methods in estimating the probability (absolute risk of osteoporotic fractures include several logistic regression models, based on qualitative risk factors plus bone mineral density (BMD, and the probability estimate of fracture in the future. The Slovak logistic regression model, in contrast to other models, is created from quantitative variables of the proximal femur (in International System of Units and estimates the probability of fracture by fall.Objectives: The first objective of this study was to order selected independent variables according to the intensity of their influence (statistical significance upon the occurrence of values of the dependent variable: femur strength index (FSI. The second objective was to determine, using logistic regression, whether the odds of FSI acquiring a pathological value (femoral neck fracture by fall increased or declined if the value of the variables (T–score total hip, BMI, alpha angle, theta angle and HAL were raised by one unit.Patients and methods: Bone densitometer measurements using dual energy X–ray absorptiometry (DXA, (Prodigy, Primo, GE, USA of the left proximal femur were obtained from 3 216 East Slovak women with primary or secondary osteoporosis or osteopenia, aged 20–89 years (mean age 58.9; 95% CI: −58.42; 59.38. The following variables were measured: FSI, T-score total hip BMD, body mass index (BMI, as were the geometrical variables of proximal femur alpha angle (α angle, theta angle (θ angle, and hip axis length (HAL.Statistical analysis: Logistic regression was used to measure the influence of the independent variables (T-score total hip, alpha angle, theta angle, HAL, BMI upon the dependent variable (FSI.Results: The order of independent variables according to the intensity of their influence (greatest to least upon the occurrence of values of the

  8. Modeling Governance KB with CATPCA to Overcome Multicollinearity in the Logistic Regression

    Science.gov (United States)

    Khikmah, L.; Wijayanto, H.; Syafitri, U. D.

    2017-04-01

    The problem often encounters in logistic regression modeling are multicollinearity problems. Data that have multicollinearity between explanatory variables with the result in the estimation of parameters to be bias. Besides, the multicollinearity will result in error in the classification. In general, to overcome multicollinearity in regression used stepwise regression. They are also another method to overcome multicollinearity which involves all variable for prediction. That is Principal Component Analysis (PCA). However, classical PCA in only for numeric data. Its data are categorical, one method to solve the problems is Categorical Principal Component Analysis (CATPCA). Data were used in this research were a part of data Demographic and Population Survey Indonesia (IDHS) 2012. This research focuses on the characteristic of women of using the contraceptive methods. Classification results evaluated using Area Under Curve (AUC) values. The higher the AUC value, the better. Based on AUC values, the classification of the contraceptive method using stepwise method (58.66%) is better than the logistic regression model (57.39%) and CATPCA (57.39%). Evaluation of the results of logistic regression using sensitivity, shows the opposite where CATPCA method (99.79%) is better than logistic regression method (92.43%) and stepwise (92.05%). Therefore in this study focuses on major class classification (using a contraceptive method), then the selected model is CATPCA because it can raise the level of the major class model accuracy.

  9. Cluster-localized sparse logistic regression for SNP data.

    Science.gov (United States)

    Binder, Harald; Müller, Tina; Schwender, Holger; Golka, Klaus; Steffens, Michael; Hengstler, Jan G; Ickstadt, Katja; Schumacher, Martin

    2012-08-14

    The task of analyzing high-dimensional single nucleotide polymorphism (SNP) data in a case-control design using multivariable techniques has only recently been tackled. While many available approaches investigate only main effects in a high-dimensional setting, we propose a more flexible technique, cluster-localized regression (CLR), based on localized logistic regression models, that allows different SNPs to have an effect for different groups of individuals. Separate multivariable regression models are fitted for the different groups of individuals by incorporating weights into componentwise boosting, which provides simultaneous variable selection, hence sparse fits. For model fitting, these groups of individuals are identified using a clustering approach, where each group may be defined via different SNPs. This allows for representing complex interaction patterns, such as compositional epistasis, that might not be detected by a single main effects model. In a simulation study, the CLR approach results in improved prediction performance, compared to the main effects approach, and identification of important SNPs in several scenarios. Improved prediction performance is also obtained for an application example considering urinary bladder cancer. Some of the identified SNPs are predictive for all individuals, while others are only relevant for a specific group. Together with the sets of SNPs that define the groups, potential interaction patterns are uncovered.

  10. Logistic Regression-HSMM-Based Heart Sound Segmentation.

    Science.gov (United States)

    Springer, David B; Tarassenko, Lionel; Clifford, Gari D

    2016-04-01

    The identification of the exact positions of the first and second heart sounds within a phonocardiogram (PCG), or heart sound segmentation, is an essential step in the automatic analysis of heart sound recordings, allowing for the classification of pathological events. While threshold-based segmentation methods have shown modest success, probabilistic models, such as hidden Markov models, have recently been shown to surpass the capabilities of previous methods. Segmentation performance is further improved when a priori information about the expected duration of the states is incorporated into the model, such as in a hidden semi-Markov model (HSMM). This paper addresses the problem of the accurate segmentation of the first and second heart sound within noisy real-world PCG recordings using an HSMM, extended with the use of logistic regression for emission probability estimation. In addition, we implement a modified Viterbi algorithm for decoding the most likely sequence of states, and evaluated this method on a large dataset of 10,172 s of PCG recorded from 112 patients (including 12,181 first and 11,627 second heart sounds). The proposed method achieved an average F1 score of 95.63 ± 0.85%, while the current state of the art achieved 86.28 ± 1.55% when evaluated on unseen test recordings. The greater discrimination between states afforded using logistic regression as opposed to the previous Gaussian distribution-based emission probability estimation as well as the use of an extended Viterbi algorithm allows this method to significantly outperform the current state-of-the-art method based on a two-sided paired t-test.

  11. Predictions of flood warning threshold exceedance computed with logistic regression

    Science.gov (United States)

    Diomede, Tommaso; Marsigli, Chiara; Stefania Tesini, Maria

    2017-04-01

    A method based on logistic regression is proposed for the prediction of river level threshold exceedance at different lead times (from +6h up to +42h). The aim of the study is to provide a valuable tool for the issue of warnings by the authority responsible of public safety in case of flood. The role of different precipitation periods as predictors for the exceedance of a fixed river level has been investigated, in order to derive significant information for flood forecasting. Based on catchment-averaged values, a separation of "antecedent" and "peak-triggering" rainfall amounts as independent variables is attempted. In particular, the following flood-related precipitation periods have been considered: (i) the period from 1 to n days before the forecast issue time, which may be relevant for the soil saturation ("state of the catchment"), (ii) the last 24 hours, which may be relevant for the current water level in the river ("state of the river"), and (iii) the period from 0 to x hours in advance with respect to the forecast issue time, when the flood-triggering precipitation generally occurs ("state of the atmosphere"). Several combinations and values of these predictors have been tested to optimise the method implementation. In particular, the period for the precursor antecedent precipitation ranges between 5 and 45 days; the current "state of the river" can be represented by the last 24-h precipitation or, as alternative, by the current river level. The flood-triggering precipitation has been cumulated over the next 18-42 hours, or the previous 6-12h, according to the forecast lead time. The proposed approach requires a specific implementation of logistic regression for each river section and warning threshold. The method performance has been evaluated over several catchments in the Emilia-Romagna Region, northern Italy, which dimensions range from 100 to 1000 km2. A statistical analysis in terms of false alarms, misses and related scores was carried out by using

  12. Exploring Public Perception of Paratransit Service Using Binomial Logistic Regression

    Directory of Open Access Journals (Sweden)

    Hisashi Kubota

    2007-01-01

    Full Text Available Knowledge of the market is a requirement for a successful provision of public transportation. This study aims to explore public perception of paratransit service, as represented by the user and non-user of paratransit. The analysis has been conducted based on the public’s response, by creating several binomial logistic regression models using the public perception of the quality of service, quality of car, quality of driver, and fare. These models illustrate the characteristics and important variables to establish whether the public will use more paratransit in the future once improvements will have been made. Moreover, several models are developed to explore public perception in order to find out whether they agree to the replacement of paratransit with other types of transportation modes. All models are well fitting. These models are able to explain the respondents’ characteristics and to reveal their actual perception of the operation of paratransit. This study provides a useful tool to know the market in greater depth.

  13. Actigraphy-based scratch detection using logistic regression.

    Science.gov (United States)

    Petersen, Johanna; Austin, Daniel; Sack, Robert; Hayes, Tamara L

    2013-03-01

    Incessant scratching as a result of diseases such as atopic dermatitis causes skin break down, poor sleep quality, and reduced quality of life for affected individuals. In order to develop more effective therapies, there is a need for objective measures to detect scratching. Wrist actigraphy, which detects wrist movements over time using micro-accelerometers, has shown great promise in detecting scratch because it is lightweight, usable in the home environment, can record longitudinally, and does not require any wires. However, current actigraphy-based scratch-detection methods are limited in their ability to discriminate scratch from other nighttime activities. Our previous work demonstrated the separability of scratch from both walking and restless sleep using a clustering technique which employed four features derived from the actigraphic data: number of accelerations above 0.01 gs, epoch variance, peak frequency, and autocorrelation value at one lag. In this paper, we extended these results by employing these same features as independent variables in a logistic regression model. This allows us to directly estimate the conditional probability of scratching for each epoch. Our approach outperforms competing actigraphy-based approaches and has both high sensitivity (0.96) and specificity (0.92) for identifying scratch as validated on experimental data collected from 12 healthy subjects. The model must still be fully validated on clinical data, but shows promise for applications to clinical trials and longitudinal studies of scratch.

  14. Using Logistic Regression to Identify Risk Factors Causing Rollover Collisions

    Directory of Open Access Journals (Sweden)

    Essam Dabbour

    2012-12-01

    Full Text Available Rollover collisions are among the most serious collisions that usually result in severe injuries or fatalities. In 2009, there were 8,732 fatal rollover collisions in the United States of America that resulted in the death of 9,833 persons. Those numbers represent approximately 28% and 29% of the total numbers of fatal collisions and fatalities, respectively. The main objective of this paper is to examine the impact of different risk factors that may contribute to this type of serious collisions to help develop countermeasures that limit them. To avoid the bias that may be caused by interactions among different drivers, this analysis focuses on rollover related to single-vehicle collisions so that the behavior of the driver of the collided vehicle can be analyzed more effectively. Logistic regression technique is utilized to analyze single-vehicle rollover collisions that occurred on state and interstate highways in the states of Ohio and Washington in 2009. The results obtained from this analysis have the potential to help decision makers identify different strategies to limit the severity of this type of collisions.

  15. Electronic Commerce Data Mining using Rough Set and Logistic Regression

    Directory of Open Access Journals (Sweden)

    Xiuli Li

    2014-05-01

    Full Text Available Electronic commerce (E-commerce has gradually been the mainstream of business. There may be some unpredictable but frequent problems such as delay in shipment, shipping errors caused by E-commerce participants’ low efficiency. There problems will have negative impact on the business of participants eventually. Correct evaluation of the efficiency of E-commerce is an important way to improve operations. This paper introduces the knowledge discovery theory of data mining-based on Rough Set Theory (RST to deal with the vague and inaccurate information about the evaluation of supplier and mine the law knowledge that exists between input variables and adverse position. The output of RST is then used as the feature and is delivered to the Logistic Regression (LR to rank the product of electronic commerce website. The proposed approach, termed as RST-LR, is composed of the procedure of attribute values discretization; filtration processing of minimum attributes sets; evaluation rule; calculating the ranking accuracy and the establishment of evaluation systems. We evaluated the proposed approach on a real world dataset, The experimental results show that it achievesa high accuracy, and the rule has met the requirements of application

  16. Logistic Regression Models to Forecast Travelling Behaviour in Tripoli City

    Directory of Open Access Journals (Sweden)

    Amiruddin Ismail

    2011-01-01

    Full Text Available Transport modes are very important to Libyan’s Tripoli residents for their daily trips. However, the total number of own car and private transport namely taxi and micro buses on the road increases and causes many problems such as traffic congestion, accidents, air and noise pollution. These problems then causes other related phenomena to the travel activities such as delay in trips, stress and frustration to motorists which may affect their productivity and efficiency to both workers and students. Delay may also increase travel cost as well inefficiency in trips making if compare to other public transport users in some Arabs cities. Switching to public transport (PT modes alternatives such as buses, light rail transit and underground train could improve travel time and travel costs. A transport study has been carried out at Tripoli City Authority areas among own car users who live in areas with inadequate of private transport and poor public transportation services. Analyses about relation between factors such as travel time, travel cost, trip purpose and parking cost have been made to answer research questions. Logistic regression technique has been used to analyse these factors that influence users to switch their trips mode to public transport alternatives.

  17. Achieving Both Valid and Secure Logistic Regression Analysis on Aggregated Data from Different Private Sources

    CERN Document Server

    Hall, Rob; Fienberg, Stephen

    2011-01-01

    Preserving the privacy of individual databases when carrying out statistical calculations has a long history in statistics and had been the focus of much recent attention in machine learning In this paper, we present a protocol for computing logistic regression when the data are held by separate parties without actually combining information sources by exploiting results from the literature on multi-party secure computation. We provide only the final result of the calculation compared with other methods that share intermediate values and thus present an opportunity for compromise of values in the combined database. Our paper has two themes: (1) the development of a secure protocol for computing the logistic parameters, and a demonstration of its performances in practice, and (2) and amended protocol that speeds up the computation of the logistic function. We illustrate the nature of the calculations and their accuracy using an extract of data from the Current Population Survey divided between two parties.

  18. Application of ordinal logistic regression analysis in determining risk factors of child malnutrition in Bangladesh

    Directory of Open Access Journals (Sweden)

    Das Sumonkanti

    2011-11-01

    Full Text Available Abstract Background The study attempts to develop an ordinal logistic regression (OLR model to identify the determinants of child malnutrition instead of developing traditional binary logistic regression (BLR model using the data of Bangladesh Demographic and Health Survey 2004. Methods Based on weight-for-age anthropometric index (Z-score child nutrition status is categorized into three groups-severely undernourished ( Results All the models determine that age of child, birth interval, mothers' education, maternal nutrition, household wealth status, child feeding index, and incidence of fever, ARI & diarrhoea were the significant predictors of child malnutrition; however, results of PPOM were more precise than those of other models. Conclusion These findings clearly justify that OLR models (POM and PPOM are appropriate to find predictors of malnutrition instead of BLR models.

  19. Sample size determination for logistic regression on a logit-normal distribution.

    Science.gov (United States)

    Kim, Seongho; Heath, Elisabeth; Heilbrun, Lance

    2017-06-01

    Although the sample size for simple logistic regression can be readily determined using currently available methods, the sample size calculation for multiple logistic regression requires some additional information, such as the coefficient of determination ([Formula: see text]) of a covariate of interest with other covariates, which is often unavailable in practice. The response variable of logistic regression follows a logit-normal distribution which can be generated from a logistic transformation of a normal distribution. Using this property of logistic regression, we propose new methods of determining the sample size for simple and multiple logistic regressions using a normal transformation of outcome measures. Simulation studies and a motivating example show several advantages of the proposed methods over the existing methods: (i) no need for [Formula: see text] for multiple logistic regression, (ii) available interim or group-sequential designs, and (iii) much smaller required sample size.

  20. The effect of high leverage points on the logistic ridge regression estimator having multicollinearity

    Science.gov (United States)

    Ariffin, Syaiba Balqish; Midi, Habshah

    2014-06-01

    This article is concerned with the performance of logistic ridge regression estimation technique in the presence of multicollinearity and high leverage points. In logistic regression, multicollinearity exists among predictors and in the information matrix. The maximum likelihood estimator suffers a huge setback in the presence of multicollinearity which cause regression estimates to have unduly large standard errors. To remedy this problem, a logistic ridge regression estimator is put forward. It is evident that the logistic ridge regression estimator outperforms the maximum likelihood approach for handling multicollinearity. The effect of high leverage points are then investigated on the performance of the logistic ridge regression estimator through real data set and simulation study. The findings signify that logistic ridge regression estimator fails to provide better parameter estimates in the presence of both high leverage points and multicollinearity.

  1. On the Usefulness of a Multilevel Logistic Regression Approach to Person-Fit Analysis

    Science.gov (United States)

    Conijn, Judith M.; Emons, Wilco H. M.; van Assen, Marcel A. L. M.; Sijtsma, Klaas

    2011-01-01

    The logistic person response function (PRF) models the probability of a correct response as a function of the item locations. Reise (2000) proposed to use the slope parameter of the logistic PRF as a person-fit measure. He reformulated the logistic PRF model as a multilevel logistic regression model and estimated the PRF parameters from this…

  2. On the Usefulness of a Multilevel Logistic Regression Approach to Person-Fit Analysis

    Science.gov (United States)

    Conijn, Judith M.; Emons, Wilco H. M.; van Assen, Marcel A. L. M.; Sijtsma, Klaas

    2011-01-01

    The logistic person response function (PRF) models the probability of a correct response as a function of the item locations. Reise (2000) proposed to use the slope parameter of the logistic PRF as a person-fit measure. He reformulated the logistic PRF model as a multilevel logistic regression model and estimated the PRF parameters from this…

  3. Logistics Management In Nigeria: Some Survey Results | Ojadi ...

    African Journals Online (AJOL)

    Logistics Management In Nigeria: Some Survey Results. ... During the last few years the word logistics has become a more frequently used word in the business ... materials management and distribution processes into a logistics supply chain.

  4. A Comparative Study of Cox Regression vs. Log-Logistic ...

    African Journals Online (AJOL)

    Journal of Medical and Biomedical Sciences ... using non-parametric Cox model and parametric Log-logistic model, factors influencing survival of ... colorectal cancer referred to Taleghani Medical and Training Center of Tehran between 2001 ...

  5. Comparing Methodologies for Developing an Early Warning System: Classification and Regression Tree Model versus Logistic Regression. REL 2015-077

    Science.gov (United States)

    Koon, Sharon; Petscher, Yaacov

    2015-01-01

    The purpose of this report was to explicate the use of logistic regression and classification and regression tree (CART) analysis in the development of early warning systems. It was motivated by state education leaders' interest in maintaining high classification accuracy while simultaneously improving practitioner understanding of the rules by…

  6. What Are the Odds of that? A Primer on Understanding Logistic Regression

    Science.gov (United States)

    Huang, Francis L.; Moon, Tonya R.

    2013-01-01

    The purpose of this Methodological Brief is to present a brief primer on logistic regression, a commonly used technique when modeling dichotomous outcomes. Using data from the National Education Longitudinal Study of 1988 (NELS:88), logistic regression techniques were used to investigate student-level variables in eighth grade (i.e., enrolled in a…

  7. Power and Sample Size Calculations for Logistic Regression Tests for Differential Item Functioning

    Science.gov (United States)

    Li, Zhushan

    2014-01-01

    Logistic regression is a popular method for detecting uniform and nonuniform differential item functioning (DIF) effects. Theoretical formulas for the power and sample size calculations are derived for likelihood ratio tests and Wald tests based on the asymptotic distribution of the maximum likelihood estimators for the logistic regression model.…

  8. Comparison of standard maximum likelihood classification and polytomous logistic regression used in remote sensing

    Science.gov (United States)

    John Hogland; Nedret Billor; Nathaniel Anderson

    2013-01-01

    Discriminant analysis, referred to as maximum likelihood classification within popular remote sensing software packages, is a common supervised technique used by analysts. Polytomous logistic regression (PLR), also referred to as multinomial logistic regression, is an alternative classification approach that is less restrictive, more flexible, and easy to interpret. To...

  9. What Are the Odds of that? A Primer on Understanding Logistic Regression

    Science.gov (United States)

    Huang, Francis L.; Moon, Tonya R.

    2013-01-01

    The purpose of this Methodological Brief is to present a brief primer on logistic regression, a commonly used technique when modeling dichotomous outcomes. Using data from the National Education Longitudinal Study of 1988 (NELS:88), logistic regression techniques were used to investigate student-level variables in eighth grade (i.e., enrolled in a…

  10. Power and Sample Size Calculations for Logistic Regression Tests for Differential Item Functioning

    Science.gov (United States)

    Li, Zhushan

    2014-01-01

    Logistic regression is a popular method for detecting uniform and nonuniform differential item functioning (DIF) effects. Theoretical formulas for the power and sample size calculations are derived for likelihood ratio tests and Wald tests based on the asymptotic distribution of the maximum likelihood estimators for the logistic regression model.…

  11. A Bayesian goodness of fit test and semiparametric generalization of logistic regression with measurement data.

    Science.gov (United States)

    Schörgendorfer, Angela; Branscum, Adam J; Hanson, Timothy E

    2013-06-01

    Logistic regression is a popular tool for risk analysis in medical and population health science. With continuous response data, it is common to create a dichotomous outcome for logistic regression analysis by specifying a threshold for positivity. Fitting a linear regression to the nondichotomized response variable assuming a logistic sampling model for the data has been empirically shown to yield more efficient estimates of odds ratios than ordinary logistic regression of the dichotomized endpoint. We illustrate that risk inference is not robust to departures from the parametric logistic distribution. Moreover, the model assumption of proportional odds is generally not satisfied when the condition of a logistic distribution for the data is violated, leading to biased inference from a parametric logistic analysis. We develop novel Bayesian semiparametric methodology for testing goodness of fit of parametric logistic regression with continuous measurement data. The testing procedures hold for any cutoff threshold and our approach simultaneously provides the ability to perform semiparametric risk estimation. Bayes factors are calculated using the Savage-Dickey ratio for testing the null hypothesis of logistic regression versus a semiparametric generalization. We propose a fully Bayesian and a computationally efficient empirical Bayesian approach to testing, and we present methods for semiparametric estimation of risks, relative risks, and odds ratios when parametric logistic regression fails. Theoretical results establish the consistency of the empirical Bayes test. Results from simulated data show that the proposed approach provides accurate inference irrespective of whether parametric assumptions hold or not. Evaluation of risk factors for obesity shows that different inferences are derived from an analysis of a real data set when deviations from a logistic distribution are permissible in a flexible semiparametric framework.

  12. Application of ordinal logistic regression analysis in determining risk factors of child malnutrition in Bangladesh.

    Science.gov (United States)

    Das, Sumonkanti; Rahman, Rajwanur M

    2011-11-14

    The study attempts to develop an ordinal logistic regression (OLR) model to identify the determinants of child malnutrition instead of developing traditional binary logistic regression (BLR) model using the data of Bangladesh Demographic and Health Survey 2004. Based on weight-for-age anthropometric index (Z-score) child nutrition status is categorized into three groups-severely undernourished (malnutrition and severe malnutrition if the proportional odds assumption satisfies. The assumption is satisfied with low p-value (0.144) due to violation of the assumption for one co-variate. So partial proportional odds model (PPOM) and two BLR models have also been developed to check the applicability of the OLR model. Graphical test has also been adopted for checking the proportional odds assumption. All the models determine that age of child, birth interval, mothers' education, maternal nutrition, household wealth status, child feeding index, and incidence of fever, ARI & diarrhoea were the significant predictors of child malnutrition; however, results of PPOM were more precise than those of other models. These findings clearly justify that OLR models (POM and PPOM) are appropriate to find predictors of malnutrition instead of BLR models.

  13. GIS-based logistic regression method for landslide susceptibility mapping in regional scale

    Institute of Scientific and Technical Information of China (English)

    ZHU Lei; HUANG Jing-feng

    2006-01-01

    Landslide susceptibility map is one of the study fields portraying the spatial distribution of future slope failure susceptibility. This paper deals with past methods for producing landslide susceptibility map and divides these methods into 3 types.The logistic linear regression approach is further elaborated on by crosstabs method, which is used to analyze the relationship between the categorical or binary response variable and one or more continuous or categorical or binary explanatory variables derived from samples. It is an objective assignment of coefficients serving as weights of various factors under considerations while expert opinions make great difference in heuristic approaches. Different from deterministic approach, it is very applicable to regional scale. In this study, double logistic regression is applied in the study area. The entire study area is first analyzed. The logistic regression equation showed that elevation, proximity to road, river and residential area are main factors triggering landslide occurrence in this area. The prediction accuracy of the first landslide susceptibility map was showed to be 80%. Along the road and residential area, almost all areas are in high landslide susceptibility zone. Some non-landslide areas are incorrectly divided into high and medium landslide susceptibility zone. In order to improve the status, a second logistic regression was done in high landslide susceptibility zone using landslide cells and non-landslide sample cells in this area. In the second logistic regression analysis, only engineering and geological conditions are important in these areas and are entered in the new logistic regression equation indicating that only areas with unstable engineering and geological conditions are prone to landslide during large scale engineerirg activity. Taking these two logistic regression results into account yields a new landslide susceptibility map. Double logistic regression analysis improved the non

  14. Predicting research use in a public health policy environment: results of a logistic regression analysis.

    Science.gov (United States)

    Zardo, Pauline; Collie, Alex

    2014-10-09

    Use of research evidence in public health policy decision-making is affected by a range of contextual factors operating at the individual, organisational and external levels. Context-specific research is needed to target and tailor research translation intervention design and implementation to ensure that factors affecting research in a specific context are addressed. Whilst such research is increasing, there remain relatively few studies that have quantitatively assessed the factors that predict research use in specific public health policy environments. A quantitative survey was designed and implemented within two public health policy agencies in the Australian state of Victoria. Binary logistic regression analyses were conducted on survey data provided by 372 participants. Univariate logistic regression analyses of 49 factors revealed 26 factors that significantly predicted research use independently. The 26 factors were then tested in a single model and five factors emerged as significant predictors of research over and above all other factors. The five key factors that significantly predicted research use were the following: relevance of research to day-to-day decision-making, skills for research use, internal prompts for use of research, intention to use research within the next 12 months and the agency for which the individual worked. These findings suggest that individual- and organisational-level factors are the critical factors to target in the design of interventions aiming to increase research use in this context. In particular, relevance of research and skills for research use would be necessary to target. The likelihood for research use increased 11- and 4-fold for those who rated highly on these factors. This study builds on previous research and contributes to the currently limited number of quantitative studies that examine use of research evidence in a large sample of public health policy and program decision-makers within a specific context. The

  15. Logistic regression models for polymorphic and antagonistic pleiotropic gene action on human aging and longevity

    DEFF Research Database (Denmark)

    Tan, Qihua; Bathum, L; Christiansen, L

    2003-01-01

    In this paper, we apply logistic regression models to measure genetic association with human survival for highly polymorphic and pleiotropic genes. By modelling genotype frequency as a function of age, we introduce a logistic regression model with polytomous responses to handle the polymorphic...... situation. Genotype and allele-based parameterization can be used to investigate the modes of gene action and to reduce the number of parameters, so that the power is increased while the amount of multiple testing minimized. A binomial logistic regression model with fractional polynomials is used to capture...

  16. Logistic Regression Analysis on Factors Affecting Adoption of RiceFish Farming in North Iran

    Institute of Scientific and Technical Information of China (English)

    Seyyed Ali NOORHOSSEINI-NIYAKI; Mohammad Sadegh ALLAHYARI

    2012-01-01

    We evaluated the factors influencing the adoption of rice-fish farming in the Tavalesh region near the Caspian Sea in northern Iran.We conducted a survey with open-ended questions.Data were collected from 184 respondents (61 adopters and 123 non-adopters) randomly sampled from selected villages and analyzed using logistic regression and multiresponse analysis.Family size,number of contacts with an extension agent,participation in extension-education activities,membership in social institutions and the presence of farm workers were the most important socioeconomic factors for the adoption of rice-fish farming system.In addition,economic problems were the most common issue reported by adopters.Other issues such as lack of access to appropriate fish food,losses of fish,lack of access to high quality fish fingerlings and dehydration and poor water quality were also important to a number of farmers.

  17. A simulation study of sample size for multilevel logistic regression models

    Directory of Open Access Journals (Sweden)

    Moineddin Rahim

    2007-07-01

    Full Text Available Abstract Background Many studies conducted in health and social sciences collect individual level data as outcome measures. Usually, such data have a hierarchical structure, with patients clustered within physicians, and physicians clustered within practices. Large survey data, including national surveys, have a hierarchical or clustered structure; respondents are naturally clustered in geographical units (e.g., health regions and may be grouped into smaller units. Outcomes of interest in many fields not only reflect continuous measures, but also binary outcomes such as depression, presence or absence of a disease, and self-reported general health. In the framework of multilevel studies an important problem is calculating an adequate sample size that generates unbiased and accurate estimates. Methods In this paper simulation studies are used to assess the effect of varying sample size at both the individual and group level on the accuracy of the estimates of the parameters and variance components of multilevel logistic regression models. In addition, the influence of prevalence of the outcome and the intra-class correlation coefficient (ICC is examined. Results The results show that the estimates of the fixed effect parameters are unbiased for 100 groups with group size of 50 or higher. The estimates of the variance covariance components are slightly biased even with 100 groups and group size of 50. The biases for both fixed and random effects are severe for group size of 5. The standard errors for fixed effect parameters are unbiased while for variance covariance components are underestimated. Results suggest that low prevalent events require larger sample sizes with at least a minimum of 100 groups and 50 individuals per group. Conclusion We recommend using a minimum group size of 50 with at least 50 groups to produce valid estimates for multi-level logistic regression models. Group size should be adjusted under conditions where the prevalence

  18. Logistic regression models of factors influencing the location of bioenergy and biofuels plants

    Science.gov (United States)

    T.M. Young; R.L. Zaretzki; J.H. Perdue; F.M. Guess; X. Liu

    2011-01-01

    Logistic regression models were developed to identify significant factors that influence the location of existing wood-using bioenergy/biofuels plants and traditional wood-using facilities. Logistic models provided quantitative insight for variables influencing the location of woody biomass-using facilities. Availability of "thinnings to a basal area of 31.7m2/ha...

  19. A hybrid model using logistic regression and wavelet transformation to detect traffic incidents

    Directory of Open Access Journals (Sweden)

    Shaurya Agarwal

    2016-07-01

    Full Text Available This research paper investigates a hybrid model using logistic regression with a wavelet-based feature extraction for detecting traffic incidents. A logistic regression model is suitable when the outcome can take only a limited number of values. For traffic incident detection, the outcome is limited to only two values, the presence or absence of an incident. The logistic regression model used in this study is a generalized linear model (GLM with a binomial response and a logit link function. This paper presents a framework to use logistic regression and wavelet-based feature extraction for traffic incident detection. It investigates the effect of preprocessing data on the performance of incident detection models. Results of this study indicate that logistic regression along with wavelet based feature extraction can be used effectively for incident detection by balancing the incident detection rate and the false alarm rate according to need. Logistic regression on raw data resulted in a maximum detection rate of 95.4% at the cost of 14.5% false alarm rate. Whereas the hybrid model achieved a maximum detection rate of 98.78% at the expense of 6.5% false alarm rate. Results indicate that the proposed approach is practical and efficient; with future improvements in the proposed technique, it will make an effective tool for traffic incident detection.

  20. Mortality risk prediction in burn injury: Comparison of logistic regression with machine learning approaches.

    Science.gov (United States)

    Stylianou, Neophytos; Akbarov, Artur; Kontopantelis, Evangelos; Buchan, Iain; Dunn, Ken W

    2015-08-01

    Predicting mortality from burn injury has traditionally employed logistic regression models. Alternative machine learning methods have been introduced in some areas of clinical prediction as the necessary software and computational facilities have become accessible. Here we compare logistic regression and machine learning predictions of mortality from burn. An established logistic mortality model was compared to machine learning methods (artificial neural network, support vector machine, random forests and naïve Bayes) using a population-based (England & Wales) case-cohort registry. Predictive evaluation used: area under the receiver operating characteristic curve; sensitivity; specificity; positive predictive value and Youden's index. All methods had comparable discriminatory abilities, similar sensitivities, specificities and positive predictive values. Although some machine learning methods performed marginally better than logistic regression the differences were seldom statistically significant and clinically insubstantial. Random forests were marginally better for high positive predictive value and reasonable sensitivity. Neural networks yielded slightly better prediction overall. Logistic regression gives an optimal mix of performance and interpretability. The established logistic regression model of burn mortality performs well against more complex alternatives. Clinical prediction with a small set of strong, stable, independent predictors is unlikely to gain much from machine learning outside specialist research contexts. Copyright © 2015 Elsevier Ltd and ISBI. All rights reserved.

  1. Spatial modelling of periglacial phenomena in Deception Island (Maritime Antarctic): logistic regression and informative value method.

    Science.gov (United States)

    Melo, Raquel; Vieira, Gonçalo; Caselli, Alberto; Ramos, Miguel

    2010-05-01

    Field surveying during the austral summer of 2007/08 and the analysis of a QuickBird satellite image, resulted on the production of a detailed geomorphological map of the Irizar and Crater Lake area in Deception Island (South Shetlands, Maritime Antarctic - 1:10 000) and allowed its analysis and spatial modelling of the geomorphological phenomena. The present study focus on the analysis of the spatial distribution and characteristics of hummocky terrains, lag surfaces and nivation hollows, complemented by GIS spatial modelling intending to identify relevant controlling geographical factors. Models of the susceptibility of occurrence of these phenomena were created using two statistical methods: logistical regression, as a multivariate method; and the informative value as a bivariate method. Success and prediction rate curves were used for model validation. The Area Under the Curve (AUC) was used to quantify the level of performance and prediction of the models and to allow the comparison between the two methods. Regarding the logistic regression method, the AUC showed a success rate of 71% for the lag surfaces, 81% for the hummocky terrains and 78% for the nivation hollows. The prediction rate was 72%, 68% and 71%, respectively. Concerning the informative value method, the success rate was 69% for the lag surfaces, 84% for the hummocky terrains and 78% for the nivation hollows, and with a correspondingly prediction of 71%, 66% and 69%. The results were of very good quality and demonstrate the potential of the models to predict the influence of independent variables in the occurrence of the geomorphological phenomena and also the reliability of the data. Key-words: present-day geomorphological dynamics, detailed geomorphological mapping, GIS, spatial modelling, Deception Island, Antarctic.

  2. Binary logistic regression-Instrument for assessing museum indoor air impact on exhibits.

    Science.gov (United States)

    Bucur, Elena; Danet, Andrei Florin; Lehr, Carol Blaziu; Lehr, Elena; Nita-Lazar, Mihai

    2017-04-01

    This paper presents a new way to assess the environmental impact on historical artifacts using binary logistic regression. The prediction of the impact on the exhibits during certain pollution scenarios (environmental impact) was calculated by a mathematical model based on the binary logistic regression; it allows the identification of those environmental parameters from a multitude of possible parameters with a significant impact on exhibitions and ranks them according to their severity effect. Air quality (NO2, SO2, O3 and PM2.5) and microclimate parameters (temperature, humidity) monitoring data from a case study conducted within exhibition and storage spaces of the Romanian National Aviation Museum Bucharest have been used for developing and validating the binary logistic regression method and the mathematical model. The logistic regression analysis was used on 794 data combinations (715 to develop of the model and 79 to validate it) by a Statistical Package for Social Sciences (SPSS 20.0). The results from the binary logistic regression analysis demonstrated that from six parameters taken into consideration, four of them present a significant effect upon exhibits in the following order: O3>PM2.5>NO2>humidity followed at a significant distance by the effects of SO2 and temperature. The mathematical model, developed in this study, correctly predicted 95.1 % of the cumulated effect of the environmental parameters upon the exhibits. Moreover, this model could also be used in the decisional process regarding the preventive preservation measures that should be implemented within the exhibition space.

  3. Modelling of binary logistic regression for obesity among secondary students in a rural area of Kedah

    Science.gov (United States)

    Kamaruddin, Ainur Amira; Ali, Zalila; Noor, Norlida Mohd.; Baharum, Adam; Ahmad, Wan Muhamad Amir W.

    2014-07-01

    Logistic regression analysis examines the influence of various factors on a dichotomous outcome by estimating the probability of the event's occurrence. Logistic regression, also called a logit model, is a statistical procedure used to model dichotomous outcomes. In the logit model the log odds of the dichotomous outcome is modeled as a linear combination of the predictor variables. The log odds ratio in logistic regression provides a description of the probabilistic relationship of the variables and the outcome. In conducting logistic regression, selection procedures are used in selecting important predictor variables, diagnostics are used to check that assumptions are valid which include independence of errors, linearity in the logit for continuous variables, absence of multicollinearity, and lack of strongly influential outliers and a test statistic is calculated to determine the aptness of the model. This study used the binary logistic regression model to investigate overweight and obesity among rural secondary school students on the basis of their demographics profile, medical history, diet and lifestyle. The results indicate that overweight and obesity of students are influenced by obesity in family and the interaction between a student's ethnicity and routine meals intake. The odds of a student being overweight and obese are higher for a student having a family history of obesity and for a non-Malay student who frequently takes routine meals as compared to a Malay student.

  4. Comparison of artificial neural networks with logistic regression for detection of obesity.

    Science.gov (United States)

    Heydari, Seyed Taghi; Ayatollahi, Seyed Mohammad Taghi; Zare, Najaf

    2012-08-01

    Obesity is a common problem in nutrition, both in the developed and developing countries. The aim of this study was to classify obesity by artificial neural networks and logistic regression. This cross-sectional study comprised of 414 healthy military personnel in southern Iran. All subjects completed questionnaires on their socio-economic status and their anthropometric measures were measured by a trained nurse. Classification of obesity was done by artificial neural networks and logistic regression. The mean age±SD of participants was 34.4 ± 7.5 years. A total of 187 (45.2%) were obese. In regard to logistic regression and neural networks the respective values were 80.2% and 81.2% when correctly classified, 80.2 and 79.7 for sensitivity and 81.9 and 83.7 for specificity; while the area under Receiver-Operating Characteristic (ROC) curve were 0.888 and 0.884 and the Kappa statistic were 0.600 and 0.629 for logistic regression and neural networks model respectively. We conclude that the neural networks and logistic regression both were good classifier for obesity detection but they were not significantly different in classification.

  5. Combining the Performance Strengths of the Logistic Regression and Neural Network Models: A Medical Outcomes Approach

    Directory of Open Access Journals (Sweden)

    Wun Wong

    2003-01-01

    Full Text Available The assessment of medical outcomes is important in the effort to contain costs, streamline patient management, and codify medical practices. As such, it is necessary to develop predictive models that will make accurate predictions of these outcomes. The neural network methodology has often been shown to perform as well, if not better, than the logistic regression methodology in terms of sample predictive performance. However, the logistic regression method is capable of providing an explanation regarding the relationship(s between variables. This explanation is often crucial to understanding the clinical underpinnings of the disease process. Given the respective strengths of the methodologies in question, the combined use of a statistical (i.e., logistic regression and machine learning (i.e., neural network technology in the classification of medical outcomes is warranted under appropriate conditions. The study discusses these conditions and describes an approach for combining the strengths of the models.

  6. Fuzzy multinomial logistic regression analysis: A multi-objective programming approach

    Science.gov (United States)

    Abdalla, Hesham A.; El-Sayed, Amany A.; Hamed, Ramadan

    2017-05-01

    Parameter estimation for multinomial logistic regression is usually based on maximizing the likelihood function. For large well-balanced datasets, Maximum Likelihood (ML) estimation is a satisfactory approach. Unfortunately, ML can fail completely or at least produce poor results in terms of estimated probabilities and confidence intervals of parameters, specially for small datasets. In this study, a new approach based on fuzzy concepts is proposed to estimate parameters of the multinomial logistic regression. The study assumes that the parameters of multinomial logistic regression are fuzzy. Based on the extension principle stated by Zadeh and Bárdossy's proposition, a multi-objective programming approach is suggested to estimate these fuzzy parameters. A simulation study is used to evaluate the performance of the new approach versus Maximum likelihood (ML) approach. Results show that the new proposed model outperforms ML in cases of small datasets.

  7. Using the Logistic Regression model in supporting decisions of establishing marketing strategies

    Directory of Open Access Journals (Sweden)

    Cristinel CONSTANTIN

    2015-12-01

    Full Text Available This paper is about an instrumental research regarding the using of Logistic Regression model for data analysis in marketing research. The decision makers inside different organisation need relevant information to support their decisions regarding the marketing strategies. The data provided by marketing research could be computed in various ways but the multivariate data analysis models can enhance the utility of the information. Among these models we can find the Logistic Regression model, which is used for dichotomous variables. Our research is based on explanation the utility of this model and interpretation of the resulted information in order to help practitioners and researchers to use it in their future investigations

  8. Updated logistic regression equations for the calculation of post-fire debris-flow likelihood in the western United States

    Science.gov (United States)

    Staley, Dennis M.; Negri, Jacquelyn A.; Kean, Jason W.; Laber, Jayme L.; Tillery, Anne C.; Youberg, Ann M.

    2016-06-30

    Wildfire can significantly alter the hydrologic response of a watershed to the extent that even modest rainstorms can generate dangerous flash floods and debris flows. To reduce public exposure to hazard, the U.S. Geological Survey produces post-fire debris-flow hazard assessments for select fires in the western United States. We use publicly available geospatial data describing basin morphology, burn severity, soil properties, and rainfall characteristics to estimate the statistical likelihood that debris flows will occur in response to a storm of a given rainfall intensity. Using an empirical database and refined geospatial analysis methods, we defined new equations for the prediction of debris-flow likelihood using logistic regression methods. We showed that the new logistic regression model outperformed previous models used to predict debris-flow likelihood.

  9. No rationale for 1 variable per 10 events criterion for binary logistic regression analysis

    Directory of Open Access Journals (Sweden)

    Maarten van Smeden

    2016-11-01

    Full Text Available Abstract Background Ten events per variable (EPV is a widely advocated minimal criterion for sample size considerations in logistic regression analysis. Of three previous simulation studies that examined this minimal EPV criterion only one supports the use of a minimum of 10 EPV. In this paper, we examine the reasons for substantial differences between these extensive simulation studies. Methods The current study uses Monte Carlo simulations to evaluate small sample bias, coverage of confidence intervals and mean square error of logit coefficients. Logistic regression models fitted by maximum likelihood and a modified estimation procedure, known as Firth’s correction, are compared. Results The results show that besides EPV, the problems associated with low EPV depend on other factors such as the total sample size. It is also demonstrated that simulation results can be dominated by even a few simulated data sets for which the prediction of the outcome by the covariates is perfect (‘separation’. We reveal that different approaches for identifying and handling separation leads to substantially different simulation results. We further show that Firth’s correction can be used to improve the accuracy of regression coefficients and alleviate the problems associated with separation. Conclusions The current evidence supporting EPV rules for binary logistic regression is weak. Given our findings, there is an urgent need for new research to provide guidance for supporting sample size considerations for binary logistic regression analysis.

  10. [Evaluation of wall configuration ultrasonogrophicin diagnosis of thyroid small nodules using binary logistic regression].

    Science.gov (United States)

    Fu, Qiaomei; Wu, Pengxi; Ding, Yan

    2015-10-01

    To screen out the sonogram features for the differential diagnosis of benign and malignant thyroid small nodules (≤ 1.0 cm) by Logistics regression analysis, to establish the binary Logistic regression model of sonogram features as independent variable and investigate the value of wall configuration of ultrasonogrophic nodules in the differential diagnosis of benign and malignant thyroid small nodules. A total of 208 thyroid nodules ≤ 1.0 cm in diameter in 190 patients were evaluated. With postoperative pathological examination or fine needle aspiration biopsy, 106 nodules were confirmed as benign and 102 as malignant. Ultrasonic features of thyroid nodules were evaluated for the differential diagnosis of benign and malignant small thyroid nodules based on pathological diagnosis as a gold standard, a Logistic model was obtained, and the odds ratio of variables were compared. The margin of thyroid nodule was divided into regular or irregular margin, and the latter was divided further into four subtypes: strip, triangular, antler and papillary. The border was divided into clear, fuzzy or both. The periphery was divided into those with normal and abnormal echo;. The calcification included no calcification, microcalcification and non-microcalcification. Four statistically significant features were obtained finally by Logistics regression analysis, including margin, border, periphery and calcification. A formula was constructed by two-variables logistic regression analysis and probability of malignancy = 1/(1 + e - z), in which z = 5.026 × margin + 4.218 × border + 4.024 × periphery + 3.892 × calcification - 15.247. The odds ratio of margin was higher than the other independent variables. Logistics regression analysis indicates that the calcification, border, periphery, and especially margin of thyroid nodules are significant features for differentiating benign and malignant thyroid nodules. The margin score was more intuitionistic for the differentialtion of

  11. Analysing the forward premium anomaly using a Logistic Smooth Transition Regression model.

    OpenAIRE

    Sofiane Amri

    2008-01-01

    Several researchers have suggested that exchange rates may be characterized by nonlinear behaviour. This paper examines these nonlinearities and asymetries and estimates a Logistic Transition Regression (LSTR) of Fama Regression with the Risk Adjusted Forward Premia as transition variable. Results confirm the existence of nonlinear dynamics in the relationship between spot exchange rate differential and the forward premium for all the currencies of the sample and for all maturities (three and...

  12. A brief conceptual tutorial of multilevel analysis in social epidemiology: using measures of clustering in multilevel logistic regression to investigate contextual phenomena

    DEFF Research Database (Denmark)

    Merlo, J; Chaix, B; Ohlsson, H

    2006-01-01

    STUDY OBJECTIVE: In social epidemiology, it is easy to compute and interpret measures of variation in multilevel linear regression, but technical difficulties exist in the case of logistic regression. The aim of this study was to present measures of variation appropriate for the logistic case...... in a didactic rather than a mathematical way. Design and PARTICIPANTS: Data were used from the health survey conducted in 2000 in the county of Scania, Sweden, that comprised 10 723 persons aged 18-80 years living in 60 areas. Conducting multilevel logistic regression different techniques were applied...... propensity areas with the area educational level. The sorting out index was equal to 82%. CONCLUSION: Measures of variation in logistic regression should be promoted in social epidemiological and public health research as efficient means of quantifying the importance of the context of residence...

  13. A brief conceptual tutorial of multilevel analysis in social epidemiology: using measures of clustering in multilevel logistic regression to investigate contextual phenomena

    DEFF Research Database (Denmark)

    Merlo, J; Chaix, B; Ohlsson, H;

    2006-01-01

    in a didactic rather than a mathematical way. Design and PARTICIPANTS: Data were used from the health survey conducted in 2000 in the county of Scania, Sweden, that comprised 10 723 persons aged 18-80 years living in 60 areas. Conducting multilevel logistic regression different techniques were applied...... propensity areas with the area educational level. The sorting out index was equal to 82%. CONCLUSION: Measures of variation in logistic regression should be promoted in social epidemiological and public health research as efficient means of quantifying the importance of the context of residence......STUDY OBJECTIVE: In social epidemiology, it is easy to compute and interpret measures of variation in multilevel linear regression, but technical difficulties exist in the case of logistic regression. The aim of this study was to present measures of variation appropriate for the logistic case...

  14. Classification and regression tree analysis vs. multivariable linear and logistic regression methods as statistical tools for studying haemophilia.

    Science.gov (United States)

    Henrard, S; Speybroeck, N; Hermans, C

    2015-11-01

    Haemophilia is a rare genetic haemorrhagic disease characterized by partial or complete deficiency of coagulation factor VIII, for haemophilia A, or IX, for haemophilia B. As in any other medical research domain, the field of haemophilia research is increasingly concerned with finding factors associated with binary or continuous outcomes through multivariable models. Traditional models include multiple logistic regressions, for binary outcomes, and multiple linear regressions for continuous outcomes. Yet these regression models are at times difficult to implement, especially for non-statisticians, and can be difficult to interpret. The present paper sought to didactically explain how, why, and when to use classification and regression tree (CART) analysis for haemophilia research. The CART method is non-parametric and non-linear, based on the repeated partitioning of a sample into subgroups based on a certain criterion. Breiman developed this method in 1984. Classification trees (CTs) are used to analyse categorical outcomes and regression trees (RTs) to analyse continuous ones. The CART methodology has become increasingly popular in the medical field, yet only a few examples of studies using this methodology specifically in haemophilia have to date been published. Two examples using CART analysis and previously published in this field are didactically explained in details. There is increasing interest in using CART analysis in the health domain, primarily due to its ease of implementation, use, and interpretation, thus facilitating medical decision-making. This method should be promoted for analysing continuous or categorical outcomes in haemophilia, when applicable. © 2015 John Wiley & Sons Ltd.

  15. Propensity Score Estimation with Data Mining Techniques: Alternatives to Logistic Regression

    Science.gov (United States)

    Keller, Bryan S. B.; Kim, Jee-Seon; Steiner, Peter M.

    2013-01-01

    Propensity score analysis (PSA) is a methodological technique which may correct for selection bias in a quasi-experiment by modeling the selection process using observed covariates. Because logistic regression is well understood by researchers in a variety of fields and easy to implement in a number of popular software packages, it has…

  16. Comparison of IRT Likelihood Ratio Test and Logistic Regression DIF Detection Procedures

    Science.gov (United States)

    Atar, Burcu; Kamata, Akihito

    2011-01-01

    The Type I error rates and the power of IRT likelihood ratio test and cumulative logit ordinal logistic regression procedures in detecting differential item functioning (DIF) for polytomously scored items were investigated in this Monte Carlo simulation study. For this purpose, 54 simulation conditions (combinations of 3 sample sizes, 2 sample…

  17. Accuracy of Bayes and Logistic Regression Subscale Probabilities for Educational and Certification Tests

    Science.gov (United States)

    Rudner, Lawrence

    2016-01-01

    In the machine learning literature, it is commonly accepted as fact that as calibration sample sizes increase, Naïve Bayes classifiers initially outperform Logistic Regression classifiers in terms of classification accuracy. Applied to subtests from an on-line final examination and from a highly regarded certification examination, this study shows…

  18. Detecting DIF in Polytomous Items Using MACS, IRT and Ordinal Logistic Regression

    Science.gov (United States)

    Elosua, Paula; Wells, Craig

    2013-01-01

    The purpose of the present study was to compare the Type I error rate and power of two model-based procedures, the mean and covariance structure model (MACS) and the item response theory (IRT), and an observed-score based procedure, ordinal logistic regression, for detecting differential item functioning (DIF) in polytomous items. A simulation…

  19. Large Scale Identification and Categorization of Protein Sequences Using Structured Logistic Regression

    DEFF Research Database (Denmark)

    Pedersen, Bjørn Panella; Ifrim, Georgiana; Liboriussen, Poul

    2014-01-01

    Abstract Background Structured Logistic Regression (SLR) is a newly developed machine learning tool first proposed in the context of text categorization. Current availability of extensive protein sequence databases calls for an automated method to reliably classify sequences and SLR seems well...

  20. A Note on Three Statistical Tests in the Logistic Regression DIF Procedure

    Science.gov (United States)

    Paek, Insu

    2012-01-01

    Although logistic regression became one of the well-known methods in detecting differential item functioning (DIF), its three statistical tests, the Wald, likelihood ratio (LR), and score tests, which are readily available under the maximum likelihood, do not seem to be consistently distinguished in DIF literature. This paper provides a clarifying…

  1. Fitting multistate transition models with autoregressive logistic regression : Supervised exercise in intermittent claudication

    NARCIS (Netherlands)

    de Vries, S O; Fidler, Vaclav; Kuipers, Wietze D; Hunink, Maria G M

    1998-01-01

    The purpose of this study was to develop a model that predicts the outcome of supervised exercise for intermittent claudication. The authors present an example of the use of autoregressive logistic regression for modeling observed longitudinal data. Data were collected from 329 participants in a six

  2. Construction of risk prediction model of type 2 diabetes mellitus based on logistic regression

    Directory of Open Access Journals (Sweden)

    Li Jian

    2017-01-01

    Full Text Available Objective: to construct multi factor prediction model for the individual risk of T2DM, and to explore new ideas for early warning, prevention and personalized health services for T2DM. Methods: using logistic regression techniques to screen the risk factors for T2DM and construct the risk prediction model of T2DM. Results: Male’s risk prediction model logistic regression equation: logit(P=BMI × 0.735+ vegetables × (−0.671 + age × 0.838+ diastolic pressure × 0.296+ physical activity× (−2.287 + sleep ×(−0.009 +smoking ×0.214; Female’s risk prediction model logistic regression equation: logit(P=BMI ×1.979+ vegetables× (−0.292 + age × 1.355+ diastolic pressure× 0.522+ physical activity × (−2.287 + sleep × (−0.010.The area under the ROC curve of male was 0.83, the sensitivity was 0.72, the specificity was 0.86, the area under the ROC curve of female was 0.84, the sensitivity was 0.75, the specificity was 0.90. Conclusion: This study model data is from a compared study of nested case, the risk prediction model has been established by using the more mature logistic regression techniques, and the model is higher predictive sensitivity, specificity and stability.

  3. A note on Bayesian logistic regression for spatial exponential family Gibbs point processes

    OpenAIRE

    Rajala, Tuomas

    2014-01-01

    Recently, a very attractive logistic regression inference method for exponential family Gibbs spatial point processes was introduced. We combined it with the technique of quadratic tangential variational approximation and derived a new Bayesian technique for analysing spatial point patterns. The technique is described in detail, and demonstrated on numerical examples.

  4. A Comparison of Logistic Regression, Neural Networks, and Classification Trees Predicting Success of Actuarial Students

    Science.gov (United States)

    Schumacher, Phyllis; Olinsky, Alan; Quinn, John; Smith, Richard

    2010-01-01

    The authors extended previous research by 2 of the authors who conducted a study designed to predict the successful completion of students enrolled in an actuarial program. They used logistic regression to determine the probability of an actuarial student graduating in the major or dropping out. They compared the results of this study with those…

  5. Macrobenthic species response surfaces along estuarine gradients: prediction by logistic regression

    NARCIS (Netherlands)

    Ysebaert, T.; Meire, P.; Herman, P.M.J.; Verbeek, H.

    2002-01-01

    This study aims at contributing to the development of statistical models to predict macrobenthic species response to environmental conditions in estuarine ecosystems. Ecological response surfaces are derived for 10 estuarine macrobenthic species. Logistic regression is applied on a large data set, p

  6. Odds Ratio, Delta, ETS Classification, and Standardization Measures of DIF Magnitude for Binary Logistic Regression

    Science.gov (United States)

    Monahan, Patrick O.; McHorney, Colleen A.; Stump, Timothy E.; Perkins, Anthony J.

    2007-01-01

    Previous methodological and applied studies that used binary logistic regression (LR) for detection of differential item functioning (DIF) in dichotomously scored items either did not report an effect size or did not employ several useful measures of DIF magnitude derived from the LR model. Equations are provided for these effect size indices.…

  7. Comparison of a Bayesian Network with a Logistic Regression Model to Forecast IgA Nephropathy

    Directory of Open Access Journals (Sweden)

    Michel Ducher

    2013-01-01

    Full Text Available Models are increasingly used in clinical practice to improve the accuracy of diagnosis. The aim of our work was to compare a Bayesian network to logistic regression to forecast IgA nephropathy (IgAN from simple clinical and biological criteria. Retrospectively, we pooled the results of all biopsies (n=155 performed by nephrologists in a specialist clinical facility between 2002 and 2009. Two groups were constituted at random. The first subgroup was used to determine the parameters of the models adjusted to data by logistic regression or Bayesian network, and the second was used to compare the performances of the models using receiver operating characteristics (ROC curves. IgAN was found (on pathology in 44 patients. Areas under the ROC curves provided by both methods were highly significant but not different from each other. Based on the highest Youden indices, sensitivity reached (100% versus 67% and specificity (73% versus 95% using the Bayesian network and logistic regression, respectively. A Bayesian network is at least as efficient as logistic regression to estimate the probability of a patient suffering IgAN, using simple clinical and biological data obtained during consultation.

  8. Comparison of a Bayesian network with a logistic regression model to forecast IgA nephropathy.

    Science.gov (United States)

    Ducher, Michel; Kalbacher, Emilie; Combarnous, François; Finaz de Vilaine, Jérome; McGregor, Brigitte; Fouque, Denis; Fauvel, Jean Pierre

    2013-01-01

    Models are increasingly used in clinical practice to improve the accuracy of diagnosis. The aim of our work was to compare a Bayesian network to logistic regression to forecast IgA nephropathy (IgAN) from simple clinical and biological criteria. Retrospectively, we pooled the results of all biopsies (n = 155) performed by nephrologists in a specialist clinical facility between 2002 and 2009. Two groups were constituted at random. The first subgroup was used to determine the parameters of the models adjusted to data by logistic regression or Bayesian network, and the second was used to compare the performances of the models using receiver operating characteristics (ROC) curves. IgAN was found (on pathology) in 44 patients. Areas under the ROC curves provided by both methods were highly significant but not different from each other. Based on the highest Youden indices, sensitivity reached (100% versus 67%) and specificity (73% versus 95%) using the Bayesian network and logistic regression, respectively. A Bayesian network is at least as efficient as logistic regression to estimate the probability of a patient suffering IgAN, using simple clinical and biological data obtained during consultation.

  9. Using ROC curves to compare neural networks and logistic regression for modeling individual noncatastrophic tree mortality

    Science.gov (United States)

    Susan L. King

    2003-01-01

    The performance of two classifiers, logistic regression and neural networks, are compared for modeling noncatastrophic individual tree mortality for 21 species of trees in West Virginia. The output of the classifier is usually a continuous number between 0 and 1. A threshold is selected between 0 and 1 and all of the trees below the threshold are classified as...

  10. Logistic回归模型及其应用%Logistic regression model and its application

    Institute of Scientific and Technical Information of China (English)

    常振海; 刘薇

    2012-01-01

    为了利用Logistic模型提高多分类定性因变量的预测准确率,在二分类Logistic回归模型的基础上,对实际统计数据建立三类别的Logistic模型.采用似然比检验法对自变量的显著性进行检验,剔除了不显著的变量;对每个类别的因变量都确定了1个线性回归函数,并进行了模型检验.分析结果表明,在处理因变量为定性变量的回归分析中,Logistic模型具有很好的预测准确度和实用推广性.%To improve the forecasting accuracy of the multinomial qualitative dependent variable by using logistic model,ternary logistic model is established for actual statistical data based on binary logistic regression model.The significance of independent variables is tested by using the likelihood ratio test method to remove the non-significant variable.A linear regression function is determined for each category dependent variable,and the models are tested.The analysis results show that logistic regression model has good predictive accuracy and practical promotional value in handling regression analysis of qualitative dependent variable.

  11. The cross-validated AUC for MCP-logistic regression with high-dimensional data.

    Science.gov (United States)

    Jiang, Dingfeng; Huang, Jian; Zhang, Ying

    2013-10-01

    We propose a cross-validated area under the receiving operator characteristic (ROC) curve (CV-AUC) criterion for tuning parameter selection for penalized methods in sparse, high-dimensional logistic regression models. We use this criterion in combination with the minimax concave penalty (MCP) method for variable selection. The CV-AUC criterion is specifically designed for optimizing the classification performance for binary outcome data. To implement the proposed approach, we derive an efficient coordinate descent algorithm to compute the MCP-logistic regression solution surface. Simulation studies are conducted to evaluate the finite sample performance of the proposed method and its comparison with the existing methods including the Akaike information criterion (AIC), Bayesian information criterion (BIC) or Extended BIC (EBIC). The model selected based on the CV-AUC criterion tends to have a larger predictive AUC and smaller classification error than those with tuning parameters selected using the AIC, BIC or EBIC. We illustrate the application of the MCP-logistic regression with the CV-AUC criterion on three microarray datasets from the studies that attempt to identify genes related to cancers. Our simulation studies and data examples demonstrate that the CV-AUC is an attractive method for tuning parameter selection for penalized methods in high-dimensional logistic regression models.

  12. Risk Factors of Falls in Community-Dwelling Older Adults: Logistic Regression Tree Analysis

    Science.gov (United States)

    Yamashita, Takashi; Noe, Douglas A.; Bailer, A. John

    2012-01-01

    Purpose of the Study: A novel logistic regression tree-based method was applied to identify fall risk factors and possible interaction effects of those risk factors. Design and Methods: A nationally representative sample of American older adults aged 65 years and older (N = 9,592) in the Health and Retirement Study 2004 and 2006 modules was used.…

  13. Risk Factors of Falls in Community-Dwelling Older Adults: Logistic Regression Tree Analysis

    Science.gov (United States)

    Yamashita, Takashi; Noe, Douglas A.; Bailer, A. John

    2012-01-01

    Purpose of the Study: A novel logistic regression tree-based method was applied to identify fall risk factors and possible interaction effects of those risk factors. Design and Methods: A nationally representative sample of American older adults aged 65 years and older (N = 9,592) in the Health and Retirement Study 2004 and 2006 modules was used.…

  14. Strategies for Testing Statistical and Practical Significance in Detecting DIF with Logistic Regression Models

    Science.gov (United States)

    Fidalgo, Angel M.; Alavi, Seyed Mohammad; Amirian, Seyed Mohammad Reza

    2014-01-01

    This study examines three controversial aspects in differential item functioning (DIF) detection by logistic regression (LR) models: first, the relative effectiveness of different analytical strategies for detecting DIF; second, the suitability of the Wald statistic for determining the statistical significance of the parameters of interest; and…

  15. A Generalized Logistic Regression Procedure to Detect Differential Item Functioning among Multiple Groups

    Science.gov (United States)

    Magis, David; Raiche, Gilles; Beland, Sebastien; Gerard, Paul

    2011-01-01

    We present an extension of the logistic regression procedure to identify dichotomous differential item functioning (DIF) in the presence of more than two groups of respondents. Starting from the usual framework of a single focal group, we propose a general approach to estimate the item response functions in each group and to test for the presence…

  16. Predictors of Placement Stability at the State Level: The Use of Logistic Regression to Inform Practice

    Science.gov (United States)

    Courtney, Jon R.; Prophet, Retta

    2011-01-01

    Placement instability is often associated with a number of negative outcomes for children. To gain state level contextual knowledge of factors associated with placement stability/instability, logistic regression was applied to selected variables from the New Mexico Adoption and Foster Care Administrative Reporting System dataset. Predictors…

  17. Comparison of IRT Likelihood Ratio Test and Logistic Regression DIF Detection Procedures

    Science.gov (United States)

    Atar, Burcu; Kamata, Akihito

    2011-01-01

    The Type I error rates and the power of IRT likelihood ratio test and cumulative logit ordinal logistic regression procedures in detecting differential item functioning (DIF) for polytomously scored items were investigated in this Monte Carlo simulation study. For this purpose, 54 simulation conditions (combinations of 3 sample sizes, 2 sample…

  18. A Note on Three Statistical Tests in the Logistic Regression DIF Procedure

    Science.gov (United States)

    Paek, Insu

    2012-01-01

    Although logistic regression became one of the well-known methods in detecting differential item functioning (DIF), its three statistical tests, the Wald, likelihood ratio (LR), and score tests, which are readily available under the maximum likelihood, do not seem to be consistently distinguished in DIF literature. This paper provides a clarifying…

  19. Accuracy of Bayes and Logistic Regression Subscale Probabilities for Educational and Certification Tests

    Science.gov (United States)

    Rudner, Lawrence

    2016-01-01

    In the machine learning literature, it is commonly accepted as fact that as calibration sample sizes increase, Naïve Bayes classifiers initially outperform Logistic Regression classifiers in terms of classification accuracy. Applied to subtests from an on-line final examination and from a highly regarded certification examination, this study shows…

  20. A Comparison of Logistic Regression, Neural Networks, and Classification Trees Predicting Success of Actuarial Students

    Science.gov (United States)

    Schumacher, Phyllis; Olinsky, Alan; Quinn, John; Smith, Richard

    2010-01-01

    The authors extended previous research by 2 of the authors who conducted a study designed to predict the successful completion of students enrolled in an actuarial program. They used logistic regression to determine the probability of an actuarial student graduating in the major or dropping out. They compared the results of this study with those…

  1. Risk stratification for prognosis in intracerebral hemorrhage: A decision tree model and logistic regression

    Directory of Open Access Journals (Sweden)

    Gang WU

    2016-01-01

    Full Text Available Objective  To analyze the risk factors for prognosis in intracerebral hemorrhage using decision tree (classification and regression tree, CART model and logistic regression model. Methods  CART model and logistic regression model were established according to the risk factors for prognosis of patients with cerebral hemorrhage. The differences in the results were compared between the two methods. Results  Logistic regression analyses showed that hematoma volume (OR-value 0.953, initial Glasgow Coma Scale (GCS score (OR-value 1.210, pulmonary infection (OR-value 0.295, and basal ganglia hemorrhage (OR-value 0.336 were the risk factors for the prognosis of cerebral hemorrhage. The results of CART analysis showed that volume of hematoma and initial GCS score were the main factors affecting the prognosis of cerebral hemorrhage. The effects of two models on the prognosis of cerebral hemorrhage were similar (Z-value 0.402, P=0.688. Conclusions  CART model has a similar value to that of logistic model in judging the prognosis of cerebral hemorrhage, and it is characterized by using transactional analysis between the risk factors, and it is more intuitive. DOI: 10.11855/j.issn.0577-7402.2015.12.13

  2. Predicting 30-day Hospital Readmission with Publicly Available Administrative Database. A Conditional Logistic Regression Modeling Approach.

    Science.gov (United States)

    Zhu, K; Lou, Z; Zhou, J; Ballester, N; Kong, N; Parikh, P

    2015-01-01

    This article is part of the Focus Theme of Methods of Information in Medicine on "Big Data and Analytics in Healthcare". Hospital readmissions raise healthcare costs and cause significant distress to providers and patients. It is, therefore, of great interest to healthcare organizations to predict what patients are at risk to be readmitted to their hospitals. However, current logistic regression based risk prediction models have limited prediction power when applied to hospital administrative data. Meanwhile, although decision trees and random forests have been applied, they tend to be too complex to understand among the hospital practitioners. Explore the use of conditional logistic regression to increase the prediction accuracy. We analyzed an HCUP statewide inpatient discharge record dataset, which includes patient demographics, clinical and care utilization data from California. We extracted records of heart failure Medicare beneficiaries who had inpatient experience during an 11-month period. We corrected the data imbalance issue with under-sampling. In our study, we first applied standard logistic regression and decision tree to obtain influential variables and derive practically meaning decision rules. We then stratified the original data set accordingly and applied logistic regression on each data stratum. We further explored the effect of interacting variables in the logistic regression modeling. We conducted cross validation to assess the overall prediction performance of conditional logistic regression (CLR) and compared it with standard classification models. The developed CLR models outperformed several standard classification models (e.g., straightforward logistic regression, stepwise logistic regression, random forest, support vector machine). For example, the best CLR model improved the classification accuracy by nearly 20% over the straightforward logistic regression model. Furthermore, the developed CLR models tend to achieve better sensitivity of

  3. Use of Logistic Regression for Forecasting Short-Term Volcanic Activity

    Directory of Open Access Journals (Sweden)

    Mark T. Woods

    2012-08-01

    Full Text Available An algorithm that forecasts volcanic activity using an event tree decision making framework and logistic regression has been developed, characterized, and validated. The suite of empirical models that drive the system were derived from a sparse and geographically diverse dataset comprised of source modeling results, volcano monitoring data, and historic information from analog volcanoes. Bootstrapping techniques were applied to the training dataset to allow for the estimation of robust logistic model coefficients. Probabilities generated from the logistic models increase with positive modeling results, escalating seismicity, and rising eruption frequency. Cross validation yielded a series of receiver operating characteristic curves with areas ranging between 0.78 and 0.81, indicating that the algorithm has good forecasting capabilities. Our results suggest that the logistic models are highly transportable and can compete with, and in some cases outperform, non-transportable empirical models trained with site specific information.

  4. A Hybrid Approach of Stepwise Regression, Logistic Regression, Support Vector Machine, and Decision Tree for Forecasting Fraudulent Financial Statements

    Directory of Open Access Journals (Sweden)

    Suduan Chen

    2014-01-01

    Full Text Available As the fraudulent financial statement of an enterprise is increasingly serious with each passing day, establishing a valid forecasting fraudulent financial statement model of an enterprise has become an important question for academic research and financial practice. After screening the important variables using the stepwise regression, the study also matches the logistic regression, support vector machine, and decision tree to construct the classification models to make a comparison. The study adopts financial and nonfinancial variables to assist in establishment of the forecasting fraudulent financial statement model. Research objects are the companies to which the fraudulent and nonfraudulent financial statement happened between years 1998 to 2012. The findings are that financial and nonfinancial information are effectively used to distinguish the fraudulent financial statement, and decision tree C5.0 has the best classification effect 85.71%.

  5. Predictors of work injury in underground mines——an application of a logistic regression model

    Institute of Scientific and Technical Information of China (English)

    E S. Pau

    2009-01-01

    Mine accidents and injuries are complex and generally characterized by several factors starting from personal to technical, and technical to social characteristics. In this study, an attempt has been made to identify the various factors responsible for work related injuries in mines and to estimate the risk of work injury to mine workers. The prediction of work injury in mines was done by a step-by-step multivariate logistic regression modeling with an application to case study mines in India. In total, 18 variables were considered in this study. Most of the variables are not directly quantifiable. Instruments were developed to quantify them through a questionnaire type survey. Underground mine workers were randomly selected for the survey. Responses from 300 participants were used for the analysis. Four variables, age, negative affectivity, job dissatisfaction, and physical hazards, bear significant discriminating power for risk of injury to the workers, comparing between cases and controls in a multivariate situation while controlling all the personal and socio-technical variables. The analysis reveals that negatively affected workers are 2.54 times more prone to injuries than the less negatively affected workers and this factor is a more impOrtant risk factor for the case-study mines. Long term planning through identification of the negative individuals, proper counseling regarding the adverse effects of negative behaviors and special training is urgently required. Care should be taken for the aged and experienced workers in terms of their job responsibility and training requirements. Management should provide a friendly atmosphere during work to increase the confidence of the injury prone miners.

  6. Remote sensing and GIS-based landslide hazard analysis and cross-validation using multivariate logistic regression model on three test areas in Malaysia

    Science.gov (United States)

    Pradhan, Biswajeet

    2010-05-01

    This paper presents the results of the cross-validation of a multivariate logistic regression model using remote sensing data and GIS for landslide hazard analysis on the Penang, Cameron, and Selangor areas in Malaysia. Landslide locations in the study areas were identified by interpreting aerial photographs and satellite images, supported by field surveys. SPOT 5 and Landsat TM satellite imagery were used to map landcover and vegetation index, respectively. Maps of topography, soil type, lineaments and land cover were constructed from the spatial datasets. Ten factors which influence landslide occurrence, i.e., slope, aspect, curvature, distance from drainage, lithology, distance from lineaments, soil type, landcover, rainfall precipitation, and normalized difference vegetation index (ndvi), were extracted from the spatial database and the logistic regression coefficient of each factor was computed. Then the landslide hazard was analysed using the multivariate logistic regression coefficients derived not only from the data for the respective area but also using the logistic regression coefficients calculated from each of the other two areas (nine hazard maps in all) as a cross-validation of the model. For verification of the model, the results of the analyses were then compared with the field-verified landslide locations. Among the three cases of the application of logistic regression coefficient in the same study area, the case of Selangor based on the Selangor logistic regression coefficients showed the highest accuracy (94%), where as Penang based on the Penang coefficients showed the lowest accuracy (86%). Similarly, among the six cases from the cross application of logistic regression coefficient in other two areas, the case of Selangor based on logistic coefficient of Cameron showed highest (90%) prediction accuracy where as the case of Penang based on the Selangor logistic regression coefficients showed the lowest accuracy (79%). Qualitatively, the cross

  7. MULTIPLE LOGISTIC REGRESSION MODEL TO PREDICT RISK FACTORS OF ORAL HEALTH DISEASES

    Directory of Open Access Journals (Sweden)

    Parameshwar V. Pandit

    2012-06-01

    Full Text Available Purpose: To analysis the dependence of oral health diseases i.e. dental caries and periodontal disease on considering the number of risk factors through the applications of logistic regression model. Method: The cross sectional study involves a systematic random sample of 1760 permanent dentition aged between 18-40 years in Dharwad, Karnataka, India. Dharwad is situated in North Karnataka. The mean age was 34.26±7.28. The risk factors of dental caries and periodontal disease were established by multiple logistic regression model using SPSS statistical software. Results: The factors like frequency of brushing, timings of cleaning teeth and type of toothpastes are significant persistent predictors of dental caries and periodontal disease. The log likelihood value of full model is –1013.1364 and Akaike’s Information Criterion (AIC is 1.1752 as compared to reduced regression model are -1019.8106 and 1.1748 respectively for dental caries. But, the log likelihood value of full model is –1085.7876 and AIC is 1.2577 followed by reduced regression model are -1019.8106 and 1.1748 respectively for periodontal disease. The area under Receiver Operating Characteristic (ROC curve for the dental caries is 0.7509 (full model and 0.7447 (reduced model; the ROC for the periodontal disease is 0.6128 (full model and 0.5821 (reduced model. Conclusions: The frequency of brushing, timings of cleaning teeth and type of toothpastes are main signifi cant risk factors of dental caries and periodontal disease. The fitting performance of reduced logistic regression model is slightly a better fit as compared to full logistic regression model in identifying the these risk factors for both dichotomous dental caries and periodontal disease.

  8. The spatial prediction of landslide susceptibility applying artificial neural network and logistic regression models: A case study of Inje, Korea

    Science.gov (United States)

    Saro, Lee; Woo, Jeon Seong; Kwan-Young, Oh; Moung-Jin, Lee

    2016-02-01

    The aim of this study is to predict landslide susceptibility caused using the spatial analysis by the application of a statistical methodology based on the GIS. Logistic regression models along with artificial neutral network were applied and validated to analyze landslide susceptibility in Inje, Korea. Landslide occurrence area in the study were identified based on interpretations of optical remote sensing data (Aerial photographs) followed by field surveys. A spatial database considering forest, geophysical, soil and topographic data, was built on the study area using the Geographical Information System (GIS). These factors were analysed using artificial neural network (ANN) and logistic regression models to generate a landslide susceptibility map. The study validates the landslide susceptibility map by comparing them with landslide occurrence areas. The locations of landslide occurrence were divided randomly into a training set (50%) and a test set (50%). A training set analyse the landslide susceptibility map using the artificial network along with logistic regression models, and a test set was retained to validate the prediction map. The validation results revealed that the artificial neural network model (with an accuracy of 80.10%) was better at predicting landslides than the logistic regression model (with an accuracy of 77.05%). Of the weights used in the artificial neural network model, `slope' yielded the highest weight value (1.330), and `aspect' yielded the lowest value (1.000). This research applied two statistical analysis methods in a GIS and compared their results. Based on the findings, we were able to derive a more effective method for analyzing landslide susceptibility.

  9. Using probabilities of enterococci exceedance and logistic regression to evaluate long term weekly beach monitoring data.

    Science.gov (United States)

    Aranda, Diana; Lopez, Jose V; Solo-Gabriele, Helena M; Fleisher, Jay M

    2016-02-01

    Recreational water quality surveillance involves comparing bacterial levels to set threshold values to determine beach closure. Bacterial levels can be predicted through models which are traditionally based upon multiple linear regression. The objective of this study was to evaluate exceedance probabilities, as opposed to bacterial levels, as an alternate method to express beach risk. Data were incorporated into a logistic regression for the purpose of identifying environmental parameters most closely correlated with exceedance probabilities. The analysis was based on 7,422 historical sample data points from the years 2000-2010 for 15 South Florida beach sample sites. Probability analyses showed which beaches in the dataset were most susceptible to exceedances. No yearly trends were observed nor were any relationships apparent with monthly rainfall or hurricanes. Results from logistic regression analyses found that among the environmental parameters evaluated, tide was most closely associated with exceedances, with exceedances 2.475 times more likely to occur at high tide compared to low tide. The logistic regression methodology proved useful for predicting future exceedances at a beach location in terms of probability and modeling water quality environmental parameters with dependence on a binary response. This methodology can be used by beach managers for allocating resources when sampling more than one beach.

  10. Bayesian logistic regression in detection of gene–steroid interaction for cancer at PDLIM5 locus

    Indian Academy of Sciences (India)

    KE-SHENG WANG; DANIEL OWUSU; YUE PAN; CHANGCHUN XIE

    2016-06-01

    The PDZ and LIM domain 5 (PDLIM5) gene may play a role in cancer, bipolar disorder, major depression, alcohol dependence and schizophrenia; however, little is known about the interaction effect of steroid and PDLIM5 gene on cancer. This study examined 47 single-nucleotide polymorphisms (SNPs) within the PDLIM5 gene in the Marshfield sample with 716 cancer patients (any diagnosed cancer, excluding minor skin cancer) and 2848 noncancer controls. Multiple logistic regression model in PLINK software was used to examine the association of each SNP with cancer. Bayesian logistic regression in PROC GENMOD in SAS statistical software, ver. 9.4 was used to detect gene steroid interactions influencing cancer. Single marker analysis using PLINK identified 12 SNPs associated with cancer(P<0.05); especially, SNP rs6532496 revealed the strongest association with cancer $(P=6.84×10^{−3})$; while the next best signal was rs951613 $(P=7.46×10^{−3})$. Classic logistic regression in PROC GENMOD showed that both rs6532496 and rs951613 revealed strong gene–steroid interaction effects (OR =2.18, 95% CI=1.31−3.63 with $P= 2.9×10^{−3}$ for rs6532496 and OR = 2.07, 95% CI =1.24 −3.45 with $P=5.43×10^{−3}$ for rs951613, respectively). Results from Bayesian logistic regression showed stronger interaction effects (OR=2.26, 95% CI=1.2−3.38 for rs6532496 and OR=2.14, 95% CI =1.14 −3.2 for rs951613, respectively). All the 12 SNPs associated with cancer revealed significant gene–steroid interaction effects (P<0.05); whereas 13 SNPs showed gene–steroid interaction effects without main effect on cancer. SNP rs4634230 revealed the strongest gene–steroid interaction effect (OR= 2.49, 95% CI=1.5−4. 13 with $P=4.0×10^{−4}$ based on the classic logistic regression and OR= 2.59, 95% CI =1.4−3.97 from Bayesian logistic regression;respectively). This study provides evidence of common genetic variants within the PDLIM5 gene and interactions between PLDIM5 gene

  11. Determination of sex using cephalo-facial dimensions by discriminant function and logistic regression equations

    Directory of Open Access Journals (Sweden)

    Twisha Shah

    2016-06-01

    Full Text Available The aim is to bring together the new anthropological techniques and knowledge about populations that are least known. The present study was performed on 901 healthy Gujarati volunteers (676 males, 225 females within the age group of 21–50 years with the aim to examine whether any correlation exists between cephalofacial measures naming maximum head length, maximum head breadth, bizygomatic breadth, bigonial diameter, morphological facial length, physiognomic facial length, biocular breadth and total cephalofacial height and sex determination. Also, discriminant function and logistic regression methods were verified to check the best accuracy level for sex determination. Mean values of cephalofacial dimensions were higher in males than in females. Best reliable results were obtained by using logistic regression equations in males (92% and discriminant function in females (80.9%. Our study conclusively establishes the existence of a definite statistically significant sexual dimorphism in Gujarati population using cephalo-facial dimensions.

  12. Comparison of Artificial Neural Networks and Logistic Regression Analysis in the Credit Risk Prediction

    Directory of Open Access Journals (Sweden)

    Hüseyin BUDAK

    2012-11-01

    Full Text Available Credit scoring is a vital topic for Banks since there is a need to use limited financial sources more effectively. There are several credit scoring methods that are used by Banks. One of them is to estimate whether a credit demanding customer’s repayment order will be regular or not. In this study, artificial neural networks and logistic regression analysis have been used to provide a support to the Banks’ credit risk prediction and to estimate whether a credit demanding customers’ repayment order will be regular or not. The results of the study showed that artificial neural networks method is more reliable than logistic regression analysis while estimating a credit demanding customer’s repayment order.

  13. Nowcasting of Low-Visibility Procedure States with Ordered Logistic Regression at Vienna International Airport

    Science.gov (United States)

    Kneringer, Philipp; Dietz, Sebastian; Mayr, Georg J.; Zeileis, Achim

    2017-04-01

    Low-visibility conditions have a large impact on aviation safety and economic efficiency of airports and airlines. To support decision makers, we develop a statistical probabilistic nowcasting tool for the occurrence of capacity-reducing operations related to low visibility. The probabilities of four different low visibility classes are predicted with an ordered logistic regression model based on time series of meteorological point measurements. Potential predictor variables for the statistical models are visibility, humidity, temperature and wind measurements at several measurement sites. A stepwise variable selection method indicates that visibility and humidity measurements are the most important model inputs. The forecasts are tested with a 30 minute forecast interval up to two hours, which is a sufficient time span for tactical planning at Vienna Airport. The ordered logistic regression models outperform persistence and are competitive with human forecasters.

  14. Assessing the effects of different types of covariates for binary logistic regression

    Science.gov (United States)

    Hamid, Hamzah Abdul; Wah, Yap Bee; Xie, Xian-Jin; Rahman, Hezlin Aryani Abd

    2015-02-01

    It is well known that the type of data distribution in the independent variable(s) may affect many statistical procedures. This paper investigates and illustrates the effect of different types of covariates on the parameter estimation of a binary logistic regression model. A simulation study with different sample sizes and different types of covariates (uniform, normal, skewed) was carried out. Results showed that parameter estimation of binary logistic regression model is severely overestimated when sample size is less than 150 for covariate which have normal and uniform distribution while the parameter is underestimated when the distribution of covariate is skewed. Parameter estimation improves for all types of covariates when sample size is large, that is at least 500.

  15. Logistic Regression for Prediction and Diagnosis of Bacterial Regrowth in Water Distribution System

    Institute of Scientific and Technical Information of China (English)

    DONG Lihua; ZHAO Xinhua; WU Qing; YANG You'an

    2009-01-01

    This paper focuses on the quantitative expression of bacterial regrowth in water distribution system. Considering public health risks of bacterial regrowth, the experiment was performed on a distribution system of selected area. Physical, chemical, and microbiological parameters such as turbidity, temperature, residual chlorine and pH were measured over a three-month period and correlation analysis was carried out. Combined with principal components analysis(PCA), a logistic regression model is developed to predict and diagnose bacterial regrowth and locate the zones with high risks of microbiology in the distribution system. The model gives the probability of bacterial regrowth with the number of heterotrophic plate counts as the binary response variable and three new prin-cipal components variables as the explanatory variables. The veracity of the logistic regression model was 90%, which meets the precision requirement of the model.

  16. Rock-profile correlations through logistic regression; Correlacao rocha-perfil atraves de regressao logistica

    Energy Technology Data Exchange (ETDEWEB)

    Castro, Wagner Barbosa de Mello

    1998-02-01

    Logistic regression models were generated starting from lithofacies described in cores and in well logs for two wells of Campos Basin. The main objective was verify the applicability of the technique in reservoir geology. The models were used to estimate the occurrence of reservoir facies in the wells. Results obtained were compared to the results of a previous discriminant analysis with the objective of determinate the accuracy of the two techniques as tools to estimate reservoir facies. Although discriminant analysis resulted more accurate in the estimate of reservoir facies, the use of logistic regression should not be discarded. Its independence of the normal distribution hypothesis make this technique, at least in theory, more robust than the discriminant analysis. (author)

  17. The use of logistic regression to enhance risk assessment and decision making by mental health administrators.

    Science.gov (United States)

    Menditto, Anthony A; Linhorst, Donald M; Coleman, James C; Beck, Niels C

    2006-04-01

    Development of policies and procedures to contend with the risks presented by elopement, aggression, and suicidal behaviors are long-standing challenges for mental health administrators. Guidance in making such judgments can be obtained through the use of a multivariate statistical technique known as logistic regression. This procedure can be used to develop a predictive equation that is mathematically formulated to use the best combination of predictors, rather than considering just one factor at a time. This paper presents an overview of logistic regression and its utility in mental health administrative decision making. A case example of its application is presented using data on elopements from Missouri's long-term state psychiatric hospitals. Ultimately, the use of statistical prediction analyses tempered with differential qualitative weighting of classification errors can augment decision-making processes in a manner that provides guidance and flexibility while wrestling with the complex problem of risk assessment and decision making.

  18. A Novel Imbalanced Data Classification Approach Based on Logistic Regression and Fisher Discriminant

    Directory of Open Access Journals (Sweden)

    Baofeng Shi

    2015-01-01

    Full Text Available We introduce an imbalanced data classification approach based on logistic regression significant discriminant and Fisher discriminant. First of all, a key indicators extraction model based on logistic regression significant discriminant and correlation analysis is derived to extract features for customer classification. Secondly, on the basis of the linear weighted utilizing Fisher discriminant, a customer scoring model is established. And then, a customer rating model where the customer number of all ratings follows normal distribution is constructed. The performance of the proposed model and the classical SVM classification method are evaluated in terms of their ability to correctly classify consumers as default customer or nondefault customer. Empirical results using the data of 2157 customers in financial engineering suggest that the proposed approach better performance than the SVM model in dealing with imbalanced data classification. Moreover, our approach contributes to locating the qualified customers for the banks and the bond investors.

  19. Efficient methods for estimating constrained parameters with applications to lasso logistic regression.

    Science.gov (United States)

    Tian, Guo-Liang; Tang, Man-Lai; Fang, Hong-Bin; Tan, Ming

    2008-03-15

    Fitting logistic regression models is challenging when their parameters are restricted. In this article, we first develop a quadratic lower-bound (QLB) algorithm for optimization with box or linear inequality constraints and derive the fastest QLB algorithm corresponding to the smallest global majorization matrix. The proposed QLB algorithm is particularly suited to problems to which EM-type algorithms are not applicable (e.g., logistic, multinomial logistic, and Cox's proportional hazards models) while it retains the same EM ascent property and thus assures the monotonic convergence. Secondly, we generalize the QLB algorithm to penalized problems in which the penalty functions may not be totally differentiable. The proposed method thus provides an alternative algorithm for estimation in lasso logistic regression, where the convergence of the existing lasso algorithm is not generally ensured. Finally, by relaxing the ascent requirement, convergence speed can be further accelerated. We introduce a pseudo-Newton method that retains the simplicity of the QLB algorithm and the fast convergence of the Newton method. Theoretical justification and numerical examples show that the pseudo-Newton method is up to 71 (in terms of CPU time) or 107 (in terms of number of iterations) times faster than the fastest QLB algorithm and thus makes bootstrap variance estimation feasible. Simulations and comparisons are performed and three real examples (Down syndrome data, kyphosis data, and colon microarray data) are analyzed to illustrate the proposed methods.

  20. Predicting Student Success on the Texas Chemistry STAAR Test: A Logistic Regression Analysis

    Science.gov (United States)

    Johnson, William L.; Johnson, Annabel M.; Johnson, Jared

    2012-01-01

    Background: The context is the new Texas STAAR end-of-course testing program. Purpose: The authors developed a logistic regression model to predict who would pass-or-fail the new Texas chemistry STAAR end-of-course exam. Setting: Robert E. Lee High School (5A) with an enrollment of 2700 students, Tyler, Texas. Date of the study was the 2011-2012…

  1. Artificial neural network, genetic algorithm, and logistic regression applications for predicting renal colic in emergency settings.

    Science.gov (United States)

    Eken, Cenker; Bilge, Ugur; Kartal, Mutlu; Eray, Oktay

    2009-06-03

    Logistic regression is the most common statistical model for processing multivariate data in the medical literature. Artificial intelligence models like an artificial neural network (ANN) and genetic algorithm (GA) may also be useful to interpret medical data. The purpose of this study was to perform artificial intelligence models on a medical data sheet and compare to logistic regression. ANN, GA, and logistic regression analysis were carried out on a data sheet of a previously published article regarding patients presenting to an emergency department with flank pain suspicious for renal colic. The study population was composed of 227 patients: 176 patients had a diagnosis of urinary stone, while 51 ultimately had no calculus. The GA found two decision rules in predicting urinary stones. Rule 1 consisted of being male, pain not spreading to back, and no fever. In rule 2, pelvicaliceal dilatation on bedside ultrasonography replaced no fever. ANN, GA rule 1, GA rule 2, and logistic regression had a sensitivity of 94.9, 67.6, 56.8, and 95.5%, a specificity of 78.4, 76.47, 86.3, and 47.1%, a positive likelihood ratio of 4.4, 2.9, 4.1, and 1.8, and a negative likelihood ratio of 0.06, 0.42, 0.5, and 0.09, respectively. The area under the curve was found to be 0.867, 0.720, 0.715, and 0.713 for all applications, respectively. Data mining techniques such as ANN and GA can be used for predicting renal colic in emergency settings and to constitute clinical decision rules. They may be an alternative to conventional multivariate analysis applications used in biostatistics.

  2. Differential item functioning (DIF) analyses of health-related quality of life instruments using logistic regression

    DEFF Research Database (Denmark)

    Scott, Neil W; Fayers, Peter M; Aaronson, Neil K

    2010-01-01

    Differential item functioning (DIF) methods can be used to determine whether different subgroups respond differently to particular items within a health-related quality of life (HRQoL) subscale, after allowing for overall subgroup differences in that scale. This article reviews issues that arise ...... when testing for DIF in HRQoL instruments. We focus on logistic regression methods, which are often used because of their efficiency, simplicity and ease of application....

  3. Differential item functioning (DIF) analyses of health-related quality of life instruments using logistic regression

    DEFF Research Database (Denmark)

    Scott, Neil W.; Fayers, Peter M.; Aaronson, Neil K.

    2010-01-01

    Differential item functioning (DIF) methods can be used to determine whether different subgroups respond differently to particular items within a health-related quality of life (HRQoL) subscale, after allowing for overall subgroup differences in that scale. This article reviews issues that arise...... when testing for DIF in HRQoL instruments. We focus on logistic regression methods, which are often used because of their efficiency, simplicity and ease of application....

  4. A Logistic Regression Model for Predicting Axillary Lymph Node Metastases in Early Breast Carcinoma Patients

    Directory of Open Access Journals (Sweden)

    Jiaqing Zhang

    2012-07-01

    Full Text Available Nodal staging in breast cancer is a key predictor of prognosis. This paper presents the results of potential clinicopathological predictors of axillary lymph node involvement and develops an efficient prediction model to assist in predicting axillary lymph node metastases. Seventy patients with primary early breast cancer who underwent axillary dissection were evaluated. Univariate and multivariate logistic regression were performed to evaluate the association between clinicopathological factors and lymph node metastatic status. A logistic regression predictive model was built from 50 randomly selected patients; the model was also applied to the remaining 20 patients to assess its validity. Univariate analysis showed a significant relationship between lymph node involvement and absence of nm-23 (p = 0.010 and Kiss-1 (p = 0.001 expression. Absence of Kiss-1 remained significantly associated with positive axillary node status in the multivariate analysis (p = 0.018. Seven clinicopathological factors were involved in the multivariate logistic regression model: menopausal status, tumor size, ER, PR, HER2, nm-23 and Kiss-1. The model was accurate and discriminating, with an area under the receiver operating characteristic curve of 0.702 when applied to the validation group. Moreover, there is a need discover more specific candidate proteins and molecular biology tools to select more variables which should improve predictive accuracy.

  5. Regularized logistic regression with adjusted adaptive elastic net for gene selection in high dimensional cancer classification.

    Science.gov (United States)

    Algamal, Zakariya Yahya; Lee, Muhammad Hisyam

    2015-12-01

    Cancer classification and gene selection in high-dimensional data have been popular research topics in genetics and molecular biology. Recently, adaptive regularized logistic regression using the elastic net regularization, which is called the adaptive elastic net, has been successfully applied in high-dimensional cancer classification to tackle both estimating the gene coefficients and performing gene selection simultaneously. The adaptive elastic net originally used elastic net estimates as the initial weight, however, using this weight may not be preferable for certain reasons: First, the elastic net estimator is biased in selecting genes. Second, it does not perform well when the pairwise correlations between variables are not high. Adjusted adaptive regularized logistic regression (AAElastic) is proposed to address these issues and encourage grouping effects simultaneously. The real data results indicate that AAElastic is significantly consistent in selecting genes compared to the other three competitor regularization methods. Additionally, the classification performance of AAElastic is comparable to the adaptive elastic net and better than other regularization methods. Thus, we can conclude that AAElastic is a reliable adaptive regularized logistic regression method in the field of high-dimensional cancer classification.

  6. Modified Logistic Regression Approaches to Eliminating the Impact of Response Styles on DIF Detection in Likert-Type Scales.

    Science.gov (United States)

    Chen, Hui-Fang; Jin, Kuan-Yu; Wang, Wen-Chung

    2017-01-01

    Extreme response styles (ERS) is prevalent in Likert- or rating-type data but previous research has not well-addressed their impact on differential item functioning (DIF) assessments. This study aimed to fill in the knowledge gap and examined their influence on the performances of logistic regression (LR) approaches in DIF detections, including the ordinal logistic regression (OLR) and the logistic discriminant functional analysis (LDFA). Results indicated that both the standard OLR and LDFA yielded severely inflated false positive rates as the magnitude of the differences in ERS increased between two groups. This study proposed a class of modified LR approaches to eliminating the ERS effect on DIF assessment. These proposed modifications showed satisfactory control of false positive rates when no DIF items existed and yielded a better control of false positive rates and more accurate true positive rates under DIF conditions than the conventional LR approaches did. In conclusion, the proposed modifications are recommended in survey research when there are multiple group or cultural groups.

  7. A general framework for the use of logistic regression models in meta-analysis.

    Science.gov (United States)

    Simmonds, Mark C; Higgins, Julian Pt

    2016-12-01

    Where individual participant data are available for every randomised trial in a meta-analysis of dichotomous event outcomes, "one-stage" random-effects logistic regression models have been proposed as a way to analyse these data. Such models can also be used even when individual participant data are not available and we have only summary contingency table data. One benefit of this one-stage regression model over conventional meta-analysis methods is that it maximises the correct binomial likelihood for the data and so does not require the common assumption that effect estimates are normally distributed. A second benefit of using this model is that it may be applied, with only minor modification, in a range of meta-analytic scenarios, including meta-regression, network meta-analyses and meta-analyses of diagnostic test accuracy. This single model can potentially replace the variety of often complex methods used in these areas. This paper considers, with a range of meta-analysis examples, how random-effects logistic regression models may be used in a number of different types of meta-analyses. This one-stage approach is compared with widely used meta-analysis methods including Bayesian network meta-analysis and the bivariate and hierarchical summary receiver operating characteristic (ROC) models for meta-analyses of diagnostic test accuracy.

  8. Analysis of sparse data in logistic regression in medical research: A newer approach

    Directory of Open Access Journals (Sweden)

    S Devika

    2016-01-01

    Full Text Available Background and Objective: In the analysis of dichotomous type response variable, logistic regression is usually used. However, the performance of logistic regression in the presence of sparse data is questionable. In such a situation, a common problem is the presence of high odds ratios (ORs with very wide 95% confidence interval (CI (OR: >999.999, 95% CI: 999.999. In this paper, we addressed this issue by using penalized logistic regression (PLR method. Materials and Methods: Data from case-control study on hyponatremia and hiccups conducted in Christian Medical College, Vellore, Tamil Nadu, India was used. The outcome variable was the presence/absence of hiccups and the main exposure variable was the status of hyponatremia. Simulation dataset was created with different sample sizes and with a different number of covariates. Results: A total of 23 cases and 50 controls were used for the analysis of ordinary and PLR methods. The main exposure variable hyponatremia was present in nine (39.13% of the cases and in four (8.0% of the controls. Of the 23 hiccup cases, all were males and among the controls, 46 (92.0% were males. Thus, the complete separation between gender and the disease group led into an infinite OR with 95% CI (OR: >999.999, 95% CI: 999.999 whereas there was a finite and consistent regression coefficient for gender (OR: 5.35; 95% CI: 0.42, 816.48 using PLR. After adjusting for all the confounding variables, hyponatremia entailed 7.9 (95% CI: 2.06, 38.86 times higher risk for the development of hiccups as was found using PLR whereas there was an overestimation of risk OR: 10.76 (95% CI: 2.17, 53.41 using the conventional method. Simulation experiment shows that the estimated coverage probability of this method is near the nominal level of 95% even for small sample sizes and for a large number of covariates. Conclusions: PLR is almost equal to the ordinary logistic regression when the sample size is large and is superior in small cell

  9. Assessing Credit Default using Logistic Regression and Multiple Discriminant Analysis: Empirical Evidence from Bosnia and Herzegovina

    Directory of Open Access Journals (Sweden)

    Deni Memić

    2015-01-01

    Full Text Available This article has an aim to assess credit default prediction on the banking market in Bosnia and Herzegovina nationwide as well as on its constitutional entities (Federation of Bosnia and Herzegovina and Republika Srpska. Ability to classify companies info different predefined groups or finding an appropriate tool which would replace human assessment in classifying companies into good and bad buckets has been one of the main interests on risk management researchers for a long time. We investigated the possibility and accuracy of default prediction using traditional statistical methods logistic regression (logit and multiple discriminant analysis (MDA and compared their predictive abilities. The results show that the created models have high predictive ability. For logit models, some variables are more influential on the default prediction than the others. Return on assets (ROA is statistically significant in all four periods prior to default, having very high regression coefficients, or high impact on the model's ability to predict default. Similar results are obtained for MDA models. It is also found that predictive ability differs between logistic regression and multiple discriminant analysis.

  10. Predicting smear negative pulmonary tuberculosis with classification trees and logistic regression: a cross-sectional study

    Directory of Open Access Journals (Sweden)

    Kritski Afrânio

    2006-02-01

    Full Text Available Abstract Background Smear negative pulmonary tuberculosis (SNPT accounts for 30% of pulmonary tuberculosis cases reported yearly in Brazil. This study aimed to develop a prediction model for SNPT for outpatients in areas with scarce resources. Methods The study enrolled 551 patients with clinical-radiological suspicion of SNPT, in Rio de Janeiro, Brazil. The original data was divided into two equivalent samples for generation and validation of the prediction models. Symptoms, physical signs and chest X-rays were used for constructing logistic regression and classification and regression tree models. From the logistic regression, we generated a clinical and radiological prediction score. The area under the receiver operator characteristic curve, sensitivity, and specificity were used to evaluate the model's performance in both generation and validation samples. Results It was possible to generate predictive models for SNPT with sensitivity ranging from 64% to 71% and specificity ranging from 58% to 76%. Conclusion The results suggest that those models might be useful as screening tools for estimating the risk of SNPT, optimizing the utilization of more expensive tests, and avoiding costs of unnecessary anti-tuberculosis treatment. Those models might be cost-effective tools in a health care network with hierarchical distribution of scarce resources.

  11. Comprehensible Predictive Modeling Using Regularized Logistic Regression and Comorbidity Based Features.

    Directory of Open Access Journals (Sweden)

    Gregor Stiglic

    Full Text Available Different studies have demonstrated the importance of comorbidities to better understand the origin and evolution of medical complications. This study focuses on improvement of the predictive model interpretability based on simple logical features representing comorbidities. We use group lasso based feature interaction discovery followed by a post-processing step, where simple logic terms are added. In the final step, we reduce the feature set by applying lasso logistic regression to obtain a compact set of non-zero coefficients that represent a more comprehensible predictive model. The effectiveness of the proposed approach was demonstrated on a pediatric hospital discharge dataset that was used to build a readmission risk estimation model. The evaluation of the proposed method demonstrates a reduction of the initial set of features in a regression model by 72%, with a slight improvement in the Area Under the ROC Curve metric from 0.763 (95% CI: 0.755-0.771 to 0.769 (95% CI: 0.761-0.777. Additionally, our results show improvement in comprehensibility of the final predictive model using simple comorbidity based terms for logistic regression.

  12. Comparing the importance of prognostic factors in Cox and logistic regression using SAS.

    Science.gov (United States)

    Heinze, Georg; Schemper, Michael

    2003-06-01

    Two SAS macro programs are presented that evaluate the relative importance of prognostic factors in the proportional hazards regression model and in the logistic regression model. The importance of a prognostic factor is quantified by the proportion of variation in the outcome attributable to this factor. For proportional hazards regression, the program %RELIMPCR uses the recently proposed measure V to calculate the proportion of explained variation (PEV). For the logistic model, the R(2) measure based on squared raw residuals is used by the program %RELIMPLR. Both programs are able to compute marginal and partial PEV, to compare PEVs of factors, of groups of factors, and even to compare PEVs of different models. The programs use a bootstrap resampling scheme to test differences of the PEVs of different factors. Confidence limits for P-values are provided. The programs further allow to base the computation of PEV on models with shrinked or bias-corrected parameter estimates. The SAS macros are freely available at www.akh-wien.ac.at/imc/biometrie/relimp

  13. [Calculating Pearson residual in logistic regressions: a comparison between SPSS and SAS].

    Science.gov (United States)

    Xu, Hao; Zhang, Tao; Li, Xiao-song; Liu, Yuan-yuan

    2015-01-01

    To compare the results of Pearson residual calculations in logistic regression models using SPSS and SAS. We reviewed Pearson residual calculation methods, and used two sets of data to test logistic models constructed by SPSS and STATA. One model contained a small number of covariates compared to the number of observed. The other contained a similar number of covariates as the number of observed. The two software packages produced similar Pearson residual estimates when the models contained a similar number of covariates as the number of observed, but the results differed when the number of observed was much greater than the number of covariates. The two software packages produce different results of Pearson residuals, especially when the models contain a small number of covariates. Further studies are warranted.

  14. Comparison of Logistic Regression and Random Forests techniques for shallow landslide susceptibility assessment in Giampilieri (NE Sicily, Italy)

    Science.gov (United States)

    Trigila, Alessandro; Iadanza, Carla; Esposito, Carlo; Scarascia-Mugnozza, Gabriele

    2015-11-01

    The aim of this work is to define reliable susceptibility models for shallow landslides using Logistic Regression and Random Forests multivariate statistical techniques. The study area, located in North-East Sicily, was hit on October 1st 2009 by a severe rainstorm (225 mm of cumulative rainfall in 7 h) which caused flash floods and more than 1000 landslides. Several small villages, such as Giampilieri, were hit with 31 fatalities, 6 missing persons and damage to buildings and transportation infrastructures. Landslides, mainly types such as earth and debris translational slides evolving into debris flows, were triggered on steep slopes and involved colluvium and regolith materials which cover the underlying metamorphic bedrock. The work has been carried out with the following steps: i) realization of a detailed event landslide inventory map through field surveys coupled with observation of high resolution aerial colour orthophoto; ii) identification of landslide source areas; iii) data preparation of landslide controlling factors and descriptive statistics based on a bivariate method (Frequency Ratio) to get an initial overview on existing relationships between causative factors and shallow landslide source areas; iv) choice of criteria for the selection and sizing of the mapping unit; v) implementation of 5 multivariate statistical susceptibility models based on Logistic Regression and Random Forests techniques and focused on landslide source areas; vi) evaluation of the influence of sample size and type of sampling on results and performance of the models; vii) evaluation of the predictive capabilities of the models using ROC curve, AUC and contingency tables; viii) comparison of model results and obtained susceptibility maps; and ix) analysis of temporal variation of landslide susceptibility related to input parameter changes. Models based on Logistic Regression and Random Forests have demonstrated excellent predictive capabilities. Land use and wildfire

  15. Modeling of geogenic radon in Switzerland based on ordered logistic regression.

    Science.gov (United States)

    Kropat, Georg; Bochud, François; Murith, Christophe; Palacios Gruson, Martha; Baechler, Sébastien

    2017-01-01

    The estimation of the radon hazard of a future construction site should ideally be based on the geogenic radon potential (GRP), since this estimate is free of anthropogenic influences and building characteristics. The goal of this study was to evaluate terrestrial gamma dose rate (TGD), geology, fault lines and topsoil permeability as predictors for the creation of a GRP map based on logistic regression. Soil gas radon measurements (SRC) are more suited for the estimation of GRP than indoor radon measurements (IRC) since the former do not depend on ventilation and heating habits or building characteristics. However, SRC have only been measured at a few locations in Switzerland. In former studies a good correlation between spatial aggregates of IRC and SRC has been observed. That's why we used IRC measurements aggregated on a 10 km × 10 km grid to calibrate an ordered logistic regression model for geogenic radon potential (GRP). As predictors we took into account terrestrial gamma doserate, regrouped geological units, fault line density and the permeability of the soil. The classification success rate of the model results to 56% in case of the inclusion of all 4 predictor variables. Our results suggest that terrestrial gamma doserate and regrouped geological units are more suited to model GRP than fault line density and soil permeability. Ordered logistic regression is a promising tool for the modeling of GRP maps due to its simplicity and fast computation time. Future studies should account for additional variables to improve the modeling of high radon hazard in the Jura Mountains of Switzerland. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.

  16. glmnetLRC f/k/a lrc package: Logistic Regression Classification

    Energy Technology Data Exchange (ETDEWEB)

    2016-06-09

    Methods for fitting and predicting logistic regression classifiers (LRC) with an arbitrary loss function using elastic net or best subsets. This package adds additional model fitting features to the existing glmnet and bestglm R packages. This package was created to perform the analyses described in Amidan BG, Orton DJ, LaMarche BL, et al. 2014. Signatures for Mass Spectrometry Data Quality. Journal of Proteome Research. 13(4), 2215-2222. It makes the model fitting available in the glmnet and bestglm packages more general by identifying optimal model parameters via cross validation with an customizable loss function. It also identifies the optimal threshold for binary classification.

  17. Semi-parametric estimation of random effects in a logistic regression model using conditional inference

    DEFF Research Database (Denmark)

    Petersen, Jørgen Holm

    2016-01-01

    . For each term in the composite likelihood, a conditional likelihood is used that eliminates the influence of the random effects, which results in a composite conditional likelihood consisting of only one-dimensional integrals that may be solved numerically. Good properties of the resulting estimator......This paper describes a new approach to the estimation in a logistic regression model with two crossed random effects where special interest is in estimating the variance of one of the effects while not making distributional assumptions about the other effect. A composite likelihood is studied...

  18. Accuracy of Bayes and Logistic Regression Subscale Probabilities for Educational and Certification Tests

    Directory of Open Access Journals (Sweden)

    Lawrence Rudner

    2016-07-01

    Full Text Available In the machine learning literature, it is commonly accepted as fact that as calibration sample sizes increase, Na ve Bayes classifiers initially outperform Logistic Regression classifiers in terms of classification accuracy. Applied to subtests from an on-line final examination and from a highly regarded certification examination, this study shows that the conclusion also applies to the probabilities estimated from short subtests of mental abilities and that small samples can yield excellent accuracy. The calculated Bayes probabilities can be used to provide meaningful examinee feedback regardless of whether the test was originally designed to be unidimensional.

  19. A logistic regression model of Coronary Artery Disease among Male Patients in Punjab

    Directory of Open Access Journals (Sweden)

    Sohail Chand

    2005-07-01

    Full Text Available This is a cross-sectional retrospective study of 308 male patients, who were presented first time for coronary angiography at the Punjab Institute of Cardiology. The mean age was 50.97 + 9.9 among male patients. As the response variable coronary artery disease (CAD was a binary variable, logistic regression model was fitted to predict the Coronary Artery Disease with the help of significant risk factors. Age, Chest pain, Diabetes Mellitus, Smoking and Lipids are resulted as significant risk factors associated with CAD among male population.

  20. Mining pharmacovigilance data using Bayesian logistic regression with James-Stein type shrinkage estimation.

    Science.gov (United States)

    An, Lihua; Fung, Karen Y; Krewski, Daniel

    2010-09-01

    Spontaneous adverse event reporting systems are widely used to identify adverse reactions to drugs following their introduction into the marketplace. In this article, a James-Stein type shrinkage estimation strategy was developed in a Bayesian logistic regression model to analyze pharmacovigilance data. This method is effective in detecting signals as it combines information and borrows strength across medically related adverse events. Computer simulation demonstrated that the shrinkage estimator is uniformly better than the maximum likelihood estimator in terms of mean squared error. This method was used to investigate the possible association of a series of diabetic drugs and the risk of cardiovascular events using data from the Canada Vigilance Online Database.

  1. Calculation of a Health Index of Oil-Paper Transformers Insulation with Binary Logistic Regression

    OpenAIRE

    Weijie Zuo; Haiwen Yuan; Yuwei Shang; Yingyi Liu; Tao Chen

    2016-01-01

    This paper presents a new method for calculating the insulation health index (HI) of oil-paper transformers rated under 110 kV to provide a snapshot of health condition using binary logistic regression. Oil breakdown voltage (BDV), total acidity of oil, 2-Furfuraldehyde content, and dissolved gas analysis (DGA) are singled out in this method as the input data for determining HI. A sample of transformers is used to test the proposed method. The results are compared with the results calculated ...

  2. Coordinate Descent Based Hierarchical Interactive Lasso Penalized Logistic Regression and Its Application to Classification Problems

    Directory of Open Access Journals (Sweden)

    Jin-Jia Wang

    2014-01-01

    Full Text Available We present the hierarchical interactive lasso penalized logistic regression using the coordinate descent algorithm based on the hierarchy theory and variables interactions. We define the interaction model based on the geometric algebra and hierarchical constraint conditions and then use the coordinate descent algorithm to solve for the coefficients of the hierarchical interactive lasso model. We provide the results of some experiments based on UCI datasets, Madelon datasets from NIPS2003, and daily activities of the elder. The experimental results show that the variable interactions and hierarchy contribute significantly to the classification. The hierarchical interactive lasso has the advantages of the lasso and interactive lasso.

  3. Estimating the causes of traffic accidents using logistic regression and discriminant analysis.

    Science.gov (United States)

    Karacasu, Murat; Ergül, Barış; Altin Yavuz, Arzu

    2014-01-01

    Factors that affect traffic accidents have been analysed in various ways. In this study, we use the methods of logistic regression and discriminant analysis to determine the damages due to injury and non-injury accidents in the Eskisehir Province. Data were obtained from the accident reports of the General Directorate of Security in Eskisehir; 2552 traffic accidents between January and December 2009 were investigated regarding whether they resulted in injury. According to the results, the effects of traffic accidents were reflected in the variables. These results provide a wealth of information that may aid future measures toward the prevention of undesired results.

  4. Calculation of a Health Index of Oil-Paper Transformers Insulation with Binary Logistic Regression

    Directory of Open Access Journals (Sweden)

    Weijie Zuo

    2016-01-01

    Full Text Available This paper presents a new method for calculating the insulation health index (HI of oil-paper transformers rated under 110 kV to provide a snapshot of health condition using binary logistic regression. Oil breakdown voltage (BDV, total acidity of oil, 2-Furfuraldehyde content, and dissolved gas analysis (DGA are singled out in this method as the input data for determining HI. A sample of transformers is used to test the proposed method. The results are compared with the results calculated for the same set of transformers using fuzzy logic. The comparison results show that the proposed method is reliable and effective in evaluating transformer health condition.

  5. elrm: Software Implementing Exact-Like Inference for Logistic Regression Models

    Directory of Open Access Journals (Sweden)

    David Zamar

    2007-09-01

    Full Text Available Exact inference is based on the conditional distribution of the sufficient statistics for the parameters of interest given the observed values for the remaining sufficient statistics. Exact inference for logistic regression can be problematic when data sets are large and the support of the conditional distribution cannot be represented in memory. Additionally, these methods are not widely implemented except in commercial software packages such as LogXact and SAS. Therefore, we have developed elrm, software for R implementing (approximate exact inference for binomial regression models from large data sets. We provide a description of the underlying statistical methods and illustrate the use of elrm with examples. We also evaluate elrm by comparing results with those obtained using other methods.

  6. Estimating the susceptibility of surface water in Texas to nonpoint-source contamination by use of logistic regression modeling

    Science.gov (United States)

    Battaglin, William A.; Ulery, Randy L.; Winterstein, Thomas; Welborn, Toby

    2003-01-01

    In the State of Texas, surface water (streams, canals, and reservoirs) and ground water are used as sources of public water supply. Surface-water sources of public water supply are susceptible to contamination from point and nonpoint sources. To help protect sources of drinking water and to aid water managers in designing protective yet cost-effective and risk-mitigated monitoring strategies, the Texas Commission on Environmental Quality and the U.S. Geological Survey developed procedures to assess the susceptibility of public water-supply source waters in Texas to the occurrence of 227 contaminants. One component of the assessments is the determination of susceptibility of surface-water sources to nonpoint-source contamination. To accomplish this, water-quality data at 323 monitoring sites were matched with geographic information system-derived watershed- characteristic data for the watersheds upstream from the sites. Logistic regression models then were developed to estimate the probability that a particular contaminant will exceed a threshold concentration specified by the Texas Commission on Environmental Quality. Logistic regression models were developed for 63 of the 227 contaminants. Of the remaining contaminants, 106 were not modeled because monitoring data were available at less than 10 percent of the monitoring sites; 29 were not modeled because there were less than 15 percent detections of the contaminant in the monitoring data; 27 were not modeled because of the lack of any monitoring data; and 2 were not modeled because threshold values were not specified.

  7. Simultaneous confidence bands for log-logistic regression with applications in risk assessment.

    Science.gov (United States)

    Kerns, Lucy X

    2017-05-01

    In risk assessment, it is often desired to make inferences on the low dose levels at which a specific benchmark risk is attained. Applications of simultaneous hyperbolic confidence bands for low-dose risk estimation with quantal data under different dose-response models (multistage, Abbott-adjusted Weibull, and Abbott-adjusted log-logistic models) have appeared in the literature. The use of simultaneous three-segment bands under the multistage model has also been proposed recently. In this article, we present explicit formulas for constructing asymptotic one-sided simultaneous hyperbolic and three-segment bands for the simple log-logistic regression model. We use the simultaneous construction to estimate upper hyperbolic and three-segment confidence bands on extra risk and to obtain lower limits on the benchmark dose by inverting the upper bands on risk under the Abbott-adjusted log-logistic model. Monte Carlo simulations evaluate the characteristics of the simultaneous limits. An example is given to illustrate the use of the proposed methods and to compare the two types of simultaneous limits at very low dose levels. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  8. Controlling Type I Error Rates in Assessing DIF for Logistic Regression Method Combined with SIBTEST Regression Correction Procedure and DIF-Free-Then-DIF Strategy

    Science.gov (United States)

    Shih, Ching-Lin; Liu, Tien-Hsiang; Wang, Wen-Chung

    2014-01-01

    The simultaneous item bias test (SIBTEST) method regression procedure and the differential item functioning (DIF)-free-then-DIF strategy are applied to the logistic regression (LR) method simultaneously in this study. These procedures are used to adjust the effects of matching true score on observed score and to better control the Type I error…

  9. Controlling Type I Error Rates in Assessing DIF for Logistic Regression Method Combined with SIBTEST Regression Correction Procedure and DIF-Free-Then-DIF Strategy

    Science.gov (United States)

    Shih, Ching-Lin; Liu, Tien-Hsiang; Wang, Wen-Chung

    2014-01-01

    The simultaneous item bias test (SIBTEST) method regression procedure and the differential item functioning (DIF)-free-then-DIF strategy are applied to the logistic regression (LR) method simultaneously in this study. These procedures are used to adjust the effects of matching true score on observed score and to better control the Type I error…

  10. Logistic regression function for detection of suspicious performance during baseline evaluations using concussion vital signs.

    Science.gov (United States)

    Hill, Benjamin David; Womble, Melissa N; Rohling, Martin L

    2015-01-01

    This study utilized logistic regression to determine whether performance patterns on Concussion Vital Signs (CVS) could differentiate known groups with either genuine or feigned performance. For the embedded measure development group (n = 174), clinical patients and undergraduate students categorized as feigning obtained significantly lower scores on the overall test battery mean for the CVS, Shipley-2 composite score, and California Verbal Learning Test-Second Edition subtests than did genuinely performing individuals. The final full model of 3 predictor variables (Verbal Memory immediate hits, Verbal Memory immediate correct passes, and Stroop Test complex reaction time correct) was significant and correctly classified individuals in their known group 83% of the time (sensitivity = .65; specificity = .97) in a mixed sample of young-adult clinical cases and simulators. The CVS logistic regression function was applied to a separate undergraduate college group (n = 378) that was asked to perform genuinely and identified 5% as having possibly feigned performance indicating a low false-positive rate. The failure rate was 11% and 16% at baseline cognitive testing in samples of high school and college athletes, respectively. These findings have particular relevance given the increasing use of computerized test batteries for baseline cognitive testing and return-to-play decisions after concussion.

  11. A genetic algorithm for variable selection in logistic regression analysis of radiotherapy treatment outcomes.

    Science.gov (United States)

    Gayou, Olivier; Das, Shiva K; Zhou, Su-Min; Marks, Lawrence B; Parda, David S; Miften, Moyed

    2008-12-01

    A given outcome of radiotherapy treatment can be modeled by analyzing its correlation with a combination of dosimetric, physiological, biological, and clinical factors, through a logistic regression fit of a large patient population. The quality of the fit is measured by the combination of the predictive power of this particular set of factors and the statistical significance of the individual factors in the model. We developed a genetic algorithm (GA), in which a small sample of all the possible combinations of variables are fitted to the patient data. New models are derived from the best models, through crossover and mutation operations, and are in turn fitted. The process is repeated until the sample converges to the combination of factors that best predicts the outcome. The GA was tested on a data set that investigated the incidence of lung injury in NSCLC patients treated with 3DCRT. The GA identified a model with two variables as the best predictor of radiation pneumonitis: the V30 (p=0.048) and the ongoing use of tobacco at the time of referral (p=0.074). This two-variable model was confirmed as the best model by analyzing all possible combinations of factors. In conclusion, genetic algorithms provide a reliable and fast way to select significant factors in logistic regression analysis of large clinical studies.

  12. Predictive market segmentation model: An application of logistic regression model and CHAID procedure

    Directory of Open Access Journals (Sweden)

    Soldić-Aleksić Jasna

    2009-01-01

    Full Text Available Market segmentation presents one of the key concepts of the modern marketing. The main goal of market segmentation is focused on creating groups (segments of customers that have similar characteristics, needs, wishes and/or similar behavior regarding the purchase of concrete product/service. Companies can create specific marketing plan for each of these segments and therefore gain short or long term competitive advantage on the market. Depending on the concrete marketing goal, different segmentation schemes and techniques may be applied. This paper presents a predictive market segmentation model based on the application of logistic regression model and CHAID analysis. The logistic regression model was used for the purpose of variables selection (from the initial pool of eleven variables which are statistically significant for explaining the dependent variable. Selected variables were afterwards included in the CHAID procedure that generated the predictive market segmentation model. The model results are presented on the concrete empirical example in the following form: summary model results, CHAID tree, Gain chart, Index chart, risk and classification tables.

  13. Sparse Logistic Regression for Diagnosis of Liver Fibrosis in Rat by Using SCAD-Penalized Likelihood

    Directory of Open Access Journals (Sweden)

    Fang-Rong Yan

    2011-01-01

    Full Text Available The objective of the present study is to find out the quantitative relationship between progression of liver fibrosis and the levels of certain serum markers using mathematic model. We provide the sparse logistic regression by using smoothly clipped absolute deviation (SCAD penalized function to diagnose the liver fibrosis in rats. Not only does it give a sparse solution with high accuracy, it also provides the users with the precise probabilities of classification with the class information. In the simulative case and the experiment case, the proposed method is comparable to the stepwise linear discriminant analysis (SLDA and the sparse logistic regression with least absolute shrinkage and selection operator (LASSO penalty, by using receiver operating characteristic (ROC with bayesian bootstrap estimating area under the curve (AUC diagnostic sensitivity for selected variable. Results show that the new approach provides a good correlation between the serum marker levels and the liver fibrosis induced by thioacetamide (TAA in rats. Meanwhile, this approach might also be used in predicting the development of liver cirrhosis.

  14. Classification of Effective Soil Depth by Using Multinomial Logistic Regression Analysis

    Science.gov (United States)

    Chang, C. H.; Chan, H. C.; Chen, B. A.

    2016-12-01

    Classification of effective soil depth is a task of determining the slopeland utilizable limitation in Taiwan. The "Slopeland Conservation and Utilization Act" categorizes the slopeland into agriculture and husbandry land, land suitable for forestry and land for enhanced conservation according to the factors including average slope, effective soil depth, soil erosion and parental rock. However, sit investigation of the effective soil depth requires a cost-effective field work. This research aimed to classify the effective soil depth by using multinomial logistic regression with the environmental factors. The Wen-Shui Watershed located at the central Taiwan was selected as the study areas. The analysis of multinomial logistic regression is performed by the assistance of a Geographic Information Systems (GIS). The effective soil depth was categorized into four levels including deeper, deep, shallow and shallower. The environmental factors of slope, aspect, digital elevation model (DEM), curvature and normalized difference vegetation index (NDVI) were selected for classifying the soil depth. An Error Matrix was then used to assess the model accuracy. The results showed an overall accuracy of 75%. At the end, a map of effective soil depth was produced to help planners and decision makers in determining the slopeland utilizable limitation in the study areas.

  15. Application of Logistic Regression Tree Model in Determining Habitat Distribution of Astragalus verus

    Directory of Open Access Journals (Sweden)

    M. Saki

    2013-03-01

    Full Text Available The relationship between plant species and environmental factors has always been a central issue in plant ecology. With rising power of statistical techniques, geo-statistics and geographic information systems (GIS, the development of predictive habitat distribution models of organisms has rapidly increased in ecology. This study aimed to evaluate the ability of Logistic Regression Tree model to create potential habitat map of Astragalus verus. This species produces Tragacanth and has economic value. A stratified- random sampling was applied to 100 sites (50 presence- 50 absence of given species, and produced environmental and edaphic factors maps by using Kriging and Inverse Distance Weighting methods in the ArcGIS software for the whole study area. Relationships between species occurrence and environmental factors were determined by Logistic Regression Tree model and extended to the whole study area. The results indicated species occurrence has strong correlation with environmental factors such as mean daily temperature and clay, EC and organic carbon content of the soil. Species occurrence showed direct relationship with mean daily temperature and clay and organic carbon, and inverse relationship with EC. Model accuracy was evaluated both by Cohen’s kappa statistics (κ and by area under Receiver Operating Characteristics curve based on independent test data set. Their values (kappa=0.9, Auc of ROC=0.96 indicated the high power of LRT to create potential habitat map on local scales. This model, therefore, can be applied to recognize potential sites for rangeland reclamation projects.

  16. Urban Growth Modelling with Artificial Neural Network and Logistic Regression. Case Study: Sanandaj City, Iran

    Directory of Open Access Journals (Sweden)

    SASSAN MOHAMMADY

    2013-01-01

    Full Text Available Cities have shown remarkable growth due to attraction, economic, social and facilities centralization in the past few decades. Population and urban expansion especially in developing countries, led to lack of resources, land use change from appropriate agricultural land to urban land use and marginalization. Under these circumstances, land use activity is a major issue and challenge for town and country planners. Different approaches have been attempted in urban expansion modelling. Artificial Neural network (ANN models are among knowledge-based models which have been used for urban growth modelling. ANNs are powerful tools that use a machine learning approach to quantify and model complex behaviour and patterns. In this research, ANN and logistic regression have been employed for interpreting urban growth modelling. Our case study is Sanandaj city and we used Landsat TM and ETM+ imageries acquired at 2000 and 2006. The dataset used includes distance to main roads, distance to the residence region, elevation, slope, and distance to green space. Percent Area Match (PAM obtained from modelling of these changes with ANN is equal to 90.47% and the accuracy achieved for urban growth modelling with Logistic Regression (LR is equal to 88.91%. Percent Correct Match (PCM and Figure of Merit for ANN method were 91.33% and 59.07% and then for LR were 90.84% and 57.07%, respectively.

  17. Predicting students' success at pre-university studies using linear and logistic regressions

    Science.gov (United States)

    Suliman, Noor Azizah; Abidin, Basir; Manan, Norhafizah Abdul; Razali, Ahmad Mahir

    2014-09-01

    The study is aimed to find the most suitable model that could predict the students' success at the medical pre-university studies, Centre for Foundation in Science, Languages and General Studies of Cyberjaya University College of Medical Sciences (CUCMS). The predictors under investigation were the national high school exit examination-Sijil Pelajaran Malaysia (SPM) achievements such as Biology, Chemistry, Physics, Additional Mathematics, Mathematics, English and Bahasa Malaysia results as well as gender and high school background factors. The outcomes showed that there is a significant difference in the final CGPA, Biology and Mathematics subjects at pre-university by gender factor, while by high school background also for Mathematics subject. In general, the correlation between the academic achievements at the high school and medical pre-university is moderately significant at α-level of 0.05, except for languages subjects. It was found also that logistic regression techniques gave better prediction models than the multiple linear regression technique for this data set. The developed logistic models were able to give the probability that is almost accurate with the real case. Hence, it could be used to identify successful students who are qualified to enter the CUCMS medical faculty before accepting any students to its foundation program.

  18. A two-stage logistic regression-ANN model for the prediction of distress banks: Evidence from 11 emerging countries

    National Research Council Canada - National Science Library

    Shu Ling Lin

    2010-01-01

      This paper proposes a new approach of two-stage hybrid model of logistic regression-ANN for the construction of a financial distress warning system for banking industry in emerging market during 1998-2006...

  19. Using latent variables in logistic regression to reduce multicollinearity, A case-control example: breast cancer risk factors

    Directory of Open Access Journals (Sweden)

    Mohamad Amin Pourhoseingholi

    2008-03-01

    Full Text Available

    Background: Logistic regression is one of the most widely used models to analyze the relation between one or more explanatory variables and a categorical response in the field of epidemiology, health and medicine. When there is strong correlation among explanatory variables, i.e.multicollinearity, the efficiency of model reduces considerably. The objective of this research was to employ latent variables to reduce the effect of multicollinearity in analysis of a case-control study about breast cancer risk factors.

    Methods: The data belonged to a case-control study in which 300 women with breast cancer were compared to same number of controls. To assess the effect of multicollinearity, five highly correlated quantitative variables were selected. Ordinary logistic regression with collinear data was compared to two models contain latent variables were generated using either factor analysis or principal components analysis. Estimated standard errors of parameters were selected to compare the efficiency of models. We also conducted a simulation study in order to compare the efficiency of models with and without latent factors. All analyses were carried out using S-plus.

    Results: Logistic regression based on five primary variables showed an unusual odds ratios for age at first pregnancy (OR=67960, 95%CI: 10184-453503 and for total length of breast feeding (OR=0. On the other hand the parameters estimated for logistic regression on latent variables generated by both factor analysis and principal components analysis were statistically significant (P<0.003. Their standard errors were smaller than that of ordinary logistic regression on original variables. The simulation showed that in the case of normal error and 58% reliability the logistic regression based on latent variables is more efficient than that model for collinear variables.

    Conclusions: This research

  20. Regional Integrated Meteorological Forecasting and Warning Model for Geological Hazards Based on Logistic Regression

    Institute of Scientific and Technical Information of China (English)

    XU Jing; YANG Chi; ZHANG Guoping

    2007-01-01

    Information model is adopted to integrate factors of various geosciences to estimate the susceptibility of geological hazards. Further combining the dynamic rainfall observations, Logistic regression is used for modeling the probabilities of geological hazard occurrences, upon which hierarchical warnings for rainfall-induced geological hazards are produced. The forecasting and warning model takes numerical precipitation forecasts on grid points as its dynamic input, forecasts the probabilities of geological hazard occurrences on the same grid, and translates the results into likelihoods in the form of a 5-level hierarchy. Validation of the model with observational data for the year 2004 shows that 80% of the geological hazards of the year have been identified as "likely enough to release warning messages". The model can satisfy the requirements of an operational warning system, thus is an effective way to improve the meteorological warnings for geological hazards.

  1. AN APPLICATION OF THE LOGISTIC REGRESSION MODEL IN THE EXPERIMENTAL PHYSICAL CHEMISTRY

    Directory of Open Access Journals (Sweden)

    Elpidio Corral-López

    2015-06-01

    Full Text Available The calculation of intensive properties molar volumes of ethanol-water mixtures by experimental densities and tangent method in the Physical Chemistry Laboratory presents the problem of making manually the molar volume curve versus mole fraction and the trace of the tangent line trace. The advantage of using a statistical model the Logistic Regression on a Texas VOYAGE graphing calculator allowed trace the curve and the tangents in situ, and also evaluate the students work during the experimental session. The error percentage between the molar volumes calculated using literature data and those obtained with statistical method is minimal, which validates the model. It is advantageous use the calculator with this application as a teaching support tool, reducing the evaluation time of 3 weeks to 3 hours.

  2. Modeling Anthropogenic Fire Occurrence in the Boreal Forest of China Using Logistic Regression and Random Forests

    Directory of Open Access Journals (Sweden)

    Futao Guo

    2016-10-01

    Full Text Available Frequent and intense anthropogenic fires present meaningful challenges to forest management in the boreal forest of China. Understanding the underlying drivers of human-caused fire occurrence is crucial for making effective and scientifically-based forest fire management plans. In this study, we applied logistic regression (LR and Random Forests (RF to identify important biophysical and anthropogenic factors that help to explain the likelihood of anthropogenic fires in the Chinese boreal forest. Results showed that the anthropogenic fires were more likely to occur at areas close to railways and were significantly influenced by forest types. In addition, distance to settlement and distance to road were identified as important predictors for anthropogenic fire occurrence. The model comparison indicated that RF had greater ability than LR to predict forest fires caused by human activity in the Chinese boreal forest. High fire risk zones in the study area were identified based on RF, where we recommend increasing allocation of fire management resources.

  3. Computational procedures for probing interactions in OLS and logistic regression: SPSS and SAS implementations.

    Science.gov (United States)

    Hayes, Andrew F; Matthes, Jörg

    2009-08-01

    Researchers often hypothesize moderated effects, in which the effect of an independent variable on an outcome variable depends on the value of a moderator variable. Such an effect reveals itself statistically as an interaction between the independent and moderator variables in a model of the outcome variable. When an interaction is found, it is important to probe the interaction, for theories and hypotheses often predict not just interaction but a specific pattern of effects of the focal independent variable as a function of the moderator. This article describes the familiar pick-a-point approach and the much less familiar Johnson-Neyman technique for probing interactions in linear models and introduces macros for SPSS and SAS to simplify the computations and facilitate the probing of interactions in ordinary least squares and logistic regression. A script version of the SPSS macro is also available for users who prefer a point-and-click user interface rather than command syntax.

  4. Modeling data for pancreatitis in presence of a duodenal diverticula using logistic regression

    Science.gov (United States)

    Dineva, S.; Prodanova, K.; Mlachkova, D.

    2013-12-01

    The presence of a periampullary duodenal diverticulum (PDD) is often observed during upper digestive tract barium meal studies and endoscopic retrograde cholangiopancreatography (ERCP). A few papers reported that the diverticulum had something to do with the incidence of pancreatitis. The aim of this study is to investigate if the presence of duodenal diverticula predisposes to the development of a pancreatic disease. A total 3966 patients who had undergone ERCP were studied retrospectively. They were divided into 2 groups-with and without PDD. Patients with a duodenal diverticula had a higher rate of acute pancreatitis. The duodenal diverticula is a risk factor for acute idiopathic pancreatitis. A multiple logistic regression to obtain adjusted estimate of odds and to identify if a PDD is a predictor of acute or chronic pancreatitis was performed. The software package STATISTICA 10.0 was used for analyzing the real data.

  5. MULTIVARIATE STEPWISE LOGISTIC REGRESSION ANALYSIS ON RISK FACTORS OF VENTILATOR-ASSOCIATED PNEUMONIA IN COMPREHENSIVE ICU

    Institute of Scientific and Technical Information of China (English)

    管军; 杨兴易; 赵良; 林兆奋; 郭昌星; 李文放

    2003-01-01

    Objective To investigate the incidence, crude mortality and independent risk factors of ventilator-associated pneumonia (VAP) in comprehensive ICU in China.Methods The clinical and microbiological data were retrospectively collected and analysed of all the 97 patients receiving mechanical ventilation (>48hr) in our comprehensive ICU during 1999. 1 - 2000. 12. Firstly several statistically significant risk factors were screened out with univariate analysis, then independent risk factors were determined with multivariate stepwise logistic regression analysis.Results The incidence of VAP was 54. 64% (15. 60 cases per 1000 ventilation days), the crude mortality 47.42% . Interval between the establishment of artificial airway and diagnosis of VAP was 6.9 ± 4.3 d. Univariate analysis suggested that indwelling naso-gastric tube, corticosteroid, acid inhibitor, third-generation cephalosporin/ imipenem, non - infection lung disease, and extrapulmonary infection were the statistically significant risk factors of

  6. A semiparametric Wald statistic for testing logistic regression models based on case-control data

    Institute of Scientific and Technical Information of China (English)

    2008-01-01

    We propose a semiparametric Wald statistic to test the validity of logistic regression models based on case-control data. The test statistic is constructed using a semiparametric ROC curve estimator and a nonparametric ROC curve estimator. The statistic has an asymptotic chi-squared distribution and is an alternative to the Kolmogorov-Smirnov-type statistic proposed by Qin and Zhang in 1997, the chi-squared-type statistic proposed by Zhang in 1999 and the information matrix test statistic proposed by Zhang in 2001. The statistic is easy to compute in the sense that it requires none of the following methods: using a bootstrap method to find its critical values, partitioning the sample data or inverting a high-dimensional matrix. We present some results on simulation and on analysis of two real examples. Moreover, we discuss how to extend our statistic to a family of statistics and how to construct its Kolmogorov-Smirnov counterpart.

  7. Binary logistic regression modelling: Measuring the probability of relapse cases among drug addict

    Science.gov (United States)

    Ismail, Mohd Tahir; Alias, Siti Nor Shadila

    2014-07-01

    For many years Malaysia faced the drug addiction issues. The most serious case is relapse phenomenon among treated drug addict (drug addict who have under gone the rehabilitation programme at Narcotic Addiction Rehabilitation Centre, PUSPEN). Thus, the main objective of this study is to find the most significant factor that contributes to relapse to happen. The binary logistic regression analysis was employed to model the relationship between independent variables (predictors) and dependent variable. The dependent variable is the status of the drug addict either relapse, (Yes coded as 1) or not, (No coded as 0). Meanwhile the predictors involved are age, age at first taking drug, family history, education level, family crisis, community support and self motivation. The total of the sample is 200 which the data are provided by AADK (National Antidrug Agency). The finding of the study revealed that age and self motivation are statistically significant towards the relapse cases..

  8. Using Logistic Regression to Model New York City Restaurant Grades Over a Two-Year Period

    Directory of Open Access Journals (Sweden)

    David Nadler

    2014-07-01

    Full Text Available A knowledge gap exists in the role of restaurant type on the prediction of attaining the highest grade possible from the local health inspection agency. This study identified disparities using logistic regression between the issuance of a Grade A and restaurant type and location. This study tested the eight most inspected types of restaurants within the City of New York and calculated the odds ratios of their receiving the highest inspection grade by the New York City Department of Health and Mental Hygiene. A fitted equation has been proposed for the prediction of receiving the highest inspection grade based upon the citywide results of these eight restaurant types from calendar years 2011 and 2012. The results suggest that certain styles of restaurants have lower odds of receiving the highest grade in comparison to American-style restaurants.

  9. Peripheral vascular trauma in children: related factors by the logistic regression method

    Directory of Open Access Journals (Sweden)

    Raquel Nogueira Avelar Silva

    2014-03-01

    Full Text Available The objective of the present study was to identify the factors related to “peripheral vascular trauma” in children aged six months to 12 years. This prospective cohort study included children with peripheral vein punctured for the first time per side and excluded those with high/complete healing of trauma signs after removing the catheter. Daily clinical evaluations were performed in intervals shorter than 24 hours. Data were treated according to Pearson’s test and the logistic regression method. Among the 14 variables considered intervenient, four were statistically associated to the occurrence of trauma: dirtiness and humidity in the catheter insertion site, catheter caliber, and age. A causal relationship was found between the intervenient variables and the outcome, “peripheral vascular trauma”, thus, contributing to forming the knowledge of the peripheral venous puncture in children aged six months to 12 years. Descriptors: Child; Nursing Diagnosis; Veins; Injuries.

  10. Statistical modelling for thoracic surgery using a nomogram based on logistic regression.

    Science.gov (United States)

    Liu, Run-Zhong; Zhao, Ze-Rui; Ng, Calvin S H

    2016-08-01

    A well-developed clinical nomogram is a popular decision-tool, which can be used to predict the outcome of an individual, bringing benefits to both clinicians and patients. With just a few steps on a user-friendly interface, the approximate clinical outcome of patients can easily be estimated based on their clinical and laboratory characteristics. Therefore, nomograms have recently been developed to predict the different outcomes or even the survival rate at a specific time point for patients with different diseases. However, on the establishment and application of nomograms, there is still a lot of confusion that may mislead researchers. The objective of this paper is to provide a brief introduction on the history, definition, and application of nomograms and then to illustrate simple procedures to develop a nomogram with an example based on a multivariate logistic regression model in thoracic surgery. In addition, validation strategies and common pitfalls have been highlighted.

  11. Sensitivity Analysis to Select the Most Influential Risk Factors in a Logistic Regression Model

    Directory of Open Access Journals (Sweden)

    Jassim N. Hussain

    2008-01-01

    Full Text Available The traditional variable selection methods for survival data depend on iteration procedures, and control of this process assumes tuning parameters that are problematic and time consuming, especially if the models are complex and have a large number of risk factors. In this paper, we propose a new method based on the global sensitivity analysis (GSA to select the most influential risk factors. This contributes to simplification of the logistic regression model by excluding the irrelevant risk factors, thus eliminating the need to fit and evaluate a large number of models. Data from medical trials are suggested as a way to test the efficiency and capability of this method and as a way to simplify the model. This leads to construction of an appropriate model. The proposed method ranks the risk factors according to their importance.

  12. A simple and efficient algorithm for gene selection using sparse logistic regression.

    Science.gov (United States)

    Shevade, S K; Keerthi, S S

    2003-11-22

    This paper gives a new and efficient algorithm for the sparse logistic regression problem. The proposed algorithm is based on the Gauss-Seidel method and is asymptotically convergent. It is simple and extremely easy to implement; it neither uses any sophisticated mathematical programming software nor needs any matrix operations. It can be applied to a variety of real-world problems like identifying marker genes and building a classifier in the context of cancer diagnosis using microarray data. The gene selection method suggested in this paper is demonstrated on two real-world data sets and the results were found to be consistent with the literature. The implementation of this algorithm is available at the site http://guppy.mpe.nus.edu.sg/~mpessk/SparseLOGREG.shtml Supplementary material is available at the site http://guppy.mpe.nus.edu.sg/~mpessk/SparseLOGREG.shtml

  13. A binary logistic regression model for discriminating real protein-protein interface

    Institute of Scientific and Technical Information of China (English)

    2003-01-01

    The selection and study of descriptive variables of protein-protein complex interface is a major question that many biologists come across when the research of protein-protein recognition is concerned. Several variables have been proposed to understand the structural or energetic features of complex interfaces. Here a systematic study of some of these "traditional" variables, as well as a few new ones, is introduced. With the values of these variables extracted from 42 PDB samples with real or false complex interfaces, a binary logistic regression analysis is performed, which results in an effective empirical model for the evaluation of binding probabilities of protein-protein interfaces. The model is validated with 12 samples, and satisfactory results are obtained for both the training and validation sets. Meanwhile, three potential dimeric interfaces of staphylokinase have been investigated and one with the best suitability to our model is proposed.

  14. Effective factors contraceptive use by logistic regression model in Tehran, 1996

    Directory of Open Access Journals (Sweden)

    Ramezani F

    1999-07-01

    Full Text Available Despite unwillingness to fertility, about 30% of couples do not use any kind of contraception and this will lead to unwanted pregnancy. In this clinical trial study, 4177 subjects who had at least one alive child, and delivered in one of the 12 university hospitals in Tehran were recruited. This study was conducted in 1996. The questionnaire included some questions about contraceptive use, their attitudes about unwantedness or wantedness of their current pregnancies. Data were analysed using a Logistic Regrassion Model. Results showed that 20.3% of those who had no fertility intention, did not use any kind of contraception methods, 41.1% of the subjects who were using a contraception method before pregnancy, had got pregnant unwantedly. Based on Logistic Regression Model; age, education, previous familiarity of women with contraception methods and husband's education were the most significant factors in contraceptive use. Subjects who were 20 years old and less or 35 years old and more and illeterate subjects were at higher risk for unuse of contraception methods. This risk was not related to the gender of their children that suggests a positive change in their perspectives towards sex and the number of children. It is suggested that health politicians choose an appropriate model to enhance the literacy, education and counseling for the correct usage of contraceptives and prevention of unwanted pregnancy.

  15. Logistic regression analysis of the risk factors of acute renal failure complicating limb war injuries

    Directory of Open Access Journals (Sweden)

    Chang-zhi CHENG

    2011-06-01

    Full Text Available Objective To explore the risk factors of complication of acute renal failure(ARF in war injuries of limbs.Methods The clinical data of 352 patients with limb injuries admitted to 303 Hospital of PLA from 1968 to 2002 were retrospectively analyzed.The patients were divided into ARF group(n=9 and non-ARF group(n=343 according to the occurrence of ARF,and the case-control study was carried out.Ten factors which might lead to death were analyzed by logistic regression to screen the risk factors for ARF,including causes of trauma,shock after injury,time of admission to hospital after injury,injured sites,combined trauma,number of surgical procedures,presence of foreign matters,features of fractures,amputation,and tourniquet time.Results Fifteen of the 352 patients died(4.3%,among them 7 patients(46.7% died of ARF,3(20.0% of pulmonary embolism,3(20.0% of gas gangrene,and 2(13.3% of multiple organ failure.Univariate analysis revealed that the shock,time before admitted to hospital,amputation and tourniquet time were the risk factors for ARF in the wounded with limb injuries,while the logistic regression analysis showed only amputation was the risk factor for ARF(P < 0.05.Conclusion ARF is the primary cause-of-death in the wounded with limb injury.Prompt and accurate treatment and optimal time for amputation may be beneficial to decreasing the incidence and mortality of ARF in the wounded with severe limb injury and ischemic necrosis.

  16. Logistic regression model for diagnosis of transition zone prostate cancer on multi-parametric MRI

    Energy Technology Data Exchange (ETDEWEB)

    Dikaios, Nikolaos; Halligan, Steve; Taylor, Stuart; Atkinson, David; Punwani, Shonit [University College London, Centre for Medical Imaging, London (United Kingdom); University College London Hospital, Departments of Radiology, London (United Kingdom); Alkalbani, Jokha; Sidhu, Harbir Singh; Fujiwara, Taiki [University College London, Centre for Medical Imaging, London (United Kingdom); Abd-Alazeez, Mohamed; Ahmed, Hashim; Emberton, Mark [University College London, Research Department of Urology, London (United Kingdom); Kirkham, Alex; Allen, Clare [University College London Hospital, Departments of Radiology, London (United Kingdom); Freeman, Alex [University College London Hospital, Department of Histopathology, London (United Kingdom)

    2014-09-17

    We aimed to develop logistic regression (LR) models for classifying prostate cancer within the transition zone on multi-parametric magnetic resonance imaging (mp-MRI). One hundred and fifty-five patients (training cohort, 70 patients; temporal validation cohort, 85 patients) underwent mp-MRI and transperineal-template-prostate-mapping (TPM) biopsy. Positive cores were classified by cancer definitions: (1) any-cancer; (2) definition-1 [≥Gleason 4 + 3 or ≥ 6 mm cancer core length (CCL)] [high risk significant]; and (3) definition-2 (≥Gleason 3 + 4 or ≥ 4 mm CCL) cancer [intermediate-high risk significant]. For each, logistic-regression mp-MRI models were derived from the training cohort and validated internally and with the temporal cohort. Sensitivity/specificity and the area under the receiver operating characteristic (ROC-AUC) curve were calculated. LR model performance was compared to radiologists' performance. Twenty-eight of 70 patients from the training cohort, and 25/85 patients from the temporal validation cohort had significant cancer on TPM. The ROC-AUC of the LR model for classification of cancer was 0.73/0.67 at internal/temporal validation. The radiologist A/B ROC-AUC was 0.65/0.74 (temporal cohort). For patients scored by radiologists as Prostate Imaging Reporting and Data System (Pi-RADS) score 3, sensitivity/specificity of radiologist A 'best guess' and LR model was 0.14/0.54 and 0.71/0.61, respectively; and radiologist B 'best guess' and LR model was 0.40/0.34 and 0.50/0.76, respectively. LR models can improve classification of Pi-RADS score 3 lesions similar to experienced radiologists. (orig.)

  17. Modeling group size and scalar stress by logistic regression from an archaeological perspective.

    Directory of Open Access Journals (Sweden)

    Gianmarco Alberti

    Full Text Available Johnson's scalar stress theory, describing the mechanics of (and the remedies to the increase in in-group conflictuality that parallels the increase in groups' size, provides scholars with a useful theoretical framework for the understanding of different aspects of the material culture of past communities (i.e., social organization, communal food consumption, ceramic style, architecture and settlement layout. Due to its relevance in archaeology and anthropology, the article aims at proposing a predictive model of critical level of scalar stress on the basis of community size. Drawing upon Johnson's theory and on Dunbar's findings on the cognitive constrains to human group size, a model is built by means of Logistic Regression on the basis of the data on colony fissioning among the Hutterites of North America. On the grounds of the theoretical framework sketched in the first part of the article, the absence or presence of colony fissioning is considered expression of not critical vs. critical level of scalar stress for the sake of the model building. The model, which is also tested against a sample of archaeological and ethnographic cases: a confirms the existence of a significant relationship between critical scalar stress and group size, setting the issue on firmer statistical grounds; b allows calculating the intercept and slope of the logistic regression model, which can be used in any time to estimate the probability that a community experienced a critical level of scalar stress; c allows locating a critical scalar stress threshold at community size 127 (95% CI: 122-132, while the maximum probability of critical scale stress is predicted at size 158 (95% CI: 147-170. The model ultimately provides grounds to assess, for the sake of any further archaeological/anthropological interpretation, the probability that a group reached a hot spot of size development critical for its internal cohesion.

  18. Optimization of Game Formats in U-10 Soccer Using Logistic Regression Analysis

    Directory of Open Access Journals (Sweden)

    Amatria Mario

    2016-12-01

    Full Text Available Small-sided games provide young soccer players with better opportunities to develop their skills and progress as individual and team players. There is, however, little evidence on the effectiveness of different game formats in different age groups, and furthermore, these formats can vary between and even within countries. The Royal Spanish Soccer Association replaced the traditional grassroots 7-a-side format (F-7 with the 8-a-side format (F-8 in the 2011-12 season and the country’s regional federations gradually followed suit. The aim of this observational methodology study was to investigate which of these formats best suited the learning needs of U-10 players transitioning from 5-aside futsal. We built a multiple logistic regression model to predict the success of offensive moves depending on the game format and the area of the pitch in which the move was initiated. Success was defined as a shot at the goal. We also built two simple logistic regression models to evaluate how the game format influenced the acquisition of technicaltactical skills. It was found that the probability of a shot at the goal was higher in F-7 than in F-8 for moves initiated in the Creation Sector-Own Half (0.08 vs 0.07 and the Creation Sector-Opponent's Half (0.18 vs 0.16. The probability was the same (0.04 in the Safety Sector. Children also had more opportunities to control the ball and pass or take a shot in the F-7 format (0.24 vs 0.20, and these were also more likely to be successful in this format (0.28 vs 0.19.

  19. Appropriate assessment of neighborhood effects on individual health: integrating random and fixed effects in multilevel logistic regression

    DEFF Research Database (Denmark)

    Larsen, Klaus; Merlo, Juan

    2005-01-01

    The logistic regression model is frequently used in epidemiologic studies, yielding odds ratio or relative risk interpretations. Inspired by the theory of linear normal models, the logistic regression model has been extended to allow for correlated responses by introducing random effects. However......, the model does not inherit the interpretational features of the normal model. In this paper, the authors argue that the existing measures are unsatisfactory (and some of them are even improper) when quantifying results from multilevel logistic regression analyses. The authors suggest a measure...... of heterogeneity, the median odds ratio, that quantifies cluster heterogeneity and facilitates a direct comparison between covariate effects and the magnitude of heterogeneity in terms of well-known odds ratios. Quantifying cluster-level covariates in a meaningful way is a challenge in multilevel logistic...

  20. Neck-focused panic attacks among Cambodian refugees; a logistic and linear regression analysis.

    Science.gov (United States)

    Hinton, Devon E; Chhean, Dara; Pich, Vuth; Um, Khin; Fama, Jeanne M; Pollack, Mark H

    2006-01-01

    Consecutive Cambodian refugees attending a psychiatric clinic were assessed for the presence and severity of current--i.e., at least one episode in the last month--neck-focused panic. Among the whole sample (N=130), in a logistic regression analysis, the Anxiety Sensitivity Index (ASI; odds ratio=3.70) and the Clinician-Administered PTSD Scale (CAPS; odds ratio=2.61) significantly predicted the presence of current neck panic (NP). Among the neck panic patients (N=60), in the linear regression analysis, NP severity was significantly predicted by NP-associated flashbacks (beta=.42), NP-associated catastrophic cognitions (beta=.22), and CAPS score (beta=.28). Further analysis revealed the effect of the CAPS score to be significantly mediated (Sobel test [Baron, R. M., & Kenny, D. A. (1986). The moderator-mediator variable distinction in social psychological research: conceptual, strategic, and statistical considerations. Journal of Personality and Social Psychology, 51, 1173-1182]) by both NP-associated flashbacks and catastrophic cognitions. In the care of traumatized Cambodian refugees, NP severity, as well as NP-associated flashbacks and catastrophic cognitions, should be specifically assessed and treated.

  1. Statistical learning techniques applied to epidemiology: a simulated case-control comparison study with logistic regression

    Directory of Open Access Journals (Sweden)

    Land Walker H

    2011-01-01

    Full Text Available Abstract Background When investigating covariate interactions and group associations with standard regression analyses, the relationship between the response variable and exposure may be difficult to characterize. When the relationship is nonlinear, linear modeling techniques do not capture the nonlinear information content. Statistical learning (SL techniques with kernels are capable of addressing nonlinear problems without making parametric assumptions. However, these techniques do not produce findings relevant for epidemiologic interpretations. A simulated case-control study was used to contrast the information embedding characteristics and separation boundaries produced by a specific SL technique with logistic regression (LR modeling representing a parametric approach. The SL technique was comprised of a kernel mapping in combination with a perceptron neural network. Because the LR model has an important epidemiologic interpretation, the SL method was modified to produce the analogous interpretation and generate odds ratios for comparison. Results The SL approach is capable of generating odds ratios for main effects and risk factor interactions that better capture nonlinear relationships between exposure variables and outcome in comparison with LR. Conclusions The integration of SL methods in epidemiology may improve both the understanding and interpretation of complex exposure/disease relationships.

  2. Logistic random effects regression models: a comparison of statistical packages for binary and ordinal outcomes

    Directory of Open Access Journals (Sweden)

    Steyerberg Ewout W

    2011-05-01

    Full Text Available Abstract Background Logistic random effects models are a popular tool to analyze multilevel also called hierarchical data with a binary or ordinal outcome. Here, we aim to compare different statistical software implementations of these models. Methods We used individual patient data from 8509 patients in 231 centers with moderate and severe Traumatic Brain Injury (TBI enrolled in eight Randomized Controlled Trials (RCTs and three observational studies. We fitted logistic random effects regression models with the 5-point Glasgow Outcome Scale (GOS as outcome, both dichotomized as well as ordinal, with center and/or trial as random effects, and as covariates age, motor score, pupil reactivity or trial. We then compared the implementations of frequentist and Bayesian methods to estimate the fixed and random effects. Frequentist approaches included R (lme4, Stata (GLLAMM, SAS (GLIMMIX and NLMIXED, MLwiN ([R]IGLS and MIXOR, Bayesian approaches included WinBUGS, MLwiN (MCMC, R package MCMCglmm and SAS experimental procedure MCMC. Three data sets (the full data set and two sub-datasets were analysed using basically two logistic random effects models with either one random effect for the center or two random effects for center and trial. For the ordinal outcome in the full data set also a proportional odds model with a random center effect was fitted. Results The packages gave similar parameter estimates for both the fixed and random effects and for the binary (and ordinal models for the main study and when based on a relatively large number of level-1 (patient level data compared to the number of level-2 (hospital level data. However, when based on relatively sparse data set, i.e. when the numbers of level-1 and level-2 data units were about the same, the frequentist and Bayesian approaches showed somewhat different results. The software implementations differ considerably in flexibility, computation time, and usability. There are also differences in

  3. Multiple logistic regression model of signalling practices of drivers on urban highways

    Science.gov (United States)

    Puan, Othman Che; Ibrahim, Muttaka Na'iya; Zakaria, Rozana

    2015-05-01

    Giving signal is a way of informing other road users, especially to the conflicting drivers, the intention of a driver to change his/her movement course. Other users are exposed to hazard situation and risks of accident if the driver who changes his/her course failed to give signal as required. This paper describes the application of logistic regression model for the analysis of driver's signalling practices on multilane highways based on possible factors affecting driver's decision such as driver's gender, vehicle's type, vehicle's speed and traffic flow intensity. Data pertaining to the analysis of such factors were collected manually. More than 2000 drivers who have performed a lane changing manoeuvre while driving on two sections of multilane highways were observed. Finding from the study shows that relatively a large proportion of drivers failed to give any signals when changing lane. The result of the analysis indicates that although the proportion of the drivers who failed to provide signal prior to lane changing manoeuvre is high, the degree of compliances of the female drivers is better than the male drivers. A binary logistic model was developed to represent the probability of a driver to provide signal indication prior to lane changing manoeuvre. The model indicates that driver's gender, type of vehicle's driven, speed of vehicle and traffic volume influence the driver's decision to provide a signal indication prior to a lane changing manoeuvre on a multilane urban highway. In terms of types of vehicles driven, about 97% of motorcyclists failed to comply with the signal indication requirement. The proportion of non-compliance drivers under stable traffic flow conditions is much higher than when the flow is relatively heavy. This is consistent with the data which indicates a high degree of non-compliances when the average speed of the traffic stream is relatively high.

  4. Sample size matters: Investigating the optimal sample size for a logistic regression debris flow susceptibility model

    Science.gov (United States)

    Heckmann, Tobias; Gegg, Katharina; Becht, Michael

    2013-04-01

    Statistical approaches to landslide susceptibility modelling on the catchment and regional scale are used very frequently compared to heuristic and physically based approaches. In the present study, we deal with the problem of the optimal sample size for a logistic regression model. More specifically, a stepwise approach has been chosen in order to select those independent variables (from a number of derivatives of a digital elevation model and landcover data) that explain best the spatial distribution of debris flow initiation zones in two neighbouring central alpine catchments in Austria (used mutually for model calculation and validation). In order to minimise problems arising from spatial autocorrelation, we sample a single raster cell from each debris flow initiation zone within an inventory. In addition, as suggested by previous work using the "rare events logistic regression" approach, we take a sample of the remaining "non-event" raster cells. The recommendations given in the literature on the size of this sample appear to be motivated by practical considerations, e.g. the time and cost of acquiring data for non-event cases, which do not apply to the case of spatial data. In our study, we aim at finding empirically an "optimal" sample size in order to avoid two problems: First, a sample too large will violate the independent sample assumption as the independent variables are spatially autocorrelated; hence, a variogram analysis leads to a sample size threshold above which the average distance between sampled cells falls below the autocorrelation range of the independent variables. Second, if the sample is too small, repeated sampling will lead to very different results, i.e. the independent variables and hence the result of a single model calculation will be extremely dependent on the choice of non-event cells. Using a Monte-Carlo analysis with stepwise logistic regression, 1000 models are calculated for a wide range of sample sizes. For each sample size

  5. 中小学生自感课业负担的理论解释--基于北京调查样本的Logistic多项回归模型分析%The Theoretical Study On Self-perceived Academic Burden of Primary and Middle School Students A Multiple Logistic Regression Model Analysis Based on the Sample of Beijing Survey

    Institute of Scientific and Technical Information of China (English)

    王东

    2016-01-01

    “客观说”和“建构说”是研究课业负担原因的两类取向。基于2014年北京中小学生的实证调查数据,建立了囊括“客观说”“建构说”的Logistic多项回归模型。模型的结果证实了基于“客观说”的一些假设,例如学习成绩好的学生自感负担程度较低;教师质量高,学生自感负担轻;学校课程选择性高,学生课业负担也相对较轻。同时,Logistic多项回归模型也证实了“建构说”的一些假设,例如学生预期学历水平越高,自感负担越重;应试压力较强的学生,自感负担较重;体现学习态度的学习价值感、学习快乐感也对学生自感负担程度有显著影响。Logistic多项回归结果表明,作为一种客观“实在”,“课业负担”在中小学生中确实存在(客观说);然而学生对此的感受则会有强弱差异(建构说)。基于此,笔者提出“课业负担感”的概念,试图整合“客观说”和“建构说”两种观点。相比于传统的“课业负担”概念,笔者认为“课业负担感”这一概念提供了更为广阔的理论研究空间,对于“政策”导向的减负策略研究也富有价值。%There are two kind of theoretical hypothesis to explain the causes of students academic burden.One of them is the objective theory;another one is the construction theory.Based on the empirical survey data of primary and middle school students in Beijing in 2014,a multiple logistic regression model is established,which includes “objective theory”and “construction theory”.The results of this model was confirmed based on the “objective”of some of the assumptions,such as learning good grades students’ self-perceived burden level is low;the quality of teachers,students’ self-perceived burden light;high school curriculum selectivity,academic burden is also relatively light.At the same time,the multiple logistic regression model also

  6. A logistic regression based approach for the prediction of flood warning threshold exceedance

    Science.gov (United States)

    Diomede, Tommaso; Trotter, Luca; Stefania Tesini, Maria; Marsigli, Chiara

    2016-04-01

    A method based on logistic regression is proposed for the prediction of river level threshold exceedance at short (+0-18h) and medium (+18-42h) lead times. The aim of the study is to provide a valuable tool for the issue of warnings by the authority responsible of public safety in case of flood. The role of different precipitation periods as predictors for the exceedance of a fixed river level has been investigated, in order to derive significant information for flood forecasting. Based on catchment-averaged values, a separation of "antecedent" and "peak-triggering" rainfall amounts as independent variables is attempted. In particular, the following flood-related precipitation periods have been considered: (i) the period from 1 to n days before the forecast issue time, which may be relevant for the soil saturation, (ii) the last 24 hours, which may be relevant for the current water level in the river, and (iii) the period from 0 to x hours in advance with respect to the forecast issue time, when the flood-triggering precipitation generally occurs. Several combinations and values of these predictors have been tested to optimise the method implementation. In particular, the period for the precursor antecedent precipitation ranges between 5 and 45 days; the state of the river can be represented by the last 24-h precipitation or, as alternative, by the current river level. The flood-triggering precipitation has been cumulated over the next 18 hours (for the short lead time) and 36-42 hours (for the medium lead time). The proposed approach requires a specific implementation of logistic regression for each river section and warning threshold. The method performance has been evaluated over the Santerno river catchment (about 450 km2) in the Emilia-Romagna Region, northern Italy. A statistical analysis in terms of false alarms, misses and related scores was carried out by using a 8-year long database. The results are quite satisfactory, with slightly better performances

  7. Flood susceptible analysis at Kelantan river basin using remote sensing and logistic regression model

    Science.gov (United States)

    Pradhan, Biswajeet

    Recently, in 2006 and 2007 heavy monsoons rainfall have triggered floods along Malaysia's east coast as well as in southern state of Johor. The hardest hit areas are along the east coast of peninsular Malaysia in the states of Kelantan, Terengganu and Pahang. The city of Johor was particularly hard hit in southern side. The flood cost nearly billion ringgit of property and many lives. The extent of damage could have been reduced or minimized if an early warning system would have been in place. This paper deals with flood susceptibility analysis using logistic regression model. We have evaluated the flood susceptibility and the effect of flood-related factors along the Kelantan river basin using the Geographic Information System (GIS) and remote sensing data. Previous flooded areas were extracted from archived radarsat images using image processing tools. Flood susceptibility mapping was conducted in the study area along the Kelantan River using radarsat imagery and then enlarged to 1:25,000 scales. Topographical, hydrological, geological data and satellite images were collected, processed, and constructed into a spatial database using GIS and image processing. The factors chosen that influence flood occurrence were: topographic slope, topographic aspect, topographic curvature, DEM and distance from river drainage, all from the topographic database; flow direction, flow accumulation, extracted from hydrological database; geology and distance from lineament, taken from the geologic database; land use from SPOT satellite images; soil texture from soil database; and the vegetation index value from SPOT satellite images. Flood susceptible areas were analyzed and mapped using the probability-logistic regression model. Results indicate that flood prone areas can be performed at 1:25,000 which is comparable to some conventional flood hazard map scales. The flood prone areas delineated on these maps correspond to areas that would be inundated by significant flooding

  8. Predictive occurrence models for coastal wetland plant communities: delineating hydrologic response surfaces with multinomial logistic regression

    Science.gov (United States)

    Snedden, Gregg A.; Steyer, Gregory D.

    2013-01-01

    Understanding plant community zonation along estuarine stress gradients is critical for effective conservation and restoration of coastal wetland ecosystems. We related the presence of plant community types to estuarine hydrology at 173 sites across coastal Louisiana. Percent relative cover by species was assessed at each site near the end of the growing season in 2008, and hourly water level and salinity were recorded at each site Oct 2007–Sep 2008. Nine plant community types were delineated with k-means clustering, and indicator species were identified for each of the community types with indicator species analysis. An inverse relation between salinity and species diversity was observed. Canonical correspondence analysis (CCA) effectively segregated the sites across ordination space by community type, and indicated that salinity and tidal amplitude were both important drivers of vegetation composition. Multinomial logistic regression (MLR) and Akaike's Information Criterion (AIC) were used to predict the probability of occurrence of the nine vegetation communities as a function of salinity and tidal amplitude, and probability surfaces obtained from the MLR model corroborated the CCA results. The weighted kappa statistic, calculated from the confusion matrix of predicted versus actual community types, was 0.7 and indicated good agreement between observed community types and model predictions. Our results suggest that models based on a few key hydrologic variables can be valuable tools for predicting vegetation community development when restoring and managing coastal wetlands.

  9. IDENTIFIKASI FAKTOR PREDIKSI DIAGNOSIS TINGKAT KEGANASAN KANKER PAYUDARA METODE STEPWISE BINARY LOGISTIC REGRESSION

    Directory of Open Access Journals (Sweden)

    Retno Aulia Vinarti

    2014-01-01

    Full Text Available The World Health Organization (WHO reported that deaths caused by cancer in the world these last four years has increased significantly. The data also reflected in the increase in breast cancer cases. In Indonesia, two cases also the highest cases of adult female deaths. Based on Hospital Information System, the number of breast cancer patients either inpatient or outpatient care amounted to 28.7%. This fact revealed more than 40% of all cancers can be prevented with early detection cancer. Role of Information Technology can implemented by data mining techniques to shorten the diagnosing time, accuracy and selection of factors early detection of breast cancer. Stepwise binary logistic regression method has the advantage to add and subtract the independent variables in accordance with level of significance of the model. Based on the analysis of weighting method, the highest four variables that should be more aware is the area of cancer (area, fineness (smoothness, the number of dots (concave points or the nucleus of cancer and grayish level of cancer (texture. So the accuracy and processing speed of diagnosis of the severity of breast cancer can be improved through this method.

  10. Prediction of cannabis and cocaine use in adolescence using decision trees and logistic regression

    Directory of Open Access Journals (Sweden)

    Alfonso L. Palmer

    2010-01-01

    Full Text Available Spain is one of the European countries with the highest prevalence of cannabis and cocaine use among young people. The aim of this study was to investigate the factors related to the consumption of cocaine and cannabis among adolescents. A questionnaire was administered to 9,284 students between 14 and 18 years of age in Palma de Mallorca (47.1% boys and 52.9% girls whose mean age was 15.59 years. Logistic regression and decision trees were carried out in order to model the consumption of cannabis and cocaine. The results show the use of legal substances and committing fraudulence or theft are the main variables that raise the odds of consuming cannabis. In boys, cannabis consumption and a family history of drug use increase the odds of consuming cocaine, whereas in girls the use of alcohol, behaviours of fraudulence or theft and difficulty in some personal skills influence their odds of consuming cocaine. Finally, ease of access to the substance greatly raises the odds of consuming cocaine and cannabis in both genders. Decision trees highlight the role of consuming other substances and committing fraudulence or theft. The results of this study gain importance when it comes to putting into practice effective prevention programmes.

  11. A semiparametric Wald statistic for testing logistic regression models based on case-control data

    Institute of Scientific and Technical Information of China (English)

    WAN ShuWen

    2008-01-01

    We propose a semiparametric Wald statistic to test the validity of logistic regression models based on case-control data.The test statistic is constructed using a semiparametric ROC curve estimator and a nonparametric ROC curve estimator.The statistic has an asymptotic chi-squared distribution and is an alternative to the Kolmogorov-Smirnov-type statistic proposed by Qin and Zhang in 1997,the chi-squared-type statistic proposed by Zhang in 1999 and the information matrix test statistic proposed by Zhang in 2001.The statistic is easy to compute in the sense that it requires none of the following methods:using a bootstrap method to find its critical values,partitioning the sample data or inverting a high-dimensional matrix.We present some results on simulation and on analysis of two real examples.Moreover,we discuss how to extend our statistic to a family of statistics and how to construct its Kolmogorov-Smirnov counterpart.

  12. Determining the Impact of Residential Neighbourhood Crime on Housing Investment Using Logistic Regression

    Directory of Open Access Journals (Sweden)

    Sunday Emmanuel Olajide

    2016-12-01

    Full Text Available This paper discusses the impact of criminal activities on residential property value. With regard to criminal activities, the paper emphasizes on the contribution of each component of property crime. One thousand (1000 sets of structured questionnaire were administered on the residents of residential estates within the South Western States of Nigeria out of which 467 were considered useable after the data screening. Purposive and systematic sampling techniques were used while logistic regression was used to determine the impact of each of the components of residential property crime on housing investment. The results showed the P-Values of 0.000, 0.322, 0.335, 0.545 and 0.992 for violent crime, incivilities and street crime, burglary and theft, vandalism and robbery respectively. However, the R2 which represents the generalisation of the impact of neighbourhood crime on housing investment was 44 % and aggregate P-value was 0.000. Using the Hosmer and Lemeshow (H-L test of goodness of fit, the model had approximately 89% predictive probability which is considered excellent. This indicates that the alternative hypothesis is upheld that residential neighbourhood crime is capable of impacting on residential property value. The policy implication of this result is that no effort should be spared in combating residential neighbourhood crime in order to boost and encourage housing investment.

  13. Fisher Scoring Method for Parameter Estimation of Geographically Weighted Ordinal Logistic Regression (GWOLR) Model

    Science.gov (United States)

    Widyaningsih, Purnami; Retno Sari Saputro, Dewi; Nugrahani Putri, Aulia

    2017-06-01

    GWOLR model combines geographically weighted regression (GWR) and (ordinal logistic reression) OLR models. Its parameter estimation employs maximum likelihood estimation. Such parameter estimation, however, yields difficult-to-solve system of nonlinear equations, and therefore numerical approximation approach is required. The iterative approximation approach, in general, uses Newton-Raphson (NR) method. The NR method has a disadvantage—its Hessian matrix is always the second derivatives of each iteration so it does not always produce converging results. With regard to this matter, NR model is modified by substituting its Hessian matrix into Fisher information matrix, which is termed Fisher scoring (FS). The present research seeks to determine GWOLR model parameter estimation using Fisher scoring method and apply the estimation on data of the level of vulnerability to Dengue Hemorrhagic Fever (DHF) in Semarang. The research concludes that health facilities give the greatest contribution to the probability of the number of DHF sufferers in both villages. Based on the number of the sufferers, IR category of DHF in both villages can be determined.

  14. The likelihood of achieving quantified road safety targets: a binary logistic regression model for possible factors.

    Science.gov (United States)

    Sze, N N; Wong, S C; Lee, C Y

    2014-12-01

    In past several decades, many countries have set quantified road safety targets to motivate transport authorities to develop systematic road safety strategies and measures and facilitate the achievement of continuous road safety improvement. Studies have been conducted to evaluate the association between the setting of quantified road safety targets and road fatality reduction, in both the short and long run, by comparing road fatalities before and after the implementation of a quantified road safety target. However, not much work has been done to evaluate whether the quantified road safety targets are actually achieved. In this study, we used a binary logistic regression model to examine the factors - including vehicle ownership, fatality rate, and national income, in addition to level of ambition and duration of target - that contribute to a target's success. We analyzed 55 quantified road safety targets set by 29 countries from 1981 to 2009, and the results indicate that targets that are in progress and with lower level of ambitions had a higher likelihood of eventually being achieved. Moreover, possible interaction effects on the association between level of ambition and the likelihood of success are also revealed.

  15. Modeling susceptibility to deforestation of remaining ecosystems in North Central Mexico with logistic regression

    Institute of Scientific and Technical Information of China (English)

    L. Miranda-Aragón; E.J. Trevi(n)o-Garza; J. Jiménez-Pérez; O.A. Aguirre-Calderón; M.A. González-Tagle; M. Pompa-García; C.A. Aguirre-Salado

    2012-01-01

    Determining underlying factors that foster deforestation and delineating forest areas by levels of susceptibility are of the main challenges when defining policies for forest management and planning at regional scale.The susceptibility to deforestation of remaining forest ecosystems (shrubland,temperate forest and rainforest) was conducted in the state of San Luis Potosi,located in north central Mexico.Spatial analysis techniques were used to detect the deforested areas in the study area during 1993-2007.Logistic regression was used to relate explanatory variables (such as social,investment,forest production,biophysical and proximity factors) with susceptibility to deforestation to construct predictive models with two focuses:general and by biogeographical zone.In all models,deforestation has positive correlation with distance to rainfed agriculture,and negative correlation with slope,distance to roads and distance to towns.Other variables were significant in some cases,but in others they had dual relationships,which varied in each biogeographical zone.The results show that the remaining rainforest of Huasteca region is highly susceptible to deforestation.Both approaches show that more than 70% of the current rainforest area has high and very high levels of susceptibility to deforestation.The values represent a serious concern with global warming whether tree carbon is released to atmosphere.However,after some considerations,encouraging forest environmental services appears to be the best alternative to achieve sustainabie forest management.

  16. Modeling Haze Problems in the North of Thailand using Logistic Regression

    Directory of Open Access Journals (Sweden)

    Busayamas Pimpunchat

    2014-07-01

    Full Text Available At present, air pollution is a major problem in the upper northern region of Thailand. Air pollutants have an effect on human health, the economy and the traveling industry. The severity of this problem clearly appears every year during the dry season, from February to April. In particular it becomes very serious in March, especially in Chiang Mai province where smoke haze is a major issue. This study looked into related data from 2005-2010 covering eight principal parameters: PM10 (particulate matter with a diameter smaller than 10 micrometer, CO (carbon monoxide, NO2 (nitrogen dioxide, SO2 (sulphur dioxide, RH (relative humidity, NO (nitrogen oxide, pressure, and rainfall. Overall haze problem occurrence was calculated from a logistic regression model. Its dependence on the eight parameters stated above was determined for design conditions using the correlation coefficients with PM10. The proposed overall haze problem modeling can be used as a quantitative assessment criterion for supporting decision making to protect human health. This study proposed to predict haze problem occurrence in 2011. The agreement of the results from the mathematical model with actual measured PM10 concentration data from the Pollution Control Department was quite satisfactory.

  17. [Clinical research XX. From clinical judgment to multiple logistic regression model].

    Science.gov (United States)

    Berea-Baltierra, Ricardo; Rivas-Ruiz, Rodolfo; Pérez-Rodríguez, Marcela; Palacios-Cruz, Lino; Moreno, Jorge; Talavera, Juan O

    2014-01-01

    The complexity of the causality phenomenon in clinical practice implies that the result of a maneuver is not solely caused by the maneuver, but by the interaction among the maneuver and other baseline factors or variables occurring during the maneuver. This requires methodological designs that allow the evaluation of these variables. When the outcome is a binary variable, we use the multiple logistic regression model (MLRM). This multivariate model is useful when we want to predict or explain, adjusting due to the effect of several risk factors, the effect of a maneuver or exposition over the outcome. In order to perform an MLRM, the outcome or dependent variable must be a binary variable and both categories must mutually exclude each other (i.e. live/death, healthy/ill); on the other hand, independent variables or risk factors may be either qualitative or quantitative. The effect measure obtained from this model is the odds ratio (OR) with 95 % confidence intervals (CI), from which we can estimate the proportion of the outcome's variability explained through the risk factors. For these reasons, the MLRM is used in clinical research, since one of the main objectives in clinical practice comprises the ability to predict or explain an event where different risk or prognostic factors are taken into account.

  18. Prediction of Depression in Cancer Patients With Different Classification Criteria, Linear Discriminant Analysis versus Logistic Regression.

    Science.gov (United States)

    Shayan, Zahra; Mohammad Gholi Mezerji, Naser; Shayan, Leila; Naseri, Parisa

    2015-11-03

    Logistic regression (LR) and linear discriminant analysis (LDA) are two popular statistical models for prediction of group membership. Although they are very similar, the LDA makes more assumptions about the data. When categorical and continuous variables used simultaneously, the optimal choice between the two models is questionable. In most studies, classification error (CE) is used to discriminate between subjects in several groups, but this index is not suitable to predict the accuracy of the outcome. The present study compared LR and LDA models using classification indices. This cross-sectional study selected 243 cancer patients. Sample sets of different sizes (n = 50, 100, 150, 200, 220) were randomly selected and the CE, B, and Q classification indices were calculated by the LR and LDA models. CE revealed the a lack of superiority for one model over the other, but the results showed that LR performed better than LDA for the B and Q indices in all situations. No significant effect for sample size on CE was noted for selection of an optimal model. Assessment of the accuracy of prediction of real data indicated that the B and Q indices are appropriate for selection of an optimal model. The results of this study showed that LR performs better in some cases and LDA in others when based on CE. The CE index is not appropriate for classification, although the B and Q indices performed better and offered more efficient criteria for comparison and discrimination between groups.

  19. CPAT: Coding-Potential Assessment Tool using an alignment-free logistic regression model.

    Science.gov (United States)

    Wang, Liguo; Park, Hyun Jung; Dasari, Surendra; Wang, Shengqin; Kocher, Jean-Pierre; Li, Wei

    2013-04-01

    Thousands of novel transcripts have been identified using deep transcriptome sequencing. This discovery of large and 'hidden' transcriptome rejuvenates the demand for methods that can rapidly distinguish between coding and noncoding RNA. Here, we present a novel alignment-free method, Coding Potential Assessment Tool (CPAT), which rapidly recognizes coding and noncoding transcripts from a large pool of candidates. To this end, CPAT uses a logistic regression model built with four sequence features: open reading frame size, open reading frame coverage, Fickett TESTCODE statistic and hexamer usage bias. CPAT software outperformed (sensitivity: 0.96, specificity: 0.97) other state-of-the-art alignment-based software such as Coding-Potential Calculator (sensitivity: 0.99, specificity: 0.74) and Phylo Codon Substitution Frequencies (sensitivity: 0.90, specificity: 0.63). In addition to high accuracy, CPAT is approximately four orders of magnitude faster than Coding-Potential Calculator and Phylo Codon Substitution Frequencies, enabling its users to process thousands of transcripts within seconds. The software accepts input sequences in either FASTA- or BED-formatted data files. We also developed a web interface for CPAT that allows users to submit sequences and receive the prediction results almost instantly.

  20. Binary Logistic Regression Analysis of Foramen Magnum Dimensions for Sex Determination

    Science.gov (United States)

    Kamath, Venkatesh Gokuldas

    2015-01-01

    Purpose. The structural integrity of foramen magnum is usually preserved in fire accidents and explosions due to its resistant nature and secluded anatomical position and this study attempts to determine its sexing potential. Methods. The sagittal and transverse diameters and area of foramen magnum of seventy-two skulls (41 male and 31 female) from south Indian population were measured. The analysis was done using Student's t-test, linear correlation, histogram, Q-Q plot, and Binary Logistic Regression (BLR) to obtain a model for sex determination. The predicted probabilities of BLR were analysed using Receiver Operating Characteristic (ROC) curve. Result. BLR analysis and ROC curve revealed that the predictability of the dimensions in sexing the crania was 69.6% for sagittal diameter, 66.4% for transverse diameter, and 70.3% for area of foramen. Conclusion. The sexual dimorphism of foramen magnum dimensions is established. However, due to considerable overlapping of male and female values, it is unwise to singularly rely on the foramen measurements. However, considering the high sex predictability percentage of its dimensions in the present study and the studies preceding it, the foramen measurements can be used to supplement other sexing evidence available so as to precisely ascertain the sex of the skeleton. PMID:26346917

  1. Predicting the "graduate on time (GOT)" of PhD students using binary logistics regression model

    Science.gov (United States)

    Shariff, S. Sarifah Radiah; Rodzi, Nur Atiqah Mohd; Rahman, Kahartini Abdul; Zahari, Siti Meriam; Deni, Sayang Mohd

    2016-10-01

    Malaysian government has recently set a new goal to produce 60,000 Malaysian PhD holders by the year 2023. As a Malaysia's largest institution of higher learning in terms of size and population which offers more than 500 academic programmes in a conducive and vibrant environment, UiTM has taken several initiatives to fill up the gap. Strategies to increase the numbers of graduates with PhD are a process that is challenging. In many occasions, many have already identified that the struggle to get into the target set is even more daunting, and that implementation is far too ideal. This has further being progressing slowly as the attrition rate increases. This study aims to apply the proposed models that incorporates several factors in predicting the number PhD students that will complete their PhD studies on time. Binary Logistic Regression model is proposed and used on the set of data to determine the number. The results show that only 6.8% of the 2014 PhD students are predicted to graduate on time and the results are compared wih the actual number for validation purpose.

  2. Use of binary logistic regression technique with MODIS data to estimate wild fire risk

    Science.gov (United States)

    Fan, Hong; Di, Liping; Yang, Wenli; Bonnlander, Brian; Li, Xiaoyan

    2007-11-01

    Many forest fires occur across the globe each year, which destroy life and property, and strongly impact ecosystems. In recent years, wildland fires and altered fire disturbance regimes have become a significant management and science problem affecting ecosystems and wildland/urban interface cross the United States and global. In this paper, we discuss the estimation of 504 probability models for forecasting fire risk for 14 fuel types, 12 months, one day/week/month in advance, which use 19 years of historical fire data in addition to meteorological and vegetation variables. MODIS land products are utilized as a major data source, and a logistical binary regression was adopted to solve fire forecast probability. In order to better modeling the change of fire risk along with the transition of seasons, some spatial and temporal stratification strategies were applied. In order to explore the possibilities of real time prediction, the Matlab distributing computing toolbox was used to accelerate the prediction. Finally, this study give an evaluation and validation of predict based on the ground truth collected. Validating results indicate these fire risk models have achieved nearly 70% accuracy of prediction and as well MODIS data are potential data source to implement near real-time fire risk prediction.

  3. Identification of the security threshold by logistic regression applied to fuel under accident conditions

    Energy Technology Data Exchange (ETDEWEB)

    Gomes, Daniel de Souza; Baptista Filho, Benedito; Oliveira, Fabio Branco de, E-mail: dsgomes@ipen.br, E-mail: bdbfilho@ipen.br, E-mail: fabio@ipen.br [Instituto de Pesquisas Energeticas e Nucleares (IPEN/CNEN-SP), Sao Paulo, SP (Brazil); Giovedi, Claudia, E-mail: claudia.giovedi@labrisco.usp.br [Universidade de Sao Paulo (POLI/USP), Sao Paulo, SP (Brazil). Lab. de Analise, Avaliacao e Gerenciamento de Risco

    2015-07-01

    A reactivity-initiated Accident (RIA) is a disastrous failure, which occurs because of an unexpected rise in the fission rate and reactor power. This sudden increase in the reactor power may activate processes that might lead to the failure of fuel cladding. In severe accidents, a disruption of fuel and core melting can occur. The purpose of the present research is to study the patterns of such accidents using exploratory data analysis techniques. A study based on applied statistics was used for simulations. Then, we chose peak enthalpy, pulse width, burnup, fission gas release, and the oxidation of zirconium as input parameters and set the safety boundary conditions. This new approach includes the logistic regression. With this, the present research aims also to develop the ability to identify the conditions and the probability of failures. Zirconium-based alloys fabricating the cladding of the fuel rod elements with niobium 1% were analyzed for high burnup limits at 65 MWd/kgU. The data based on six decades of investigations from experimental programs. In test, perform in American reactors such as the transient reactor test (TREAT), and power Burst Facility (PBF). In experiments realized in Japanese program at nuclear in the safety research reactor (NSRR), and in Kazakhstan as impulse graphite reactor (IGR). The database obtained from the tests and served as a support for our study. (author)

  4. Lasso logistic regression, GSoft and the cyclic coordinate descent algorithm: application to gene expression data.

    Science.gov (United States)

    Garcia-Magariños, Manuel; Antoniadis, Anestis; Cao, Ricardo; Gonzãlez-Manteiga, Wenceslao

    2010-01-01

    Statistical methods generating sparse models are of great value in the gene expression field, where the number of covariates (genes) under study moves about the thousands while the sample sizes seldom reach a hundred of individuals. For phenotype classification, we propose different lasso logistic regression approaches with specific penalizations for each gene. These methods are based on a generalized soft-threshold (GSoft) estimator. We also show that a recent algorithm for convex optimization, namely, the cyclic coordinate descent (CCD) algorithm, provides with a way to solve the optimization problem significantly faster than with other competing methods. Viewing GSoft as an iterative thresholding procedure allows us to get the asymptotic properties of the resulting estimates in a straightforward manner. Results are obtained for simulated and real data. The leukemia and colon datasets are commonly used to evaluate new statistical approaches, so they come in useful to establish comparisons with similar methods. Furthermore, biological meaning is extracted from the leukemia results, and compared with previous studies. In summary, the approaches presented here give rise to sparse, interpretable models that are competitive with similar methods developed in the field.

  5. Variable Selection for Functional Logistic Regression in fMRI Data Analysis

    Directory of Open Access Journals (Sweden)

    Nedret BILLOR

    2015-03-01

    Full Text Available This study was motivated by classification problem in Functional Magnetic Resonance Imaging (fMRI, a noninvasive imaging technique which allows an experimenter to take images of a subject's brain over time. As fMRI studies usually have a small number of subjects and we assume that there is a smooth, underlying curve describing the observations in fMRI data, this results in incredibly high-dimensional datasets that are functional in nature. High dimensionality is one of the biggest problems in statistical analysis of fMRI data. There is also a need for the development of better classification methods. One of the best things about fMRI technique is its noninvasiveness. If statistical classification methods are improved, it could aid the advancement of noninvasive diagnostic techniques for mental illness or even degenerative diseases such as Alzheimer's. In this paper, we develop a variable selection technique, which tackles high dimensionality and correlation problems in fMRI data, based on L1 regularization-group lasso for the functional logistic regression model where the response is binary and represent two separate classes; the predictors are functional. We assess our method with a simulation study and an application to a real fMRI dataset.

  6. Evaluation of Inference Adequacy in Cumulative Logistic Regression Models: An Empirical Validation of ISWRidge Relationships

    Institute of Scientific and Technical Information of China (English)

    Cheng-Wu CHEN; Hsien-Chueh Peter YANG; Chen-Yuan CHEN; Alex Kung-Hsiung CHANG; Tsung-Hao CHEN

    2008-01-01

    Internal solitary wave propagation over a submarine ridge results in energy dissipation, in which the hydrodynamic interaction between a wave and ridge affects marine environment. This study analyzes the effects of ridge height and potential energy during wave-ridge interaction with a binary and cumulative logistic regression model. In testing the Global Null Hypothesis, all values are p<0.001, with three statistical methods, such as Likelihood Ratio, Score, and Wald. While comparing with two kinds of models, tests values obtained by cumulative logistic regression models are better than those by binary logistic regression models. Although this study employed cumulative logistic regression model, three probability functions 1, 2 and 3, are utilized for investigating the weighted influence of factors on wave reflection. Deviance and Pearson tests are applied to check the goodness-of-fit of the proposed model. The analytical results demonstrated that both ridge height (X1) and potential energy (X2) significantly impact (p<0.0001) the amplitude-based reflected rate; the P-values for the deviance and Pearson are all >0.05 (0.2839, 0.3438, respectively). That is, the goodness-of-fit between ridge height (X1) and potential energy (X2) can further predict parameters under the scenario of the best parsimonious model.Investigation of 6 predictive powers (R2, Max-rescaled R2, Somers'D, Gamma, Tau-a, and c, respectively) indicate that these predictive estimates of the proposed model have better predictive ability than ridge height alone, and are very similar to the interaction of ridge height and potential energy. It can be concluded that the goodness-of-fit and prediction ability of the cumulative logistic regression model are better than that of the binary logistic regression model.

  7. Bias-reduced and separation-proof conditional logistic regression with small or sparse data sets.

    Science.gov (United States)

    Heinze, Georg; Puhr, Rainer

    2010-03-30

    Conditional logistic regression is used for the analysis of binary outcomes when subjects are stratified into several subsets, e.g. matched pairs or blocks. Log odds ratio estimates are usually found by maximizing the conditional likelihood. This approach eliminates all strata-specific parameters by conditioning on the number of events within each stratum. However, in the analyses of both an animal experiment and a lung cancer case-control study, conditional maximum likelihood (CML) resulted in infinite odds ratio estimates and monotone likelihood. Estimation can be improved by using Cytel Inc.'s well-known LogXact software, which provides a median unbiased estimate and exact or mid-p confidence intervals. Here, we suggest and outline point and interval estimation based on maximization of a penalized conditional likelihood in the spirit of Firth's (Biometrika 1993; 80:27-38) bias correction method (CFL). We present comparative analyses of both studies, demonstrating some advantages of CFL over competitors. We report on a small-sample simulation study where CFL log odds ratio estimates were almost unbiased, whereas LogXact estimates showed some bias and CML estimates exhibited serious bias. Confidence intervals and tests based on the penalized conditional likelihood had close-to-nominal coverage rates and yielded highest power among all methods compared, respectively. Therefore, we propose CFL as an attractive solution to the stratified analysis of binary data, irrespective of the occurrence of monotone likelihood. A SAS program implementing CFL is available at: http://www.muw.ac.at/msi/biometrie/programs.

  8. Zone-specific logistic regression models improve classification of prostate cancer on multi-parametric MRI

    Energy Technology Data Exchange (ETDEWEB)

    Dikaios, Nikolaos; Halligan, Steve; Taylor, Stuart; Atkinson, David; Punwani, Shonit [University College London, Centre for Medical Imaging, London (United Kingdom); University College London Hospital, Departments of Radiology, London (United Kingdom); Alkalbani, Jokha; Sidhu, Harbir Singh [University College London, Centre for Medical Imaging, London (United Kingdom); Abd-Alazeez, Mohamed; Ahmed, Hashim U.; Emberton, Mark [University College London, Research Department of Urology, Division of Surgery and Interventional Science, London (United Kingdom); Kirkham, Alex [University College London Hospital, Departments of Radiology, London (United Kingdom); Freeman, Alex [University College London Hospital, Department of Histopathology, London (United Kingdom)

    2015-09-15

    To assess the interchangeability of zone-specific (peripheral-zone (PZ) and transition-zone (TZ)) multiparametric-MRI (mp-MRI) logistic-regression (LR) models for classification of prostate cancer. Two hundred and thirty-one patients (70 TZ training-cohort; 76 PZ training-cohort; 85 TZ temporal validation-cohort) underwent mp-MRI and transperineal-template-prostate-mapping biopsy. PZ and TZ uni/multi-variate mp-MRI LR-models for classification of significant cancer (any cancer-core-length (CCL) with Gleason > 3 + 3 or any grade with CCL ≥ 4 mm) were derived from the respective cohorts and validated within the same zone by leave-one-out analysis. Inter-zonal performance was tested by applying TZ models to the PZ training-cohort and vice-versa. Classification performance of TZ models for TZ cancer was further assessed in the TZ validation-cohort. ROC area-under-curve (ROC-AUC) analysis was used to compare models. The univariate parameters with the best classification performance were the normalised T2 signal (T2nSI) within the TZ (ROC-AUC = 0.77) and normalized early contrast-enhanced T1 signal (DCE-nSI) within the PZ (ROC-AUC = 0.79). Performance was not significantly improved by bi-variate/tri-variate modelling. PZ models that contained DCE-nSI performed poorly in classification of TZ cancer. The TZ model based solely on maximum-enhancement poorly classified PZ cancer. LR-models dependent on DCE-MRI parameters alone are not interchangeable between prostatic zones; however, models based exclusively on T2 and/or ADC are more robust for inter-zonal application. (orig.)

  9. Oral health-related risk behaviours and attitudes among Croatian adolescents--multiple logistic regression analysis.

    Science.gov (United States)

    Spalj, Stjepan; Spalj, Vedrana Tudor; Ivanković, Luida; Plancak, Darije

    2014-03-01

    The aim of this study was to explore the patterns of oral health-related risk behaviours in relation to dental status, attitudes, motivation and knowledge among Croatian adolescents. The assessment was conducted in the sample of 750 male subjects - military recruits aged 18-28 in Croatia using the questionnaire and clinical examination. Mean number of decayed, missing and filled teeth (DMFT) and Significant Caries Index (SIC) were calculated. Multiple logistic regression models were crated for analysis. Although models of risk behaviours were statistically significant their explanatory values were quite low. Five of them--rarely toothbrushing, not using hygiene auxiliaries, rarely visiting dentist, toothache as a primary reason to visit dentist, and demand for tooth extraction due to toothache--had the highest explanatory values ranging from 21-29% and correctly classified 73-89% of subjects. Toothache as a primary reason to visit dentist, extraction as preferable therapy when toothache occurs, not having brushing education in school and frequent gingival bleeding were significantly related to population with high caries experience (DMFT > or = 14 according to SiC) producing Odds ratios of 1.6 (95% CI 1.07-2.46), 2.1 (95% CI 1.29-3.25), 1.8 (95% CI 1.21-2.74) and 2.4 (95% CI 1.21-2.74) respectively. DMFT> or = 14 model had low explanatory value of 6.5% and correctly classified 83% of subjects. It can be concluded that oral health-related risk behaviours are interrelated. Poor association was seen between attitudes concerning oral health and oral health-related risk behaviours, indicating insufficient motivation to change lifestyle and habits. Self-reported oral hygiene habits were not strongly related to dental status.

  10. BLOOD PRESSURE AWARENESS AMONG GENERAL POPULATION: A RURAL WEST BENGAL EXPERIENCE WITH LOGISTIC REGRESSION

    Directory of Open Access Journals (Sweden)

    Sanjoy Kumar Sadhukhan

    2012-02-01

    Full Text Available Objectives: The study was conducted with an objective to find out the awareness of self blood pressure in a rural community of West Bengal and factors associated with it. Methods: A community based cross-sectional study on self BP awareness among adults (≥18 years was carried out in a rural community of West Bengal through house to house visits. Total study subjects were1201 (Male=598; Female=603 of which 132 (11% were hypertensive. Results: Only 17.2% of all study subjects were aware of their own BP readings with no male-female difference. This awareness was significantly associated with age, education, economic status and hypertension, which remained significant, even after multiple logistic regressions. Even among hypertensives, only 38% were aware of their self BP. Nearly 67.11% of the study subjects had no knowledge about complications of hypertension. About 86.92% of the study subjects were ignorant about the life style changes required to prevent hypertension. Regarding hypertension control/treatment, 72.85% of study subjects were unaware. In general, males had better knowledge compared to females,although not always statistically significant. Conclusion: Self BP awareness among this study population was very poor even among the hypertensives leading to a high risk of cerebrovascular accidents and coronary heart diseases. Interpersonal communication in medical facilities as well as other strategies like group-discussions (general and focal, mass media and general education system can be utilized to improve the situation. [National J of Med Res 2012; 2(1.000: 55-58

  11. Shock index correlates with extravasation on angiographs of gastrointestinal hemorrhage: a logistics regression analysis.

    Science.gov (United States)

    Nakasone, Yutaka; Ikeda, Osamu; Yamashita, Yasuyuki; Kudoh, Kouichi; Shigematsu, Yoshinori; Harada, Kazunori

    2007-01-01

    We applied multivariate analysis to the clinical findings in patients with acute gastrointestinal (GI) hemorrhage and compared the relationship between these findings and angiographic evidence of extravasation. Our study population consisted of 46 patients with acute GI bleeding. They were divided into two groups. In group 1 we retrospectively analyzed 41 angiograms obtained in 29 patients (age range, 25-91 years; average, 71 years). Their clinical findings including the shock index (SI), diastolic blood pressure, hemoglobin, platelet counts, and age, which were quantitatively analyzed. In group 2, consisting of 17 patients (age range, 21-78 years; average, 60 years), we prospectively applied statistical analysis by a logistics regression model to their clinical findings and then assessed 21 angiograms obtained in these patients to determine whether our model was useful for predicting the presence of angiographic evidence of extravasation. On 18 of 41 (43.9%) angiograms in group 1 there was evidence of extravasation; in 3 patients it was demonstrated only by selective angiography. Factors significantly associated with angiographic visualization of extravasation were the SI and patient age. For differentiation between cases with and cases without angiographic evidence of extravasation, the maximum cutoff point was between 0.51 and 0.0.53. Of the 21 angiograms obtained in group 2, 13 (61.9%) showed evidence of extravasation; in 1 patient it was demonstrated only on selective angiograms. We found that in 90% of the cases, the prospective application of our model correctly predicted the angiographically confirmed presence or absence of extravasation. We conclude that in patients with GI hemorrhage, angiographic visualization of extravasation is associated with the pre-embolization SI. Patients with a high SI value should undergo study to facilitate optimal treatment planning.

  12. Large scale identification and categorization of protein sequences using structured logistic regression.

    Directory of Open Access Journals (Sweden)

    Bjørn P Pedersen

    Full Text Available BACKGROUND: Structured Logistic Regression (SLR is a newly developed machine learning tool first proposed in the context of text categorization. Current availability of extensive protein sequence databases calls for an automated method to reliably classify sequences and SLR seems well-suited for this task. The classification of P-type ATPases, a large family of ATP-driven membrane pumps transporting essential cations, was selected as a test-case that would generate important biological information as well as provide a proof-of-concept for the application of SLR to a large scale bioinformatics problem. RESULTS: Using SLR, we have built classifiers to identify and automatically categorize P-type ATPases into one of 11 pre-defined classes. The SLR-classifiers are compared to a Hidden Markov Model approach and shown to be highly accurate and scalable. Representing the bulk of currently known sequences, we analysed 9.3 million sequences in the UniProtKB and attempted to classify a large number of P-type ATPases. To examine the distribution of pumps on organisms, we also applied SLR to 1,123 complete genomes from the Entrez genome database. Finally, we analysed the predicted membrane topology of the identified P-type ATPases. CONCLUSIONS: Using the SLR-based classification tool we are able to run a large scale study of P-type ATPases. This study provides proof-of-concept for the application of SLR to a bioinformatics problem and the analysis of P-type ATPases pinpoints new and interesting targets for further biochemical characterization and structural analysis.

  13. To resuscitate or not to resuscitate: a logistic regression analysis of physician-related variables influencing the decision.

    Science.gov (United States)

    Einav, Sharon; Alon, Gady; Kaufman, Nechama; Braunstein, Rony; Carmel, Sara; Varon, Joseph; Hersch, Moshe

    2012-09-01

    To determine whether variables in physicians' backgrounds influenced their decision to forego resuscitating a patient they did not previously know. Questionnaire survey of a convenience sample of 204 physicians working in the departments of internal medicine, anaesthesiology and cardiology in 11 hospitals in Israel. Twenty per cent of the participants had elected to forego resuscitating a patient they did not previously know without additional consultation. Physicians who had more frequently elected to forego resuscitation had practised medicine for more than 5 years (p=0.013), estimated the number of resuscitations they had performed as being higher (p=0.009), and perceived their experience in resuscitation as sufficient (p=0.001). The variable that predicted the outcome of always performing resuscitation in the logistic regression model was less than 5 years of experience in medicine (OR 0.227, 95% CI 0.065 to 0.793; p=0.02). Physicians' level of experience may affect the probability of a patient's receiving resuscitation, whereas the physicians' personal beliefs and values did not seem to affect this outcome.

  14. The Overall Odds Ratio as an Intuitive Effect Size Index for Multiple Logistic Regression: Examination of Further Refinements

    Science.gov (United States)

    Le, Huy; Marcus, Justin

    2012-01-01

    This study used Monte Carlo simulation to examine the properties of the overall odds ratio (OOR), which was recently introduced as an index for overall effect size in multiple logistic regression. It was found that the OOR was relatively independent of study base rate and performed better than most commonly used R-square analogs in indexing model…

  15. Multinomial logistic regression modelling of obesity and overweight among primary school students in a rural area of Negeri Sembilan

    Energy Technology Data Exchange (ETDEWEB)

    Ghazali, Amirul Syafiq Mohd; Ali, Zalila; Noor, Norlida Mohd; Baharum, Adam [Pusat Pengajian Sains Matematik, Universiti Sains Malaysia, 11800 USM, Pulau Pinang, Malaysia amirul@unisel.edu.my, zalila@cs.usm.my, norlida@usm.my, adam@usm.my (Malaysia)

    2015-10-22

    Multinomial logistic regression is widely used to model the outcomes of a polytomous response variable, a categorical dependent variable with more than two categories. The model assumes that the conditional mean of the dependent categorical variables is the logistic function of an affine combination of predictor variables. Its procedure gives a number of logistic regression models that make specific comparisons of the response categories. When there are q categories of the response variable, the model consists of q-1 logit equations which are fitted simultaneously. The model is validated by variable selection procedures, tests of regression coefficients, a significant test of the overall model, goodness-of-fit measures, and validation of predicted probabilities using odds ratio. This study used the multinomial logistic regression model to investigate obesity and overweight among primary school students in a rural area on the basis of their demographic profiles, lifestyles and on the diet and food intake. The results indicated that obesity and overweight of students are related to gender, religion, sleep duration, time spent on electronic games, breakfast intake in a week, with whom meals are taken, protein intake, and also, the interaction between breakfast intake in a week with sleep duration, and the interaction between gender and protein intake.

  16. Estimation of Logistic Regression Models in Small Samples. A Simulation Study Using a Weakly Informative Default Prior Distribution

    Science.gov (United States)

    Gordovil-Merino, Amalia; Guardia-Olmos, Joan; Pero-Cebollero, Maribel

    2012-01-01

    In this paper, we used simulations to compare the performance of classical and Bayesian estimations in logistic regression models using small samples. In the performed simulations, conditions were varied, including the type of relationship between independent and dependent variable values (i.e., unrelated and related values), the type of variable…

  17. Establishing the change in antibiotic resistance of Enterococcus faecium strains isolated from Dutch broilers by logistic regression and survival analysis

    NARCIS (Netherlands)

    Stegeman, J.A.; Vernooij, J.C.M.; Khalifa, O.A.; Broek, van den J.; Mevius, D.J.

    2006-01-01

    In this study, we investigated the change in the resistance of Enterococcus faecium strains isolated from Dutch broilers against erythromycin and virginiamycin in 1998, 1999 and 2001 by logistic regression analysis and survival analysis. The E. faecium strains were isolated from caecal samples that

  18. The Overall Odds Ratio as an Intuitive Effect Size Index for Multiple Logistic Regression: Examination of Further Refinements

    Science.gov (United States)

    Le, Huy; Marcus, Justin

    2012-01-01

    This study used Monte Carlo simulation to examine the properties of the overall odds ratio (OOR), which was recently introduced as an index for overall effect size in multiple logistic regression. It was found that the OOR was relatively independent of study base rate and performed better than most commonly used R-square analogs in indexing model…

  19. Diagnosis of hepatic fibrosis in hepatitis B patients by logistic regression modeling based on plasma amino acid ratio and age

    Institute of Scientific and Technical Information of China (English)

    张占卿

    2013-01-01

    Objective To explore the efficacy of logistic regression modeling based on plasma amino acid profile and patient age,for diagnosing hepatic fibrosis in patients with chronic hepatitis B (CHB) .Methods One-hundredand-forty-eight patients (108 males;mean age:38.1±11.9 years,range:16—72 years) histologically

  20. Multinomial logistic regression modelling of obesity and overweight among primary school students in a rural area of Negeri Sembilan

    Science.gov (United States)

    Ghazali, Amirul Syafiq Mohd; Ali, Zalila; Noor, Norlida Mohd; Baharum, Adam

    2015-10-01

    Multinomial logistic regression is widely used to model the outcomes of a polytomous response variable, a categorical dependent variable with more than two categories. The model assumes that the conditional mean of the dependent categorical variables is the logistic function of an affine combination of predictor variables. Its procedure gives a number of logistic regression models that make specific comparisons of the response categories. When there are q categories of the response variable, the model consists of q-1 logit equations which are fitted simultaneously. The model is validated by variable selection procedures, tests of regression coefficients, a significant test of the overall model, goodness-of-fit measures, and validation of predicted probabilities using odds ratio. This study used the multinomial logistic regression model to investigate obesity and overweight among primary school students in a rural area on the basis of their demographic profiles, lifestyles and on the diet and food intake. The results indicated that obesity and overweight of students are related to gender, religion, sleep duration, time spent on electronic games, breakfast intake in a week, with whom meals are taken, protein intake, and also, the interaction between breakfast intake in a week with sleep duration, and the interaction between gender and protein intake.

  1. Risk factors for subclinical intramammary infection in dairy goats in two longitudinal field studies evaluated by Bayesian logistic regression

    DEFF Research Database (Denmark)

    Koop, Gerrit; Collar, Carol A.; Toft, Nils

    2013-01-01

    are imperfect tests, particularly lacking sensitivity, which leads to misclassification and thus to biased estimates of odds ratios in risk factor studies. The objective of this study was to evaluate risk factors for the true (latent) IMI status of major pathogens in dairy goats. We used Bayesian logistic......, caprine arthritis encephalitis-virus infection status, and kidding season), and uncontrollable risk factors (parity, lactation stage, milk yield, pregnancy status, and breed) were measured in the Dutch study, the Californian study or in both studies. Bayesian logistic regression models were constructed...... in which the true (but latent) infection status was linked to the joint test results, as functions of test sensitivity and specificity. The latent IMI status was the dependent variable in the logistic regression model with risk factors as independent variables and with random herd and goat effects...

  2. Application of fused lasso logistic regression to the study of corpus callosum thickness in early Alzheimer's disease.

    Science.gov (United States)

    Lee, Sang H; Yu, Donghyeon; Bachman, Alvin H; Lim, Johan; Ardekani, Babak A

    2014-01-15

    We propose a fused lasso logistic regression to analyze callosal thickness profiles. The fused lasso regression imposes penalties on both the l1-norm of the model coefficients and their successive differences, and finds only a small number of non-zero coefficients which are locally constant. An iterative method of solving logistic regression with fused lasso regularization is proposed to make this a practical procedure. In this study we analyzed callosal thickness profiles sampled at 100 equal intervals between the rostrum and the splenium. The method was applied to corpora callosa of elderly normal controls (NCs) and patients with very mild or mild Alzheimer's disease (AD) from the Open Access Series of Imaging Studies (OASIS) database. We found specific locations in the genu and splenium of AD patients that are proportionally thinner than those of NCs. Callosal thickness in these regions combined with the Mini Mental State Examination scores differentiated AD from NC with 84% accuracy.

  3. Demand Analysis of Logistics Information Matching Platform: A Survey from Highway Freight Market in Zhejiang Province

    Science.gov (United States)

    Chen, Daqiang; Shen, Xiahong; Tong, Bing; Zhu, Xiaoxiao; Feng, Tao

    With the increasing competition in logistics industry and promotion of lower logistics costs requirements, the construction of logistics information matching platform for highway transportation plays an important role, and the accuracy of platform design is the key to successful operation or not. Based on survey results of logistics service providers, customers and regulation authorities to access to information and in-depth information demand analysis of logistics information matching platform for highway transportation in Zhejiang province, a survey analysis for framework of logistics information matching platform for highway transportation is provided.

  4. An Original Stepwise Multilevel Logistic Regression Analysis of Discriminatory Accuracy: The Case of Neighbourhoods and Health.

    Directory of Open Access Journals (Sweden)

    Juan Merlo

    Full Text Available Many multilevel logistic regression analyses of "neighbourhood and health" focus on interpreting measures of associations (e.g., odds ratio, OR. In contrast, multilevel analysis of variance is rarely considered. We propose an original stepwise analytical approach that distinguishes between "specific" (measures of association and "general" (measures of variance contextual effects. Performing two empirical examples we illustrate the methodology, interpret the results and discuss the implications of this kind of analysis in public health.We analyse 43,291 individuals residing in 218 neighbourhoods in the city of Malmö, Sweden in 2006. We study two individual outcomes (psychotropic drug use and choice of private vs. public general practitioner, GP for which the relative importance of neighbourhood as a source of individual variation differs substantially. In Step 1 of the analysis, we evaluate the OR and the area under the receiver operating characteristic (AUC curve for individual-level covariates (i.e., age, sex and individual low income. In Step 2, we assess general contextual effects using the AUC. Finally, in Step 3 the OR for a specific neighbourhood characteristic (i.e., neighbourhood income is interpreted jointly with the proportional change in variance (i.e., PCV and the proportion of ORs in the opposite direction (POOR statistics.For both outcomes, information on individual characteristics (Step 1 provide a low discriminatory accuracy (AUC = 0.616 for psychotropic drugs; = 0.600 for choosing a private GP. Accounting for neighbourhood of residence (Step 2 only improved the AUC for choosing a private GP (+0.295 units. High neighbourhood income (Step 3 was strongly associated to choosing a private GP (OR = 3.50 but the PCV was only 11% and the POOR 33%.Applying an innovative stepwise multilevel analysis, we observed that, in Malmö, the neighbourhood context per se had a negligible influence on individual use of psychotropic drugs, but

  5. Education-Based Gaps in eHealth: A Weighted Logistic Regression Approach.

    Science.gov (United States)

    Amo, Laura

    2016-10-12

    Persons with a college degree are more likely to engage in eHealth behaviors than persons without a college degree, compounding the health disadvantages of undereducated groups in the United States. However, the extent to which quality of recent eHealth experience reduces the education-based eHealth gap is unexplored. The goal of this study was to examine how eHealth information search experience moderates the relationship between college education and eHealth behaviors. Based on a nationally representative sample of adults who reported using the Internet to conduct the most recent health information search (n=1458), I evaluated eHealth search experience in relation to the likelihood of engaging in different eHealth behaviors. I examined whether Internet health information search experience reduces the eHealth behavior gaps among college-educated and noncollege-educated adults. Weighted logistic regression models were used to estimate the probability of different eHealth behaviors. College education was significantly positively related to the likelihood of 4 eHealth behaviors. In general, eHealth search experience was negatively associated with health care behaviors, health information-seeking behaviors, and user-generated or content sharing behaviors after accounting for other covariates. Whereas Internet health information search experience has narrowed the education gap in terms of likelihood of using email or Internet to communicate with a doctor or health care provider and likelihood of using a website to manage diet, weight, or health, it has widened the education gap in the instances of searching for health information for oneself, searching for health information for someone else, and downloading health information on a mobile device. The relationship between college education and eHealth behaviors is moderated by Internet health information search experience in different ways depending on the type of eHealth behavior. After controlling for college

  6. Bias of using odds ratio estimates in multinomial logistic regressions to estimate relative risk or prevalence ratio and alternatives

    Directory of Open Access Journals (Sweden)

    Suzi Alves Camey

    2014-01-01

    Full Text Available Recent studies have emphasized that there is no justification for using the odds ratio (OR as an approximation of the relative risk (RR or prevalence ratio (PR. Erroneous interpretations of the OR as RR or PR must be avoided, as several studies have shown that the OR is not a good approximation for these measures when the outcome is common (> 10%. For multinomial outcomes it is usual to use the multinomial logistic regression. In this context, there are no studies showing the impact of the approximation of the OR in the estimates of RR or PR. This study aimed to present and discuss alternative methods to multinomial logistic regression based upon robust Poisson regression and the log-binomial model. The approaches were compared by simulating various possible scenarios. The results showed that the proposed models have more precise and accurate estimates for the RR or PR than the multinomial logistic regression, as in the case of the binary outcome. Thus also for multinomial outcomes the OR must not be used as an approximation of the RR or PR, since this may lead to incorrect conclusions.

  7. Ultrasonic Diagnosis of Breast Nodular Lesions by Logistic Regression%乳腺结节样病变超声诊断的 Logistic 回归分析

    Institute of Scientific and Technical Information of China (English)

    傅增顺

    2012-01-01

      目的建立乳腺结节样病变超声诊断的Logistic回归模型.方法对经手术病理证实的205个乳腺病变的二维超声、彩色多普勒超声声像特征进行回归分析,建立Logistic回归模型,用ROC曲线法评价Logistic回归模型的预报能力.结果9个超声特征进入Logistic模型初步筛选,即病灶后方回声改变、病灶活动度、病灶内血流信号、毛刺征、病灶内微小钙化、强回声晕征、包膜、腋窝淋巴节结构改变、纵横径比.经筛选后,具有显著性的病灶后方回声改变、病灶活动度、病灶内血流信号3因素再进一步Logistic回归分析,改善拟合优度. Logistic回归模型ROC曲线下面积为0.981.结论超声声像特征的Logistic 回归模型有助于乳腺良、恶性病变的鉴别诊断.%  Objective To establish a Logistic regression model based on ultrasonographic characteristics and to diagnose breast nodular lesions.Methods The characteristics of gray-scale ultrasonography ( US),color Doppler flow imaging ( CDFI) and some clinical symptoms were evaluated in 205 breast nodular lesions confirmed by surgical pathology on a retrospective study .A Logistic model for predic-ting malignancy of the breast nodular lesions on the basis of ultrasonographic characteristics and clinical symptoms were obtained .A receiver operating characteristic(ROC) curve was used to assess the performance of the Logistic model .Results Nine ultrasonographic characteristics entered the Logistic model.They were rear echo change,mass movement,color Doppler flow grade within lesion ,spicule sign,strong echo halo sign,micro-calcification,envelope,aspect ratio,and axillary lymph nodes structural change respectively .After screening,rear echo change, mass movement and color Doppler flow grade within lesion were done again to improve the goodness of fit .The area under the ROC curve was 0.981.Conclusion The Logistic regression model can help differentiate malignant

  8. Fourth annual state of logistics survey for South Africa 2007: logistics for regional growth and development

    CSIR Research Space (South Africa)

    Ittmann, HW

    2008-06-01

    Full Text Available worldwide. While South Africa is ranked a comforting 24th out of 150 countries, the high internal – or domestic – logistics costs remain the biggest concern for industry in our country. If South Africa wants to compete in the global market place... is 100% owned by Imperial Holdings and is home to 70 operating companies. In sub-Saharan Africa, operations are segmented into three key divisions, namely Transport and Warehousing, Consumer Products and Specialised Freight. Imperial Logistics has...

  9. An empirical study of statistical properties of variance partition coefficients for multi-level logistic regression models

    Science.gov (United States)

    Li, J.; Gray, B.R.; Bates, D.M.

    2008-01-01

    Partitioning the variance of a response by design levels is challenging for binomial and other discrete outcomes. Goldstein (2003) proposed four definitions for variance partitioning coefficients (VPC) under a two-level logistic regression model. In this study, we explicitly derived formulae for multi-level logistic regression model and subsequently studied the distributional properties of the calculated VPCs. Using simulations and a vegetation dataset, we demonstrated associations between different VPC definitions, the importance of methods for estimating VPCs (by comparing VPC obtained using Laplace and penalized quasilikehood methods), and bivariate dependence between VPCs calculated at different levels. Such an empirical study lends an immediate support to wider applications of VPC in scientific data analysis.

  10. Susceptibility mapping of shallow landslides using kernel-based Gaussian process, support vector machines and logistic regression

    Science.gov (United States)

    Colkesen, Ismail; Sahin, Emrehan Kutlug; Kavzoglu, Taskin

    2016-06-01

    Identification of landslide prone areas and production of accurate landslide susceptibility zonation maps have been crucial topics for hazard management studies. Since the prediction of susceptibility is one of the main processing steps in landslide susceptibility analysis, selection of a suitable prediction method plays an important role in the success of the susceptibility zonation process. Although simple statistical algorithms (e.g. logistic regression) have been widely used in the literature, the use of advanced non-parametric algorithms in landslide susceptibility zonation has recently become an active research topic. The main purpose of this study is to investigate the possible application of kernel-based Gaussian process regression (GPR) and support vector regression (SVR) for producing landslide susceptibility map of Tonya district of Trabzon, Turkey. Results of these two regression methods were compared with logistic regression (LR) method that is regarded as a benchmark method. Results showed that while kernel-based GPR and SVR methods generally produced similar results (90.46% and 90.37%, respectively), they outperformed the conventional LR method by about 18%. While confirming the superiority of the GPR method, statistical tests based on ROC statistics, success rate and prediction rate curves revealed the significant improvement in susceptibility map accuracy by applying kernel-based GPR and SVR methods.

  11. Developing a Referral Protocol for Community-Based Occupational Therapy Services in Taiwan: A Logistic Regression Analysis.

    Science.gov (United States)

    Mao, Hui-Fen; Chang, Ling-Hui; Tsai, Athena Yi-Jung; Huang, Wen-Ni; Wang, Jye

    2016-01-01

    Because resources for long-term care services are limited, timely and appropriate referral for rehabilitation services is critical for optimizing clients' functions and successfully integrating them into the community. We investigated which client characteristics are most relevant in predicting Taiwan's community-based occupational therapy (OT) service referral based on experts' beliefs. Data were collected in face-to-face interviews using the Multidimensional Assessment Instrument (MDAI). Community-dwelling participants (n = 221) ≥ 18 years old who reported disabilities in the previous National Survey of Long-term Care Needs in Taiwan were enrolled. The standard for referral was the judgment and agreement of two experienced occupational therapists who reviewed the results of the MDAI. Logistic regressions and Generalized Additive Models were used for analysis. Two predictive models were proposed, one using basic activities of daily living (BADLs) and one using instrumental ADLs (IADLs). Dementia, psychiatric disorders, cognitive impairment, joint range-of-motion limitations, fear of falling, behavioral or emotional problems, expressive deficits (in the BADL-based model), and limitations in IADLs or BADLs were significantly correlated with the need for referral. Both models showed high area under the curve (AUC) values on receiver operating curve testing (AUC = 0.977 and 0.972, respectively). The probability of being referred for community OT services was calculated using the referral algorithm. The referral protocol facilitated communication between healthcare professionals to make appropriate decisions for OT referrals. The methods and findings should be useful for developing referral protocols for other long-term care services.

  12. Personality, Driving Behavior and Mental Disorders Factors as Predictors of Road Traffic Accidents Based on Logistic Regression

    Directory of Open Access Journals (Sweden)

    Seyyed Salman Alavi

    2017-01-01

    Full Text Available Background: The aim of this study was to evaluate the effect of variables such as personality traits, driving behavior and mental illness on road traffic accidents among the drivers with accidents and those without road crash. Methods: In this cohort study, 800 bus and truck drivers were recruited. Participants were selected among drivers who referred to Imam Sajjad Hospital (Tehran, Iran during 2013-2015. The Manchester driving behavior questionnaire (MDBQ, big five personality test (NEO personality inventory and semi-structured interview (SADS were used. After two years, we surveyed all accidents due to human factors that involved the recruited drivers. The data were analyzed using the SPSS software by performing the descriptive statistics, t-test, and multiple logistic regression analysis methods. P values less than 0.05 were considered statistically significant. Results: In terms of controlling the effective and demographic variables, the findings revealed significant differences between the two groups of drivers that were and were not involved in road accidents. In addition, it was found that depression and anxiety could increase the odds ratio (OR of road accidents by 2.4- and 2.7-folds, respectively (P=0.04, P=0.004. It is noteworthy to mention that neuroticism alone can increase the odds of road accidents by 1.1-fold (P=0.009, but other personality factors did not have a significant effect on the equation. Conclusion: The results revealed that some mental disorders affect the incidence of road collisions. Considering the importance and sensitivity of driving behavior, it is necessary to evaluate multiple psychological factors influencing drivers before and after receiving or renewing their driver’s license.

  13. Personality, Driving Behavior and Mental Disorders Factors as Predictors of Road Traffic Accidents Based on Logistic Regression

    Science.gov (United States)

    Alavi, Seyyed Salman; Mohammadi, Mohammad Reza; Souri, Hamid; Mohammadi Kalhori, Soroush; Jannatifard, Fereshteh; Sepahbodi, Ghazal

    2017-01-01

    Background: The aim of this study was to evaluate the effect of variables such as personality traits, driving behavior and mental illness on road traffic accidents among the drivers with accidents and those without road crash. Methods: In this cohort study, 800 bus and truck drivers were recruited. Participants were selected among drivers who referred to Imam Sajjad Hospital (Tehran, Iran) during 2013-2015. The Manchester driving behavior questionnaire (MDBQ), big five personality test (NEO personality inventory) and semi-structured interview (schizophrenia and affective disorders scale) were used. After two years, we surveyed all accidents due to human factors that involved the recruited drivers. The data were analyzed using the SPSS software by performing the descriptive statistics, t-test, and multiple logistic regression analysis methods. P values less than 0.05 were considered statistically significant. Results: In terms of controlling the effective and demographic variables, the findings revealed significant differences between the two groups of drivers that were and were not involved in road accidents. In addition, it was found that depression and anxiety could increase the odds ratio (OR) of road accidents by 2.4- and 2.7-folds, respectively (P=0.04, P=0.004). It is noteworthy to mention that neuroticism alone can increase the odds of road accidents by 1.1-fold (P=0.009), but other personality factors did not have a significant effect on the equation. Conclusion: The results revealed that some mental disorders affect the incidence of road collisions. Considering the importance and sensitivity of driving behavior, it is necessary to evaluate multiple psychological factors influencing drivers before and after receiving or renewing their driver’s license. PMID:28293047

  14. Reproductive risk factors assessment for anaemia among pregnant women in India using a multinomial logistic regression model.

    Science.gov (United States)

    Perumal, Vanamail

    2014-07-01

    To assess reproductive risk factors for anaemia among pregnant women in urban and rural areas of India. The International Institute of Population Sciences, India, carried out third National Family Health Survey in 2005-2006 to estimate a key indicator from a sample of ever-married women in the reproductive age group 15-49 years. Data on various dimensions were collected using a structured questionnaire, and anaemia was measured using a portable HemoCue instrument. Anaemia prevalence among pregnant women was compared between rural and urban areas using chi-square test and odds ratio. Multinomial logistic regression analysis was used to determine risk factors. Anaemia prevalence was assessed among 3355 pregnant women from rural areas and 1962 pregnant women from urban areas. Moderate-to-severe anaemia in rural areas (32.4%) is significantly more common than in urban areas (27.3%) with an excess risk of 30%. Gestational age specific prevalence of anaemia significantly increases in rural areas after 6 months. Pregnancy duration is a significant risk factor in both urban and rural areas. In rural areas, increasing age at marriage and mass media exposure are significant protective factors of anaemia. However, more births in the last five years, alcohol consumption and smoking habits are significant risk factors. In rural areas, various reproductive factors and lifestyle characteristics constitute significant risk factors for moderate-to-severe anaemia. Therefore, intensive health education on reproductive practices and the impact of lifestyle characteristics are warranted to reduce anaemia prevalence. © 2014 John Wiley & Sons Ltd.

  15. Influential factors of red-light running at signalized intersection and prediction using a rare events logistic regression model.

    Science.gov (United States)

    Ren, Yilong; Wang, Yunpeng; Wu, Xinkai; Yu, Guizhen; Ding, Chuan

    2016-10-01

    Red light running (RLR) has become a major safety concern at signalized intersection. To prevent RLR related crashes, it is critical to identify the factors that significantly impact the drivers' behaviors of RLR, and to predict potential RLR in real time. In this research, 9-month's RLR events extracted from high-resolution traffic data collected by loop detectors from three signalized intersections were applied to identify the factors that significantly affect RLR behaviors. The data analysis indicated that occupancy time, time gap, used yellow time, time left to yellow start, whether the preceding vehicle runs through the intersection during yellow, and whether there is a vehicle passing through the intersection on the adjacent lane were significantly factors for RLR behaviors. Furthermore, due to the rare events nature of RLR, a modified rare events logistic regression model was developed for RLR prediction. The rare events logistic regression method has been applied in many fields for rare events studies and shows impressive performance, but so far none of previous research has applied this method to study RLR. The results showed that the rare events logistic regression model performed significantly better than the standard logistic regression model. More importantly, the proposed RLR prediction method is purely based on loop detector data collected from a single advance loop detector located 400 feet away from stop-bar. This brings great potential for future field applications of the proposed method since loops have been widely implemented in many intersections and can collect data in real time. This research is expected to contribute to the improvement of intersection safety significantly.

  16. Examining asymmetric effects in the South African Philips curve: Evidence from logistic smooth transition regression (LSTR) models

    OpenAIRE

    Phiri, Andrew

    2015-01-01

    This study contributes to the foregoing literature by investigating asymmetric behaviour within the South African short-run Phillips curve for three versions of the Phillips curve specification namely; the New Classical Phillips curve, the New Keynesian Phillips curve and the Hybrid New Keynesian Phillips curve. To this end, we employ a logistic smooth transition regression (LSTR) econometric model to each of the aforementioned versions of the Phillips curve specifications for quarterly data ...

  17. A comparative study of slope failure prediction using logistic regression, support vector machine and least square support vector machine models

    Science.gov (United States)

    Zhou, Lim Yi; Shan, Fam Pei; Shimizu, Kunio; Imoto, Tomoaki; Lateh, Habibah; Peng, Koay Swee

    2017-08-01

    A comparative study of logistic regression, support vector machine (SVM) and least square support vector machine (LSSVM) models has been done to predict the slope failure (landslide) along East-West Highway (Gerik-Jeli). The effects of two monsoon seasons (southwest and northeast) that occur in Malaysia are considered in this study. Two related factors of occurrence of slope failure are included in this study: rainfall and underground water. For each method, two predictive models are constructed, namely SOUTHWEST and NORTHEAST models. Based on the results obtained from logistic regression models, two factors (rainfall and underground water level) contribute to the occurrence of slope failure. The accuracies of the three statistical models for two monsoon seasons are verified by using Relative Operating Characteristics curves. The validation results showed that all models produced prediction of high accuracy. For the results of SVM and LSSVM, the models using RBF kernel showed better prediction compared to the models using linear kernel. The comparative results showed that, for SOUTHWEST models, three statistical models have relatively similar performance. For NORTHEAST models, logistic regression has the best predictive efficiency whereas the SVM model has the second best predictive efficiency.

  18. The identification of menstrual blood in forensic samples by logistic regression modeling of miRNA expression.

    Science.gov (United States)

    Hanson, Erin K; Mirza, Mohid; Rekab, Kamel; Ballantyne, Jack

    2014-11-01

    We report the identification of sensitive and specific miRNA biomarkers for menstrual blood, a tissue that might provide probative information in certain specialized instances. We incorporated these biomarkers into qPCR assays and developed a quantitative statistical model using logistic regression that permits the prediction of menstrual blood in a forensic sample with a high, and measurable, degree of accuracy. Using the developed model, we achieved 100% accuracy in determining the body fluid of interest for a set of test samples (i.e. samples not used in model development). The development, and details, of the logistic regression model are described. Testing and evaluation of the finalized logistic regression modeled assay using a small number of samples was carried out to preliminarily estimate the limit of detection (LOD), specificity in admixed samples and expression of the menstrual blood miRNA biomarkers throughout the menstrual cycle (25-28 days). The LOD was blood was identified only during the menses phase of the female reproductive cycle in two donors.

  19. Logistic Regression Analysis of Contrast-Enhanced Ultrasound and Conventional Ultrasound Characteristics of Sub-centimeter Thyroid Nodules.

    Science.gov (United States)

    Zhao, Rui-Na; Zhang, Bo; Yang, Xiao; Jiang, Yu-Xin; Lai, Xing-Jian; Zhang, Xiao-Yan

    2015-12-01

    The purpose of the study described here was to determine specific characteristics of thyroid microcarcinoma (TMC) and explore the value of contrast-enhanced ultrasound (CEUS) combined with conventional ultrasound (US) in the diagnosis of TMC. Characteristics of 63 patients with TMC and 39 with benign sub-centimeter thyroid nodules were retrospectively analyzed. Multivariate logistic regression analysis was performed to determine independent risk factors. Four variables were included in the logistic regression models: age, shape, blood flow distribution and enhancement pattern. The area under the receiver operating characteristic curve was 0.919. With 0.113 selected as the cutoff value, sensitivity, specificity, positive predictive value, negative predictive value and accuracy were 90.5%, 82.1%, 89.1%, 84.2% and 87.3%, respectively. Independent risk factors for TMC determined with the combination of CEUS and conventional US were age, shape, blood flow distribution and enhancement pattern. Age was negatively correlated with malignancy, whereas shape, blood flow distribution and enhancement pattern were positively correlated. The logistic regression model involving CEUS and conventional US was found to be effective in the diagnosis of sub-centimeter thyroid nodules.

  20. The quest for conditional independence in prospectivity modeling: weights-of-evidence, boost weights-of-evidence, and logistic regression

    Science.gov (United States)

    Schaeben, Helmut; Semmler, Georg

    2016-09-01

    The objective of prospectivity modeling is prediction of the conditional probability of the presence T = 1 or absence T = 0 of a target T given favorable or prohibitive predictors B, or construction of a two classes 0,1 classification of T. A special case of logistic regression called weights-of-evidence (WofE) is geologists' favorite method of prospectivity modeling due to its apparent simplicity. However, the numerical simplicity is deceiving as it is implied by the severe mathematical modeling assumption of joint conditional independence of all predictors given the target. General weights of evidence are explicitly introduced which are as simple to estimate as conventional weights, i.e., by counting, but do not require conditional independence. Complementary to the regression view is the classification view on prospectivity modeling. Boosting is the construction of a strong classifier from a set of weak classifiers. From the regression point of view it is closely related to logistic regression. Boost weights-of-evidence (BoostWofE) was introduced into prospectivity modeling to counterbalance violations of the assumption of conditional independence even though relaxation of modeling assumptions with respect to weak classifiers was not the (initial) purpose of boosting. In the original publication of BoostWofE a fabricated dataset was used to "validate" this approach. Using the same fabricated dataset it is shown that BoostWofE cannot generally compensate lacking conditional independence whatever the consecutively processing order of predictors. Thus the alleged features of BoostWofE are disproved by way of counterexamples, while theoretical findings are confirmed that logistic regression including interaction terms can exactly compensate violations of joint conditional independence if the predictors are indicators.

  1. Building vulnerability to hydro-geomorphic hazards: Estimating damage probability from qualitative vulnerability assessment using logistic regression

    Science.gov (United States)

    Ettinger, Susanne; Mounaud, Loïc; Magill, Christina; Yao-Lafourcade, Anne-Françoise; Thouret, Jean-Claude; Manville, Vern; Negulescu, Caterina; Zuccaro, Giulio; De Gregorio, Daniela; Nardone, Stefano; Uchuchoque, Juan Alexis Luque; Arguedas, Anita; Macedo, Luisa; Manrique Llerena, Nélida

    2016-10-01

    bivariate analyses were applied to better characterize each vulnerability parameter. Multiple corresponding analyses revealed strong relationships between the "Distance to channel or bridges", "Structural building type", "Building footprint" and the observed damage. Logistic regression enabled quantification of the contribution of each explanatory parameter to potential damage, and determination of the significant parameters that express the damage susceptibility of a building. The model was applied 200 times on different calibration and validation data sets in order to examine performance. Results show that 90% of these tests have a success rate of more than 67%. Probabilities (at building scale) of experiencing different damage levels during a future event similar to the 8 February 2013 flash flood are the major outcomes of this study.

  2. Logistic quantile regression provides improved estimates for bounded avian counts: A case study of California Spotted Owl fledgling production

    Science.gov (United States)

    Cade, Brian S.; Noon, Barry R.; Scherer, Rick D.; Keane, John J.

    2017-01-01

    Counts of avian fledglings, nestlings, or clutch size that are bounded below by zero and above by some small integer form a discrete random variable distribution that is not approximated well by conventional parametric count distributions such as the Poisson or negative binomial. We developed a logistic quantile regression model to provide estimates of the empirical conditional distribution of a bounded discrete random variable. The logistic quantile regression model requires that counts are randomly jittered to a continuous random variable, logit transformed to bound them between specified lower and upper values, then estimated in conventional linear quantile regression, repeating the 3 steps and averaging estimates. Back-transformation to the original discrete scale relies on the fact that quantiles are equivariant to monotonic transformations. We demonstrate this statistical procedure by modeling 20 years of California Spotted Owl fledgling production (0−3 per territory) on the Lassen National Forest, California, USA, as related to climate, demographic, and landscape habitat characteristics at territories. Spotted Owl fledgling counts increased nonlinearly with decreasing precipitation in the early nesting period, in the winter prior to nesting, and in the prior growing season; with increasing minimum temperatures in the early nesting period; with adult compared to subadult parents; when there was no fledgling production in the prior year; and when percentage of the landscape surrounding nesting sites (202 ha) with trees ≥25 m height increased. Changes in production were primarily driven by changes in the proportion of territories with 2 or 3 fledglings. Average variances of the discrete cumulative distributions of the estimated fledgling counts indicated that temporal changes in climate and parent age class explained 18% of the annual variance in owl fledgling production, which was 34% of the total variance. Prior fledgling production explained as much of

  3. Geothermal Favorability Map Derived From Logistic Regression Models of the Western United States (favorabilitysurface.zip)

    Data.gov (United States)

    U.S. Geological Survey, Department of the Interior — This is a surface showing relative favorability for the presence of geothermal systems in the western United States. It is an average of 12 models that correlates...

  4. An alternative to evaluate the efficiency of in vitro culture medium using a logistic regression model

    Directory of Open Access Journals (Sweden)

    Daniel Furtado Ferreira

    2003-01-01

    Full Text Available The evaluation of a culture medium for the in vitro culture of a species is performed using its physical and/or chemical properties. However, the analysis of the experimental results makes it possible to evaluate its quality. In this sense, this work presents an alternative using a logistic model to evaluate the culture medium to be used in vitro. The probabilities provided by this model will be used as a medium evaluator index. The importance of this index is based on the formalization of a statistical criterion for the selection of the adequate culture medium to be used on in vitro culture without excluding its physical and/or chemical properties. To demonstrate this procedure, an experiment determining the ideal medium for the in vitro culture of primary explants of Ipeca [Psychotria ipecacuanha (Brot. Stokes] was evaluated. The differentiation of the culture medium was based on the presence and absence of the growth regulator BAP (6-benzilaminopurine. A logistic model was adjusted as a function of the weight of fresh and dry matter. Minimum, medium and maximum probabilities obtained with this model showed that the culture medium containing BAP was the most adequate for the explant growth. Due to the high discriminative power of these mediums, detected by the model, their use is recommended as an alternative to select culture medium for similar experiments.

  5. Investigating nonlinear speculation in cattle, corn, and hog futures markets using logistic smooth transition regression models

    OpenAIRE

    Röthig, Andreas; Chiarella, Carl

    2006-01-01

    This article explores nonlinearities in the response of speculators' trading activity to price changes in live cattle, corn, and lean hog futures markets. Analyzing weekly data from March 4, 1997 to December 27, 2005, we reject linearity in all of these markets. Using smooth transition regression models, we find a similar structure of nonlinearities with regard to the number of different regimes, the choice of the transition variable, and the value at which the transition occurs.

  6. Third annual state of logistics survey for South Africa 2006: Implementing logistics strategies in a developing economy

    CSIR Research Space (South Africa)

    Ittmann, HW

    2007-07-01

    Full Text Available in the global market place. In addition, those within the second economy who require focused assistance, specifically from a logistics and supply chain management point of view, cannot be ignored. The CSIR (Council for Scientific and Industrial Research..., and market and economic research. We would like to thank the following organisation for participating in the survey: And a special thanks to our sponsor ??? ?????? ??????? ??????? Report edited by Isabel Meyer, Ilse Hobbs and Mario...

  7. An Alternative Flight Software Trigger Paradigm: Applying Multivariate Logistic Regression to Sense Trigger Conditions Using Inaccurate or Scarce Information

    Science.gov (United States)

    Smith, Kelly M.; Gay, Robert S.; Stachowiak, Susan J.

    2013-01-01

    In late 2014, NASA will fly the Orion capsule on a Delta IV-Heavy rocket for the Exploration Flight Test-1 (EFT-1) mission. For EFT-1, the Orion capsule will be flying with a new GPS receiver and new navigation software. Given the experimental nature of the flight, the flight software must be robust to the loss of GPS measurements. Once the high-speed entry is complete, the drogue parachutes must be deployed within the proper conditions to stabilize the vehicle prior to deploying the main parachutes. When GPS is available in nominal operations, the vehicle will deploy the drogue parachutes based on an altitude trigger. However, when GPS is unavailable, the navigated altitude errors become excessively large, driving the need for a backup barometric altimeter to improve altitude knowledge. In order to increase overall robustness, the vehicle also has an alternate method of triggering the parachute deployment sequence based on planet-relative velocity if both the GPS and the barometric altimeter fail. However, this backup trigger results in large altitude errors relative to the targeted altitude. Motivated by this challenge, this paper demonstrates how logistic regression may be employed to semi-automatically generate robust triggers based on statistical analysis. Logistic regression is used as a ground processor pre-flight to develop a statistical classifier. The classifier would then be implemented in flight software and executed in real-time. This technique offers improved performance even in the face of highly inaccurate measurements. Although the logistic regression-based trigger approach will not be implemented within EFT-1 flight software, the methodology can be carried forward for future missions and vehicles.

  8. An Alternative Flight Software Paradigm: Applying Multivariate Logistic Regression to Sense Trigger Conditions using Inaccurate or Scarce Information

    Science.gov (United States)

    Smith, Kelly; Gay, Robert; Stachowiak, Susan

    2013-01-01

    In late 2014, NASA will fly the Orion capsule on a Delta IV-Heavy rocket for the Exploration Flight Test-1 (EFT-1) mission. For EFT-1, the Orion capsule will be flying with a new GPS receiver and new navigation software. Given the experimental nature of the flight, the flight software must be robust to the loss of GPS measurements. Once the high-speed entry is complete, the drogue parachutes must be deployed within the proper conditions to stabilize the vehicle prior to deploying the main parachutes. When GPS is available in nominal operations, the vehicle will deploy the drogue parachutes based on an altitude trigger. However, when GPS is unavailable, the navigated altitude errors become excessively large, driving the need for a backup barometric altimeter to improve altitude knowledge. In order to increase overall robustness, the vehicle also has an alternate method of triggering the parachute deployment sequence based on planet-relative velocity if both the GPS and the barometric altimeter fail. However, this backup trigger results in large altitude errors relative to the targeted altitude. Motivated by this challenge, this paper demonstrates how logistic regression may be employed to semi-automatically generate robust triggers based on statistical analysis. Logistic regression is used as a ground processor pre-flight to develop a statistical classifier. The classifier would then be implemented in flight software and executed in real-time. This technique offers improved performance even in the face of highly inaccurate measurements. Although the logistic regression-based trigger approach will not be implemented within EFT-1 flight software, the methodology can be carried forward for future missions and vehicles

  9. Transmission Risks of Schistosomiasis Japonica: Extraction from Back-propagation Artificial Neural Network and Logistic Regression Model

    Science.gov (United States)

    Xu, Jun-Fang; Xu, Jing; Li, Shi-Zhu; Jia, Tia-Wu; Huang, Xi-Bao; Zhang, Hua-Ming; Chen, Mei; Yang, Guo-Jing; Gao, Shu-Jing; Wang, Qing-Yun; Zhou, Xiao-Nong

    2013-01-01

    Background The transmission of schistosomiasis japonica in a local setting is still poorly understood in the lake regions of the People's Republic of China (P. R. China), and its transmission patterns are closely related to human, social and economic factors. Methodology/Principal Findings We aimed to apply the integrated approach of artificial neural network (ANN) and logistic regression model in assessment of transmission risks of Schistosoma japonicum with epidemiological data collected from 2339 villagers from 1247 households in six villages of Jiangling County, P.R. China. By using the back-propagation (BP) of the ANN model, 16 factors out of 27 factors were screened, and the top five factors ranked by the absolute value of mean impact value (MIV) were mainly related to human behavior, i.e. integration of water contact history and infection history, family with past infection, history of water contact, infection history, and infection times. The top five factors screened by the logistic regression model were mainly related to the social economics, i.e. village level, economic conditions of family, age group, education level, and infection times. The risk of human infection with S. japonicum is higher in the population who are at age 15 or younger, or with lower education, or with the higher infection rate of the village, or with poor family, and in the population with more than one time to be infected. Conclusion/Significance Both BP artificial neural network and logistic regression model established in a small scale suggested that individual behavior and socioeconomic status are the most important risk factors in the transmission of schistosomiasis japonica. It was reviewed that the young population (≤15) in higher-risk areas was the main target to be intervened for the disease transmission control. PMID:23556015

  10. Using Logistic Regression and Random Forests multivariate statistical methods for landslide spatial probability assessment in North-Est Sicily, Italy

    Science.gov (United States)

    Trigila, Alessandro; Iadanza, Carla; Esposito, Carlo; Scarascia-Mugnozza, Gabriele

    2015-04-01

    first phase of the work addressed to identify the spatial relationships between the landslides location and the 13 related factors by using the Frequency Ratio bivariate statistical method. The analysis was then carried out by adopting a multivariate statistical approach, according to the Logistic Regression technique and Random Forests technique that gave best results in terms of AUC. The models were performed and evaluated with different sample sizes and also taking into account the temporal variation of input variables such as burned areas by wildfire. The most significant outcome of this work are: the relevant influence of the sample size on the model results and the strong importance of some environmental factors (e.g. land use and wildfires) for the identification of the depletion zones of extremely rapid shallow landslides.

  11. Adjusting for unmeasured confounding due to either of two crossed factors with a logistic regression model.

    Science.gov (United States)

    Li, Li; Brumback, Babette A; Weppelmann, Thomas A; Morris, J Glenn; Ali, Afsar

    2016-08-15

    Motivated by an investigation of the effect of surface water temperature on the presence of Vibrio cholerae in water samples collected from different fixed surface water monitoring sites in Haiti in different months, we investigated methods to adjust for unmeasured confounding due to either of the two crossed factors site and month. In the process, we extended previous methods that adjust for unmeasured confounding due to one nesting factor (such as site, which nests the water samples from different months) to the case of two crossed factors. First, we developed a conditional pseudolikelihood estimator that eliminates fixed effects for the levels of each of the crossed factors from the estimating equation. Using the theory of U-Statistics for independent but non-identically distributed vectors, we show that our estimator is consistent and asymptotically normal, but that its variance depends on the nuisance parameters and thus cannot be easily estimated. Consequently, we apply our estimator in conjunction with a permutation test, and we investigate use of the pigeonhole bootstrap and the jackknife for constructing confidence intervals. We also incorporate our estimator into a diagnostic test for a logistic mixed model with crossed random effects and no unmeasured confounding. For comparison, we investigate between-within models extended to two crossed factors. These generalized linear mixed models include covariate means for each level of each factor in order to adjust for the unmeasured confounding. We conduct simulation studies, and we apply the methods to the Haitian data. Copyright © 2016 John Wiley & Sons, Ltd.

  12. Applicability of the Ricketts' posteroanterior cephalometry for sex determination using logistic regression analysis in Hispano American Peruvians

    Directory of Open Access Journals (Sweden)

    Ivan Perez

    2016-01-01

    Full Text Available Background: The Ricketts' posteroanterior (PA cephalometry seems to be the most widely used and it has not been tested by multivariate statistics for sex determination.Objective: The objective was to determine the applicability of Ricketts' PA cephalometry for sex determination using the logistic regression analysis. Materials and Methods: The logistic models were estimated at distinct age cutoffs (all ages, 11 years, 13 years, and 15 years in a database from 1,296 Hispano American Peruvians between 5 years and 44 years of age. Results: The logistic models were composed by six cephalometric measurements; the accuracy achieved by resubstitution varied between 60% and 70% and all the variables, with one exception, exhibited a direct relationship with the probability of being classified as male; the nasal width exhibited an indirect relationship. Conclusion: The maxillary and facial widths were present in all models and may represent a sexual dimorphism indicator. The accuracy found was lower than the literature and the Ricketts' PA cephalometry may not be adequate for sex determination. The indirect relationship of the nasal width in models with data from patients of 12 years of age or less may be a trait related to age or a characteristic in the studied population, which could be better studied and confirmed.

  13. A Two-Stage Penalized Logistic Regression Approach to Case-Control Genome-Wide Association Studies

    Directory of Open Access Journals (Sweden)

    Jingyuan Zhao

    2012-01-01

    Full Text Available We propose a two-stage penalized logistic regression approach to case-control genome-wide association studies. This approach consists of a screening stage and a selection stage. In the screening stage, main-effect and interaction-effect features are screened by using L1-penalized logistic like-lihoods. In the selection stage, the retained features are ranked by the logistic likelihood with the smoothly clipped absolute deviation (SCAD penalty (Fan and Li, 2001 and Jeffrey’s Prior penalty (Firth, 1993, a sequence of nested candidate models are formed, and the models are assessed by a family of extended Bayesian information criteria (J. Chen and Z. Chen, 2008. The proposed approach is applied to the analysis of the prostate cancer data of the Cancer Genetic Markers of Susceptibility (CGEMS project in the National Cancer Institute, USA. Simulation studies are carried out to compare the approach with the pair-wise multiple testing approach (Marchini et al. 2005 and the LASSO-patternsearch algorithm (Shi et al. 2007.

  14. A Logistic Regression Model with a Hierarchical Random Error Term for Analyzing the Utilization of Public Transport

    Directory of Open Access Journals (Sweden)

    Chong Wei

    2015-01-01

    Full Text Available Logistic regression models have been widely used in previous studies to analyze public transport utilization. These studies have shown travel time to be an indispensable variable for such analysis and usually consider it to be a deterministic variable. This formulation does not allow us to capture travelers’ perception error regarding travel time, and recent studies have indicated that this error can have a significant effect on modal choice behavior. In this study, we propose a logistic regression model with a hierarchical random error term. The proposed model adds a new random error term for the travel time variable. This term structure enables us to investigate travelers’ perception error regarding travel time from a given choice behavior dataset. We also propose an extended model that allows constraining the sign of this error in the model. We develop two Gibbs samplers to estimate the basic hierarchical model and the extended model. The performance of the proposed models is examined using a well-known dataset.

  15. Logistic Regression Analysis of Risk Factors for Stroke%脑卒中危险因素的Logistic分析

    Institute of Scientific and Technical Information of China (English)

    黄炜

    2014-01-01

    Objective To analyze risk factors for stroke of patients in our hospital,in order to prevent and control the occurrence of stroke.Methods Statistical analysis of single-factor and multivariate non-conditional Logistic regression were performed to analyze 133 cases of stroke patients and 87 cases of healthy people.Results Multivariate logistic regression analysis showed that risk factors for stroke of patients was associated with hypertension,diabetes,TIA,lipid abnormality and BMI in our hospital.Conclusion To do primary prevention of risk factors for stroke,which can reduce the incidence rate of stroke.%目的分析来我院就诊的脑卒中患者的危险因素,为有效预防脑卒中的发生提供临床经验。方法通过非条件Logistic回归对133例脑卒中患者和87例健康者进行分析比较。结果多因素Logistic回归分析(前进法)显示院我院脑卒中患者的危险因素与高血压、糖尿病、TIA史、异常血脂和BMI等有相关。结论积极做好脑卒中危险因素的一级预防,能够降低其发生率。

  16. Least Square Support Vector Machine Classifier vs a Logistic Regression Classifier on the Recognition of Numeric Digits

    Directory of Open Access Journals (Sweden)

    Danilo A. López-Sarmiento

    2013-11-01

    Full Text Available In this paper is compared the performance of a multi-class least squares support vector machine (LSSVM mc versus a multi-class logistic regression classifier to problem of recognizing the numeric digits (0-9 handwritten. To develop the comparison was used a data set consisting of 5000 images of handwritten numeric digits (500 images for each number from 0-9, each image of 20 x 20 pixels. The inputs to each of the systems were vectors of 400 dimensions corresponding to each image (not done feature extraction. Both classifiers used OneVsAll strategy to enable multi-classification and a random cross-validation function for the process of minimizing the cost function. The metrics of comparison were precision and training time under the same computational conditions. Both techniques evaluated showed a precision above 95 %, with LS-SVM slightly more accurate. However the computational cost if we found a marked difference: LS-SVM training requires time 16.42 % less than that required by the logistic regression model based on the same low computational conditions.

  17. Modeling Typhoon Event-Induced Landslides Using GIS-Based Logistic Regression: A Case Study of Alishan Forestry Railway, Taiwan

    Directory of Open Access Journals (Sweden)

    Sheng-Chuan Chen

    2013-01-01

    Full Text Available This study develops a model for evaluating the hazard level of landslides at Alishan Forestry Railway, Taiwan, by using logistic regression with the assistance of a geographical information system (GIS. A typhoon event-induced landslide inventory, independent variables, and a triggering factor were used to build the model. The environmental factors such as bedrock lithology from the geology database; topographic aspect, terrain roughness, profile curvature, and distance to river, from the topographic database; and the vegetation index value from SPOT 4 satellite images were used as variables that influence landslide occurrence. The area under curve (AUC of a receiver operator characteristic (ROC curve was used to validate the model. Effects of parameters on landslide occurrence were assessed from the corresponding coefficient that appears in the logistic regression function. Thereafter, the model was applied to predict the probability of landslides for rainfall data of different return periods. Using a predicted map of probability, the study area was classified into four ranks of landslide susceptibility: low, medium, high, and very high. As a result, most high susceptibility areas are located on the western portion of the study area. Several train stations and railways are located on sites with a high susceptibility ranking.

  18. Good Corporate Governance and Predicting Financial Distress Using Logistic and Probit Regression Model

    Directory of Open Access Journals (Sweden)

    Juniarti Juniarti

    2013-01-01

    Full Text Available The study aims to prove whether good corporate governance (GCG is able to predict the probability of companies experiencing financial difficulties. Financial ratios that traditionally used for predicting bankruptcy remains used in this study. Besides, this study also compares logit and probit regression models, which are widely used in research related accounting bankruptcy prediction. Both models will be compared to determine which model is more superior. The sample in this study is the infrastructure, transportation, utilities & trade, services and hotels companies experiencing financial distress in the period 2008-2011. The results show that GCG and other three variables control i.e DTA, CR and company category do not prove significantly to predict the probability of companies experiencing financial difficulties. NPM, the only variable that proved significantly distinguishing healthy firms and distress. In general, logit and probit models do not result in different conclusions. Both of the models confirm the goodness of fit of models and the results of hypothesis testing. In terms of classification accuracy, logit model proves more accurate predictions than the probit models.

  19. Potential misinterpretation of treatment effects due to use of odds ratios and logistic regression in randomized controlled trials.

    Directory of Open Access Journals (Sweden)

    Mirjam J Knol

    Full Text Available BACKGROUND: In randomized controlled trials (RCTs, the odds ratio (OR can substantially overestimate the risk ratio (RR if the incidence of the outcome is over 10%. This study determined the frequency of use of ORs, the frequency of overestimation of the OR as compared with its accompanying RR in published RCTs, and we assessed how often regression models that calculate RRs were used. METHODS: We included 288 RCTs published in 2008 in five major general medical journals (Annals of Internal Medicine, British Medical Journal, Journal of the American Medical Association, Lancet, New England Journal of Medicine. If an OR was reported, we calculated the corresponding RR, and we calculated the percentage of overestimation by using the formula . RESULTS: Of 193 RCTs with a dichotomous primary outcome, 24 (12.4% presented a crude and/or adjusted OR for the primary outcome. In five RCTs (2.6%, the OR differed more than 100% from its accompanying RR on the log scale. Forty-one of all included RCTs (n = 288; 14.2% presented ORs for other outcomes, or for subgroup analyses. Nineteen of these RCTs (6.6% had at least one OR that deviated more than 100% from its accompanying RR on the log scale. Of 53 RCTs that adjusted for baseline variables, 15 used logistic regression. Alternative methods to estimate RRs were only used in four RCTs. CONCLUSION: ORs and logistic regression are often used in RCTs and in many articles the OR did not approximate the RR. Although the authors did not explicitly misinterpret these ORs as RRs, misinterpretation by readers can seriously affect treatment decisions and policy making.

  20. Survey on farmers' willingness for training on modern distance education based on the binary logistic regression model——Taking countrysides of Pearl River Delta as an example%基于二元 Logistic模型的农民现代远程教育培训意愿研究——以珠三角地区农村为例

    Institute of Scientific and Technical Information of China (English)

    田兴国; 陈敏慧; 崔建勋; 何淑群; 吕建秋

    2012-01-01

    [Objective]In order to promote the development of modern distance education for modern agricultural scientific technology in rural areas of the Pearl River Delta, and to develop farmers' modern production technology and management skills, the studies were conducted for fanners' willingness for training on modern distance education. [Method] The questionnaire, investigation concerning on demands of farmers' willingness for training was conducted, and the factors including individual characteristics of farmers, the conditions of modern distance education and the selection of training subjects, were analyzed through the binary logistic regression model. [Result]The training time, participation in the farmer specialty cooperative organization, the training costs, quantity of the rural distance educational facilities, and the education background were positively related to the farmers' willingness for training, whereas computer and internet literacy,and every class time were negatively related to the farmers' willingness for training. [ Suggestion] A few suggestions were put forward, such as increasing the investment for modern distance educational training, accelerating the construction of the farmer specialty cooperative organization, innovating the training content and training methodologies, etc.%[目的]研究农民现代远程教育培训意愿,为促进珠三角地区农村现代农业科技远程教育培训的开展及提高农民的现代生产技术和经营管理水平提供参考.[方法]通过对珠三角地区农村农民现代远程教育培训需求情况进行问卷调查,并利用二元Logistic模型对影响培训意愿的农民个体特征、农村现代远程教育条件和培训特征选择等因素进行分析.[结果]愿意接受的培训时间、是否参加农民专业合作社、愿意接受的培训费用、农村远程教育设施数量、文化程度与农民参与培训的意愿呈正相关,操作电脑和网络的水平、希望每堂

  1. Assessment of the classification abilities of the CNS multi-parametric optimization approach by the method of logistic regression.

    Science.gov (United States)

    Raevsky, O A; Polianczyk, D E; Mukhametov, A; Grigorev, V Y

    2016-08-01

    Assessment of "CNS drugs/CNS candidates" classification abilities of the multi-parametric optimization (CNS MPO) approach was performed by logistic regression. It was found that the five out of the six separately used physical-chemical properties (topological polar surface area, number of hydrogen-bonded donor atoms, basicity, lipophilicity of compound in neutral form and at pH = 7.4) provided accuracy of recognition below 60%. Only the descriptor of molecular weight (MW) could correctly classify two-thirds of the studied compounds. Aggregation of all six properties in the MPOscore did not improve the classification, which was worse than the classification using only MW. The results of our study demonstrate the imperfection of the CNS MPO approach; in its current form it is not very useful for computer design of new, effective CNS drugs.

  2. Binary Logistic Regression Modeling of Idle CO Emissions in Order to Estimate Predictors Influences in Old Vehicle Park

    Directory of Open Access Journals (Sweden)

    Branimir Milosavljević

    2015-01-01

    Full Text Available This paper determines, by experiments, the CO emissions at idle running with 1,785 vehicles powered by spark ignition engine, in order to verify the correctness of emissions values with a representative sample of vehicles in Serbia. The permissible emissions limits were considered for three (3 fitted binary logistic regression (BLR models, and the key reason for such analysis is finding the predictors that can have a crucial influence on the accuracy of the estimation whether such vehicles have correct emissions or not. Having summarized the research results, we found out that vehicles produced in Serbia (hereinafter referred to as “domestic vehicles” cause more pollution than imported cars (hereinafter referred to as “foreign vehicles”, although domestic vehicles are of lower average age and mileage. Another trend was observed: low-power vehicles and vehicles produced before 1992 are potentially more serious polluters.

  3. Reducing a spatial database to its effective dimensionality for logistic-regression analysis of incidence of livestock disease.

    Science.gov (United States)

    Duchateau, L; Kruska, R L; Perry, B D

    1997-10-01

    Large databases with multiple variables, selected because they are available and might provide an insight into establishing causal relationships, are often difficult to analyse and interpret because of multicollinearity. The objective of this study was to reduce the dimensionality of a multivariable spatial database of Zimbabwe, containing many environmental variables that were collected to predict the distribution of outbreaks of theileriosis (the tick-borne infection of cattle caused by Theileria parva and transmitted by the brown ear tick). Principal-component analysis and varimax rotation of the principal components were first used to select a reduced number of variables. The logistic-regression model was evaluated by appropriate goodness-of-fit tests.

  4. Modelling the spatial distribution of Fasciola hepatica in bovines using decision tree, logistic regression and GIS query approaches for Brazil.

    Science.gov (United States)

    Bennema, S C; Molento, M B; Scholte, R G; Carvalho, O S; Pritsch, I

    2017-11-01

    Fascioliasis is a condition caused by the trematode Fasciola hepatica. In this paper, the spatial distribution of F. hepatica in bovines in Brazil was modelled using a decision tree approach and a logistic regression, combined with a geographic information system (GIS) query. In the decision tree and the logistic model, isothermality had the strongest influence on disease prevalence. Also, the 50-year average precipitation in the warmest quarter of the year was included as a risk factor, having a negative influence on the parasite prevalence. The risk maps developed using both techniques, showed a predicted higher prevalence mainly in the South of Brazil. The prediction performance seemed to be high, but both techniques failed to reach a high accuracy in predicting the medium and high prevalence classes to the entire country. The GIS query map, based on the range of isothermality, minimum temperature of coldest month, precipitation of warmest quarter of the year, altitude and the average dailyland surface temperature, showed a possibility of presence of F. hepatica in a very large area. The risk maps produced using these methods can be used to focus activities of animal and public health programmes, even on non-evaluated F. hepatica areas.

  5. Logistic Regression Analysis on Heart Diseases of Traditional Chinese Medicine%中医心病Logistic回归分析

    Institute of Scientific and Technical Information of China (English)

    宋观礼; 郭伟星; 张启明

    2011-01-01

    Objective: To offer objective data for diagnosing, preventing and treating of heart diseases in clinic. Methods: Selecting the clinical cases of the famous doctors in the archaic and modern history ofTraditional Chinese Medicine, a clinical cases databsse of Traditional Chinese Medicine was founded, Non- conditional Logistic multivariate regression was employed to screen variables stepwisely, P < 0. 05. Results: Obtaining data of the common clinicsl syndromes, pathogeneses or pathological results, symptoms and Chinese medicinal herbs of heart diseases, and the importance of the pathogeneses or pathological results, symptoms, and Chinese medicinal herbs were demonstrated quantitatively. Conclusion: According to the statistical results, attacking laws ( including the clinical syndromes and pathogeneses or pathological results ), symptom characteristics, Chinese medicinal herbs of heart diseases were summarized. From the data of Logistic regression of heart diseases,the physiological functions of heart in Traditional Chinese Medicine were also inferred and testified.%目的:为中医心病证研究提供客观的研究数据.方法:选择古代及近现代中医医家的医案,建立中医医案数据库,采用非条件Logistic多元逐步回归法筛选变量,P<0.05.结果:筛选出中医心病常见的临床证型、病因或病理结果、症状和临床用药的Logistic回归结果,并定量表述了这些病因或病理结果、症状、用药的重要性大小.结论:总结心病的发病规律(包括临床证型、病因或病理结果)、症状特点、用药规律,同时反推和证实了中医心的功能.

  6. Prediction of Foreign Object Debris/Damage (FOD) type for elimination in the aeronautics manufacturing environment through logistic regression model

    Science.gov (United States)

    Espino, Natalia V.

    Foreign Object Debris/Damage (FOD) is a costly and high-risk problem that aeronautics industries such as Boeing, Lockheed Martin, among others are facing at their production lines every day. They spend an average of $350 thousand dollars per year fixing FOD problems. FOD can put pilots, passengers and other crews' lives into high-risk. FOD refers to any type of foreign object, particle, debris or agent in the manufacturing environment, which could contaminate/damage the product or otherwise undermine quality control standards. FOD can be in the form of any of the following categories: panstock, manufacturing debris, tools/shop aids, consumables and trash. Although aeronautics industries have put many prevention plans in place such as housekeeping and "clean as you go" philosophies, trainings, use of RFID for tooling control, etc. none of them has been able to completely eradicate the problem. This research presents a logistic regression statistical model approach to predict probability of FOD type under given specific circumstances such as workstation, month and aircraft/jet being built. FOD Quality Assurance Reports of the last three years were provided by an aeronautical industry for this study. By predicting type of FOD, custom reduction/elimination plans can be put in place and by such means being able to diminish the problem. Different aircrafts were analyzed and so different models developed through same methodology. Results of the study presented are predictions of FOD type for each aircraft and workstation throughout the year, which were obtained by applying proposed logistic regression models. This research would help aeronautic industries to address the FOD problem correctly, to be able to identify root causes and establish actual reduction/elimination plans.

  7. Logits and Tigers and Bears, Oh My! A Brief Look at the Simple Math of Logistic Regression and How It Can Improve Dissemination of Results

    Science.gov (United States)

    Osborne, Jason W.

    2012-01-01

    Logistic regression is slowly gaining acceptance in the social sciences, and fills an important niche in the researcher's toolkit: being able to predict important outcomes that are not continuous in nature. While OLS regression is a valuable tool, it cannot routinely be used to predict outcomes that are binary or categorical in nature. These…

  8. The implementation of rare events logistic regression to predict the distribution of mesophotic hard corals across the main Hawaiian Islands

    Directory of Open Access Journals (Sweden)

    Lindsay M. Veazey

    2016-07-01

    Full Text Available Predictive habitat suitability models are powerful tools for cost-effective, statistically robust assessment of the environmental drivers of species distributions. The aim of this study was to develop predictive habitat suitability models for two genera of scleractinian corals (Leptoserisand Montipora found within the mesophotic zone across the main Hawaiian Islands. The mesophotic zone (30–180 m is challenging to reach, and therefore historically understudied, because it falls between the maximum limit of SCUBA divers and the minimum typical working depth of submersible vehicles. Here, we implement a logistic regression with rare events corrections to account for the scarcity of presence observations within the dataset. These corrections reduced the coefficient error and improved overall prediction success (73.6% and 74.3% for both original regression models. The final models included depth, rugosity, slope, mean current velocity, and wave height as the best environmental covariates for predicting the occurrence of the two genera in the mesophotic zone. Using an objectively selected theta (“presence” threshold, the predicted presence probability values (average of 0.051 for Leptoseris and 0.040 for Montipora were translated to spatially-explicit habitat suitability maps of the main Hawaiian Islands at 25 m grid cell resolution. Our maps are the first of their kind to use extant presence and absence data to examine the habitat preferences of these two dominant mesophotic coral genera across Hawai‘i.

  9. The implementation of rare events logistic regression to predict the distribution of mesophotic hard corals across the main Hawaiian Islands.

    Science.gov (United States)

    Veazey, Lindsay M; Franklin, Erik C; Kelley, Christopher; Rooney, John; Frazer, L Neil; Toonen, Robert J

    2016-01-01

    Predictive habitat suitability models are powerful tools for cost-effective, statistically robust assessment of the environmental drivers of species distributions. The aim of this study was to develop predictive habitat suitability models for two genera of scleractinian corals (Leptoserisand Montipora) found within the mesophotic zone across the main Hawaiian Islands. The mesophotic zone (30-180 m) is challenging to reach, and therefore historically understudied, because it falls between the maximum limit of SCUBA divers and the minimum typical working depth of submersible vehicles. Here, we implement a logistic regression with rare events corrections to account for the scarcity of presence observations within the dataset. These corrections reduced the coefficient error and improved overall prediction success (73.6% and 74.3%) for both original regression models. The final models included depth, rugosity, slope, mean current velocity, and wave height as the best environmental covariates for predicting the occurrence of the two genera in the mesophotic zone. Using an objectively selected theta ("presence") threshold, the predicted presence probability values (average of 0.051 for Leptoseris and 0.040 for Montipora) were translated to spatially-explicit habitat suitability maps of the main Hawaiian Islands at 25 m grid cell resolution. Our maps are the first of their kind to use extant presence and absence data to examine the habitat preferences of these two dominant mesophotic coral genera across Hawai'i.

  10. LOGISTIC REGRESSION ANALYSIS ON RELATIONSHIP OF SERUM HOMOCYSTEINE, FOLIC ACID, AND VITAMIN B12 WITH CORONARY ARTERIOPATHY

    Institute of Scientific and Technical Information of China (English)

    王真; 郭静宣; 毛节明; 王天成; 赵一呜

    2001-01-01

    Objective To investigate the relationship among the serum homocysteine (Hcy), folic acid and vitamin B12 with coronary arteriopathy.Methods In a cross-sectional study, serum Hcy levels of 210 cases with (CHD), 115 non CHD subjects from a consecutive series of subjects with chest pain or myocardial infarction(MI) undergoing diagnostic coronary angiography and 63 subjects undergoing health examination were measured using high-performance liquid chromatography (HPLC) with fluorescence detection. Serum folic acid and vitamin B12 level were measured by radioimmunoassay method. Serum cholesterol and lipoproteins were also measured. The information on conventional risk factors were collected by interviews.Results The coronary arteriopathy was correspondingly related with male, smoking, diabetes, folic acid, vitamin B12, ApoA1, and Hcy level. The mean serum Hcy level were significantly higher in CHD patients than in non CHD patients(19.01±10.36 μmol/L n=210 vs 11.5+4.97 μmol/L n=115, P<0.01). The mean serum folic acid level and vitamin B12 level were significantly lower in CHD patients (4.5±1.5 pg/ml vs 414.6±142.3 pg/ml) than in non CHD patients (5.6±1.4ng/ml vs 537.7±136.6 ng/ml), P<0.01. There is no difference on the mean serum Hcy level in NCHD cases and the healthy subjects. The mean serum ApoA1 (1188.8±206.1 mmol/L vs 1262.1±201.4 mmol/L)level was significantly lower in CHD patients than in non CHD patients, P<0.05. CHD patients had higher rates of smoking, aging and suffering from diabetes than non CHD patients. By multivariate logistic regression, the OR of Hcy, aging, male and diabetes were all≥1, P<0.01, which means all these factors are independent risk factors. With forward method, when folic acid, vitamin B12 and Hcy entering the regression model, the coefficients of Hcy changed greatly, showed multivariate co-liner on logistic regression.Conclusion The results of our study showed that Hcy, male, senility and diabetes were all independent risk

  11. Digital soil mapping using multiple logistic regression on terrain parameters in southern Brazil Mapeamento digital de solos utilizando regressões logísticas múltiplas e parâmetros do terreno no sul do Brasil

    Directory of Open Access Journals (Sweden)

    Elvio Giasson

    2006-06-01

    Full Text Available Soil surveys are necessary sources of information for land use planning, but they are not always available. This study proposes the use of multiple logistic regressions on the prediction of occurrence of soil types based on reference areas. From a digitalized soil map and terrain parameters derived from the digital elevation model in ArcView environment, several sets of multiple logistic regressions were defined using statistical software Minitab, establishing relationship between explanatory terrain variables and soil types, using either the original legend or a simplified legend, and using or not stratification of the study area by drainage classes. Terrain parameters, such as elevation, distance to stream, flow accumulation, and topographic wetness index, were the variables that best explained soil distribution. Stratification by drainage classes did not have significant effect. Simplification of the original legend increased the accuracy of the method on predicting soil distribution.Os levantamentos de solos são fontes de informação necessárias para o planejamento de uso das terras, entretanto eles nem sempre estão disponíveis. Este estudo propõe o uso de regressões logísticas múltiplas na predição de ocorrência de classes de solos a partir de áreas de referência. Baseado no mapa original de solos em formato digital e parâmetros do terreno derivados do modelo numérico do terreno em ambiente ArcView, vários conjuntos de regressões logísticas múltiplas foram definidas usando o programa estatístico Minitab, estabelecendo relações entre as variáveis do terreno independentes e tipos de solos, usando tanto a legenda original como uma legenda simplificada, e usando ou não estratificação da área de estudo por classes de drenagem. Os parâmetros do terreno como elevação, distância dos rios, acúmulo de fluxo e índice de umidade topográfica foram as variáveis que melhor explicaram a distribuição das classes de

  12. Adverse events associated with incretin-based drugs in Japanese spontaneous reports: a mixed effects logistic regression model

    Directory of Open Access Journals (Sweden)

    Daichi Narushima

    2016-03-01

    Full Text Available Background: Spontaneous Reporting Systems (SRSs are passive systems composed of reports of suspected Adverse Drug Events (ADEs, and are used for Pharmacovigilance (PhV, namely, drug safety surveillance. Exploration of analytical methodologies to enhance SRS-based discovery will contribute to more effective PhV. In this study, we proposed a statistical modeling approach for SRS data to address heterogeneity by a reporting time point. Furthermore, we applied this approach to analyze ADEs of incretin-based drugs such as DPP-4 inhibitors and GLP-1 receptor agonists, which are widely used to treat type 2 diabetes. Methods: SRS data were obtained from the Japanese Adverse Drug Event Report (JADER database. Reported adverse events were classified according to the MedDRA High Level Terms (HLTs. A mixed effects logistic regression model was used to analyze the occurrence of each HLT. The model treated DPP-4 inhibitors, GLP-1 receptor agonists, hypoglycemic drugs, concomitant suspected drugs, age, and sex as fixed effects, while the quarterly period of reporting was treated as a random effect. Before application of the model, Fisher’s exact tests were performed for all drug-HLT combinations. Mixed effects logistic regressions were performed for the HLTs that were found to be associated with incretin-based drugs. Statistical significance was determined by a two-sided p-value <0.01 or a 99% two-sided confidence interval. Finally, the models with and without the random effect were compared based on Akaike’s Information Criteria (AIC, in which a model with a smaller AIC was considered satisfactory. Results: The analysis included 187,181 cases reported from January 2010 to March 2015. It showed that 33 HLTs, including pancreatic, gastrointestinal, and cholecystic events, were significantly associated with DPP-4 inhibitors or GLP-1 receptor agonists. In the AIC comparison, half of the HLTs reported with incretin-based drugs favored the random effect

  13. Multinomial Logistic Regression Predicted Probability Map To Visualize The Influence Of Socio-Economic Factors On Breast Cancer Occurrence in Southern Karnataka

    Science.gov (United States)

    Madhu, B.; Ashok, N. C.; Balasubramanian, S.

    2014-11-01

    Multinomial logistic regression analysis was used to develop statistical model that can predict the probability of breast cancer in Southern Karnataka using the breast cancer occurrence data during 2007-2011. Independent socio-economic variables describing the breast cancer occurrence like age, education, occupation, parity, type of family, health insurance coverage, residential locality and socioeconomic status of each case was obtained. The models were developed as follows: i) Spatial visualization of the Urban- rural distribution of breast cancer cases that were obtained from the Bharat Hospital and Institute of Oncology. ii) Socio-economic risk factors describing the breast cancer occurrences were complied for each case. These data were then analysed using multinomial logistic regression analysis in a SPSS statistical software and relations between the occurrence of breast cancer across the socio-economic status and the influence of other socio-economic variables were evaluated and multinomial logistic regression models were constructed. iii) the model that best predicted the occurrence of breast cancer were identified. This multivariate logistic regression model has been entered into a geographic information system and maps showing the predicted probability of breast cancer occurrence in Southern Karnataka was created. This study demonstrates that Multinomial logistic regression is a valuable tool for developing models that predict the probability of breast cancer Occurrence in Southern Karnataka.

  14. 6th Annual state of logistics survey for South Africa 2009

    CSIR Research Space (South Africa)

    Ittman, H

    2010-03-01

    Full Text Available issues are raised and elaborated on in this survey, for example, the cost of bad roads and humanitarian logistics. We believe that �ndings from studies published in the sixth State of Logistics™ survey will, as in the past, be referenced in numerous.... And this is critical – right now. THE COST OF BAD ROADS TO THE ECONOMY Deteriorating road quality can potentially have many negative effects on the vehicle maintenance costs of a company, which in turn can translate into increased logistics costs and may eventually...

  15. To Set Up a Logistic Regression Prediction Model for Hepatotoxicity of Chinese Herbal Medicines Based on Traditional Chinese Medicine Theory

    Science.gov (United States)

    Liu, Hongjie; Li, Tianhao; Zhan, Sha; Pan, Meilan; Ma, Zhiguo; Li, Chenghua

    2016-01-01

    Aims. To establish a logistic regression (LR) prediction model for hepatotoxicity of Chinese herbal medicines (HMs) based on traditional Chinese medicine (TCM) theory and to provide a statistical basis for predicting hepatotoxicity of HMs. Methods. The correlations of hepatotoxic and nonhepatotoxic Chinese HMs with four properties, five flavors, and channel tropism were analyzed with chi-square test for two-way unordered categorical data. LR prediction model was established and the accuracy of the prediction by this model was evaluated. Results. The hepatotoxic and nonhepatotoxic Chinese HMs were related with four properties (p flavors (p 0.05). There were totally 12 variables from four properties and five flavors for the LR. Four variables, warm and neutral of the four properties and pungent and salty of five flavors, were selected to establish the LR prediction model, with the cutoff value being 0.204. Conclusions. Warm and neutral of the four properties and pungent and salty of five flavors were the variables to affect the hepatotoxicity. Based on such results, the established LR prediction model had some predictive power for hepatotoxicity of Chinese HMs. PMID:27656240

  16. Comparison of logistic regression and neural network classifiers in the detection of hard exudates in retinal images.

    Science.gov (United States)

    Garcia, Maria; Valverde, Carmen; Lopez, Maria I; Poza, Jesus; Hornero, Roberto

    2013-01-01

    Diabetic Retinopathy (DR) is a common cause of visual impairment in industrialized countries. Automatic recognition of DR lesions in retinal images can contribute to the diagnosis and screening of this disease. The aim of this study is to automatically detect one of these lesions: hard exudates (EXs). Based on their properties, we extracted a set of features from image regions and selected the subset that best discriminated between EXs and the retinal background using logistic regression (LR). The LR model obtained, a multilayer perceptron (MLP) classifier and a radial basis function (RBF) classifier were subsequently used to obtain the final segmentation of EXs. Our database contained 130 images with variable color, brightness, and quality. Fifty of them were used to obtain the training examples. The remaining 80 images were used to test the performance of the method. The highest statistics were achieved for MLP or RBF. Using a lesion based criterion, our results reached a mean sensitivity of 95.9% (MLP) and a mean positive predictive value of 85.7% (RBF). With an image-based criterion, we achieved a 100% mean sensitivity, 87.5% mean specificity and 93.8% mean accuracy (MLP and RBF).

  17. To Set Up a Logistic Regression Prediction Model for Hepatotoxicity of Chinese Herbal Medicines Based on Traditional Chinese Medicine Theory.

    Science.gov (United States)

    Liu, Hongjie; Li, Tianhao; Chen, Lingxiu; Zhan, Sha; Pan, Meilan; Ma, Zhiguo; Li, Chenghua; Zhang, Zhe

    2016-01-01

    Aims. To establish a logistic regression (LR) prediction model for hepatotoxicity of Chinese herbal medicines (HMs) based on traditional Chinese medicine (TCM) theory and to provide a statistical basis for predicting hepatotoxicity of HMs. Methods. The correlations of hepatotoxic and nonhepatotoxic Chinese HMs with four properties, five flavors, and channel tropism were analyzed with chi-square test for two-way unordered categorical data. LR prediction model was established and the accuracy of the prediction by this model was evaluated. Results. The hepatotoxic and nonhepatotoxic Chinese HMs were related with four properties (p 0.05). There were totally 12 variables from four properties and five flavors for the LR. Four variables, warm and neutral of the four properties and pungent and salty of five flavors, were selected to establish the LR prediction model, with the cutoff value being 0.204. Conclusions. Warm and neutral of the four properties and pungent and salty of five flavors were the variables to affect the hepatotoxicity. Based on such results, the established LR prediction model had some predictive power for hepatotoxicity of Chinese HMs.

  18. A Novel Method for Earthquake-triggered Landslides Susceptibility Mapping: Combining the Newmark Displacement Value with Logistic Regression Model

    Science.gov (United States)

    Lin, Q.; Wang, Y.; Song, C.

    2016-12-01

    The Newmark displacement model has been used to predict earthquake-triggered landslides. Logistic regression (LR) is also a common landslide hazard assessment method. We combined the Newmark displacement model and LR and applied them to Wenchuan County and Beichuan County in China, which were affected by the Ms.8.0 Wenchuan earthquake on May 12th, 2008, to develop a mechanism-based landslide occurrence probability model and improve the predictive accuracy. A total of 1904 landslide sites in Wenchuan County and 3800 random non-landslide sites were selected as the training dataset. We applied the Newmark model and obtained the distribution of permanent displacement (Dn) for a 30 × 30 m grid. Four factors (Dn, topographic relief, and distances to drainages and roads) were used as independent variables for LR. Then, a combined model was obtained, with an AUC (area under the curve) value of 0.797 for Wenchuan County. A total of 617 landslide sites and non-landslide sites in Beichuan County were used as a validation dataset with AUC = 0.753. The proposed method may also be applied to earthquake-induced landslides in other regions.

  19. Seismic features and automatic discrimination of deep and shallow induced-microearthquakes using neural network and logistic regression

    Science.gov (United States)

    Mousavi, S. Mostafa; Horton, Stephen, P.; Langston, Charles A.; Samei, Borhan

    2016-07-01

    We develop an automated strategy for discriminating deep microseismic events from shallow ones on the basis of the waveforms recorded on a limited number of surface receivers. Machine-learning techniques are employed to explore the relationship between event hypocenters and seismic features of the recorded signals in time, frequency, and time-frequency domains. We applied the technique to 440 microearthquakes -1.7deep and shallow events based on the knowledge gained from existing patterns. The cross validation test showed that events with depth shallower than 250 m can be discriminated from events with hypocentral depth between 1000 to 2000 m with 88% and 90.7% accuracy using logistic regression (LR) and artificial neural network (ANN) models, respectively. Similar results were obtained using single station seismograms. The results show that the spectral features have the highest correlation to source depth. Spectral centroids and 2D cross-correlations in the time-frequency domain are two new seismic features used in this study that showed to be promising measures for seismic event classification. The used machine learning techniques have application for efficient automatic classification of low energy signals recorded at one or more seismic stations.

  20. A nesting site suitability model for rock partridge (Alectoris graeca in the Apennine Mountains using logistic regression

    Directory of Open Access Journals (Sweden)

    Lorenzo Boccia

    2010-01-01

    Full Text Available The rock partridge has undergone a decline throughout its entire distribution area, including the population of the central Italian Apennine Mountains. Areas of suitable habitat for this species have been reduced due to landscape fragmentation and the dynamics of domestic animal and wildlife management. The present study was conducted in the Province of Rieti, Lazio Region. Geograph- ical and land use predictors were evaluated in a GIS environment to identify the most relevant factors influencing the presence of rock partridge during the nesting period. Logistic regression was then imple- mented to create a model, characterised by a good level of adequacy, for predicting rock partridge nesting site habitat characteristics. Correct predictions of presence and absence were made in 65.2% and 98.6% of cases, respectively. The ROC value was 0.771, which is statistically significant (P<0.001. The results show that, on a local scale, slope (log, distance from forests, and the presence of bare rocks were statisti- cally significant factors. On a landscape scale, the percentage of forests, the presence of sparse vegetation (over 60%, and a negative Mean Shape Index (MSI were found to be statistically significant.

  1. Occurrence probability assessment of earthquake-triggered landslides with Newmark displacement values and logistic regression: The Wenchuan earthquake, China

    Science.gov (United States)

    Wang, Ying; Song, Chongzhen; Lin, Qigen; Li, Juan

    2016-04-01

    The Newmark displacement model has been used to predict earthquake-triggered landslides. Logistic regression (LR) is also a common landslide hazard assessment method. We combined the Newmark displacement model and LR and applied them to Wenchuan County and Beichuan County in China, which were affected by the Ms. 8.0 Wenchuan earthquake on May 12th, 2008, to develop a mechanism-based landslide occurrence probability model and improve the predictive accuracy. A total of 1904 landslide sites in Wenchuan County and 3800 random non-landslide sites were selected as the training dataset. We applied the Newmark model and obtained the distribution of permanent displacement (Dn) for a 30 × 30 m grid. Four factors (Dn, topographic relief, and distances to drainages and roads) were used as independent variables for LR. Then, a combined model was obtained, with an AUC (area under the curve) value of 0.797 for Wenchuan County. A total of 617 landslide sites and non-landslide sites in Beichuan County were used as a validation dataset with AUC = 0.753. The proposed method may also be applied to earthquake-induced landslides in other regions.

  2. Seismic features and automatic discrimination of deep and shallow induced-microearthquakes using neural network and logistic regression

    Science.gov (United States)

    Mousavi, S. Mostafa; Horton, Stephen P.; Langston, Charles A.; Samei, Borhan

    2016-10-01

    We develop an automated strategy for discriminating deep microseismic events from shallow ones on the basis of the waveforms recorded on a limited number of surface receivers. Machine-learning techniques are employed to explore the relationship between event hypocentres and seismic features of the recorded signals in time, frequency and time-frequency domains. We applied the technique to 440 microearthquakes -1.7 train the system to discriminate between deep and shallow events based on the knowledge gained from existing patterns. The cross-validation test showed that events with depth shallower than 250 m can be discriminated from events with hypocentral depth between 1000 and 2000 m with 88 per cent and 90.7 per cent accuracy using logistic regression and artificial neural network models, respectively. Similar results were obtained using single station seismograms. The results show that the spectral features have the highest correlation to source depth. Spectral centroids and 2-D cross-correlations in the time-frequency domain are two new seismic features used in this study that showed to be promising measures for seismic event classification. The used machine-learning techniques have application for efficient automatic classification of low energy signals recorded at one or more seismic stations.

  3. Food security and vulnerability modeling of East Java Province based on Geographically Weighted Ordinal Logistic Regression Semiparametric (GWOLRS model

    Directory of Open Access Journals (Sweden)

    N.W. Surya Wardhani

    2014-10-01

    Full Text Available Modeling of food security based on the characteristics of the area will be affected by the geographical location which means that geographical location will affect the region’s potential. Therefore, we need a method of statistical modeling that takes into account the geographical location or the location factor observations. In this case, the research variables could be global means that the location affects the response variables significantly; when some of the predictor variables are global and the other variables are local, then Geographically Weighted Ordinal Logistic Regression Semiparametric (GWOLRS could be used to analyze the data. The data used is the resilience and food insecurity data in 2011 in East Java Province. The result showed that three predictor variables that influenced by the location are the percentage of poor (%, rice production per district (tons and life expectancy (%. Those three predictor variables are local because they have significant influence in some districts/cities but had no significant effect in other districts/cities, while other two variables that are clean water and good quality road length (km are assumed global because it is not a significant factor for the whole districts/towns in East Java .

  4. Identifying Environmental and Social Factors Predisposing to Pathological Gambling Combining Standard Logistic Regression and Logic Learning Machine.

    Science.gov (United States)

    Parodi, Stefano; Dosi, Corrado; Zambon, Antonella; Ferrari, Enrico; Muselli, Marco

    2017-03-02

    Identifying potential risk factors for problem gambling (PG) is of primary importance for planning preventive and therapeutic interventions. We illustrate a new approach based on the combination of standard logistic regression and an innovative method of supervised data mining (Logic Learning Machine or LLM). Data were taken from a pilot cross-sectional study to identify subjects with PG behaviour, assessed by two internationally validated scales (SOGS and Lie/Bet). Information was obtained from 251 gamblers recruited in six betting establishments. Data on socio-demographic characteristics, lifestyle and cognitive-related factors, and type, place and frequency of preferred gambling were obtained by a self-administered questionnaire. The following variables associated with PG were identified: instant gratification games, alcohol abuse, cognitive distortion, illegal behaviours and having started gambling with a relative or a friend. Furthermore, the combination of LLM and LR indicated the presence of two different types of PG, namely: (a) daily gamblers, more prone to illegal behaviour, with poor money management skills and who started gambling at an early age, and (b) non-daily gamblers, characterised by superstitious beliefs and a higher preference for immediate reward games. Finally, instant gratification games were strongly associated with the number of games usually played. Studies on gamblers habitually frequently betting shops are rare. The finding of different types of PG by habitual gamblers deserves further analysis in larger studies. Advanced data mining algorithms, like LLM, are powerful tools and potentially useful in identifying risk factors for PG.

  5. Financial performance monitoring of the technical efficiency of critical access hospitals: a data envelopment analysis and logistic regression modeling approach.

    Science.gov (United States)

    Wilson, Asa B; Kerr, Bernard J; Bastian, Nathaniel D; Fulton, Lawrence V

    2012-01-01

    From 1980 to 1999, rural designated hospitals closed at a disproportionally high rate. In response to this emergent threat to healthcare access in rural settings, the Balanced Budget Act of 1997 made provisions for the creation of a new rural hospital--the critical access hospital (CAH). The conversion to CAH and the associated cost-based reimbursement scheme significantly slowed the closure rate of rural hospitals. This work investigates which methods can ensure the long-term viability of small hospitals. This article uses a two-step design to focus on a hypothesized relationship between technical efficiency of CAHs and a recently developed set of financial monitors for these entities. The goal is to identify the financial performance measures associated with efficiency. The first step uses data envelopment analysis (DEA) to differentiate efficient from inefficient facilities within a data set of 183 CAHs. Determining DEA efficiency is an a priori categorization of hospitals in the data set as efficient or inefficient. In the second step, DEA efficiency is the categorical dependent variable (efficient = 0, inefficient = 1) in the subsequent binary logistic regression (LR) model. A set of six financial monitors selected from the array of 20 measures were the LR independent variables. We use a binary LR to test the null hypothesis that recently developed CAH financial indicators had no predictive value for categorizing a CAH as efficient or inefficient, (i.e., there is no relationship between DEA efficiency and fiscal performance).

  6. Spatial Analysis of Severe Fever with Thrombocytopenia Syndrome Virus in China Using a Geographically Weighted Logistic Regression Model

    Directory of Open Access Journals (Sweden)

    Liang Wu

    2016-11-01

    Full Text Available Severe fever with thrombocytopenia syndrome (SFTS is caused by severe fever with thrombocytopenia syndrome virus (SFTSV, which has had a serious impact on public health in parts of Asia. There is no specific antiviral drug or vaccine for SFTSV and, therefore, it is important to determine the factors that influence the occurrence of SFTSV infections. This study aimed to explore the spatial associations between SFTSV infections and several potential determinants, and to predict the high-risk areas in mainland China. The analysis was carried out at the level of provinces in mainland China. The potential explanatory variables that were investigated consisted of meteorological factors (average temperature, average monthly precipitation and average relative humidity, the average proportion of rural population and the average proportion of primary industries over three years (2010–2012. We constructed a geographically weighted logistic regression (GWLR model in order to explore the associations between the selected variables and confirmed cases of SFTSV. The study showed that: (1 meteorological factors have a strong influence on the SFTSV cover; (2 a GWLR model is suitable for exploring SFTSV cover in mainland China; (3 our findings can be used for predicting high-risk areas and highlighting when meteorological factors pose a risk in order to aid in the implementation of public health strategies.

  7. Predicting Student Success in a Major's Introductory Biology Course via Logistic Regression Analysis of Scientific Reasoning Ability and Mathematics Scores

    Science.gov (United States)

    Thompson, E. David; Bowling, Bethany V.; Markle, Ross E.

    2017-02-01

    Studies over the last 30 years have considered various factors related to student success in introductory biology courses. While much of the available literature suggests that the best predictors of success in a college course are prior college grade point average (GPA) and class attendance, faculty often require a valuable predictor of success in those courses wherein the majority of students are in the first semester and have no previous record of college GPA or attendance. In this study, we evaluated the efficacy of the ACT Mathematics subject exam and Lawson's Classroom Test of Scientific Reasoning in predicting success in a major's introductory biology course. A logistic regression was utilized to determine the effectiveness of a combination of scientific reasoning (SR) scores and ACT math (ACT-M) scores to predict student success. In summary, we found that the model—with both SR and ACT-M as significant predictors—could be an effective predictor of student success and thus could potentially be useful in practical decision making for the course, such as directing students to support services at an early point in the semester.

  8. Adjustment of State Owned and Foreign-Funded Enterprises in China to economic reforms,1980s-2007: a logistic smooth transition regression (LSTR) approach

    OpenAIRE

    Aizenman, Joshua; Geng, Nan

    2009-01-01

    This paper applies a logistic smooth transition regression approach to the estimation of a homogenous aggregate value added production function of the State Owned (SOE) and Foreign-Funded Enterprises (FFE) in China, 1980s-2007. The transition associated with the eco- nomic reforms in China is estimated applying a curvilinear logistic function, where the speed and the timing of the transition are endoge- nously determined by the data. We find high but gradually declining markups in both ...

  9. A Conditional Logistic Regression Analysis on Risk Factors of Ankylosing Spondylitis%强直性脊柱炎危险因素的条件 Logistic 回归分析

    Institute of Scientific and Technical Information of China (English)

    李俊杰; 季加芬; 董永珍; 杨恺; 贾南; 刘长云

    2013-01-01

    Objective To screen out the risk factors of ankylosing spondylitis with logistic regression analy-sis,and to evaluate the importance of risk factors to lead to AS ,also to probe into the protecting factors of AS to provide the data for evidence of the primary prevention for AS .Methods Determine the number of samples of the case group and the control group according to the incidence of AS ,to 1∶2 case-control study.Take the form of survey questionnaires to survey,then data analysis with logistic regression analysis .Results Test the various factors by chi-square,Gender, birth season, result of HLA-B27, rheumatoid arthritis history, occupation category,smoking, drinking,history of new homes relocation and sports exercise are statistically significant for the incidence of AS ,then make the Logistic regression analysis:gender,season of birth,HLA-B27,smoking are the risk factors of AS.Conclusion The incidence of ankylosing spondylitis is influenced by the genetic and environmental .Gender,season of birth,HLA-B27,smoking are the main risk factors for AS,The risk factors role can be avoided to reduce the incidence of AS for the purpose of eugenics .%  目的应用条件Logistic回归方法筛选出与强直性脊柱炎(AS)发病有关的主要危险因素,评估各主要危险因素对AS发病的相对重要性,为优生咨询提供数据并为实现AS的一级预防提供科学依据。方法根据目前流行病学资料提供的发病率确定病例组及对照组的样本数,按1∶2分组进行病例对照研究,采取调查问卷形式对患者经行调查,运用Logistic回归对数据进行统计学分析。结果通过卡方检验对各因素进行筛选,有性别、出生季节、HLA-B27检查结果、类风湿性关节炎病史、职业类型、吸烟、饮酒、新居搬迁史、体育锻炼情况等9因素与AS的发病具有统计学意义,将上述9因素进行Logistic回归分析,最后进入回归模型的危险因素为患者

  10. 9th state of logistics survey for South Africa: connecting neighbours - engaging the world

    CSIR Research Space (South Africa)

    Viljoen, N

    2013-06-01

    Full Text Available The 9th State of Logistics survey for South Africa 2012 delivers a message of action. South Africa must make great strides in addressing critical issues relating to the road freight sector, shifting freight from road to rail and addressing rampant...

  11. Study of risk factors affecting both hypertension and obesity outcome by using multivariate multilevel logistic regression models

    Directory of Open Access Journals (Sweden)

    Sepedeh Gholizadeh

    2016-07-01

    Full Text Available Background:Obesity and hypertension are the most important non-communicable diseases thatin many studies, the prevalence and their risk factors have been performedin each geographic region univariately.Study of factors affecting both obesity and hypertension may have an important role which to be adrressed in this study. Materials &Methods:This cross-sectional study was conducted on 1000 men aged 20-70 living in Bushehr province. Blood pressure was measured three times and the average of them was considered as one of the response variables. Hypertension was defined as systolic blood pressure ≥140 (and-or diastolic blood pressure ≥90 and obesity was defined as body mass index ≥25. Data was analyzed by using multilevel, multivariate logistic regression model by MlwiNsoftware. Results:Intra class correlations in cluster level obtained 33% for high blood pressure and 37% for obesity, so two level model was fitted to data. The prevalence of obesity and hypertension obtained 43.6% (0.95%CI; 40.6-46.5, 29.4% (0.95%CI; 26.6-32.1 respectively. Age, gender, smoking, hyperlipidemia, diabetes, fruit and vegetable consumption and physical activity were the factors affecting blood pressure (p≤0.05. Age, gender, hyperlipidemia, diabetes, fruit and vegetable consumption, physical activity and place of residence are effective on obesity (p≤0.05. Conclusion: The multilevel models with considering levels distribution provide more precise estimates. As regards obesity and hypertension are the major risk factors for cardiovascular disease, by knowing the high-risk groups we can d careful planning to prevention of non-communicable diseases and promotion of society health.

  12. Comparison of artificial neural network and logistic regression models for predicting in-hospital mortality after primary liver cancer surgery.

    Directory of Open Access Journals (Sweden)

    Hon-Yi Shi

    Full Text Available BACKGROUND: Since most published articles comparing the performance of artificial neural network (ANN models and logistic regression (LR models for predicting hepatocellular carcinoma (HCC outcomes used only a single dataset, the essential issue of internal validity (reproducibility of the models has not been addressed. The study purposes to validate the use of ANN model for predicting in-hospital mortality in HCC surgery patients in Taiwan and to compare the predictive accuracy of ANN with that of LR model. METHODOLOGY/PRINCIPAL FINDINGS: Patients who underwent a HCC surgery during the period from 1998 to 2009 were included in the study. This study retrospectively compared 1,000 pairs of LR and ANN models based on initial clinical data for 22,926 HCC surgery patients. For each pair of ANN and LR models, the area under the receiver operating characteristic (AUROC curves, Hosmer-Lemeshow (H-L statistics and accuracy rate were calculated and compared using paired T-tests. A global sensitivity analysis was also performed to assess the relative significance of input parameters in the system model and the relative importance of variables. Compared to the LR models, the ANN models had a better accuracy rate in 97.28% of cases, a better H-L statistic in 41.18% of cases, and a better AUROC curve in 84.67% of cases. Surgeon volume was the most influential (sensitive parameter affecting in-hospital mortality followed by age and lengths of stay. CONCLUSIONS/SIGNIFICANCE: In comparison with the conventional LR model, the ANN model in the study was more accurate in predicting in-hospital mortality and had higher overall performance indices. Further studies of this model may consider the effect of a more detailed database that includes complications and clinical examination findings as well as more detailed outcome data.

  13. Identification and validation of a logistic regression model for predicting serious injuries associated with motor vehicle crashes.

    Science.gov (United States)

    Kononen, Douglas W; Flannagan, Carol A C; Wang, Stewart C

    2011-01-01

    A multivariate logistic regression model, based upon National Automotive Sampling System Crashworthiness Data System (NASS-CDS) data for calendar years 1999-2008, was developed to predict the probability that a crash-involved vehicle will contain one or more occupants with serious or incapacitating injuries. These vehicles were defined as containing at least one occupant coded with an Injury Severity Score (ISS) of greater than or equal to 15, in planar, non-rollover crash events involving Model Year 2000 and newer cars, light trucks, and vans. The target injury outcome measure was developed by the Centers for Disease Control and Prevention (CDC)-led National Expert Panel on Field Triage in their recent revision of the Field Triage Decision Scheme (American College of Surgeons, 2006). The parameters to be used for crash injury prediction were subsequently specified by the National Expert Panel. Model input parameters included: crash direction (front, left, right, and rear), change in velocity (delta-V), multiple vs. single impacts, belt use, presence of at least one older occupant (≥ 55 years old), presence of at least one female in the vehicle, and vehicle type (car, pickup truck, van, and sport utility). The model was developed using predictor variables that may be readily available, post-crash, from OnStar-like telematics systems. Model sensitivity and specificity were 40% and 98%, respectively, using a probability cutpoint of 0.20. The area under the receiver operator characteristic (ROC) curve for the final model was 0.84. Delta-V (mph), seat belt use and crash direction were the most important predictors of serious injury. Due to the complexity of factors associated with rollover-related injuries, a separate screening algorithm is needed to model injuries associated with this crash mode. Copyright © 2010 Elsevier Ltd. All rights reserved.

  14. Predicting China’s SME Credit Risk in Supply Chain Financing by Logistic Regression, Artificial Neural Network and Hybrid Models

    Directory of Open Access Journals (Sweden)

    You Zhu

    2016-05-01

    Full Text Available Based on logistic regression (LR and artificial neural network (ANN methods, we construct an LR model, an ANN model and three types of a two-stage hybrid model. The two-stage hybrid model is integrated by the LR and ANN approaches. We predict the credit risk of China’s small and medium-sized enterprises (SMEs for financial institutions (FIs in the supply chain financing (SCF by applying the above models. In the empirical analysis, the quarterly financial and non-financial data of 77 listed SMEs and 11 listed core enterprises (CEs in the period of 2012–2013 are chosen as the samples. The empirical results show that: (i the “negative signal” prediction accuracy ratio of the ANN model is better than that of LR model; (ii the two-stage hybrid model type I has a better performance of predicting “positive signals” than that of the ANN model; (iii the two-stage hybrid model type II has a stronger ability both in aspects of predicting “positive signals” and “negative signals” than that of the two-stage hybrid model type I; and (iv “negative signal” predictive power of the two-stage hybrid model type III is stronger than that of the two-stage hybrid model type II. In summary, the two-stage hybrid model III has the best classification capability to forecast SMEs credit risk in SCF, which can be a useful prediction tool for China’s FIs.

  15. SU-F-BRD-01: A Logistic Regression Model to Predict Objective Function Weights in Prostate Cancer IMRT

    Energy Technology Data Exchange (ETDEWEB)

    Boutilier, J; Chan, T; Lee, T [University of Toronto, Toronto, Ontario (Canada); Craig, T; Sharpe, M [University of Toronto, Toronto, Ontario (Canada); The Princess Margaret Cancer Centre - UHN, Toronto, ON (Canada)

    2014-06-15

    Purpose: To develop a statistical model that predicts optimization objective function weights from patient geometry for intensity-modulation radiotherapy (IMRT) of prostate cancer. Methods: A previously developed inverse optimization method (IOM) is applied retrospectively to determine optimal weights for 51 treated patients. We use an overlap volume ratio (OVR) of bladder and rectum for different PTV expansions in order to quantify patient geometry in explanatory variables. Using the optimal weights as ground truth, we develop and train a logistic regression (LR) model to predict the rectum weight and thus the bladder weight. Post hoc, we fix the weights of the left femoral head, right femoral head, and an artificial structure that encourages conformity to the population average while normalizing the bladder and rectum weights accordingly. The population average of objective function weights is used for comparison. Results: The OVR at 0.7cm was found to be the most predictive of the rectum weights. The LR model performance is statistically significant when compared to the population average over a range of clinical metrics including bladder/rectum V53Gy, bladder/rectum V70Gy, and mean voxel dose to the bladder, rectum, CTV, and PTV. On average, the LR model predicted bladder and rectum weights that are both 63% closer to the optimal weights compared to the population average. The treatment plans resulting from the LR weights have, on average, a rectum V70Gy that is 35% closer to the clinical plan and a bladder V70Gy that is 43% closer. Similar results are seen for bladder V54Gy and rectum V54Gy. Conclusion: Statistical modelling from patient anatomy can be used to determine objective function weights in IMRT for prostate cancer. Our method allows the treatment planners to begin the personalization process from an informed starting point, which may lead to more consistent clinical plans and reduce overall planning time.

  16. Logistic regression analysis of the outcome on 90 d and associated factors in conscious patients with intracerebral hemorrhage

    Directory of Open Access Journals (Sweden)

    ZHEN Zhi-gang

    2013-09-01

    Full Text Available Objective To investigate the outcome on 90 d and influencing factors for the outcome in conscious patients with intracerebral hemorrhage (ICH. Methods Two hundred and twenty-five patients with ICH were admitted to our hospital within 6 h after onset and were suitable to be treated with medical conservative therapy. Patients were divided into two groups, the conscious group [Glasgow Coma Scale (GCS score ≥ 9] and the coma group (GCS score ≤ 8. Clinical features including gender, age, National Institute of Health Stroke Scale (NIHSS score, etc, were recorded. The prognosis of these patients on 90 d after onset was evaluated by the following index: survival or death; favorable prognosis [modified Rankin Scale (mRS score ≤ 2] or unfavorable prognosis (mRS score ≥ 3, death or severe disability. The difference of clinical features and prognosis between the conscious group and coma group was explored. The prognosis of the patients in conscious group was analyzed, and influencing factors for prognosis were explored. Results Multifactorial Logistic regression analysis indicated that hyperglycemia, higher NIHSS score, rehemorrhagia and hematemesis were independent risk factors for 90-day mortality. On the other hand, advanced age, higher NIHSS score, rehemorrhagia and hematemesis were independent risk factors for death or severe disability on 90-day. Conclusion In ICH patients who were conscious on admission, hyperglycemia, advanced age, higher NIHSS score, rehemorrhagia and hematemesis are strong predictors for mortality and unfavourable outcome. Controlling hyperglycemia and prevention of rehemorrhagia and hematemesis are important elements for reducing 90-day mortality and severe disability.

  17. Classification of Urban Aerial Data Based on Pixel Labelling with Deep Convolutional Neural Networks and Logistic Regression

    Science.gov (United States)

    Yao, W.; Poleswki, P.; Krzystek, P.

    2016-06-01

    The recent success of deep convolutional neural networks (CNN) on a large number of applications can be attributed to large amounts of available training data and increasing computing power. In this paper, a semantic pixel labelling scheme for urban areas using multi-resolution CNN and hand-crafted spatial-spectral features of airborne remotely sensed data is presented. Both CNN and hand-crafted features are applied to image/DSM patches to produce per-pixel class probabilities with a L1-norm regularized logistical regression classifier. The evidence theory infers a degree of belief for pixel labelling from different sources to smooth regions by handling the conflicts present in the both classifiers while reducing the uncertainty. The aerial data used in this study were provided by ISPRS as benchmark datasets for 2D semantic labelling tasks in urban areas, which consists of two data sources from LiDAR and color infrared camera. The test sites are parts of a city in Germany which is assumed to consist of typical object classes including impervious surfaces, trees, buildings, low vegetation, vehicles and clutter. The evaluation is based on the computation of pixel-based confusion matrices by random sampling. The performance of the strategy with respect to scene characteristics and method combination strategies is analyzed and discussed. The competitive classification accuracy could be not only explained by the nature of input data sources: e.g. the above-ground height of nDSM highlight the vertical dimension of houses, trees even cars and the nearinfrared spectrum indicates vegetation, but also attributed to decision-level fusion of CNN's texture-based approach with multichannel spatial-spectral hand-crafted features based on the evidence combination theory.

  18. Binary Logistic Regression Analysis for Detecting Differential Item Functioning: Effectiveness of R[superscript 2] and Delta Log Odds Ratio Effect Size Measures

    Science.gov (United States)

    Hidalgo, Mª Dolores; Gómez-Benito, Juana; Zumbo, Bruno D.

    2014-01-01

    The authors analyze the effectiveness of the R[superscript 2] and delta log odds ratio effect size measures when using logistic regression analysis to detect differential item functioning (DIF) in dichotomous items. A simulation study was carried out, and the Type I error rate and power estimates under conditions in which only statistical testing…

  19. A Logistic Regression Analysis of Turkey's 15-Year-Olds' Scoring above the OECD Average on the PISA'09 Reading Assessment

    Science.gov (United States)

    Kasapoglu, Koray

    2014-01-01

    This study aims to investigate which factors are associated with Turkey's 15-year-olds' scoring above the OECD average (493) on the PISA'09 reading assessment. Collected from a total of 4,996 15-year-old students from Turkey, data were analyzed by logistic regression analysis in order to model the data of students who were split into two: (1)…

  20. Binary Logistic Regression Analysis for Detecting Differential Item Functioning: Effectiveness of R[superscript 2] and Delta Log Odds Ratio Effect Size Measures

    Science.gov (United States)

    Hidalgo, Mª Dolores; Gómez-Benito, Juana; Zumbo, Bruno D.

    2014-01-01

    The authors analyze the effectiveness of the R[superscript 2] and delta log odds ratio effect size measures when using logistic regression analysis to detect differential item functioning (DIF) in dichotomous items. A simulation study was carried out, and the Type I error rate and power estimates under conditions in which only statistical testing…

  1. The Research about the Comparison of RBF Neural Network with Logistic Regression%RBF神经网络与logistic回归模型的对比研究

    Institute of Scientific and Technical Information of China (English)

    姚应水; 叶明全

    2011-01-01

    目的 RBF神经网络是一种重要的数据挖掘分类模型,探讨RBF神经网络在解决判别分析问题中的应用.方法 通过实例比较RBF神经网络和logistic回归模型的性能优劣.结果 RBF神经网络的回代拟合效果和泛化能力明显优于logistic回归模型.结论RBF神经网络在医学统计学领域中具有较好的应用前景.%Objective RBF neural network is an important data mining classification model in data mining. To explore the application of RBF neural network on medical discriminant analysis through comparing with logistic regression. Methods Comparing the prediction results by some statistical indexes of the RBF neural network and the logistic regression by using an example. Results The comparison results of the prediction performance between RBF neural network and logistic regression show that RBF neural network is much better than logistic regression for the data. Conclusion RBF neural network will make a better facture of its appfi-cadon in medical researches.

  2. Logistic versus Hazards Regression Analyses in Evaluation Research: An Exposition and Application to the North Carolina Court Counselors' Intensive Protective Supervision Project.

    Science.gov (United States)

    Land, Kenneth C.; And Others

    1994-01-01

    Advantages of using logistic and hazards regression techniques in assessing the overall impact of a treatment program and the differential impact on client subgroups are examined and compared using data from a juvenile court program for status offenders. Implications are drawn for management and effectiveness of intensive supervision programs.…

  3. Logistic regression and artificial neural network models for mapping of regional-scale landslide susceptibility in volcanic mountains of West Java (Indonesia)

    Science.gov (United States)

    Ngadisih, Bhandary, Netra P.; Yatabe, Ryuichi; Dahal, Ranjan K.

    2016-05-01

    West Java Province is the most landslide risky area in Indonesia owing to extreme geo-morphological conditions, climatic conditions and densely populated settlements with immense completed and ongoing development activities. So, a landslide susceptibility map at regional scale in this province is a fundamental tool for risk management and land-use planning. Logistic regression and Artificial Neural Network (ANN) models are the most frequently used tools for landslide susceptibility assessment, mainly because they are capable of handling the nature of landslide data. The main objective of this study is to apply logistic regression and ANN models and compare their performance for landslide susceptibility mapping in volcanic mountains of West Java Province. In addition, the model application is proposed to identify the most contributing factors to landslide events in the study area. The spatial database built in GIS platform consists of landslide inventory, four topographical parameters (slope, aspect, relief, distance to river), three geological parameters (distance to volcano crater, distance to thrust and fault, geological formation), and two anthropogenic parameters (distance to road, land use). The logistic regression model in this study revealed that slope, geological formations, distance to road and distance to volcano are the most influential factors of landslide events while, the ANN model revealed that distance to volcano crater, geological formation, distance to road, and land-use are the most important causal factors of landslides in the study area. Moreover, an evaluation of the model showed that the ANN model has a higher accuracy than the logistic regression model.

  4. Detection of Aberrant Responding on a Personality Scale in a Military Sample: An Application of Evaluating Person Fit with Two-Level Logistic Regression

    Science.gov (United States)

    Woods, Carol M.; Oltmanns, Thomas F.; Turkheimer, Eric

    2008-01-01

    Person-fit assessment is used to identify persons who respond aberrantly to a test or questionnaire. In this study, S. P. Reise's (2000) method for evaluating person fit using 2-level logistic regression was applied to 13 personality scales of the Schedule for Nonadaptive and Adaptive Personality (SNAP; L. Clark, 1996) that had been administered…

  5. Three Statistical Testing Procedures in Logistic Regression: Their Performance in Differential Item Functioning (DIF) Investigation. Research Report. ETS RR-09-35

    Science.gov (United States)

    Paek, Insu

    2009-01-01

    Three statistical testing procedures well-known in the maximum likelihood approach are the Wald, likelihood ratio (LR), and score tests. Although well-known, the application of these three testing procedures in the logistic regression method to investigate differential item function (DIF) has not been rigorously made yet. Employing a variety of…

  6. Multivariate logistic regression analysis of postoperative complications and risk model establishment of gastrectomy for gastric cancer: A single-center cohort report.

    Science.gov (United States)

    Zhou, Jinzhe; Zhou, Yanbing; Cao, Shougen; Li, Shikuan; Wang, Hao; Niu, Zhaojian; Chen, Dong; Wang, Dongsheng; Lv, Liang; Zhang, Jian; Li, Yu; Jiao, Xuelong; Tan, Xiaojie; Zhang, Jianli; Wang, Haibo; Zhang, Bingyuan; Lu, Yun; Sun, Zhenqing

    2016-01-01

    Reporting of surgical complications is common, but few provide information about the severity and estimate risk factors of complications. If have, but lack of specificity. We retrospectively analyzed data on 2795 gastric cancer patients underwent surgical procedure at the Affiliated Hospital of Qingdao University between June 2007 and June 2012, established multivariate logistic regression model to predictive risk factors related to the postoperative complications according to the Clavien-Dindo classification system. Twenty-four out of 86 variables were identified statistically significant in univariate logistic regression analysis, 11 significant variables entered multivariate analysis were employed to produce the risk model. Liver cirrhosis, diabetes mellitus, Child classification, invasion of neighboring organs, combined resection, introperative transfusion, Billroth II anastomosis of reconstruction, malnutrition, surgical volume of surgeons, operating time and age were independent risk factors for postoperative complications after gastrectomy. Based on logistic regression equation, p=Exp∑BiXi / (1+Exp∑BiXi), multivariate logistic regression predictive model that calculated the risk of postoperative morbidity was developed, p = 1/(1 + e((4.810-1.287X1-0.504X2-0.500X3-0.474X4-0.405X5-0.318X6-0.316X7-0.305X8-0.278X9-0.255X10-0.138X11))). The accuracy, sensitivity and specificity of the model to predict the postoperative complications were 86.7%, 76.2% and 88.6%, respectively. This risk model based on Clavien-Dindo grading severity of complications system and logistic regression analysis can predict severe morbidity specific to an individual patient's risk factors, estimate patients' risks and benefits of gastric surgery as an accurate decision-making tool and may serve as a template for the development of risk models for other surgical groups.

  7. Using ordinal logistic regression to evaluate the performance of laser-Doppler predictions of burn-healing time

    Directory of Open Access Journals (Sweden)

    Pape Sarah A

    2009-02-01

    Full Text Available Abstract Background Laser-Doppler imaging (LDI of cutaneous blood flow is beginning to be used by burn surgeons to predict the healing time of burn wounds; predicted healing time is used to determine wound treatment as either dressings or surgery. In this paper, we do a statistical analysis of the performance of the technique. Methods We used data from a study carried out by five burn centers: LDI was done once between days 2 to 5 post burn, and healing was assessed at both 14 days and 21 days post burn. Random-effects ordinal logistic regression and other models such as the continuation ratio model were used to model healing-time as a function of the LDI data, and of demographic and wound history variables. Statistical methods were also used to study the false-color palette, which enables the laser-Doppler imager to be used by clinicians as a decision-support tool. Results Overall performance is that diagnoses are over 90% correct. Related questions addressed were what was the best blood flow summary statistic and whether, given the blood flow measurements, demographic and observational variables had any additional predictive power (age, sex, race, % total body surface area burned (%TBSA, site and cause of burn, day of LDI scan, burn center. It was found that mean laser-Doppler flux over a wound area was the best statistic, and that, given the same mean flux, women recover slightly more slowly than men. Further, the likely degradation in predictive performance on moving to a patient group with larger %TBSA than those in the data sample was studied, and shown to be small. Conclusion Modeling healing time is a complex statistical problem, with random effects due to multiple burn areas per individual, and censoring caused by patients missing hospital visits and undergoing surgery. This analysis applies state-of-the art statistical methods such as the bootstrap and permutation tests to a medical problem of topical interest. New medical findings are

  8. Methodologies for the assessment of earthquake-triggered landslides hazard. A comparison of Logistic Regression and Artificial Neural Network models.

    Science.gov (United States)

    García-Rodríguez, M. J.; Malpica, J. A.; Benito, B.

    2009-04-01

    In recent years, interest in landslide hazard assessment studies has increased substantially. They are appropriate for evaluation and mitigation plan development in landslide-prone areas. There are several techniques available for landslide hazard research at a regional scale. Generally, they can be classified in two groups: qualitative and quantitative methods. Most of qualitative methods tend to be subjective, since they depend on expert opinions and represent hazard levels in descriptive terms. On the other hand, quantitative methods are objective and they are commonly used due to the correlation between the instability factors and the location of the landslides. Within this group, statistical approaches and new heuristic techniques based on artificial intelligence (artificial neural network (ANN), fuzzy logic, etc.) provide rigorous analysis to assess landslide hazard over large regions. However, they depend on qualitative and quantitative data, scale, types of movements and characteristic factors used. We analysed and compared an approach for assessing earthquake-triggered landslides hazard using logistic regression (LR) and artificial neural networks (ANN) with a back-propagation learning algorithm. One application has been developed in El Salvador, a country of Central America where the earthquake-triggered landslides are usual phenomena. In a first phase, we analysed the susceptibility and hazard associated to the seismic scenario of the 2001 January 13th earthquake. We calibrated the models using data from the landslide inventory for this scenario. These analyses require input variables representing physical parameters to contribute to the initiation of slope instability, for example, slope gradient, elevation, aspect, mean annual precipitation, lithology, land use, and terrain roughness, while the occurrence or non-occurrence of landslides is considered as dependent variable. The results of the landslide susceptibility analysis are checked using landslide

  9. Performance comparison between Logistic regression, decision trees, and multilayer perceptron in predicting peripheral neuropathy in type 2 diabetes mellitus

    Institute of Scientific and Technical Information of China (English)

    LI Chang-ping; ZHI Xin-yue; MA Jun; CUI Zhuang; ZHU Zi-long; ZHANG Cui; HU Liang-ping

    2012-01-01

    Background Various methods can be applied to build predictive models for the clinical data with binary outcome variable.This research aims to explore the process of constructing common predictive models,Logistic regression (LR),decision tree (DT) and multilayer perceptron (MLP),as well as focus on specific details when applying the methods mentioned above:what preconditions should be satisfied,how to set parameters of the model,how to screen variables and build accuracy models quickly and efficiently,and how to assess the generalization ability (that is,prediction performance) reliably by Monte Carlo method in the case of small sample size.Methods All the 274 patients (include 137 type 2 diabetes mellitus with diabetic peripheral neuropathy and 137 type 2 diabetes mellitus without diabetic peripheral neuropathy) from the Metabolic Disease Hospital in Tianjin participated in the study.There were 30 variables such as sex,age,glycosylated hemoglobin,etc.On account of small sample size,the classification and regression tree (CART) with the chi-squared automatic interaction detector tree (CHAID) were combined by means of the 100 times 5-7 fold stratified cross-validation to build DT.The MLP was constructed by Schwarz Bayes Criterion to choose the number of hidden layers and hidden layer units,alone with levenberg-marquardt (L-M) optimization algorithm,weight decay and preliminary training method.Subsequently,LR was applied by the best subset method with the Akaike Information Criterion (AIC) to make the best used of information and avoid overfitting.Eventually,a 10 to 100 times 3-10 fold stratified cross-validation method was used to compare the generalization ability of DT,MLP and LR in view of the areas under the receiver operating characteristic (ROC) curves (AUC).Results The AUC of DT,MLP and LR were 0.8863,0.8536 and 0.8802,respectively.As the larger the AUC of a specific prediction model is,the higher diagnostic ability presents,MLP performed optimally,and then

  10. "Logits and Tigers and Bears, Oh My! A Brief Look at the Simple Math of Logistic Regression and How It Can Improve Dissemination of Results"

    Directory of Open Access Journals (Sweden)

    Jason W. Osborne

    2012-06-01

    Full Text Available Logistic regression is slowly gaining acceptance in the social sciences, and fills an important niche in the researcher's toolkit: being able to predict important outcomes that are not continuous in nature. While OLS regression is a valuable tool, it cannot routinely be used to predict outcomes that are binary or categorical in nature. These outcomes represent important social science lines of research: retention in, or dropout from school, using illicit drugs, underage alcohol consumption, antisocial behavior, purchasing decisions, voting patterns, risky behavior, and so on. The goal of this paper is to briefly lead the reader through the surprisingly simple mathematics that underpins logistic regression: probabilities, odds, odds ratios, and logits. Anyone with spreadsheet software or a scientific calculator can follow along, and in turn, this knowledge can be used to make much more interesting, clear, and accurate presentations of results (especially to non-technical audiences. In particular, I will share an example of an interaction in logistic regression, how it was originally graphed, and how the graph was made substantially more user-friendly by converting the original metric (logits to a more readily interpretable metric (probability through three simple steps.

  11. Patterns and trends in occupational attainment of first jobs in the Netherlands, 1930–1995 : ordinary least squares regression versus conditional multinomial logistic regression

    NARCIS (Netherlands)

    Dessens, Jos A. G.; Jansen, Wim; Ganzeboom, Harry B. G.; Heijden, Peter G. M. van der

    2003-01-01

    This paper brings together the virtues of linear regression models for status attainment models formulated by second-generation social mobility researchers and the strengths of log-linear models formulated by third-generation researchers, into fourth-generation social mobility models, by using condi

  12. Hospital-level associations with 30-day patient mortality after cardiac surgery: a tutorial on the application and interpretation of marginal and multilevel logistic regression

    Directory of Open Access Journals (Sweden)

    Sanagou Masoumeh

    2012-03-01

    Full Text Available Abstract Background Marginal and multilevel logistic regression methods can estimate associations between hospital-level factors and patient-level 30-day mortality outcomes after cardiac surgery. However, it is not widely understood how the interpretation of hospital-level effects differs between these methods. Methods The Australasian Society of Cardiac and Thoracic Surgeons (ASCTS registry provided data on 32,354 patients undergoing cardiac surgery in 18 hospitals from 2001 to 2009. The logistic regression methods related 30-day mortality after surgery to hospital characteristics with concurrent adjustment for patient characteristics. Results Hospital-level mortality rates varied from 1.0% to 4.1% of patients. Ordinary, marginal and multilevel regression methods differed with regard to point estimates and conclusions on statistical significance for hospital-level risk factors; ordinary logistic regression giving inappropriately narrow confidence intervals. The median odds ratio, MOR, from the multilevel model was 1.2 whereas ORs for most patient-level characteristics were of greater magnitude suggesting that unexplained between-hospital variation was not as relevant as patient-level characteristics for understanding mortality rates. For hospital-level characteristics in the multilevel model, 80% interval ORs, IOR-80%, supplemented the usual ORs from the logistic regression. The IOR-80% was (0.8 to 1.8 for academic affiliation and (0.6 to 1.3 for the median annual number of cardiac surgery procedures. The width of these intervals reflected the unexplained variation between hospitals in mortality rates; the inclusion of one in each interval suggested an inability to add meaningfully to explaining variation in mortality rates. Conclusions Marginal and multilevel models take different approaches to account for correlation between patients within hospitals and they lead to different interpretations for hospital-level odds ratios.

  13. A Comparison Between the Empirical Logistic Regression Method and the Maximum Likelihood Estimation Method%经验 logistic 回归方法与最大似然估计方法的对比分析

    Institute of Scientific and Technical Information of China (English)

    张婷婷; 高金玲

    2014-01-01

    针对logistic回归中最大似然估计法的迭代算法求解困难的问题,从理论和实例运用的两个角度寻找到一种简便估计法,即经验logistic回归。分析结果表明,在样本容量很大的情况下经验logistic回归方法比最大似然估计方法更具备良好的科学性和实用性,并且两种方法对同一组资料的分析结果一致,而经验logistic回归更简单,此结果对于实际工作者来说非常重要。%In this paper , the empirical logistic regression method and the maximum likelihood estimation method were analyzed in detail by illustrating in theory , and the two methods were compared with correlation a-nalysis from scientific and practical .Analysis results show that , under the condition of the sample size is very big , empirical logistic regression method is better than maximum likelihood estimation method in respect of scientific and practical , at the same time , they are the same consequence .However , empirical logistic regression method is easier than maximum likelihood estimation method , which is very important to practical workers .

  14. Comprehensive Logistics

    CERN Document Server

    Gudehus, Timm

    2012-01-01

    Modern logistics comprises operative logistics, analytical logistics and management of logistic networks. Central task of operative logistics is the efficient supply of required goods at the right place within the right time. Tasks of analytical logistics are designing optimal networks and systems, developing strategies for planning, scheduling and operation, and organizing efficient order and performance processes. Logistic management plans, implements and operates logistic networks and schedules orders, stocks and resources. This reference-book offers a unique survey of modern logistics. It contains proven strategies, rules and tools for the solution of a multitude of logistic problems. The analytically derived algorithms and formulas can be used for the computer-based planning of logistic systems and for the dynamic scheduling of orders and resources in supply networks. They enable significant improvements of performance, quality and costs. Their application is demonstrated by several examples from industr...

  15. Exploring improvements in patient logistics in Dutch hospitals with a survey

    Science.gov (United States)

    2012-01-01

    Background Research showed that promising approaches such as benchmarking, operations research, lean management and six sigma, could be adopted to improve patient logistics in healthcare. To our knowledge, little research has been conducted to obtain an overview on the use, combination and effects of approaches to improve patient logistics in hospitals. We therefore examined the approaches and tools used to improve patient logistics in Dutch hospitals, the reported effects of these approaches on performance, the applied support structure and the methods used to evaluate the effects. Methods A survey among experts on patient logistics in 94 Dutch hospitals. The survey data were analysed using cross tables. Results Forty-eight percent of all hospitals participated. Ninety-eight percent reported to have used multiple approaches, 39% of them used five or more approaches. Care pathways were the preferred approach by 43% of the hospitals, followed by business process re-engineering and lean six sigma (both 13%). Flowcharts were the most commonly used tool, they were used on a regular basis by 94% of the hospitals. Less than 10% of the hospitals used data envelopment analysis and critical path analysis on a regular basis. Most hospitals (68%) relied on external support for process analyses and education on patient logistics, only 24% had permanent internal training programs on patient logistics. Approximately 50% of the hospitals that evaluated the effects of approaches on efficiency, throughput times and financial results, reported that they had accomplished their goals. Goal accomplishment in general hospitals ranged from 63% to 67%, in academic teaching hospitals from 0% to 50%, and in teaching hospitals from 25% to 44%. More than 86% performed an evaluation, 53% performed a post-intervention measurement. Conclusions Patient logistics appeared to be a rather new subject as most hospitals had not selected a single approach, they relied on external support and they did

  16. Exploring improvements in patient logistics in Dutch hospitals with a survey

    Directory of Open Access Journals (Sweden)

    van Lent Wineke AM

    2012-08-01

    Full Text Available Abstract Background Research showed that promising approaches such as benchmarking, operations research, lean management and six sigma, could be adopted to improve patient logistics in healthcare. To our knowledge, little research has been conducted to obtain an overview on the use, combination and effects of approaches to improve patient logistics in hospitals. We therefore examined the approaches and tools used to improve patient logistics in Dutch hospitals, the reported effects of these approaches on performance, the applied support structure and the methods used to evaluate the effects. Methods A survey among experts on patient logistics in 94 Dutch hospitals. The survey data were analysed using cross tables. Results Forty-eight percent of all hospitals participated. Ninety-eight percent reported to have used multiple approaches, 39% of them used five or more approaches. Care pathways were the preferred approach by 43% of the hospitals, followed by business process re-engineering and lean six sigma (both 13%. Flowcharts were the most commonly used tool, they were used on a regular basis by 94% of the hospitals. Less than 10% of the hospitals used data envelopment analysis and critical path analysis on a regular basis. Most hospitals (68% relied on external support for process analyses and education on patient logistics, only 24% had permanent internal training programs on patient logistics. Approximately 50% of the hospitals that evaluated the effects of approaches on efficiency, throughput times and financial results, reported that they had accomplished their goals. Goal accomplishment in general hospitals ranged from 63% to 67%, in academic teaching hospitals from 0% to 50%, and in teaching hospitals from 25% to 44%. More than 86% performed an evaluation, 53% performed a post-intervention measurement. Conclusions Patient logistics appeared to be a rather new subject as most hospitals had not selected a single approach, they relied on

  17. Relative accuracy of spatial predictive models for lynx Lynx canadensis derived using logistic regression-AIC, multiple criteria evaluation and Bayesian approaches

    Institute of Scientific and Technical Information of China (English)

    Hejun KANG; Shelley M.ALEXANDER

    2009-01-01

    We compared probability surfaces derived using one set of environmental variables in three Geographic Information Systems (GIS) -based approaches: logistic regression and Akaike's Information Criterion (AIC),Multiple Criteria Evaluation (MCE),and Bayesian Analysis (specifically Dempster-Shafer theory). We used lynx Lynx canadensis as our focal species,and developed our environment relationship model using track data collected in Banff National Park,Alberta,Canada,during winters from 1997 to 2000. The accuracy of the three spatial models were compared using a contingency table method. We determined the percentage of cases in which both presence and absence points were correctly classified (overall accuracy),the failure to predict a species where it occurred (omission error) and the prediction of presence where there was absence (commission error). Our overall accuracy showed the logistic regression approach was the most accurate (74.51% ). The multiple criteria evaluation was intermediate (39.22%),while the Dempster-Shafer (D-S) theory model was the poorest (29.90%). However,omission and commission error tell us a different story: logistic regression had the lowest commission error,while D-S theory produced the lowest omission error. Our results provide evidence that habitat modellers should evaluate all three error measures when ascribing confidence in their model. We suggest that for our study area at least,the logistic regression model is optimal. However,where sample size is small or the species is very rare,it may also be useful to explore and/or use a more ecologically cautious modelling approach (e.g. Dempster-Shafer) that would over-predict,protect more sites,and thereby minimize the risk of missing critical habitat in conservation plans.

  18. Relative accuracy of spatial predictive models for lynx Lynx canadensis derived using logistic regression-AIC, multiple criteria evaluation and Bayesian approaches

    Directory of Open Access Journals (Sweden)

    Shelley M. ALEXANDER

    2009-02-01

    Full Text Available We compared probability surfaces derived using one set of environmental variables in three Geographic Information Systems (GIS-based approaches: logistic regression and Akaike’s Information Criterion (AIC, Multiple Criteria Evaluation (MCE, and Bayesian Analysis (specifically Dempster-Shafer theory. We used lynx Lynx canadensis as our focal species, and developed our environment relationship model using track data collected in Banff National Park, Alberta, Canada, during winters from 1997 to 2000. The accuracy of the three spatial models were compared using a contingency table method. We determined the percentage of cases in which both presence and absence points were correctly classified (overall accuracy, the failure to predict a species where it occurred (omission error and the prediction of presence where there was absence (commission error. Our overall accuracy showed the logistic regression approach was the most accurate (74.51%. The multiple criteria evaluation was intermediate (39.22%, while the Dempster-Shafer (D-S theory model was the poorest (29.90%. However, omission and commission error tell us a different story: logistic regression had the lowest commission error, while D-S theory produced the lowest omission error. Our results provide evidence that habitat modellers should evaluate all three error measures when ascribing confidence in their model. We suggest that for our study area at least, the logistic regression model is optimal. However, where sample size is small or the species is very rare, it may also be useful to explore and/or use a more ecologically cautious modelling approach (e.g. Dempster-Shafer that would over-predict, protect more sites, and thereby minimize the risk of missing critical habitat in conservation plans[Current Zoology 55(1: 28 – 40, 2009].

  19. A Mathematical Tool for Inference in Logistic Regression with Small-Sized Data Sets: A Practical Application on ISW-Ridge Relationships

    Directory of Open Access Journals (Sweden)

    Cheng-Wu Chen

    2008-11-01

    Full Text Available The general approach to modeling binary data for the purpose of estimating the propagation of an internal solitary wave (ISW is based on the maximum likelihood estimate (MLE method. In cases where the number of observations in the data is small, any inferences made based on the asymptotic distribution of changes in the deviance may be unreliable for binary data (the model's lack of fit is described in terms of a quantity known as the deviance. The deviance for the binary data is given by D. Collett (2003. may be unreliable for binary data. Logistic regression shows that the P-values for the likelihood ratio test and the score test are both <0.05. However, the null hypothesis is not rejected in the Wald test. The seeming discrepancies in P-values obtained between the Wald test and the other two tests are a sign that the large-sample approximation is not stable. We find that the parameters and the odds ratio estimates obtained via conditional exact logistic regression are different from those obtained via unconditional asymptotic logistic regression. Using exact results is a good idea when the sample size is small and the approximate P-values are <0.10. Thus in this study exact analysis is more appropriate.

  20. Comparing machine learning and logistic regression methods for predicting hypertension using a combination of gene expression and next-generation sequencing data.

    Science.gov (United States)

    Held, Elizabeth; Cape, Joshua; Tintle, Nathan

    2016-01-01

    Machine learning methods continue to show promise in the analysis of data from genetic association studies because of the high number of variables relative to the number of observations. However, few best practices exist for the application of these methods. We extend a recently proposed supervised machine learning approach for predicting disease risk by genotypes to be able to incorporate gene expression data and rare variants. We then apply 2 different versions of the approach (radial and linear support vector machines) to simulated data from Genetic Analysis Workshop 19 and compare performance to logistic regression. Method performance was not radically different across the 3 methods, although the linear support vector machine tended to show small gains in predictive ability relative to a radial support vector machine and logistic regression. Importantly, as the number of genes in the models was increased, even when those genes contained causal rare variants, model predictive ability showed a statistically significant decrease in performance for both the radial support vector machine and logistic regression. The linear support vector machine showed more robust performance to the inclusion of additional genes. Further work is needed to evaluate machine learning approaches on larger samples and to evaluate the relative improvement in model prediction from the incorporation of gene expression data.

  1. The adaptive Lasso for Logistic regression models%Logistic模型中参数的自适应Lasso估计

    Institute of Scientific and Technical Information of China (English)

    王娉; 郭鹏江; 夏志明

    2012-01-01

    目的 研究Logistic模型的参数估计.方法 在L1罚中引用一个自适应的权,即自适应Lasso方法.结果 自适应Lasso方法对Logistic模型同时进行了模型选择和参数估计.结论 在一定的正则条件下,Logistic模型的自适应Lasso估计是满足Oracle性质的.%Aim To estimate the parameters in the Logistic model. Methods Adaptive weights are used in the L1 penalty, which is adaptive Lasso. Results The adaptive Lasso selects variables and estimates parameters simulta-neously for the Logistic model. Conclusion Under certain regular conditions, the adaptive Lasso enjoys the oracle properties.

  2. Using Historical Data and Quasi-Likelihood Logistic Regression Modeling to Test Spatial Patterns of Channel Response to Peak Flows in a Mountain Watershed

    Science.gov (United States)

    Faustini, J. M.; Jones, J. A.

    2001-12-01

    This study used an empirical modeling approach to explore landscape controls on spatial variations in reach-scale channel response to peak flows in a mountain watershed. We used historical cross-section surveys spanning 20 years at five sites on 2nd to 5th-order channels and stream gaging records spanning up to 50 years. We related the observed proportion of cross-sections at a site exhibiting detectable change between consecutive surveys to the recurrence interval of the largest peak flow during the corresponding period using a quasi-likelihood logistic regression model. Stream channel response was linearly related to flood size or return period through the logit function, but the shape of the response function varied according to basin size, bed material, and the presence or absence of large wood. At the watershed scale, we hypothesized that the spatial scale and frequency of channel adjustment should increase in the downstream direction as sediment supply increases relative to transport capacity, resulting in more transportable sediment in the channel and hence increased bed mobility. Consistent with this hypothesis, cross sections from the 4th and 5th-order main stem channels exhibit more frequent detectable changes than those at two steep third-order tributary sites. Peak flows able to mobilize bed material sufficiently to cause detectable changes in 50% of cross-section profiles had an estimated recurrence interval of 3 years for the 4th and 5th-order channels and 4 to 6 years for the 3rd-order sites. This difference increased for larger magnitude channel changes; peak flows with recurrence intervals of about 7 years produced changes in 90% of cross sections at the main stem sites, but flows able to produce the same level of response at tributary sites were three times less frequent. At finer scales, this trend of increasing bed mobility in the downstream direction is modified by variations in the degree of channel confinement by bedrock and landforms, the

  3. Meteorological Factor Analysis of Freezing Injury to Overwintering Tea Based on Logistic Regression%基于 Logistic 回归的茶树越冬期冻害气象因素分析

    Institute of Scientific and Technical Information of China (English)

    段永春

    2015-01-01

    Thirty -one meteorological factors were chosed as dependent variable from 45 years of meteor-ological data in 3 major tea producing areas of Qingdao,Rizhao and Linyi in Shandong Provingce.Occurrence or not of the freezing injury during overwintering was chosed as independent variable.The single -factor logis-tic regression analysis was conducted,and 9 meteorological factors with statistical significance were chosed for multivariate logistic regression analysis.Then the logistic model for the heavy freezing injury occurrence of o-verwintering tea trees was established and evaluated.The results showed that the average air temperature in January,average air temperature in July of last year,rainfall in November of last year,average air temperature in November of last year,relative air humidity in February were the main determinants that caused the heavy freezing injury to the overwintering tea tree.%从山东日照、青岛、临沂三个主茶区45年的气象资料中,选择可能导致茶树越冬期大冻害形成的31个气象因子作为自变量,以越冬期大冻害发生有无作为因变量,进行单因素 Logistic 回归分析,从中选出9个有统计学意义的气象因子进行多因素 Logistic 回归分析,建立茶树越冬期大冻害发生的 Logistic 回归模型,并对模型进行评价。结果显示,1月平均气温、上年7月平均气温、上年11月降水量、上年11月平均气温、2月空气相对湿度五个气象因子决定了茶树越冬期大冻害的发生,其中1月平均气温是主要因子。

  4. Comparing Machine Learning Classifiers and Linear/Logistic Regression to Explore the Relationship between Hand Dimensions and Demographic Characteristics.

    Science.gov (United States)

    Miguel-Hurtado, Oscar; Guest, Richard; Stevenage, Sarah V; Neil, Greg J; Black, Sue

    2016-01-01

    Understanding the relationship between physiological measurements from human subjects and their demographic data is important within both the biometric and forensic domains. In this paper we explore the relationship between measurements of the human hand and a range of demographic features. We assess the ability of linear regression and machine learning classifiers to predict demographics from hand features, thereby providing evidence on both the strength of relationship and the key features underpinning this relationship. Our results show that we are able to predict sex, height, weight and foot size accurately within various data-range bin sizes, with machine learning classification algorithms out-performing linear regression in most situations. In addition, we identify the features used to provide these relationships applicable across multiple applications.

  5. Comparing Machine Learning Classifiers and Linear/Logistic Regression to Explore the Relationship between Hand Dimensions and Demographic Characteristics

    Science.gov (United States)

    2016-01-01

    Understanding the relationship between physiological measurements from human subjects and their demographic data is important within both the biometric and forensic domains. In this paper we explore the relationship between measurements of the human hand and a range of demographic features. We assess the ability of linear regression and machine learning classifiers to predict demographics from hand features, thereby providing evidence on both the strength of relationship and the key features underpinning this relationship. Our results show that we are able to predict sex, height, weight and foot size accurately within various data-range bin sizes, with machine learning classification algorithms out-performing linear regression in most situations. In addition, we identify the features used to provide these relationships applicable across multiple applications. PMID:27806075

  6. Fifth annual state of logistics survey for South Africa: logistics value and cost drivers from a macro and micro-economic perspective

    CSIR Research Space (South Africa)

    Ittmann, HW

    2008-01-01

    Full Text Available in vehicle damage and costs, vehicle operating costs, pavement damage and costs, damage to transported cargo, environmental damage and costs, and increases in congestion and decreases in safety. A limited case study indicated that trucks travelling... STATE OF LOGISTICS SURVEY FOR SOUTH AFRICA 2008 BULK MINING Most bulk mining in South Africa is transported by rail, mostly on world-class ‘export machines’, i.e. the coal line between Mpumalanga...

  7. Tropical Forest Fire Susceptibility Mapping at the Cat Ba National Park Area, Hai Phong City, Vietnam, Using GIS-Based Kernel Logistic Regression

    Directory of Open Access Journals (Sweden)

    Dieu Tien Bui

    2016-04-01

    Full Text Available The Cat Ba National Park area (Vietnam with its tropical forest is recognized as being part of the world biodiversity conservation by the United Nations Educational, Scientific and Cultural Organization (UNESCO and is a well-known destination for tourists, with around 500,000 travelers per year. This area has been the site for many research projects; however, no project has been carried out for forest fire susceptibility assessment. Thus, protection of the forest including fire prevention is one of the main concerns of the local authorities. This work aims to produce a tropical forest fire susceptibility map for the Cat Ba National Park area, which may be helpful for the local authorities in forest fire protection management. To obtain this purpose, first, historical forest fires and related factors were collected from various sources to construct a GIS database. Then, a forest fire susceptibility model was developed using Kernel logistic regression. The quality of the model was assessed using the Receiver Operating Characteristic (ROC curve, area under the ROC curve (AUC, and five statistical evaluation measures. The usability of the resulting model is further compared with a benchmark model, the support vector machine (SVM. The results show that the Kernel logistic regression model has a high level of performance in both the training and validation dataset, with a prediction capability of 92.2%. Since the Kernel logistic regression model outperforms the benchmark model, we conclude that the proposed model is a promising alternative tool that should also be considered for forest fire susceptibility mapping in other areas. The results of this study are useful for the local authorities in forest planning and management.

  8. Discriminating between adaptive and carcinogenic liver hypertrophy in rat studies using logistic ridge regression analysis of toxicogenomic data: The mode of action and predictive models.

    Science.gov (United States)

    Liu, Shujie; Kawamoto, Taisuke; Morita, Osamu; Yoshinari, Kouichi; Honda, Hiroshi

    2017-03-01

    Chemical exposure often results in liver hypertrophy in animal tests, characterized by increased liver weight, hepatocellular hypertrophy, and/or cell proliferation. While most of these changes are considered adaptive responses, there is concern that they may be associated with carcinogenesis. In this study, we have employed a toxicogenomic approach using a logistic ridge regression model to identify genes responsible for liver hypertrophy and hypertrophic hepatocarcinogenesis and to develop a predictive model for assessing hypertrophy-inducing compounds. Logistic regression models have previously been used in the quantification of epidemiological risk factors. DNA microarray data from the Toxicogenomics Project-Genomics Assisted Toxicity Evaluation System were used to identify hypertrophy-related genes that are expressed differently in hypertrophy induced by carcinogens and non-carcinogens. Data were collected for 134 chemicals (72 non-hypertrophy-inducing chemicals, 27 hypertrophy-inducing non-carcinogenic chemicals, and 15 hypertrophy-inducing carcinogenic compounds). After applying logistic ridge regression analysis, 35 genes for liver hypertrophy (e.g., Acot1 and Abcc3) and 13 genes for hypertrophic hepatocarcinogenesis (e.g., Asns and Gpx2) were selected. The predictive models built using these genes were 94.8% and 82.7% accurate, respectively. Pathway analysis of the genes indicates that, aside from a xenobiotic metabolism-related pathway as an adaptive response for liver hypertrophy, amino acid biosynthesis and oxidative responses appear to be involved in hypertrophic hepatocarcinogenesis. Early detection and toxicogenomic characterization of liver hypertrophy using our models may be useful for predicting carcinogenesis. In addition, the identified genes provide novel insight into discrimination between adverse hypertrophy associated with carcinogenesis and adaptive hypertrophy in risk assessment. Copyright © 2017 Elsevier Inc. All rights reserved.

  9. Evaluating risk factors for endemic human Salmonella Enteritidis infections with different phage types in Ontario, Canada using multinomial logistic regression and a case-case study approach

    Directory of Open Access Journals (Sweden)

    Varga Csaba

    2012-10-01

    Full Text Available Abstract Background Identifying risk factors for Salmonella Enteritidis (SE infections in Ontario will assist public health authorities to design effective control and prevention programs to reduce the burden of SE infections. Our research objective was to identify risk factors for acquiring SE infections with various phage types (PT in Ontario, Canada. We hypothesized that certain PTs (e.g., PT8 and PT13a have specific risk factors for infection. Methods Our study included endemic SE cases with various PTs whose isolates were submitted to the Public Health Laboratory-Toronto from January 20th to August 12th, 2011. Cases were interviewed using a standardized questionnaire that included questions pertaining to demographics, travel history, clinical symptoms, contact with animals, and food exposures. A multinomial logistic regression method using the Generalized Linear Latent and Mixed Model procedure and a case-case study design were used to identify risk factors for acquiring SE infections with various PTs in Ontario, Canada. In the multinomial logistic regression model, the outcome variable had three categories representing human infections caused by SE PT8, PT13a, and all other SE PTs (i.e., non-PT8/non-PT13a as a referent category to which the other two categories were compared. Results In the multivariable model, SE PT8 was positively associated with contact with dogs (OR=2.17, 95% CI 1.01-4.68 and negatively associated with pepper consumption (OR=0.35, 95% CI 0.13-0.94, after adjusting for age categories and gender, and using exposure periods and health regions as random effects to account for clustering. Conclusions Our study findings offer interesting hypotheses about the role of phage type-specific risk factors. Multinomial logistic regression analysis and the case-case study approach are novel methodologies to evaluate associations among SE infections with different PTs and various risk factors.

  10. Extraction of potential debris source areas by logistic regression technique: a case study from Barla, Besparmak and Kapi mountains (NW Taurids, Turkey)

    Science.gov (United States)

    Tunusluoglu, M. C.; Gokceoglu, C.; Nefeslioglu, H. A.; Sonmez, H.

    2008-03-01

    Debris flow is one of the most destructive mass movements. Sometimes regional debris flow susceptibility or hazard assessments can be more difficult than the other mass movements. Determination of debris accumulation zones and debris source areas, which is one of the most crucial stages in debris flow investigations, can be too difficult because of morphological restrictions. The main goal of the present study is to extract debris source areas by logistic regression analyses based on the data from the slopes of the Barla, Besparmak and Kapi Mountains in the SW part of the Taurids Mountain belt of Turkey, where formation of debris material are clearly evident and common. In this study, in order to achieve this goal, extensive field observations to identify the areal extent of debris source areas and debris material, air-photo studies to determine the debris source areas and also desk studies including Geographical Information System (GIS) applications and statistical assessments were performed. To justify the training data used in logistic regression analyses as representative, a random sampling procedure was applied. By using the results of the logistic regression analysis, the debris source area probability map of the region is produced. However, according to the field experiences of the authors, the produced map yielded over-predicted results. The main source of the over-prediction is structural relation between the bedding planes and slope aspects on the basis of the field observations, for the generation of debris, the dip of the bedding planes must be taken into consideration regarding the slope face. In order to eliminate this problem, in this study, an approach has been developed using probability distribution of the aspect values. With the application of structural adjustment, the final adjusted debris source area probability map is obtained for the study area. The field observations revealed that the actual debris source areas in the field coincide with

  11. Construction of hazard maps of Hantavirus contagion using Remote Sensing, logistic regression and Artificial Neural Networks: case Araucan\\'ia Region, Chile

    CERN Document Server

    Alvarez, G; Salinas, R

    2016-01-01

    In this research, methods and computational results based on statistical analysis and mathematical modelling, data collection in situ in order to make a hazard map of Hanta Virus infection in the region of Araucania, Chile are presented. The development of this work involves several elements such as Landsat satellite images, biological information regarding seropositivity of Hanta Virus and information concerning positive cases of infection detected in the region. All this information has been processed to find a function that models the danger of contagion in the region, through logistic regression analysis and Artificial Neural Networks

  12. [Multiple imputation and complete case analysis in logistic regression models: a practical assessment of the impact of incomplete covariate data].

    Science.gov (United States)

    Camargos, Vitor Passos; César, Cibele Comini; Caiaffa, Waleska Teixeira; Xavier, Cesar Coelho; Proietti, Fernando Augusto

    2011-12-01

    Researchers in the health field often deal with the problem of incomplete databases. Complete Case Analysis (CCA), which restricts the analysis to subjects with complete data, reduces the sample size and may result in biased estimates. Based on statistical grounds, Multiple Imputation (MI) uses all collected data and is recommended as an alternative to CCA. Data from the study Saúde em Beagá, attended by 4,048 adults from two of nine health districts in the city of Belo Horizonte, Minas Gerais State, Brazil, in 2008-2009, were used to evaluate CCA and different MI approaches in the context of logistic models with incomplete covariate data. Peculiarities in some variables in this study allowed analyzing a situation in which the missing covariate data are recovered and thus the results before and after recovery are compared. Based on the analysis, even the more simplistic MI approach performed better than CCA, since it was closer to the post-recovery results.

  13. Delimitação de áreas para plantio de eucalipto utilizando regressões logísticas Delimitation of areas for planting eucalyptus trees using logistic regressions

    Directory of Open Access Journals (Sweden)

    Rodrigo Teske

    2012-07-01

    Full Text Available A área útil efetiva é um parâmetro importante na aquisição de terras e planejamento do florestamento. A finalidade desta pesquisa foi gerar mapas preditores de áreas aptas ao plantio de eucalipto usando regressões logísticas binárias e variáveis geomorfométricas. As relações entre as variáveis preditoras e as áreas aptas para plantio de eucalipto foram modeladas e a variável que melhor explicou a ocorrência de áreas para plantio foi a distância dos rios. O mapa gerado apresentando as áreas aptas para plantio mostrou alta capacidade de reproduzir o mapa original de plantio de eucalipto. As regressões logísticas demonstraram viabilidade do uso para o mapeamento da aptidão para o plantio de eucalipto.Effective usable area is a key parameter in land acquisition and afforestation planning. The purpose of this research was to generate predictive maps of areas suitable for planting eucalyptus trees using binary logistic regressions and geomorphometric variables. The relationships between the predicting variables and suitable areas for planting eucalyptus trees were modeled and the variable that best explained occurrence of suitable lands was distance from rivers. The generated map showing areas suitable for planting had a high ability to reproduce the original planting map. Logistic regressions demonstrated the feasibility of use this approach to map suitability for eucalyptus forestation.

  14. Determinants of the probability of adopting quality protein maize (QPM technology in Tanzania: A logistic regression analysis

    Directory of Open Access Journals (Sweden)

    Gregory, T.

    2013-06-01

    Full Text Available Adoption of technology is an important factor in economic development. The thrust of this study was to establish factors affecting adoption of QPM technology in Northern zone of Tanzania. Primary data was collected from a random sample of 120 smallholder maize farmers in four villages. Data collected were analysed using descriptive and quantitative methods. Logit model was used to determine factors that influence adoption of QPM technology. The regression results indicated that education of the household head, farmers’ participation on demonstration trials, attendance to field days, and numbers of livestock owned have positively influenced the rate of adoption of the technology. Access to credit, and poor QPM marketing problem perception by farmers negatively influenced the rate of adoption. The study recommended government to ensure efficiency input-output linkage for QPM production.

  15. Driver injury severity outcome analysis in rural interstate highway crashes: a two-level Bayesian logistic regression interpretation.

    Science.gov (United States)

    Chen, Cong; Zhang, Guohui; Liu, Xiaoyue Cathy; Ci, Yusheng; Huang, Helai; Ma, Jianming; Chen, Yanyan; Guan, Hongzhi

    2016-12-01

    There is a high potential of severe injury outcomes in traffic crashes on rural interstate highways due to the significant amount of high speed traffic on these corridors. Hierarchical Bayesian models are capable of incorporating between-crash variance and within-crash correlations into traffic crash data analysis and are increasingly utilized in traffic crash severity analysis. This paper applies a hierarchical Bayesian logistic model to examine the significant factors at crash and vehicle/driver levels and their heterogeneous impacts on driver injury severity in rural interstate highway crashes. Analysis results indicate that the majority of the total variance is induced by the between-crash variance, showing the appropriateness of the utilized hierarchical modeling approach. Three crash-level variables and six vehicle/driver-level variables are found significant in predicting driver injury severities: road curve, maximum vehicle damage in a crash, number of vehicles in a crash, wet road surface, vehicle type, driver age, driver gender, driver seatbelt use and driver alcohol or drug involvement. Among these variables, road curve, functional and disabled vehicle damage in crash, single-vehicle crashes, female drivers, senior drivers, motorcycles and driver alcohol or drug involvement tend to increase the odds of drivers being incapably injured or killed in rural interstate crashes, while wet road surface, male drivers and driver seatbelt use are more likely to decrease the probability of severe driver injuries. The developed methodology and estimation results provide insightful understanding of the internal mechanism of rural interstate crashes and beneficial references for developing effective countermeasures for rural interstate crash prevention.

  16. Explaining marital patterns and trends in namibia: a regression analysis of 1992, 2000 and 2006 demographic and survey data.

    Directory of Open Access Journals (Sweden)

    Lillian Pazvakawambwa

    Full Text Available BACKGROUND: Marriage is a significant event in life-course of individuals, and creates a system that characterizes societal and economic structures. Marital patterns and dynamics over the years have changed a lot, with decreasing proportions of marriage, increased levels of divorce and co-habitation in developing countries. Although, such changes have been reported in African societies including Namibia, they have largely remained unexplained. OBJECTIVES AND METHODS: IN THIS PAPER, WE EXAMINED TRENDS AND PATTERNS OF MARITAL STATUS OF WOMEN OF MARRIAGEABLE AGE: 15 to 49 years, in Namibia using the 1992, 2000 and 2006 Demographic and Health Survey (DHS data. Trends were established for selected demographic variables. Two binary logistic regression models for ever-married versus never married, and cohabitation versus married were fitted to establish factors associated with such nuptial systems. Further a multinomial logistic regression models, adjusted for bio-demographic and socio-economic variables, were fitted separately for each year, to establish determinants of type of union (never married, married and cohabitation. RESULTS AND CONCLUSIONS: Findings indicate a general change away from marriage, with a shift in singulate mean age at marriage. Cohabitation was prevalent among those less than 30 years of age, the odds were higher in urban areas and increased since 1992. Be as it may marriage remained a persistent nuptiality pattern, and common among the less educated and employed, but lower odds in urban areas. Results from multinomial model suggest that marital status was associated with age at marriage, total children born, region, place of residence, education level and religion. We conclude that marital patterns have undergone significant transformation over the past two decades in Namibia, with a coexistence of traditional marriage framework with co-habitation, and sizeable proportion remaining unmarried to the late 30s. A shift in the

  17. A case study using support vector machines, neural networks and logistic regression in a GIS to identify wells contaminated with nitrate-N

    Science.gov (United States)

    Dixon, Barnali

    2009-09-01

    Accurate and inexpensive identification of potentially contaminated wells is critical for water resources protection and management. The objectives of this study are to 1) assess the suitability of approximation tools such as neural networks (NN) and support vector machines (SVM) integrated in a geographic information system (GIS) for identifying contaminated wells and 2) use logistic regression and feature selection methods to identify significant variables for transporting contaminants in and through the soil profile to the groundwater. Fourteen GIS derived soil hydrogeologic and landuse parameters were used as initial inputs in this study. Well water quality data (nitrate-N) from 6,917 wells provided by Florida Department of Environmental Protection (USA) were used as an output target class. The use of the logistic regression and feature selection methods reduced the number of input variables to nine. Receiver operating characteristics (ROC) curves were used for evaluation of these approximation tools. Results showed superior performance with the NN as compared to SVM especially on training data while testing results were comparable. Feature selection did not improve accuracy; however, it helped increase the sensitivity or true positive rate (TPR). Thus, a higher TPR was obtainable with fewer variables.

  18. Understanding data in clinical research: a simple graphical display for plotting data (up to four independent variables) after binary logistic regression analysis.

    Science.gov (United States)

    Mesa, José Luis

    2004-01-01

    In clinical research, suitable visualization techniques of data after statistical analysis are crucial for the researches' and physicians' understanding. Common statistical techniques to analyze data in clinical research are logistic regression models. Among these, the application of binary logistic regression analysis (LRA) has greatly increased during past years, due to its diagnostic accuracy and because scientists often want to analyze in a dichotomous way whether some event will occur or not. Such an analysis lacks a suitable, understandable, and widely used graphical display, instead providing an understandable logit function based on a linear model for the natural logarithm of the odds in favor of the occurrence of the dependent variable, Y. By simple exponential transformation, such a logit equation can be transformed into a logistic function, resulting in predicted probabilities for the presence of the dependent variable, P(Y-1/X). This model can be used to generate a simple graphical display for binary LRA. For the case of a single predictor or explanatory (independent) variable, X, a plot can be generated with X represented by the abscissa (i.e., horizontal axis) and P(Y-1/X) represented by the ordinate (i.e., vertical axis). For the case of multiple predictor models, I propose here a relief 3D surface graphic in order to plot up to four independent variables (two continuous and two discrete). By using this technique, any researcher or physician would be able to transform a lesser understandable logit function into a figure easier to grasp, thus leading to a better knowledge and interpretation of data in clinical research. For this, a sophisticated statistical package is not necessary, because the graphical display may be generated by using any 2D or 3D surface plotter.

  19. EXAMINING THE PROFITABILITY OF TURKISH COMMERCIAL BANKS WITH THE LOGISTIC REGRESSION ANALYSIS IN CRISIS YEARS 1999, 2000, 2001 AND 2008

    Directory of Open Access Journals (Sweden)

    Açelya TELLİ

    2016-01-01

    Full Text Available Financial crisis which were occured especially after 1980, damaged particularly Turkish Banking System then all financial system and caused to be determine by researchers seriously. In this line, the aim of this study is searching the effect of the profitability performance of commercial banks in Turkish Banking System comparing with the financial crisis which were occured in 1999, 2000, 2001 and 2008. With this purpose, three profitability ratios were published in the Banks Association of Turkey were taken as dependent variables and four financial ratio groups were taken as independent variables and it was set up a regression model for every related year. As a result of this study, the ratio group of income and expense took over for dependent variables which are the profitability of asset and the profitability of equity at analyzed years. The margin of interest didn’t stand out at 1999, 2000 and 2008 but at 2001, the ratio group of income and expense and the ratio group of adequacy of capital took over for dependent variable which is the margin of net interest.

  20. Recursive and non-linear logistic regression: moving on from the original EuroSCORE and EuroSCORE II methodologies.

    Science.gov (United States)

    Poullis, Michael

    2014-11-01

    EuroSCORE II, despite improving on the original EuroSCORE system, has not solved all the calibration and predictability issues. Recursive, non-linear and mixed recursive and non-linear regression analysis were assessed with regard to sensitivity, specificity and predictability of the original EuroSCORE and EuroSCORE II systems. The original logistic EuroSCORE, EuroSCORE II and recursive, non-linear and mixed recursive and non-linear regression analyses of these risk models were assessed via receiver operator characteristic curves (ROC) and Hosmer-Lemeshow statistic analysis with regard to the accuracy of predicting in-hospital mortality. Analysis was performed for isolated coronary artery bypass grafts (CABGs) (n = 2913), aortic valve replacement (AVR) (n = 814), mitral valve surgery (n = 340), combined AVR and CABG (n = 517), aortic (n = 350), miscellaneous cases (n = 642), and combinations of the above cases (n = 5576). The original EuroSCORE had an ROC below 0.7 for isolated AVR and combined AVR and CABG. None of the methods described increased the ROC above 0.7. The EuroSCORE II risk model had an ROC below 0.7 for isolated AVR only. Recursive regression, non-linear regression, and mixed recursive and non-linear regression all increased the ROC above 0.7 for isolated AVR. The original EuroSCORE had a Hosmer-Lemeshow statistic that was above 0.05 for all patients and the subgroups analysed. All of the techniques markedly increased the Hosmer-Lemeshow statistic. The EuroSCORE II risk model had a Hosmer-Lemeshow statistic that was significant for all patients (P linear regression failed to improve on the original Hosmer-Lemeshow statistic. The mixed recursive and non-linear regression using the EuroSCORE II risk model was the only model that produced an ROC of 0.7 or above for all patients and procedures and had a Hosmer-Lemeshow statistic that was highly non-significant. The original EuroSCORE and the EuroSCORE II risk models do not have adequate ROC and Hosmer

  1. 乳腺结节临床超声诊断的Logistic回归分析%Logistic regression analysis of clinical and ultrasonic features of breast nodules

    Institute of Scientific and Technical Information of China (English)

    张秀梅; 邵玉红; 熊霞; 万远廉

    2011-01-01

    Objective To create a breast nodule estimation model based on grayscale and color Doppler ultrasonography using Logistic regression that can screen out the specific features for distinguishing breast malignancy from benignancy.Methods From July,2009 to May,2010,217 patients were enrolled in the study in peking university first hospital.Clinical data and ultrasonic features were evaluated in 219 breast nodules of 217 patients confirmed by surgical pathology.Logistic regression model was established to screen out significant ultrasonic indexes for differentiating breast malignancy from benignancy.A receiver operating characteristics curve was made to assess diagnostic value of the Logistic regression model.Correlation was analyzed between the Logistic regression model and surgical pathology.Results Logistic regression model:Logit(p) = - 16.884 + 0.037 × age + 3.228 × longitudinal-transverse axis ratio + 1.412 ×border + 2.663 × halo + 1.813 × microcalcicum + 1.157 × resistance index + 2.204 × enlarged axillary lymph node (x2 = 167.107,P =000).The areas of ROC curve for probability and identification of breast malignant and benign nodule were 0.948 and 0.882 respectively.Diagnostic sensitivity,specificity and accuracy were 91.6%,84.9% and 88.9%.Logistic regression model positively correlated with surgical pathology(r=0.768,P= 0.000).Conclusion Our Logistic regression model can effectively differentiate malignant breast nodules from benign and can identify the ultrasonic features associated with breast cancer.%目的 建立Logistic回归模型并筛选鉴别乳腺结节良恶性的临床超声特征.方法 分析北京大学第一医院2009年7月至2010年5月手术病理证实的217例患者共219个乳腺结节(恶性结节133个、良性结节86个)灰阶、彩色多普勒超声特征及临床资料,选择单因素良恶性组间比较差异有统计学意义的临床及超声特征指标进入多变量二分类Logistic回归分析,建立Logistic回归模

  2. 基于 Logistic 回归的森林火险天气等级模型%Weather Model Level of Forest Fire Danger Based on Logistic Regression

    Institute of Scientific and Technical Information of China (English)

    张伟; 王峰; 郭艳芬; 郑煜

    2013-01-01

    根据大兴安岭地区林业局1975-2004年火灾资料及气象数据,利用logistic 回归选择最优配比建立了森林火险天气等级模型,并对其进行检验。经验证该模型具有较好的应用效果,能够为当地林业部门制定防火策略时提供参考。%With the fire records and meteorological data of the Daxing’an Mountain Area Forestry Bureauin Heilongjiang Prov-ince from 1975 to 2004 , a judgment method of forest fire danger weather level was established by the logistic regression with best ratios and examined by forest fire data. The model has a good application effect and can provide a reference for the local forestry department when formulating the fire prevention strategy.

  3. The comparison of landslide ratio-based and general logistic regression landslide susceptibility models in the Chishan watershed after 2009 Typhoon Morakot

    Science.gov (United States)

    WU, Chunhung

    2015-04-01

    The research built the original logistic regression landslide susceptibility model (abbreviated as or-LRLSM) and landslide ratio-based ogistic regression landslide susceptibility model (abbreviated as lr-LRLSM), compared the performance and explained the error source of two models. The research assumes that the performance of the logistic regression model can be better if the distribution of landslide ratio and weighted value of each variable is similar. Landslide ratio is the ratio of landslide area to total area in the specific area and an useful index to evaluate the seriousness of landslide disaster in Taiwan. The research adopted the landside inventory induced by 2009 Typhoon Morakot in the Chishan watershed, which was the most serious disaster event in the last decade, in Taiwan. The research adopted the 20 m grid as the basic unit in building the LRLSM, and six variables, including elevation, slope, aspect, geological formation, accumulated rainfall, and bank erosion, were included in the two models. The six variables were divided as continuous variables, including elevation, slope, and accumulated rainfall, and categorical variables, including aspect, geological formation and bank erosion in building the or-LRLSM, while all variables, which were classified based on landslide ratio, were categorical variables in building the lr-LRLSM. Because the count of whole basic unit in the Chishan watershed was too much to calculate by using commercial software, the research took random sampling instead of the whole basic units. The research adopted equal proportions of landslide unit and not landslide unit in logistic regression analysis. The research took 10 times random sampling and selected the group with the best Cox & Snell R2 value and Nagelkerker R2 value as the database for the following analysis. Based on the best result from 10 random sampling groups, the or-LRLSM (lr-LRLSM) is significant at the 1% level with Cox & Snell R2 = 0.190 (0.196) and Nagelkerke R2

  4. Study on the interaction under logistic regression modeling%logistic回归模型中交互作用的分析及评价

    Institute of Scientific and Technical Information of China (English)

    邱宏; 余德新; 王晓蓉; 付振明; 谢立亚

    2008-01-01

    流行病学病因学研究常运用logistic回归模型分析影响因素的作用,并利用纳入乘积项的方法分析因素间交互作用,如有统计学意义表示两因素间存在相乘交互作用,但乘积项若无统计学意义并不表示两因素问相加交互作用或生物学交互作用的有无.文中介绍Rothman提出的针对logistic或Cox回归模型的三个评价相加交互作用的指标及其可信区间的计算,并以SPSS 15.0软件应用实例分析得出logistic回归模型的参数估计值和协方差矩阵,引入Andersson等编制的Excel计算表,计算相加交瓦作用指标及其可信区间,用于评价因素间的相加交互作用,为研究人员分析生物学交互作用提供依据.该方法方便快捷,且Excel计算表可在线免费下载.%When study on epidemiological causation is carried out,logistic regression has been commonly used to estimate the independent effects of risk factors.as well as to examine possible interactions among individual risk factor by adding one or more product terms to the regression model.In logistic or Cox's regression model.the regression coefficient of the product term estimates the interaction on a muhiplicative scale while statistical significance indicates the departure from multiplicativity.Rothman argues that when biologic interaction iS examined,we need to focus on interaction as departure from additivity rather than departure from multiplicativity.He presents three indices to measure interaction on an additive scale or departure from additivity.using logarithmic models such aS logistic or Cox's regression model.In this paper,we use data from a case-control study of female lung cancer in Hong Kong to calculate the regression coefficients and covariance matrix of logistie model in SPSS.We then introduce an Excel spreadsheet set up by Tomas Andersson to calculate the indices of interaction on an additive scale and the corresponding confidence intervals.The results can be used as

  5. The severity of Minamata disease declined in 25 years: temporal profile of the neurological findings analyzed by multiple logistic regression model.

    Science.gov (United States)

    Uchino, Makoto; Hirano, Teruyuki; Satoh, Hiroshi; Arimura, Kimiyoshi; Nakagawa, Masanori; Wakamiya, Jyunji

    2005-01-01

    Minamata disease (MD) was caused by ingestion of seafood from the methylmercury-contaminated areas. Although 50 years have passed since the discovery of MD, there have been only a few studies on the temporal profile of neurological findings in certified MD patients. Thus, we evaluated changes in neurological symptoms and signs of MD using discriminants by multiple logistic regression analysis. The severity of predictive index declined in 25 years in most of the patients. Only a few patients showed aggravation of neurological findings, which was due to complications such as spino-cerebellar degeneration. Patients with chronic MD aged over 45 years had several concomitant diseases so that their clinical pictures were complicated. It was difficult to differentiate chronic MD using statistically established discriminants based on sensory disturbance alone. In conclusion, the severity of MD declined in 25 years along with the modification by age-related concomitant disorders.

  6. 儿童情绪障碍相关因素的LOGISTIC回归分析%Logistic Regression Analysis of Related Risk Factors of Emotional Disorders in Children

    Institute of Scientific and Technical Information of China (English)

    高鸿云; 冯金英; 徐俊冕; 郑士俊

    2001-01-01

    Objective: To identify the related psychosocial risk factors of emotional disorders in children. Methods:To use case-control approach in which. Diagnosis was made by clinical interview according to ICD-10 criteria. Eighty eight cases and controls separately filled out general condition inventory. The results were put into Logistic regression model for analysis. Results: The children with timid personality, without kindergarten education, or with parents who were administrative or technical personnel, were apt to have emotional disorders. The children who were usually counseled by their mothers had less emotional disorders than those were beaten. Conclusion: The emotional disorders were the results of multiple factors. Prevention of children's emotional disorders should be focused on the children's personality and family education.

  7. Applications of Adaptive Elastic Net Pro cedure for Logistic Regression Mo del%Adaptive Elastic Net方法在Logistic回归模型中的应用

    Institute of Scientific and Technical Information of China (English)

    李春红; 黄登香; 戴洪帅

    2015-01-01

    本文将adaptive Elastic Net方法应用于Logistic回归模型,研究并证明其具有Oracle性质,并利用数值模拟及实际例子将其与Lasso、adaptive Lasso、Elastic Net方法的估计结果进行比较,从结果可以看出,adaptive Elastic Net方法效果更优。%In this paper, we consider the adaptive Elastic Net procedure for the Logistic reg-ression model and prove the Oracle property of its estimates. Compared with the Lasso, the adaptive Lasso and the Elastic Net procedure, we obtain that the proposed procedure has good performance, owing to the Oracle property.

  8. Comparison of Artificial Neural Network with Logistic Regression as Classification Models for Variable Selection for Prediction of Breast Cancer Patient Outcomes

    Directory of Open Access Journals (Sweden)

    Valérie Bourdès

    2010-01-01

    Full Text Available The aim of this study was to compare multilayer perceptron neural networks (NNs with standard logistic regression (LR to identify key covariates impacting on mortality from cancer causes, disease-free survival (DFS, and disease recurrence using Area Under Receiver-Operating Characteristics (AUROC in breast cancer patients. From 1996 to 2004, 2,535 patients diagnosed with primary breast cancer entered into the study at a single French centre, where they received standard treatment. For specific mortality as well as DFS analysis, the ROC curves were greater with the NN models compared to LR model with better sensitivity and specificity. Four predictive factors were retained by both approaches for mortality: clinical size stage, Scarff Bloom Richardson grade, number of invaded nodes, and progesterone receptor. The results enhanced the relevance of the use of NN models in predictive analysis in oncology, which appeared to be more accurate in prediction in this French breast cancer cohort.

  9. Using occupancy modeling and logistic regression to assess the distribution of shrimp species in lowland streams, Costa Rica: Does regional groundwater create favorable habitat?

    Science.gov (United States)

    Snyder, Marcia; Freeman, Mary C.; Purucker, S. Thomas; Pringle, Catherine M.

    2016-01-01

    Freshwater shrimps are an important biotic component of tropical ecosystems. However, they can have a low probability of detection when abundances are low. We sampled 3 of the most common freshwater shrimp species, Macrobrachium olfersii, Macrobrachium carcinus, and Macrobrachium heterochirus, and used occupancy modeling and logistic regression models to improve our limited knowledge of distribution of these cryptic species by investigating both local- and landscape-scale effects at La Selva Biological Station in Costa Rica. Local-scale factors included substrate type and stream size, and landscape-scale factors included presence or absence of regional groundwater inputs. Capture rates for 2 of the sampled species (M. olfersii and M. carcinus) were sufficient to compare the fit of occupancy models. Occupancy models did not converge for M. heterochirus, but M. heterochirus had high enough occupancy rates that logistic regression could be used to model the relationship between occupancy rates and predictors. The best-supported models for M. olfersii and M. carcinus included conductivity, discharge, and substrate parameters. Stream size was positively correlated with occupancy rates of all 3 species. High stream conductivity, which reflects the quantity of regional groundwater input into the stream, was positively correlated with M. olfersii occupancy rates. Boulder substrates increased occupancy rate of M. carcinus and decreased the detection probability of M. olfersii. Our models suggest that shrimp distribution is driven by factors that function at local (substrate and discharge) and landscape (conductivity) scales.

  10. Application of GIS and logistic regression to fossil pollen data in modelling present and past spatial distribution of the Colombian savanna

    Energy Technology Data Exchange (ETDEWEB)

    Flantua, Suzette G.A.; Boxel, John H. van; Hooghiemstra, Henry; Smaalen, John van [University of Amsterdam, Faculty of Science, Institute for Biodiversity and Ecosystem Dynamics, Amsterdam (Netherlands)

    2007-12-15

    Climate changes affect the abundance, geographic extent, and floral composition of vegetation, which are reflected in the pollen rain. Sediment cores taken from lakes and peat bogs can be analysed for their pollen content. The fossil pollen records provide information on the temporal changes in climate and palaeo-environments. Although the complexity of the variables influencing vegetation distribution requires a multi-dimensional approach, only a few research projects have used GIS to analyse pollen data. This paper presents a new approach to palynological data analysis by combining GIS and spatial modelling. Eastern Colombia was chosen as a study area owing to the migration of the forest-savanna boundary since the last glacial maximum, and the availability of pollen records. Logistic regression has been used to identify the climatic variables that determine the distribution of savanna and forest in eastern Colombia. These variables were used to create a predictive land-cover model, which was subsequently implemented into a GIS to perform spatial analysis on the results. The palynological data from the study area were incorporated into the GIS. Reconstructed maps of past vegetation distribution by interpolation showed a new approach of regional multi-site data synthesis related to climatic parameters. The logistic regression model resulted in a map with 85.7% predictive accuracy, which is considered useful for the reconstruction of future and past land-cover distributions. The suitability of palynological GIS application depends on the number of pollen sites, the distribution of the pollen sites over the area of interest, and the degree of overlap of the age ranges of the pollen records. (orig.)

  11. Interaction between continuous variables in logistic regression model%Logistic回归模型中连续变量交互作用的分析

    Institute of Scientific and Technical Information of China (English)

    邱宏; 余德新; 谢立亚; 王晓蓉; 付振明

    2010-01-01

    Rothman提出生物学交互作用的评价应该基于相加尺度即是否有相加交互作用,而logistic回归模型的乘积项反映的是相乘交互作用.目前国内外文献讨论logistic回归模型中两因素的相加交互作用以两分类变量为主,本文介绍两连续变量或连续变量与分类变量相加交互作用可信区间估计的Bootstrap方法,文中以香港男性肺癌病例对照研究资料为例,辅以免费软件R的实现程序,为研究人员分析交互作用提供参考.%Rothman argued that interaction estimated as departure from additivity better reflected the biological interaction. In a logistic regression model, the product term reflects the interaction as departure from multiplicativity. So far, literature on estimating interaction regarding an additive scale using logistic regression was only focusing on two dichotomous factors. The objective of the present report was to provide a method to examine the interaction as departure from additivity between two continuous variables or between one continuous variable and one categorical variable.We used data from a lung cancer case-control study among males in Hong Kong as an example to illustrate the bootstrap re-sampling method for calculating the corresponding confidence intervals.Free software R (Version 2.8.1) was used to estimate interaction on the additive scale.

  12. Regresión logística: Un ejemplo de su uso en Endocrinología Logistic regression: An example of its use in Endocrinology

    Directory of Open Access Journals (Sweden)

    Emma Domínguez Alonso

    2001-04-01

    Full Text Available Se intentó un acercamiento a la regresión logística, como una de las técnicas estadísticas multivariadas de más frecuente uso en las últimas décadas, para orientar a su uso correcto. Se consideraron cuestiones de tipo práctico como número de sujetos necesarios para aplicarla, situaciones en las que está indicado su uso, tipo de variables a las que es posible aplicarla y las formas en que puede ser incluida en el modelo, interpretación de los resultados, etc. Se mostró un ejemplo de la aplicación de esta técnica en una investigación en el campo de la Endocrinología. Se concluyó que la regresión logística resulta de gran utilidad para su aplicación en cualquier campo de la investigación médica cuando necesitamos precisar el efecto de un grupo de variables, consideradas potencialmente influyentes, sobre la ocurrencia de un determinado proceso.An approach to logistic regression , as one of the most used multivariate statistical techniques in the last decades, was made to recommend its correct use. Practical questions as the number of subjects necessary for its application, the situations in which it should be used, the type of variables to which it may be applied, the way it may be included in the model, the interpretation of the results, etc., were taken into consideration. An example of the application of this technique in the field of Endocrinology was given. It was concluded that the application of logistic regression is very useful in any field of medical research when we need to determine the effect of a group of variables, potentially considered as influential, on the ocurrence of a certain process.

  13. 婴幼儿猛性龋危险因素的Logistic分析%The risk factors of baby bottle tooth decay by logistic regression analysis

    Institute of Scientific and Technical Information of China (English)

    刘淑杰; 朱剑东; 陈旭; 魏庆

    2000-01-01

    目的:探讨引起婴幼儿猛性龋的危险因素。方法:龋病危险因素调查和菌斑pH检测。应用Lo-gistic回归分析的方法研究婴幼儿猛性龋与危险因素之间的数量关系。结果:喂养时间、奶瓶内容、甜食习惯和菌斑pH值4项是婴幼儿猛性龋的危险因素。婴幼儿猛性龋的发病情况与用Logistic回归分析计算结果的判断情况一致率为96.2%。结论:用此4项来预测婴幼儿猛性龋比单纯用细菌学参数更为准确。%AIM: To study the risk factors of baby bottle tooth decay(BBTD). METHODS: To investigate the risk factom of caries and their plaque pH. The numeral relationship between the risk fctors and BBTD was studied by the Logistic regression analysis. RESULTS: Feeding time, baby bottle composition, desert habit and plaque pH were the risk factors. The coincidence between the prevalence of BBTD and the results of Logistic regression analysis was 96.2 %. CONCLUSION: It suggested that the four risk factors are more accurate than mere bacterial parameter in BBTD prediction.

  14. The impact of meteorology on the occurrence of waterborne outbreaks of vero cytotoxin-producing Escherichia coli (VTEC): a logistic regression approach.

    Science.gov (United States)

    O'Dwyer, Jean; Morris Downes, Margaret; Adley, Catherine C

    2016-02-01

    This study analyses the relationship between meteorological phenomena and outbreaks of waterborne-transmitted vero cytotoxin-producing Escherichia coli (VTEC) in the Republic of Ireland over an 8-year period (2005-2012). Data pertaining to the notification of waterborne VTEC outbreaks were extracted from the Computerised Infectious Disease Reporting system, which is administered through the national Health Protection Surveillance Centre as part of the Health Service Executive. Rainfall and temperature data were obtained from the national meteorological office and categorised as cumulative rainfall, heavy rainfall events in the previous 7 days, and mean temperature. Regression analysis was performed using logistic regression (LR) analysis. The LR model was significant (p < 0.001), with all independent variables: cumulative rainfall, heavy rainfall and mean temperature making a statistically significant contribution to the model. The study has found that rainfall, particularly heavy rainfall in the preceding 7 days of an outbreak, is a strong statistical indicator of a waterborne outbreak and that temperature also impacts waterborne VTEC outbreak occurrence.

  15. New Approaches To Photometric Redshift Prediction Via Gaussian Process Regression In The Sloan Digital Sky Survey

    CERN Document Server

    Way, M J; Gazis, P R; Srivastava, A N

    2009-01-01

    Expanding upon the work of Way & Srivastava 2006 we demonstrate how the use of training sets of comparable size continue to make Gaussian Process Regression a competitive and in many ways a superior approach to that of Neural Networks and other least-squares fitting methods. This is possible via new matrix inversion techniques developed for Gaussian Processes that do not require that the kernel matrix be sparse. This development, combined with a neural-network kernel function appears to give superior results for this problem. We demonstrate that there appears to be a minimum number of training set galaxies needed to obtain the optimal fit when using our Gaussian Process Regression rank-reduction methods. We also find that morphological information included with many photometric surveys appears, for the most part, to make the photometric redshift evaluation slightly worse rather than better. This would indicate that morphological information simply adds noise from the Gaussian Process point of view. In add...

  16. Binary Logistic Regression Versus Boosted Regression Trees in Assessing Landslide Susceptibility for Multiple-Occurring Regional Landslide Events: Application to the 2009 Storm Event in Messina (Sicily, southern Italy).

    Science.gov (United States)

    Lombardo, L.; Cama, M.; Maerker, M.; Parisi, L.; Rotigliano, E.

    2014-12-01

    This study aims at comparing the performances of Binary Logistic Regression (BLR) and Boosted Regression Trees (BRT) methods in assessing landslide susceptibility for multiple-occurrence regional landslide events within the Mediterranean region. A test area was selected in the north-eastern sector of Sicily (southern Italy), corresponding to the catchments of the Briga and the Giampilieri streams both stretching for few kilometres from the Peloritan ridge (eastern Sicily, Italy) to the Ionian sea. This area was struck on the 1st October 2009 by an extreme climatic event resulting in thousands of rapid shallow landslides, mainly of debris flows and debris avalanches types involving the weathered layer of a low to high grade metamorphic bedrock. Exploiting the same set of predictors and the 2009 landslide archive, BLR- and BRT-based susceptibility models were obtained for the two catchments separately, adopting a random partition (RP) technique for validation; besides, the models trained in one of the two catchments (Briga) were tested in predicting the landslide distribution in the other (Giampilieri), adopting a spatial partition (SP) based validation procedure. All the validation procedures were based on multi-folds tests so to evaluate and compare the reliability of the fitting, the prediction skill, the coherence in the predictor selection and the precision of the susceptibility estimates. All the obtained models for the two methods produced very high predictive performances, with a general congruence between BLR and BRT in the predictor importance. In particular, the research highlighted that BRT-models reached a higher prediction performance with respect to BLR-models, for RP based modelling, whilst for the SP-based models the difference in predictive skills between the two methods dropped drastically, converging to an analogous excellent performance. However, when looking at the precision of the probability estimates, BLR demonstrated to produce more robust

  17. Logistic regression analysis on risk factors of endometrial cancer%子宫内膜癌患病的危险因素Logistic回归分析

    Institute of Scientific and Technical Information of China (English)

    曾艳华

    2012-01-01

    目的:探讨金坛市子宫内膜癌发病的危险因素.方法:采用病例对照研究,选择2005年12月~2011年6月在金坛市人民医院妇产科就诊并经过病理诊断为子宫内膜癌的患者165例为病例组,同时选择528例健康体检者为对照组,采用单因素与多因素非条件Logistic回归分析子宫内膜癌发病的危险因素.结果:单因素分析表明,年龄≤50岁、年龄≥61岁、BMI超重、患有高血压、患有糖尿病、月经初潮年龄≤12岁、初次分娩年龄≤20岁、一级亲属中有乳癌、子宫内膜癌、结肠癌、卵巢癌患病史与子宫内膜癌发病有关.通过多因素Logistic逐步回归分析,最终引入回归方程的变量为年龄≤50岁、年龄≥61岁、BMI超重、患有高血压、患有糖尿病、月经初潮年龄≤12岁、一级亲属中有结肠癌及卵巢癌患病史.结论:年龄≥61岁、BMI超重、患有高血压、患有糖尿病、月经初潮年龄≤12岁、一级亲属中有结肠癌及卵巢癌患病史是子宫内膜癌发病的危险因素,年龄≤50岁是子宫内膜癌发病的保护因素.%Objective; To explore the risk factors of endometrial cancer in Jintan city. Methods: A case - control study was conducted, 165 patients who were treated and diagnosed as endometrial cancer by pathological examination in the hospital from December 2005 to June 2011 were selected as case group, and 528 healthy women after physical examination during the same period were selected as control group, univariate and multivariate logistic regression analysis were used to analyze the risk factors of endometrial cancer. Results; Univariate logistic regression analysis showed that ≤ 50 years old, ≥ 61 years old, BMI overweight, hypertension, diabetes mellitus, age of menarche ≤ 12 years, age of primiparity≤ 20 years, medical histories of breast cancer, endometrial cancer, colon cancer, and ovarian cancer in first - degree relatives were correlated with prevalence

  18. Logistic regression analysis of influence factors in psoriasis patients with metabolic syndrome%寻常性银屑病患者并发代谢综合征影响因素的Logistic回归分析

    Institute of Scientific and Technical Information of China (English)

    程娟; 张丽; 惠让松; 郐大余; 段红岩; 姚永良; 李安信; 杨雪琴

    2012-01-01

    Objective To explore the related factors of metabolic syndrome in patients psoriasis. Methods Two hundred and two patients with psoriasis were investigated by the questionnaire survey, physical and laboratory examination. Logistic regression was used to analyze the related factors of metabolic syndrome. Results Logistic regression analysis showed that older ages, type A character, breakfast skipping and drinking were independent risk factors of metabolic syndrome in psoriasis patients. Contusion Relieve pressure and improve lifestyle may be useful to reduce tbe occurrence of metabolic syndrome in psoriasis patients.%目的 探讨寻常性银屑病患者发生代谢综合征的相关影响因素.方法 对202例确诊为寻常性银屑病的门诊患者进行问卷调查、体格及实验室检查,对寻常性银屑病患者发生代谢综合征的相关因素进行Logistic回归分析.结果年龄大、A型性格、不吃早餐、饮酒是寻常性银屑病患者并发代谢综合征的独立危险因素.结论 积极调整患者心态,减轻精神压力,改善生活方式,可望降低和预防寻常性银屑病患者代谢综合征的发生.

  19. Ordinal logistic regression models: application in quality of life studies Modelos de regressão logística ordinal: aplicação em estudo sobre qualidade de vida

    Directory of Open Access Journals (Sweden)

    Mery Natali Silva Abreu

    2008-01-01

    Full Text Available Quality of life has been increasingly emphasized in public health research in recent years. Typically, the results of quality of life are measured by means of ordinal scales. In these situations, specific statistical methods are necessary because procedures such as either dichotomization or misinformation on the distribution of the outcome variable may complicate the inferential process. Ordinal logistic regression models are appropriate in many of these situations. This article presents a review of the proportional odds model, partial proportional odds model, continuation ratio model, and stereotype model. The fit, statistical inference, and comparisons between models are illustrated with data from a study on quality of life in 273 patients with schizophrenia. All tested models showed good fit, but the proportional odds or partial proportional odds models proved to be the best choice due to the nature of the data and ease of interpretation of the results. Ordinal logistic models perform differently depending on categorization of outcome, adequacy in relation to assumptions, goodness-of-fit, and parsimony.O tema qualidade de vida tem ganhado ênfase nos últimos anos. Tipicamente os resultados da qualidade de vida são mensurados por meio de escalas ordinais. Procedimentos como dicotomizar a variável resposta e desconsiderar a ordenação geram perda de informação e podem ocasionar inferências incorretas. Para análise de dados ordinais, métodos estatísticos específicos são necessários, como modelos de regressão logística ordinal. A proposta deste trabalho é apresentar uma revisão dos modelos de chances proporcionais, de razão contínua, estereótipo e de chances proporcionais parciais. O ajuste, inferência estatística e comparação dos modelos são ilustrados com dados de um estudo sobre qualidade de vida realizado com 273 pacientes com esquizofrenia. Todos os modelos testados mostraram bom ajuste, mas o de chances

  20. Análise de dados de gastroenterite hemorrágica canina para identificar fatores de risco por regressão logística Data analysis of hemorrhagic gastroenteritis to identify risk factors by logistic regression

    Directory of Open Access Journals (Sweden)

    Paula Roberta Mendes

    2004-04-01

    Full Text Available No presente estudo, ajustou-se um modelo de regressão logística para prever a probabilidade de óbito de cães acometidos por gastroenterite hemorrágica. O modelo Logístico é recomendado para variáveis-resposta dicotômicas em estudo de Coorte. Registraram-se 176 animais censitariamente atendidos com gastroenterite hemorrágica em quatro clínicas veterinárias da cidade de Lavras, sul de Minas Gerais, entre os anos de 1992 e 1999. Após terem sido selecionadas por meio do teste de de Pearson ou teste exato de Fisher, ajustou-se o modelo considerando-se as variáveis sexo, idade, diárias de internação e número de atendimentos. A estimação dos parâmetros foi feita pelo método da máxima verossimilhança. Conclui-se que quando os cães acometidos por gastroenterite hemorrágica são atendidos apenas uma vez, aqueles com idade superior a 6 meses possuem 15,45 vezes mais chances de morrerem (PThis paper presents a study of how to fit a logistic regression model to predict the death probability of dogs with hemorrhagic gastroenteritis. A logistic model is recommended to treat dichotomic variables in Coorte study. Using a census procedure from 1992 to 1999 four veterinary clinic in Lavras, MG, registered 176 infected animals. The variables of the model have been chosen to be sex, age internment days rates and number of clinical treatments by the or Fisher’s exact test. The parameters were estimated by the maximum likelihood method. The results showed that if the infected dogs were clinically treated only once then the animals older than six months had their mortality chances 15.45 times (P<0.05 larger than those younger than six months. If the infected animals younger than six months were clinically treated only once then their mortality chances were 20.251 (P<0.05 higher than if they had received two to seven medical treatments.

  1. Power of multifactor dimensionality reduction and penalized logistic regression for detecting gene-gene Interaction in a case-control study

    Directory of Open Access Journals (Sweden)

    Brott Marcia J

    2009-12-01

    Full Text Available Abstract Background There is a growing awareness that interaction between multiple genes play an important role in the risk of common, complex multi-factorial diseases. Many common diseases are affected by certain genotype combinations (associated with some genes and their interactions. The identification and characterization of these susceptibility genes and gene-gene interaction have been limited by small sample size and large number of potential interactions between genes. Several methods have been proposed to detect gene-gene interaction in a case control study. The penalized logistic regression (PLR, a variant of logistic regression with L2 regularization, is a parametric approach to detect gene-gene interaction. On the other hand, the Multifactor Dimensionality Reduction (MDR is a nonparametric and genetic model-free approach to detect genotype combinations associated with disease risk. Methods We compared the power of MDR and PLR for detecting two-way and three-way interactions in a case-control study through extensive simulations. We generated several interaction models with different magnitudes of interaction effect. For each model, we simulated 100 datasets, each with 200 cases and 200 controls and 20 SNPs. We considered a wide variety of models such as models with just main effects, models with only interaction effects or models with both main and interaction effects. We also compared the performance of MDR and PLR to detect gene-gene interaction associated with acute rejection(AR in kidney transplant patients. Results In this paper, we have studied the power of MDR and PLR for detecting gene-gene interaction in a case-control study through extensive simulation. We have compared their performances for different two-way and three-way interaction models. We have studied the effect of different allele frequencies on these methods. We have also implemented their performance on a real dataset. As expected, none of these methods were

  2. A Multi-way Multi-task Learning Approach for Multinomial Logistic Regression*. An Application in Joint Prediction of Appointment Miss-opportunities across Multiple Clinics.

    Science.gov (United States)

    Alaeddini, Adel; Hong, Seung Hee

    2017-08-11

    Whether they have been engineered for it or not, most healthcare systems experience a variety of unexpected events such as appointment miss-opportunities that can have significant impact on their revenue, cost and resource utilization. In this paper, a multi-way multi-task learning model based on multinomial logistic regression is proposed to jointly predict the occurrence of different types of miss-opportunities at multiple clinics. An extension of L1 / L2 regularization is proposed to enable transfer of information among various types of miss-opportunities as well as different clinics. A proximal algorithm is developed to transform the convex but non-smooth likelihood function of the multi-way multi-task learning model into a convex and smooth optimization problem solvable using gradient descent algorithm. A dataset of real attendance records of patients at four different clinics of a VA medical center is used to verify the performance of the proposed multi-task learning approach. Additionally, a simulation study, investigating more general data situations is provided to highlight the specific aspects of the proposed approach. Various individual and integrated multinomial logistic regression models with/without LASSO penalty along with a number of other common classification algorithms are fitted and compared against the proposed multi-way multi-task learning approach. Fivefold cross validation is used to estimate comparing models parameters and their predictive accuracy. The multi-way multi-task learning framework enables the proposed approach to achieve a considerable rate of parameter shrinkage and superior prediction accuracy across various types of miss-opportunities and clinics. The proposed approach provides an integrated structure to effectively transfer knowledge among different miss-opportunities and clinics to reduce model size, increase estimation efficacy, and more importantly improve predictions results. The proposed framework can be

  3. Logistic regression analysis of factors affecting blood culture positive and countermeasures%影响血培养阳性的Logistic回归分析及对策

    Institute of Scientific and Technical Information of China (English)

    柴建华; 常洪美; 李炼; 凌冬; 陈玲; 张丕

    2015-01-01

    目的:探讨影响血培养阳性的因素,提出相应的对策。方法对2014年1~6月送检血培养标本的病例,采用单因素χ2检验和多因素Logistic回归分析相结合的方法,探讨血培养阳性的影响因素。结果该院血培养阳性率为12.83%。Logistic回归分析发现:患者发热程度(OR=1.772,P=0.002)、是否使用抗菌药物(OR=0.551,P=0.026)、抽血时机是否正确(OR=4.585,P=0.047)是血培养阳性的影响因素。结论该院血培养阳性率低,应根据分析出的影响因素采取相应的对策。%Objective To investigate the factors affecting blood culture positive ,and to put forward the corre‐sponding countermeasures .Methods The single factor χ2 test was adopted by combining with the multiple factors Logistic regression analysis method to statistically analyze the cases of blood culture in our hospital from January to June 2014 for investigating the influence factors of blood culture positive .Results The blood culture positive rate in our hospital was 12 .83% .The Logistic regression analysis found that the fever degree (OR= 1 .772 ,P= 0 .002) , whether using antimicrobial agents (OR=0 .551 ,P=0 .026) ,whether blood collection timing being correct (OR=4 .585 ,P=0 .047) were the influence factors of blood culture positive .Conclusion The blood culture positive rate is low in our hospital ,the corresponding countermeasures should be adopted according to the extracted influence fac‐to rs .

  4. Malocclusion and temporomandibular disorders-conditional logistic regression analysis%错牙合与颞下颌关节紊乱病-条件logistic回归分析

    Institute of Scientific and Technical Information of China (English)

    薛飞; 姚博; 宋新亮; 高晓峰; 戴庆; 卜宏

    2015-01-01

    目的:探讨颞下颌关节紊乱病与错牙合的关系。方法:主诉颞下颌关节紊乱病、连续病例613例(TMD组),单纯牙列不齐要求正畸治疗的连续病例682例。记录各病例错牙合情况,对两组人群作配对多因素条件logistic回归分析。结果:进入logistic回归模型的因素有对刃牙合、前牙反牙合、后牙反牙合、锁牙合,OR值分别为0.523、0.270、1.824,1.946。结论:对刃、前牙反牙合、后牙反牙合及锁牙合与颞下颌关节紊乱病有相关关系,其中影响美观的前牙异常正畸治疗需求高,后牙反牙合、锁牙合是危险因素,后牙异常可能易患TMD。%Objective To investigate the relationship between temporomandibular disorders(TMD)and malocclusion. Methods The TMD group included 613 patients diagnosed TMD.The control group included 682 patients who sought orthodontic treatment because of pute irregular dentition.The malocclusions of the patients were recorded.The patients of the two groups were analyzed by the paired multivariate logistic regression. Results The factors including edge to edge occlusion,anterior crossbite,posterior crossbite、scissors bite were took into the logistic regression model.The odds ratio was 0.523,0.270,1.824,1.946. Conclusion Edge to edge occlusion,anterior crossbite,posterior crossbite,scissors bite were associated with TMD.The one with abnormal anterior teeth that infecting his appearance is more will to accept orthodontic treatment.Posterior crossbite、scissors bite were risk factors.The one with abnormal posterior teeth is easy to suffer from TMD.

  5. Conditional Logistic Regression on Influencing Factors of Fall Injuries among Undergraduates%大学生跌落伤危险因素条件Logistic回归分析

    Institute of Scientific and Technical Information of China (English)

    许珊丹; 向兵; 张玲

    2011-01-01

    目的:探讨大学生跌落伤发生的危险因素,为制定有效的干预措施提供理论依据.方法:采用分层整群随机抽样方法,调查武汉市某医学院1219名在校大学生,从中获取跌落伤病例65例为病例,采用1:1配对病例对照研究方法获得对照.对65对大学生进行视力、注意力以及心理状况测试.结果:多因素条件Logistic回归分析结果显示,视力状况、注意力集中品质差、抑郁、焦虑是大学生跌落伤的危险因素.结论:大学生跌落伤与自身个体特质有关,视力不良、注意力集中品质差、经常抑郁和焦虑的大学生容易发生跌落伤,应对这些人群采取相应的干预措施,避免跌落伤的发生.%Objective:To study the influencing factor of fall injuries among undergraduates and to provide a theoretical basis for effective prevention measures. Methods: 1219 undergraduates were recruited and 65 injured cases were obtained by stratified and cluster random sampling. The control group was obtained by 1:1 matched case control study. The visual acuity, attention and psychologic status were tested in 65 pair cases and data was analyzed by single factor conditional logistic regression analysis and multivariate analysis was performed by condition logistic regression model. Results: Poor eyesight ,lack of concentration,depression and anxiety were risk factors of fall injuries. Conclusion: Fall injuries of college students had a correlation with personality characteristics. Undergraduates with poor eyesight ,lack of concentration.de-pression and anxiety are more likely to be injured. The effective measures should be taken on such college students to prevent the fall injuries.

  6. 上市公司财务预警的正则化逻辑回归模型%Regularized Logistic Regression Model in Financial Early-warning System with Listed Companies

    Institute of Scientific and Technical Information of China (English)

    张恒; 秦宾; 许金凤

    2011-01-01

    LI norm penalized logistic regression model is proposed based on regularization technique of statistical theory, Logistic regression model and L2 regularized logistic regression model are established. Combining two years' financial data of Shanghai-Shenzhen stock ST companies and normal counterparts, simulating experiment is conducted to analyze the financial early-warning system of listed companies. The results demonstrate the good performance and predicting accuracy of LI logistic regression model%基于统计学习理论的正则化技术构建L1(一范数约束惩罚)正则化的逻辑回归(Logistic Regression)模型,同比建立了logistic回归模型和L2(二范数约束惩罚)正则化的logistic回归模型,结合沪深股市ST公司和正常公司的T-3年和T-2年财务数据进行仿真实验用于上市公司财务预警实证分析.实验结果表明L1正则化的logistic回归模型的有效性,并且在保证模型预测精度的同时提高模型了解释性.

  7. 基于 Logistic 回归模型的三线城市道路事故数据分析%Traffic Accident Data Analysis of Third-class Urban Roadways Using Logistic Regression Models

    Institute of Scientific and Technical Information of China (English)

    邓瑶望; 李凌宇; 陈雨人

    2014-01-01

    According to the statistical data of Urumqi City from 2006 to 2010 ,nine different crash types of traffic accidents on urban roadways were selected respectively as the dependent variables .Furthermore ,nine factors were select-ed as the independent variables ,in aspects of road facilities and road environment .Based on Binary Logistic Regression model ,this paper established linear correlative models between crash types and nine affecting factors ,evaluated the model parameters ,analyzed the reliability and fitting degree of the model ,and investigated the impact that different independent variables combination have on the dependent variables .The paper also predicted the risk of each crash types under various conditions by using a multi-Logistic model ,and compared the prediction with the actual cases ,and tested the fitting effi-ciency of the model used .%根据乌鲁木齐市2006~2010年的交通事故统计资料,分别以城市道路中9类不同的交通事故形态为因变量,从道路设施、道路环境等方面选取了9个因素作为自变量,通过二项logistic模型进行事故形态分析,建立事故形态与9个影响因素间的线性相关模型,对模型参数进行了估计,并对模型的拟合程度、可靠性进行了分析,研究了所有自变量单独/组合等不同情况下对因变量的影响。再通过多项Logistic模型对不同道路条件下,各种形态的事故发生几率进行了预测,并与实际情况进行对比,检验了模型拟合效果。

  8. Forest cover dynamics analysis and prediction modelling using logistic regression model (case study: forest cover at Indragiri Hulu Regency, Riau Province)

    Science.gov (United States)

    Nahib, Irmadi; Suryanta, Jaka

    2017-01-01

    Forest destruction, climate change and global warming could reduce an indirect forest benefit because forest is the largest carbon sink and it plays a very important role in global carbon cycle. To support Reducing Emissions from Deforestation and Forest Degradation (REDD +) program, people pay attention of forest cover changes as the basis for calculating carbon stock changes. This study try to explore the forest cover dynamics as well as the prediction model of forest cover in Indragiri Hulu Regency, Riau Province Indonesia. The study aims to analyse some various explanatory variables associated with forest conversion processes and predict forest cover change using logistic regression model (LRM). The main data used in this study is Land use/cover map (1990 – 2011). Performance of developed model was assessed through a comparison of the predicted model of forest cover change and the actual forest cover in 2011. The analysis result showed that forest cover has decreased continuously between 1990 and 2011, up to the loss of 165,284.82 ha (35.19 %) of forest area. The LRM successfully predicted the forest cover for the period 2010 with reasonably high accuracy (ROC = 92.97 % and 70.26 %).

  9. Examination By Multinomial Logistic Regression Model Of The Factors Affecting The Types Of Domestic Violence Against Women A Case Of Turkey

    Directory of Open Access Journals (Sweden)

    Erkan Ari

    2015-08-01

    Full Text Available In this paper factors affecting the types of domestic violence against women was determined by multinomial logistic regression model. In this context we used the data of Research on Domestic Violence against Women in Turkey that was applied by Turkish Statistamp305cal Institute in 2008. In the study the variable of the types of domestic violence against women was used as dependent variable that has four levels. In addition twelve independent variables were used removing irrelevant variables from the data set via chi-square test of independence. After that the maximum likelihood estimates and the odds ratios of the variables of the model were obtained. Besides the validity of the model was tested by likelihood ratio test. At last comparisons were made for three categories depending on the odds ratio according to the selected reference category. In terms of odds ratios the variables of education level of woman and husbands work sector were statistically significant in only comparison one the variables of agnation with husband education level of husband frequency of seeing drunk husband and frequency of gambling of husband were statistically significant in both comparison one and three the variables of region deceived by husband common-law female for husband were statistically significant in all comparisons.

  10. Spatial prediction of Lactarius deliciosus and Lactarius salmonicolor mushroom distribution with logistic regression models in the Kızılcasu Planning Unit, Turkey.

    Science.gov (United States)

    Mumcu Kucuker, Derya; Baskent, Emin Zeki

    2015-01-01

    Integration of non-wood forest products (NWFPs) into forest management planning has become an increasingly important issue in forestry over the last decade. Among NWFPs, mushrooms are valued due to their medicinal, commercial, high nutritional and recreational importance. Commercial mushroom harvesting also provides important income to local dwellers and contributes to the economic value of regional forests. Sustainable management of these products at the regional scale requires information on their locations in diverse forest settings and the ability to predict and map their spatial distributions over the landscape. This study focuses on modeling the spatial distribution of commercially harvested Lactarius deliciosus and L. salmonicolor mushrooms in the Kızılcasu Forest Planning Unit, Turkey. The best models were developed based on topographic, climatic and stand characteristics, separately through logistic regression analysis using SPSS™. The best topographic model provided better classification success (69.3 %) than the best climatic (65.4 %) and stand (65 %) models. However, the overall best model, with 73 % overall classification success, used a mix of several variables. The best models were integrated into an Arc/Info GIS program to create spatial distribution maps of L. deliciosus and L. salmonicolor in the planning area. Our approach may be useful to predict the occurrence and distribution of other NWFPs and provide a valuable tool for designing silvicultural prescriptions and preparing multiple-use forest management plans.

  11. Shallow landslide susceptibility model for the Oria river basin, Gipuzkoa province (North of Spain). Application of the logistic regression and comparison with previous studies.

    Science.gov (United States)

    Bornaetxea, Txomin; Antigüedad, Iñaki; Ormaetxea, Orbange

    2016-04-01

    In the Oria river basin (885 km2) shallow landslides are very frequent and they produce several roadblocks and damage in the infrastructure and properties, causing big economic loss every year. Considering that the zonification of the territory in different landslide susceptibility levels provides a useful tool for the territorial planning and natural risk management, this study has the objective of identifying the most prone landslide places applying an objective and reproducible methodology. To do so, a quantitative multivariate methodology, the logistic regression, has been used. Fieldwork landslide points and randomly selected stable points have been used along with Lithology, Land Use, Distance to the transport infrastructure, Altitude, Senoidal Slope and Normalized Difference Vegetation Index (NDVI) independent variables to carry out a landslide susceptibility map. The model has been validated by the prediction and success rate curves and their corresponding area under the curve (AUC). In addition, the result has been compared to those from two landslide susceptibility models, covering the study area previously applied in different scales, such as ELSUS1000 version 1 (2013) and Landslide Susceptibility Map of Gipuzkoa (2007). Validation results show an excellent prediction capacity of the proposed model (AUC 0,962), and comparisons highlight big differences with previous studies.

  12. Classification of endometrial lesions by nuclear morphometry features extracted from liquid-based cytology samples: a system based on logistic regression model.

    Science.gov (United States)

    Zygouris, Dimitrios; Pouliakis, Abraham; Margari, Niki; Chrelias, Charalampos; Terzakis, Emmanouil; Koureas, Nikolaos; Panayiotides, Ioannis; Karakitsos, Petros

    2014-08-01

    To investigate the potential of a computerized system for the discrimination of benign from malignant endometrial nuclei and lesions. A total of 228 histologically confirmed liquid-based cytological smears were collected: 117 within normal limits cases, 66 malignant cases, 37 hyperplasias without atypia, and 8 cases of hyperplasia with atypia. From each case we extracted nuclear morphometric features from about 100 nuclei using a custom image analysis system. Initially we performed feature selection, and subsequently we applied a logistic regression model that classified each nucleus as benign or malignant. Based on the results of the nucleus classification process, we constructed an algorithm to discriminate endometrium cases as benign or malignant. The proposed system had an overall accuracy for the classification of endometrial nuclei equal to 83.02%, specificity of 85.09%, and sensitivity of 77.01%. For the case classification the overall accuracy was 92.98%, specificity was 92.86%, and sensitivity was 93.24%. The proposed computerized system can be applied for the classification of endometrial nuclei and lesions as it outperformed the standard cytological diagnosis. This study highlights interesting diagnostic features of endometrial nuclear morphology, and the proposed method can be a useful tool in the everyday practice of the cytological laboratory.

  13. Segmentation and profiling consumers in a multi-channel environment using a combination of self-organizing maps (SOM method, and logistic regression

    Directory of Open Access Journals (Sweden)

    Seyed Ali Akbar Afjeh

    2014-05-01

    Full Text Available Market segmentation plays essential role on understanding the behavior of people’s interests in purchasing various products and services through various channels. This paper presents an empirical investigation to shed light on consumer’s purchasing attitude as well as gathering information in multi-channel environment. The proposed study of this paper designed a questionnaire and distributed it among 800 people who were at least 18 years of age and had some experiences on purchasing goods and services on internet, catalog or regular shopping centers. Self-organizing map, SOM, clustering technique was performed based on consumer’s interest in gathering information as well as purchasing products through internet, catalog and shopping centers and determined four segments. There were two types of questions for the proposed study of this paper. The first group considered participants’ personal characteristics such as age, gender, income, etc. The second group of questions was associated with participants’ psychographic characteristics including price consciousness, quality consciousness, time pressure, etc. Using multinominal logistic regression technique, the study determines consumers’ behaviors in each four segments.

  14. Supply and demand analysis for flood insurance by using logistic regression model: case study at Citarum watershed in South Bandung, West Java, Indonesia

    Science.gov (United States)

    Sidi, P.; Mamat, M.; Sukono; Supian, S.

    2017-01-01

    Floods have always occurred in the Citarum river basin. The adverse effects caused by floods can cover all their property, including the destruction of houses. The impact due to damage to residential buildings is usually not small. Indeed, each of flooding, the government and several social organizations providing funds to repair the building. But the donations are given very limited, so it cannot cover the entire cost of repair was necessary. The presence of insurance products for property damage caused by the floods is considered very important. However, if its presence is also considered necessary by the public or not? In this paper, the factors that affect the supply and demand of insurance product for damaged building due to floods are analyzed. The method used in this analysis is the ordinal logistic regression. Based on the analysis that the factors that affect the supply and demand of insurance product for damaged building due to floods, it is included: age, economic circumstances, family situations, insurance motivations, and lifestyle. Simultaneously that the factors affecting supply and demand of insurance product for damaged building due to floods mounted to 65.7%.

  15. Association of perceived stress with stressful life events, lifestyle and sociodemographic factors: a large-scale community-based study using logistic quantile regression.

    Science.gov (United States)

    Feizi, Awat; Aliyari, Roqayeh; Roohafza, Hamidreza

    2012-01-01

    The present paper aimed at investigating the association between perceived stress and major life events stressors in Iranian general population. In a cross-sectional large-scale community-based study, 4583 people aged 19 and older, living in Isfahan, Iran, were investigated. Logistic quantile regression was used for modeling perceived stress, measured by GHQ questionnaire, as the bounded outcome (dependent), variable, and as a function of most important stressful life events, as the predictor variables, controlling for major lifestyle and sociodemographic factors. This model provides empirical evidence of the predictors' effects heterogeneity depending on individual location on the distribution of perceived stress. The results showed that among four stressful life events, family conflicts and social problems were more correlated with level of perceived stress. Higher levels of education were negatively associated with perceived stress and its coefficients monotonically decrease beyond the 30th percentile. Also, higher levels of physical activity were associated with perception of low levels of stress. The pattern of gender's coefficient over the majority of quantiles implied that females are more affected by stressors. Also high perceived stress was associated with low or middle levels of income. The results of current research suggested that in a developing society with high prevalence of stress, interventions targeted toward promoting financial and social equalities, social skills training, and healthy lifestyle may have the potential benefits for large parts of the population, most notably female and lower educated people.

  16. Association of Perceived Stress with Stressful Life Events, Lifestyle and Sociodemographic Factors: A Large-Scale Community-Based Study Using Logistic Quantile Regression

    Directory of Open Access Journals (Sweden)

    Awat Feizi

    2012-01-01

    Full Text Available Objective. The present paper aimed at investigating the association between perceived stress and major life events stressors in Iranian general population. Methods. In a cross-sectional large-scale community-based study, 4583 people aged 19 and older, living in Isfahan, Iran, were investigated. Logistic quantile regression was used for modeling perceived stress, measured by GHQ questionnaire, as the bounded outcome (dependent, variable, and as a function of most important stressful life events, as the predictor variables, controlling for major lifestyle and sociodemographic factors. This model provides empirical evidence of the predictors’ effects heterogeneity depending on individual location on the distribution of perceived stress. Results. The results showed that among four stressful life events, family conflicts and social problems were more correlated with level of perceived stress. Higher levels of education were negatively associated with perceived stress and its coefficients monotonically decrease beyond the 30th percentile. Also, higher levels of physical activity were associated with perception of low levels of stress. The pattern of gender’s coefficient over the majority of quantiles implied that females are more affected by stressors. Also high perceived stress was associated with low or middle levels of income. Conclusions. The results of current research suggested that in a developing society with high prevalence of stress, interventions targeted toward promoting financial and social equalities, social skills training, and healthy lifestyle may have the potential benefits for large parts of the population, most notably female and lower educated people.

  17. Determiners of enterprise risk management applications in Turkey: An empirical study with logistic regression model on the companies included in ISE (Istanbul Stock Exchange

    Directory of Open Access Journals (Sweden)

    Şerife Önder

    2012-10-01

    Full Text Available Enterprise risk management (ERM, which came along with the change in the understanding of risk management in companies, refers to evaluation of all the risks as a whole and managing them in line with the targets of the company. This study aims at determining the ERM application levels of the companies included in the Istanbul Stock Exchange and the factors that affect these applications. Existence of ERM in the companies was related with having senior manager in charge of risk management. In order to explain ERM applications with profitability, leverage and company size a Logistic Regression model was established. As a result of the analysis it was determined that about half of the financial sector companies within the ISE employed a chief risk officer (CRO, which means a culture of risk management has been founded within these companies. Moreover, it was determined that profitability of the companies do not have any significance in ERM applications while the most important factors that affect the applications were found to be leverage and company size.

  18. Emergency department mental health presentations by people born in refugee source countries: an epidemiological logistic regression study in a Medicare Local region in Australia.

    Science.gov (United States)

    Enticott, Joanne C; Cheng, I-Hao; Russell, Grant; Szwarc, Josef; Braitberg, George; Peek, Anne; Meadows, Graham

    2015-01-01

    This study investigated if people born in refugee source countries are disproportionately represented among those receiving a diagnosis of mental illness within emergency departments (EDs). The setting was the Cities of Greater Dandenong and Casey, the resettlement region for one-twelfth of Australia's refugees. An epidemiological, secondary data analysis compared mental illness diagnoses received in EDs by refugee and non-refugee populations. Data was the Victorian Emergency Minimum Dataset in the 2008-09 financial year. Univariate and multivariate logistic regression created predictive models for mental illness using five variables: age, sex, refugee background, interpreter use and preferred language. Collinearity, model fit and model stability were examined. Multivariate analysis showed age and sex to be the only significant risk factors for mental illness diagnosis in EDs. 'Refugee status', 'interpreter use' and 'preferred language' were not associatedwith a mental health diagnosis following risk adjustment forthe effects ofage and sex. The disappearance ofthe univariate association after adjustment for age and sex is a salutary lesson for Medicare Locals and other health planners regarding the importance of adjusting analyses of health service data for demographic characteristics.

  19. [Logistic regression analysis of the risk factors for difficult airway and the cut-off value of height-to-thyromental distance ratio].

    Science.gov (United States)

    Jin, Hao; Chen, Ping

    2015-08-01

    To analyze the risk factors for difficult airway in laryngoscopy and mask ventilation. A total of 300 patients receiving general anesthesia with tracheal intubation were examined preoperatively for height, thyromental and sternomental distance (TMD), range of neck movement, inter-incisor distance, and modified Mallampati class. Intubation Difficult Score was used to identify a difficult laryngoscopy. Difficult airway was defined as either difficult laryngoscopy or difficult mask ventilation. The association between the airway characteristics and difficult airway was analyzed by logistic regression analysis, and the cut-off values for the height-to-TMD ratio was determined by the ROC curve. Eight airway characteristics were identified to contribute to a difficult airway, including (OR [95%CI]) the height-to-TMD ratio (3.58[1.95-8.46]), modified Mallampati class (3.34 [1.82-7.14]), BMI (3.07 [1.64-6.69]), history of a previous difficult airway (2.79 [1.28-5.25]), a thick neck (2.15 [1.04-4.37]), range of neck movement (1.98 [0.96-3.89]), sternomental and angulus mandibulae distance (1.46 [0.67-3.04]), and inter-incisor distance (1.01 [0.49-2.54]). The optimal cut-off value for the height-to-TMD ratio was 22.8 for predicting a difficult airway.

  20. Logistic regression analysis about the risk factors of small for gestational age%小于胎龄儿的高危因素Logistic回归分析

    Institute of Scientific and Technical Information of China (English)

    李盛强; 周守方; 袁贵龙

    2012-01-01

    目的:探讨小于胎龄儿的主要危险因素,为制定适宜的预防和干预措施提供科学依据.方法:选择2008~2009年单胎活产小于胎龄儿59例作为病例组,采用完全随机抽样方法选取出生体重在10%~90%分位的适于胎龄儿65例作为对照组.比较两组的胎龄、生长激素、胰岛素抵抗、镁离子浓度、孕妇体质指数、疾病情况、孕妇年龄、父亲吸烟、酗酒等,对上述资料先进行单因素分析,在此基础上选择有统计学差异的变量进行多因素非条件Logistic回归分析.结果:小于胎龄儿在生长激素、镁离子浓度、妊娠年龄方面明显低于适于胎龄儿,差异有统计学意义(P<0.05).小于胎龄儿在胰岛素抵抗水平、妊娠高血压、父亲酗酒率、父亲吸烟率方面明显高于适于胎龄儿,差异有统计学意义(P<0.05).Logistic回归分析表明,镁离子浓度、妊娠年龄、父亲酗酒可作为小于胎龄儿的独立相关因素.结论:镁离子可能是胎儿生长发育的重要调控因子;孕妇的妊娠年龄小和父亲的酗酒会使小于胎龄儿产生的几率大大增加.%Objective: To study the main risk factors of small for gestational age (SGA) infant and to provide scientific basis for appropriate prevention and intervention measure. Methods: The SGA group included 59 singleton alive SGA delivered in our hospital during January 2008 to December 2009, 65 cases were randomly selected by birth weight of 10% to 90% of its spaces appropriate for AGA and taken as a control group. Two groups were compared in gestation age, growth hormone, insulin resistance, density of magnesium ion, pregnant women's physique index, disease situation, pregnant women's age, father smoke, get drunk, etc. Simple logistic regression analysis was performed for all the risk factors, and variables with statistical significance were chosen for multivariate and unconditioned logistic regression analysis. Results: The levels of

  1. First state of logistics survey for South Africa 2004: The case for measurement and revitalisation of basic logistics infrastructure in our dual economy

    CSIR Research Space (South Africa)

    Van Dyk, FE

    2005-02-01

    Full Text Available and Neil Jacobs for the development of the logistics cost model. Barry Saxton of Barloworld Logistics for valuable insights into the dynamics of the 3PL and 4PL world. Esli Rall and Johan Ackerman for industry-specific perspectives. We would... & rail. Conclusions & recommendations Industry level perspective: Macro-economic perspective: Reflecting the macro-economic logistics and transport reality Small business development perspective: Reflecting on logistics practices, the health...

  2. Stratification Logistic Regression Analysis on the Relationship between Body Mass Index,Waist Circumference and Hypertension%体重指数及腰围与高血压关系的分层 Logistic 回归分析

    Institute of Scientific and Technical Information of China (English)

    沈丽丽; 沈毅

    2014-01-01

    目的:探讨体重指数(BMI)、腰围(WC)与成人高血压的关系,以及对高血压的预测效果。方法用多阶段整群随机抽样抽取拱墅区540户家庭3177名18周岁以上居民作问卷调查和身高、体重、腰围及血压测量,并采用年龄分层 Logistic 回归分析 BMI、WC 与高血压之间的相关性;绘制 ROC 曲线比较不同性别、年龄BMI 及 WC 对高血压的预测效果。结果随着年龄的递增,平均收缩压、平均舒张压、WC、高血压患病率和腹型肥胖率也随之升高,调整性别、文化、职业、婚姻状况、高血压家族史、职业活动强度、吸烟和饮酒等8项有关的因素后,按年龄分层的 Logistic 回归分析结果表明:青年 BMI 肥胖组患高血压的 OR 值是正常组的15.167倍,青年腹型肥胖组患高血压的 OR 值是腰围正常组的6.995倍;BMI 肥胖组、腹型肥胖组的偏回归系数β值和OR 值均随着年龄组的递增而降低。性别和不同年龄层分别用 WC 和 BMI 预测高血压的 ROC 曲线下面积均大于0.5。结论WC 和 BMI 均是18周岁以上成人较好的高血压预测指标。青年的整体肥胖和腹型肥胖对血压的影响程度高于中老年,需重点关注。%Objective To explore the relationship between body mass index (BMI),waist circumference (WC)and adult hypertension in Gongshu District,as well as on hypertension prediction effect.Methods Age -stratified Logistic regression was used to analyze the correlation between BMI,WC and hypertension;ROC curve was used to compare hypertension prediction effect of BMI with that of WC among different gender and age people.Results BMI and WC of hypertension group were higher than that of normal group (P <0.05);the mean SBP and DBP of abdominal obesity group were higher than that of normal group (P <0.001);the mean SBP and DBP of BMI obesity group were higher than that of overweight group,and overweight group were higher than that of

  3. Quasi-Likelihood Techniques in a Logistic Regression Equation for Identifying Simulium damnosum s.l. Larval Habitats Intra-cluster Covariates in Togo.

    Science.gov (United States)

    Jacob, Benjamin G; Novak, Robert J; Toe, Laurent; Sanfo, Moussa S; Afriyie, Abena N; Ibrahim, Mohammed A; Griffith, Daniel A; Unnasch, Thomas R

    2012-01-01

    The standard methods for regression analyses of clustered riverine larval habitat data of Simulium damnosum s.l. a major black-fly vector of Onchoceriasis, postulate models relating observational ecological-sampled parameter estimators to prolific habitats without accounting for residual intra-cluster error correlation effects. Generally, this correlation comes from two sources: (1) the design of the random effects and their assumed covariance from the multiple levels within the regression model; and, (2) the correlation structure of the residuals. Unfortunately, inconspicuous errors in residual intra-cluster correlation estimates can overstate precision in forecasted S.damnosum s.l. riverine larval habitat explanatory attributes regardless how they are treated (e.g., independent, autoregressive, Toeplitz, etc). In this research, the geographical locations for multiple riverine-based S. damnosum s.l. larval ecosystem habitats sampled from 2 pre-established epidemiological sites in Togo were identified and recorded from July 2009 to June 2010. Initially the data was aggregated into proc genmod. An agglomerative hierarchical residual cluster-based analysis was then performed. The sampled clustered study site data was then analyzed for statistical correlations using Monthly Biting Rates (MBR). Euclidean distance measurements and terrain-related geomorphological statistics were then generated in ArcGIS. A digital overlay was then performed also in ArcGIS using the georeferenced ground coordinates of high and low density clusters stratified by Annual Biting Rates (ABR). This data was overlain onto multitemporal sub-meter pixel resolution satellite data (i.e., QuickBird 0.61m wavbands ). Orthogonal spatial filter eigenvectors were then generated in SAS/GIS. Univariate and non-linear regression-based models (i.e., Logistic, Poisson and Negative Binomial) were also employed to determine probability distributions and to identify statistically significant parameter

  4. Logistic regression analysis of high risk factors of preterm delivery%应用logistic回归模型分析孕妇早产的高危因素

    Institute of Scientific and Technical Information of China (English)

    王彩芳; 刘妙珍; 黄丽君

    2015-01-01

    Objective To analyze the risk factors of premature delivery. Methods The clinical data of 401 cases of preterm labor and 632 cases of mature puerperas was analyzed, and its risk factors of premature delivery were analyzed by logistic regression analysis. Results The single factor analysis showed that pregnancy induced hypertension, pregnancy complicated with diabetes, reproductive system infection, vaginal bleeding, pregnancy check-ups, premature labor, pre-mature rupture of membranes, cervical incompetence, abnormal amniotic fluid, peripheral blood lymphocyte count be-tween two groups were significantly different (χ2=195.47, 205.55, 156.09, 95.44, 100.13, 41.96, 106.61, 13.12, 2.28, 22.64,P<0.05). Logistic regression analysis showed that pregnancy induced hypertension, pregnancy complicated with di-abetes, reproductive system infection, vaginal bleeding, pregnancy check-ups, premature labor, premature rupture of membranes, peripheral blood lymphocyte count were the independent risk factors(P<0.05). Conclusion Strict observing and preventing the high risk factors of premature delivery in order to reduce the incidence of premature birth and improve maternal and child outcomes.%目的:分析孕妇早产的高危因素。方法回归性分析401例早产孕妇和同期收治的632例足月产孕妇临床资料,采用单因素分析和多因素logistic回归分析孕妇早产的高危因素。结果单因素分析显示,两组妊娠期高血压疾病、妊娠合并糖尿病、生殖系统感染、阴道出血、有产检、有早产史、胎膜早破、宫颈机能不全、羊水异常、外周血淋巴细胞计数比较,差异有统计学意义(χ2分别=195.47、205.55、156.09、95.44、100.13、41.96、106.61、13.12、2.28、22.64,P均<0.05);logistic回归分析显示,妊娠期高血压疾病、妊娠合并糖尿病、生殖系统感染、阴道出血、有产检、有早产史、胎膜早破、外周血淋巴细胞计数是孕

  5. Factors related to clinical pregnancy after vitrified-warmed embryo transfer: a retrospective and multivariate logistic regression analysis of 2313 transfer cycles.

    Science.gov (United States)

    Shi, Wenhao; Zhang, Silin; Zhao, Wanqiu; Xia, Xue; Wang, Min; Wang, Hui; Bai, Haiyan; Shi, Juanzi

    2013-07-01

    What factors does multivariate logistic regression show to be significantly associated with the likelihood of clinical pregnancy in vitrified-warmed embryo transfer (VET) cycles? Assisted hatching (AH) and if the reason to freeze embryos was to avoid the risk of ovarian hyperstimulation syndrome (OHSS) were significantly positively associated with a greater likelihood of clinical pregnancy. Single factor analysis has shown AH, number of embryos transferred and the reason of freezing for OHSS to be positively and damaged blastomere to be negatively significantly associated with the chance of clinical pregnancy after VET. It remains unclear what factors would be significant after multivariate analysis. The study was a retrospective analysis of 2313 VET cycles from 1481 patients performed between January 2008 and April 2012. A multivariate logistic regression analysis was performed to identify the factors to affect clinical pregnancy outcome of VET. There were 22 candidate variables selected based on clinical experiences and the literature. With the thresholds of α entry = α removal= 0.05 for both variable entry and variable removal, eight variables were chosen to contribute the multivariable model by the bootstrap stepwise variable selection algorithm (n = 1000). Eight variables were age at controlled ovarian hyperstimulation (COH), reason for freezing, AH, endometrial thickness, damaged blastomere, number of embryos transferred, number of good-quality embryos, and blood presence on transfer catheter. A descriptive comparison of the relative importance was accomplished by the proportion of explained variation (PEV). Among the reasons for freezing, the OHSS group showed a higher OR than the surplus embryo group when compared with other reasons for VET groups (OHSS versus Other, OR: 2.145; CI: 1.4-3.286; Surplus embryos versus Other, OR: 1.152; CI: 0.761-1.743) and high PEV (marginal 2.77%, P = 0.2911; partial 1.68%; CI of area under receptor operator characteristic

  6. A novel framework for predicting in vivo toxicities from in vitro data using optimal methods for dense and sparse matrix reordering and logistic regression.

    Science.gov (United States)

    DiMaggio, Peter A; Subramani, Ashwin; Judson, Richard S; Floudas, Christodoulos A

    2010-11-01

    In this work, we combine the strengths of mixed-integer linear optimization (MILP) and logistic regression for predicting the in vivo toxicity of chemicals using only their measured in vitro assay data. The proposed approach utilizes a biclustering method based on iterative optimal reordering (DiMaggio, P. A., McAllister, S. R., Floudas, C. A., Feng, X. J., Rabinowitz, J. D., and Rabitz, H. A. (2008). Biclustering via optimal re-ordering of data matrices in systems biology: rigorous methods and comparative studies. BMC Bioinformatics 9, 458-474.; DiMaggio, P. A., McAllister, S. R., Floudas, C. A., Feng, X. J., Rabinowitz, J. D., and Rabitz, H. A. (2010b). A network flow model for biclustering via optimal re-ordering of data matrices. J. Global. Optim. 47, 343-354.) to identify biclusters corresponding to subsets of chemicals that have similar responses over distinct subsets of the in vitro assays. The biclustering of the in vitro assays is shown to result in significant clustering based on assay target (e.g., cytochrome P450 [CYP] and nuclear receptors) and type (e.g., downregulated BioMAP and biochemical high-throughput screening protein kinase activity assays). An optimal method based on mixed-integer linear optimization for reordering sparse data matrices (DiMaggio, P. A., McAllister, S. R., Floudas, C. A., Feng, X. J., Li, G. Y., Rabinowitz, J. D., and Rabitz, H. A. (2010a). Enhancing molecular discovery using descriptor-free rearrangement clustering techniques for sparse data sets. AIChE J. 56, 405-418.; McAllister, S. R., DiMaggio, P. A., and Floudas, C. A. (2009). Mathematical modeling and efficient optimization methods for the distance-dependent rearrangement clustering problem. J. Global. Optim. 45, 111-129) is then applied to the in vivo data set (21.7% sparse) in order to cluster end points that have similar lowest effect level (LEL) values, where it is observed that the end points are effectively clustered according to (1) animal species (i.e., the

  7. Comparison of four prognostic models and a new Logistic regression model to predict short-term prognosis of acute-on-chronic hepatitis B liver failure

    Institute of Scientific and Technical Information of China (English)

    HE Wei-ping; HU Jin-hua; ZHAO Jun; TONG Jing-jing; DING Jin-biao; LIN Fang; WANG Hui-fen

    2012-01-01

    Background Acute-on-chronic hepatitis B liver failure (ACLF-HBV) is a clinically severe disease associated with major life-threatening complications including hepatic encephalopathy and hepatorenal syndrome.The aim of this study was to evaluate the short-term prognostic predictability of the model for end-stage liver disease (MELD),MELD-based indices,and their dynamic changes in patients with ACLF-HBV,and to establish a new model for predicting the prognosis of ACLF-HBV.Methods A total of 172 patients with ACLF-HBV who stayed in the hospital for more than 2 weeks were retrospectively recruited.The predictive accuracy of MELD,MELD-based indices,and their dynamic change (△) were compared using the area under the receiver operating characteristic curve method.The associations between mortality and patient characteristics were studied by univariate and multivariate analyses.Results The 3-month mortality was 43.6%.The largest concordance (c) statistic predicting 3-month mortality was the MELD score at the end of 2 weeks of admission (0.8),followed by the MELD:sodium ratio (MESO) (0.796) and integrated MELD (iMELD) (0.758) scores,△MELD (0.752),△MESO (0.729),and MELD plus sodium (MELD-Na) (0.728) scores.In multivariate Logistic regression analysis,the independent factors predicting prognosis were hepatic encephalopathy (OR=-3.466),serum creatinine,international normalized ratio (INR),and total bilirubin at the end of 2 weeks of admission (OR=10.302,6.063,5.208,respectively),and cholinesterase on admission (OR=0.255).This regression model had a greater prognostic value (c=0.85,95% Cl 0.791-0.909) compared to the MELD score at the end of 2 weeks of admission (Z=4.9851,P=-0.0256).Conclusions MELD score at the end of 2 weeks of admission is a useful predictor for 3-month mortality in ACLF-HBV patients.Hepatic encephalopathy,serum creatinine,international normalized ratio,and total bilirubin at the end of 2 weeks of admission and cholinesterase on admission are

  8. Logistic Regression Analysis and Nursing Interven-tions for High-risk Factors for Pressure Sores in Pa-tients in a Surgical Intensive Care Unit

    Institute of Scientific and Technical Information of China (English)

    Xin-Ran Wang∗; Bin-Ru Han

    2015-01-01

    Objective: To investigate the risk factors related to the development of pressure sores in critically ill surgical patients and to establish a basis for the formulation of effective precautions. Methods: A questionnaire regarding the factors for pressure sores in critically ill surgical patients was created using a case control study with reference to the pertinent literature. After being exam-ined and validated by experts, the questionnaire was used to collect data about critically ill surgical patients in a grade A tertiary hospital. Among the 47 patients enrolled into the study, the 14 who developed nosocomial pressure sores were allocated to the pressure sore group, and the remaining 33 patients who met the inclusion criteria and did not exhibit pressure sores were allocated to the control group. Univariate and multivariate logistic regression analyses were employed to examine the differences in 22 indicators between the two groups in an attempt to identify the risk factors for pressure sores. Results: According to the univariate analyses, the maximum value of lactic acid in the arterial blood, the number of days of norepinephrine use, the number of days of mechanical ventilation, the number of days of blood purification, and the number of days of bowel incontinence were sta-tistically greater in the pressure sore group than in the control group ( P Conclusions: The best method for preventing and control pressure sores in surgical critically ill patients is to strongly emphasize the duration of the critical status and to give special attention to patients in a continuous state of shock. The adoption of measures specific to high-risk patient groups and risk factors, including the active control of primary diseases and the application of de-compression measures during the treatment of the patients, are helpful for improving the quality of care in the prevention and control of pressure sores in critically ill patients.

  9. Dynamic Simulation of Urban Expansion Based on Cellular Automata and Logistic Regression Model: Case Study of the Hyrcanian Region of Iran

    Directory of Open Access Journals (Sweden)

    Meisam Jafari

    2016-08-01

    Full Text Available The hypothesis addressed in this article is to determine the extent of selected land use categories with respect to their effect on urban expansion. A model that combines a logistic regression model, Markov chain, together with cellular automata based modeling, is introduced here to simulate future urban growth and development in the Gilan Province, Iran. The model is calibrated based on data beginning in 1989 and ending in 2013 and is applied in making predictions for the years 2025 and 2037, across 12 urban development criteria. The relative operating characteristic (ROC is validated with a very high rate of urban development. The analyzed results indicate that the area of urban land has increased by more than 1.7% that is, from 36,012.5 ha in 1989 to 59,754.8 ha in 2013 and the area of the Caspian Hyrcanian forestland has reduced by 31,628 ha. The simulation results, with respect to prediction, indicate an alarming increase in the rate of urban development in the province by 2025 and 2037 that is, 0.82% and 1.3%, respectively. The development pattern is expected to be uneven and scattered, without following any particular direction. The development will occur close to the existing or newly-formed urban infrastructure and around major roads and commercial areas. If not controlled, this development trend will lead to the loss of 25,101 ha of Hyrcanian forest and, if continued, 21,774 ha of barren and open lands are expected to be destroyed by the year 2037. These results demonstrate the capacity of the integrated model in establishing comparisons with urban plans and their utility to explain both the volume and constraints of urban growth. It is beneficial to apply the integrated approach in urban dynamic assessment through land use modeling with respect to spatio-temporal representation in distinct urban development formats.

  10. An Objective Screening Method for Major Depressive Disorder Using Logistic Regression Analysis of Heart Rate Variability Data Obtained in a Mental Task Paradigm

    Directory of Open Access Journals (Sweden)

    Guanghao Sun

    2016-11-01

    Full Text Available Background and Objectives: Heart rate variability (HRV has been intensively studied as a promising biological marker of major depressive disorder (MDD. Our previous study confirmed that autonomic activity and reactivity in depression revealed by HRV during rest and mental task (MT conditions can be used as diagnostic measures and in clinical evaluation. In this study, logistic regression analysis (LRA was utilized for the classification and prediction of MDD based on HRV data obtained in an MT paradigm.Methods: Power spectral analysis of HRV on R-R intervals before, during, and after an MT (random number generation was performed in 44 drug-naïve patients with MDD and 47 healthy control subjects at Department of Psychiatry in Shizuoka Saiseikai General Hospital. Logit scores of LRA determined by HRV indices and heart rates discriminated patients with MDD from healthy subjects. The high frequency (HF component of HRV and the ratio of the low frequency (LF component to the HF component (LF/HF correspond to parasympathetic and sympathovagal balance, respectively.Results: The LRA achieved a sensitivity and specificity of 80.0% and 79.0%, respectively, at an optimum cutoff logit score (0.28. Misclassifications occurred only when the logit score was close to the cutoff score. Logit scores also correlated significantly with subjective self-rating depression scale scores (p < 0.05.Conclusion: HRV indices recorded during a mental task may be an objective tool for screening patients with MDD in psychiatric practice. The proposed method appears promising for not only objective and rapid MDD screening, but also evaluation of its severity.

  11. Three-Level Mixed-Effects Logistic Regression Analysis Reveals Complex Epidemiology of Swine Rotaviruses in Diagnostic Samples from North America.

    Directory of Open Access Journals (Sweden)

    Nitipong Homwong

    Full Text Available Rotaviruses (RV are important causes of diarrhea in animals, especially in domestic animals. Of the 9 RV species, rotavirus A, B, and C (RVA, RVB, and RVC, respectively had been established as important causes of diarrhea in pigs. The Minnesota Veterinary Diagnostic Laboratory receives swine stool samples from North America to determine the etiologic agents of disease. Between November 2009 and October 2011, 7,508 samples from pigs with diarrhea were submitted to determine if enteric pathogens, including RV, were present in the samples. All samples were tested for RVA, RVB, and RVC by real time RT-PCR. The majority of the samples (82% were positive for RVA, RVB, and/or RVC. To better understand the risk factors associated with RV infections in swine diagnostic samples, three-level mixed-effects logistic regression models (3L-MLMs were used to estimate associations among RV species, age, and geographical variability within the major swine production regions in North America. The conditional odds ratios (cORs for RVA and RVB detection were lower for 1-3 day old pigs when compared to any other age group. However, the cOR of RVC detection in 1-3 day old pigs was significantly higher (p 55 day old age groups. Furthermore, pigs in the 21-55 day old age group had statistically higher cORs of RV co-detection compared to 1-3 day old pigs (p < 0.001. The 3L-MLMs indicated that RV status was more similar within states than among states or within each region. Our results indicated that 3L-MLMs are a powerful and adaptable tool to handle and analyze large-hierarchical datasets. In addition, our results indicated that, overall, swine RV epidemiology is complex, and RV species are associated with different age groups and vary by regions in North America.

  12. Research on the Prediction of Credit Risk Based on Logistic Regression Method%基于Logistic回归方法的信用风险预测研究

    Institute of Scientific and Technical Information of China (English)

    肖超峰; 郭浩明

    2013-01-01

    The main body of current commercial bank profits is mainly from the credit business. Therefore, the credit risk has been one of the most concerned risks of commercial banks, and how to assess the credit risk is a concern in commercial banking business. Compared with the traditional credit risk assessment methods which require more labor participation, this paper proposes a new method which is based on the data directly extracted from bank customer transaction database and uses Logistic regression method to model, so as to forecasts customer's PD reflecting the customer probability of default loan repayment ability.%目前商业银行利润的主体主要来自信贷业务。因此,信贷风险一直是商业银行最为关注的风险之一,如何对信用风险评估是商业银行业务中很关注的问题。与传统的信用风险评估方法需要更多人力参与相比,本文提出一种直接从银行客户交易数据中抽取所关注的交易行为,然后在此基础上利用Logistic回归方法建立模型的方法,来预测反映了客户贷款偿还能力的违约概率(PD)。

  13. Exploring improvements in patient logistics in Dutch hospitals with a survey

    NARCIS (Netherlands)

    Lent, van W.A.M.; Sanders, E.M.; Harten, van W.H.

    2012-01-01

    Background Research showed that promising approaches such as benchmarking, operations research, lean management and six sigma, could be adopted to improve patient logistics in healthcare. To our knowledge, little research has been conducted to obtain an overview on the use, combination and effects o

  14. Logistic Regression Analysis of Big Five Personality Predicting Blogging among Undergraduates%大学生人格特征对其博客行为的Logistic回归分析

    Institute of Scientific and Technical Information of China (English)

    王明辉; 李宗波

    2011-01-01

    Objective: To investigate the relationship between individual personality undergraduates of and blogging.Methods: 401 Chinese undergraduates were surveyed on the basis of Blogging Inventory and NEO-FFI. Results: By means of logistic regression analysis methods, it was showed that openness to new experience (β=0.414, wald=6.627, P<0.05) and conscientiousness (β=0.509, wald=5.226, P<0.05) could predict blogging in undergraduates significantly. Conclusion: Personality traits can predict blogging of undergraduates effectively.%目的:探讨大学生人格特征与博客创作行为的关系.方法:采用自编的大学生博客行为调查表和NEO-FFI大五人格问卷对401名在校大学生进行调查.结果:通过Logistic回归分析显示,大五人格中的经验开放性(β=0.414,wald=6.627,P<0.05)和尽责性(B=0.509,wald=5.226,P<0.05)特质对大学生博客行为具有显著的正向预测作用.结论:人格特征可以作为是否参与博客行为的有效预测变量.

  15. 女大学生原发性痛经的多因素Logistic回归分析%Logistic regression analysis on the multi-factors of primary dysmenorrhea of female college students

    Institute of Scientific and Technical Information of China (English)

    陈丽娟; 闫妍

    2014-01-01

    Objective:To investigate the relative factors of the primary dysmenorrhea of college female students. Methods:Taking class as a group in the method of chester sampling, 1574 female college students randomly were surveyed by self-made questionnaires. And relative factors influencing dysmenorrhea were statistically analyzed. Results: In 1574 selected subjects, the incidence of primary dysmenorrheal in female college students was 65.0% (1023/1574), including 757 mild cases (74.0%), 192 moderate cases (18.8%), 74 severe cases (7.2%). According to Logistic regression analysis, mother with history of dysmenorrhea [OR=1. 352, 95% CI (1.087~1.569), P25 [OR=0.695, 95% CI (0.554~0.951), P8 h [ OR=0. 331, 95% CI ( 0. 225~0. 452 ) , P25[OR=0.695,95%CI(0.554~0.951),P8h [OR=0.331,95%CI(0.225~0.452),P<0.05]是原发性痛经的保护因素。结论:女大学生的原发性痛经发生率较高,不良社会心理环境是引起女大学生原发性痛经的主要危险因素,避免体重偏低和充足睡眠可预防痛经的发生。

  16. 258例儿童哮喘危险因素的Logistic回归分析%Logistic regression analysis on risk factors of 258 children with asthma

    Institute of Scientific and Technical Information of China (English)

    李晶; 侯丽影; 周雅燕

    2012-01-01

    目的:探讨哮喘的相关危险因素,为儿童哮喘的防治提供参考.方法:以258例儿童哮喘患者为观察组,同时从门诊随机抽取240例无哮喘病史的人群作为对照组.应用非条件Logistic回归分析进行多因素分析.结果:单因素分析结果表明:两组间上呼吸道感染、运动、天气变化、情绪变化、饮食、鼻炎、个人过敏史、家族过敏史、家族哮喘史比较差异有统计学意义(P<0.05).多因素分析结果表明:上呼吸道感染、运动、饮食、鼻炎、个人过敏史、家族过敏史、家族哮喘史是儿童发生哮喘的独立危险因素(P<0.05).结论:上呼吸道感染、运动、饮食、鼻炎、个人过敏史、家族过敏史、家族哮喘史是儿童发生哮喘的独立危险因素,对这些危险因素进行必要的干预可能降低哮喘的发生率.%Objective: To explore the related risk factors ol asthma, and provide reference for prevention and treatment of children with asthma. Methods: A total of 258 children with asthma were selected as observation group, meanwhile, 240 cases without asthma were selected from outpatient department of the hospital as control group, non - conditional logistic regression analysis was performed for multivari-ate analysis. Results: The results of univariate regression analysis showed that there was statistically significant difference in the incidence of upper respiratory tract infection, sports, changes of weather, changes of emotion , diet, rhinitis, personal allergic history, family allergic history, and family asthmatic history between the two groups (P <0. 05) . The results of multivariate regression analysis showed that upper respiratory tract infection, sports, diet, rhinitis, personal allergic history, family allergic history, and family asthmatic history were independent risk factors of asthma in children, there was statistical significance (P<0. 05) . Conclusion: Upper respiratory tract infection, sports, diet

  17. Logistic Regression Analysis of Related Factors on Children With Tic Disorders%儿童抽动障碍Logistic相关因素分析

    Institute of Scientific and Technical Information of China (English)

    黄晓玲; 刘长云

    2011-01-01

    Objective To investigate the related factors on children with tic disorders (TD) and provide the clinical evidence for therapy. Methods This was a prospective observational study with institutional ethics approval and written maternal consent. From January to December 2009, a total of 122 children with TD were selected as study group, while 106 cases without TD were selected as control group. There were no significant differences in gender ratio, age and etc between two groups(P>0. 05). All the children were investigated with "Related Factors on Children With Tic Disorders Logistic Scale" which was designed by Weifang Medical College. The risk factors of TD were analysed by class two Logistic regression. Results Among 15 single factor, such as family discipline harsh, angry / personality sensitive to partial addicted to diet, family history of TD, history of recurrent respiratory tract infections, attention deficit hyperactivity disorder (ADHD) had significant differences between two groups(P0. 05). According to class two Logistic regression analysis classification of 15 single factor, factors of family discipline harsh, angry / personality sensitive to partial addicted to diet, family history of TD, history of recurrent respiratory tract infections, ADHD were closely related to TD incidence (P0. 05). Conclusion In order to treat and prevent TD, factors such as family environment, balanced diet, testing children character, enhance children s regulation of the immune resistance and so on should be taken into account.%目的 探讨儿童抽动障碍(TD)发病的相关因素,为临床防治TD提供依据.方法 选择2009年1月至12月在淄博市妇幼保健院儿童保健门诊以抽动为主诉就诊并确诊为TD的患儿纳入研究组(n=122).对照组(n=106)为同期在本院就诊,排除TD病史儿童(本研究遵循的程序符合本院人体试验委员会制定的伦理学标准,得到该委员会批准,分组征得受试对象监护人的知情同意,并

  18. Logistic regression analysis of risk factors of oral lichen planus%口腔扁平苔藓发病相关风险因素的Logistic回归分析

    Institute of Scientific and Technical Information of China (English)

    郝玉娥

    2014-01-01

    Objective:To investigate and analyze the may causes of oral lichen planus ( OLP) risk factors,provide a theoretical basis for the prevention and treatment of lichen planus. Methods:Selected a qualified 130 patients with oral lichen planus,with the recent health examination in our hospital 130 cases of non-oral mucosal disease control study,the relevant survey data row two categories multivariate logistic regression analysis,to screen for risk factors associ-ated with the onset of OLP. Results:Logistic regression analysis showed that psychological factors,gastritis,local chemical stimulation and autonomic disor-ders with OLP was significantly correlated with the incidence. Conclusion:Onset and progression of oral lichen planus and multiple factors,Clinical cause of treatment should be to promote effective prevention and prognosis of OLP.%目的:对可能导致口腔扁平苔藓( OLP)发病的相关风险因素进行调查分析,为防治OLP提供理论依据。方法:选取符合条件的OLP患者130例,与近期在我院健康体检的130例非口腔黏膜病患者进行对照研究,将相关调查数据行二分类多因素Logistic回归分析,以筛查OLP发病的相关风险因素。结果:经Logistic回归分析结果显示,心理因素、胃炎、局部理化刺激和植物神经紊乱与OLP发病具有显著相关性。结论:OLP发病及病情进展与多因素有关,临床应有效进行病因治疗以促进OLP的预防及预后。

  19. Regression analysis by example

    National Research Council Canada - National Science Library

    Chatterjee, Samprit; Hadi, Ali S

    2012-01-01

    .... The emphasis continues to be on exploratory data analysis rather than statistical theory. The coverage offers in-depth treatment of regression diagnostics, transformation, multicollinearity, logistic regression, and robust regression...

  20. The relationship between vehicle routing and scheduling and green logistics - a literature survey

    OpenAIRE

    Sbihi, A.; Eglese, R W

    2007-01-01

    The basic Vehicle Routing and Scheduling Problem (VRSP) is described followed by an outline of solution approaches. Different variations of the basic VRSP are examined that involve the consideration of additional constraints or other changes in the structure of the appropriate model. An introduction is provided to Green Logistics issues that are relevant to vehicle routing and scheduling including discussion of the environmental objectives that should be considered. Particular consideration i...

  1. The Relationship between Vehicle Routing & Scheduling and Green Logistics - A Literature Survey

    OpenAIRE

    Sbihi, Abdelkader; W. Eglese, Richard

    2007-01-01

    The basic Vehicle Routing and Scheduling Problem (VRSP) is described followed by an outline of solution approaches. Different variations of the basic VRSP are examined that involve the consideration of additional constraints or other changes in the structure of the appropriate model. An introduction is provided to Green Logistics issues that are relevant to vehicle routing and scheduling including discussion of the environmental objectives that should be considered. Particular consideration i...

  2. Electric Vehicles in Logistics and Transportation: A Survey on Emerging Environmental, Strategic, and Operational Challenges

    OpenAIRE

    Angel Alejandro Juan; Carlos Alberto Mendez; Javier Faulin; Jesica de Armas; Scott Erwin Grasman

    2016-01-01

    Current logistics and transportation (L&T) systems include heterogeneous fleets consisting of common internal combustion engine vehicles as well as other types of vehicles using “green” technologies, e.g., plug-in hybrid electric vehicles and electric vehicles (EVs). However, the incorporation of EVs in L&T activities also raise some additional challenges from the strategic, planning, and operational perspectives. For instance, smart cities are required to provide recharge stations fo...

  3. A Remote Sensing Based Approach for the Assessment of Debris Flow Hazards Using Artificial Neural Network and Binary Logistic Regression Modeling

    Science.gov (United States)

    El Kadiri, R.; Sultan, M.; Elbayoumi, T.; Sefry, S.

    2013-12-01

    Efforts to map the distribution of debris flows, to assess the factors controlling their development, and to identify the areas prone to their development are often hampered by the absence or paucity of appropriate monitoring systems and historical databases and the inaccessibility of these areas in many parts of the world. We developed methodologies that heavily rely on readily available observations extracted from remote sensing datasets and successfully applied these techniques over the the Jazan province, in the Red Sea hills of Saudi Arabia. We first identified debris flows (10,334 locations) from high spatial resolution satellite datasets (e.g., GeoEye, Orbview), and verified a subset of these occurrences in the field. We then constructed a GIS to host the identified debris flow locations together with co-registered relevant data (e.g., lithology, elevation) and derived products (e.g., slope, normalized difference vegetation index, etc). Spatial analysis of the data sets in the GIS sets indicated various degrees of correspondence between the distribution of debris flows and various variables (e.g., stream power index, topographic position index, normalized difference vegetation index, distance to stream, flow accumulation, slope and soil weathering index, aspect, elevation) suggesting a causal effect. For example, debris flows were found in areas of high slope, low distance to low stream orders and low vegetation index. To evaluate the extent to which these factors control landslide distribution, we constructed and applied: (1) a stepwise input selection by testing all input combinations to make the final model more compact and effective, (2) a statistic-based binary logistic regression (BLR) model, and (3) a mathematical-based artificial neural network (ANN) model. Only 80% (8267 locations) of the data was used for the construction of each of the models and the remaining samples (2067 locations) were used for the accuracy assessment purposes. Results

  4. 变应性鼻炎相关危险因素的Logistic回归分析%Logistic regression analysis of related risk factors in allergic rhinitis

    Institute of Scientific and Technical Information of China (English)

    冯婷; 黄世铮; 鲁航

    2015-01-01

    目的:探讨变应性鼻炎致病的相关因素。方法通过系统抽样方法选取200例变应性鼻炎患者与200例健康体检者,详细记录每例试验者的体育锻炼、饮食习惯、有变应性鼻炎家族史、吸烟史、工作环境粉尘情况、营养状态、晾晒被褥、花粉过敏、每日睡眠、尘螨过敏、养宠物史、食物过敏、开窗通风等情况,采用Logistic回归分析变应性鼻炎致病的相关因素。结果体育锻炼、营养状况、每日睡眠、饮食习惯与变应性鼻炎无相关性,晾晒被褥、开窗通风及空调开放为变应性鼻炎的保护因素,有变应性鼻炎家族史、吸烟史、花粉过敏、尘螨过敏、养宠物史、食物过敏为变应性鼻炎的危险因素。结论有变应性鼻炎家族史、吸烟史、花粉过敏、尘螨过敏、养宠物史、食物过敏是变应性鼻炎的危险因素。%Objective To investigate the factors related to allergic rhinitis disease. Method Chose 200 cases of allergic rhinitis and 200 cases of healthy people through systematic sampling method, detail recorded each case with the physical exercise, diet habit, family history, smoking history, work environment of dust, status, airing quilts, pollen allergy, daily sleep, dust mite allergy history, pet history, food allergies, ventilation windows, used Logistic regression analysis of factors related to allergic rhinitis. Result Physical exercise, nutrition, sleep, diet were not associated with allergic rhinitis, airing quilts, window ventilation and air conditioning open indicated that were protective factors, the rest of family history, smoking history of allergic rhinitis, allergic to pollen, dust mites allergic, pet history and food allergies were risk factors for allergic rhinitis. Conclusion Family history, smoking history, pollen allergy, dust mite allergy, pet history, food allergies are the risk factors of allergic rhinitis.

  5. Logistic Regression Analysis of the Risk Factors of Ectopic Pregnancy%异位妊娠危险因素Logistic回归分析

    Institute of Scientific and Technical Information of China (English)

    陈小华

    2014-01-01

    目的:研究异位妊娠发生的危险因素,为提高其防治水平提供科学有力证据。方法采用Logisitic回归分析法,选取2011年3月-2013年10月在该院妇产科住院治疗140例异位妊娠患者作为研究对象,进行异位妊娠相关因素的多元回归分析。结果流产史、异位妊娠史、盆腔感染史和宫内节育器是异位妊娠发生的显著危险因素,其优势比(odds ratio,OR)及95.0%可信区间(confidence interval,CI)分别为OR=3.79,95.0% CI:2.23~6.42、OR=6.98,95.0% CI:2.64~15.53、OR=4.21,95.0%CI:2.43~7.15和OR=2.19,95.0%CI:1.21~3.98。结论加强生殖健康教育、减少非意愿妊娠和预防盆腔感染,可降低异位妊娠的风险。%Objective To study the risk factors related to ectopic pregnancy in order to provide a scientific evidence for improving the level of prevention and treatment. Methods 140 cases of patients with ectopic pregnancy hospitalized in the Department of Obstetrics and Gynecology of our hospital from March, 2011 to October, 2013 were selected as the subjects. And logistic regression analysis method was used to analyze the factors related to ectopic pregnancy. Results The main risk factors related to ectopic pregnancy were history of abortion with the odds ratio (OR) =3.79, 95% confidence interval (CI):2.23-6.42; and history of ectopic pregnancy with OR=6.98; 95% CI: 2.64-15.53; history of pelvic infection with OR=4.21, 95% CI:2.43-7.15;the intrauterine de-vice with OR=2.19, 95%CI:1.21-3.98. Conclusion Strengthening the reproductive health education, reducing the unwanted preg-nancy and preventing pelvic infection, can reduce the risk of ectopic pregnancy.

  6. The role of multicollinearity in landslide susceptibility assessment by means of Binary Logistic Regression: comparison between VIF and AIC stepwise selection

    Science.gov (United States)

    Cama, Mariaelena; Cristi Nicu, Ionut; Conoscenti, Christian; Quénéhervé, Geraldine; Maerker, Michael

    2016-04-01

    Landslide susceptibility can be defined as the likelihood of a landslide occurring in a given area on the basis of local terrain conditions. In the last decades many research focused on its evaluation by means of stochastic approaches under the assumption that 'the past is the key to the future' which means that if a model is able to reproduce a known landslide spatial distribution, it will be able to predict the future locations of new (i.e. unknown) slope failures. Among the various stochastic approaches, Binary Logistic Regression (BLR) is one of the most used because it calculates the susceptibility in probabilistic terms and its results are easily interpretable from a geomorphological point of view. However, very often not much importance is given to multicollinearity assessment whose effect is that the coefficient estimates are unstable, with opposite sign and therefore difficult to interpret. Therefore, it should be evaluated every time in order to make a model whose results are geomorphologically correct. In this study the effects of multicollinearity in the predictive performance and robustness of landslide susceptibility models are analyzed. In particular, the multicollinearity is estimated by means of Variation Inflation Index (VIF) which is also used as selection criterion for the independent variables (VIF Stepwise Selection) and compared to the more commonly used AIC Stepwise Selection. The robustness of the results is evaluated through 100 replicates of the dataset. The study area selected to perform this analysis is the Moldavian Plateau where landslides are among the most frequent geomorphological processes. This area has an increasing trend of urbanization and a very high potential regarding the cultural heritage, being the place of discovery of the largest settlement belonging to the Cucuteni Culture from Eastern Europe (that led to the development of the great complex Cucuteni-Tripyllia). Therefore, identifying the areas susceptible to

  7. Logistic regression analysis of correlative factors for 413 asthma children%413例儿童哮喘危险因素Logistic回归分析

    Institute of Scientific and Technical Information of China (English)

    陈建平; 赵婉莹; 何念海; 周平; 王刚; 汪万军

    2011-01-01

    目的 调查重庆地区儿童哮喘发病的相关因素,为本地区儿童哮喘的防治提供参考.方法 第三军医大学西南医院儿科哮喘门诊或儿科病房2007年6月至2010年12月期间哮喘儿童413例,采取数字随机法选取同期来院体检及就诊非哮喘的儿童420例作为对照.通过问卷调查和做相应检查,追溯哮喘病史并随访其治疗情况,寻找哮喘发病的相关因素.结果 调查10个相关因素,经非条件Logistic回归分析,最终发现6个危险因素和2个保护因素,其中家族遗传性、特应性体质、皮肤点刺阳性、总IgE是儿童哮喘的高危因素.结论 遗传因素与哮喘密切相关,特应性体质是儿童哮喘的危险因素.母乳喂养和HAV感染为儿童哮喘保护因素.%Objective To study the correlative factors for asthma children in Chongqing, China. Methods This study enrolled 413 children with asthma from the Pediatric Asthma Outpatient Department and Pediatric Ward of Southwest Hospital of Third Military Medical University from Jun. 2007 to Dec. 2010. The control group included 420 children without asthma from the outpatients and the children who accepted physical examination in our hospital. Questionnaires, examinations, asthma history analysis and following-up were used to determine the correlative factors for asthma children. Results Totally 10 factors were investigated and analyzed by Logistic regression analysis. Two factors were excluded and eight factors were included in the end, and among the eight factors there were six risk factors and two protective factors. High risk factors included heredofamilial asthma, atopic constitution, positive response to skin prick test (SPT), and high total IgE level. Conclusion Genetic factors are closely correlated with the development of asthma children, and atopic constitution is a risk factor for asthma children. Protective factors include breastfeeding and hepatitis A virus (HAV) infection.

  8. 小儿腹股沟疝嵌顿因素的Logistic回归分析%Logistic regression analysis on factors affecting incarcerated pediatric inguinal hernia

    Institute of Scientific and Technical Information of China (English)

    苗春林; 王誉都

    2013-01-01

    目的 分析可能影响小儿腹股沟疝嵌顿的危险因素,筛选出相互独立的可能导致小儿腹股沟疝嵌顿的危险因素.方法 回顾性分析2006年1月至2012年4月采用腹腔镜治疗小儿腹股沟疝1368例的临床资料,采用Logistic回归分析可能影响小儿腹股沟疝嵌顿的危险因素.结果 单因素分析结果显示患儿年龄、是否早产、外环口内径、腹股沟管长度、内环口内径及内环口类型是小儿腹股沟疝嵌顿的主要危险因素(P均<0.05).Logistic回归分析结果显示,患儿年龄、腹股沟管长度及外环口内径是影响小儿腹股沟疝嵌顿的主要危险因素(P均<0.05).结论 小儿腹股沟疝嵌顿多发生在幼儿年龄阶段,外环口的内径和腹股沟管的长度是影响小儿腹股沟疝嵌顿的独立危险因素.%Objective To investigate the factors affecting incarcerated pediatric inguinal hernia.Methods Clinical data of 1368 cases of pediatric inguinal hernia treated with laparoscopy from January 2006 to April 2012 were analyzed retrospectively.Logistic regression analysis was applied to analyze the factors affecting incarcerated pediatric inguinal hernia.Results Univariate analysis showed that six factors were related to incarcerated pediatric inguinal hernia,including age,premature birth,inside diameter of superficial inguinal ring,inside diameter of deep inguinal ring,length of inguinal canal and type of deep inguinal ring (P<0.05).Multivariate analysis showed that age,length of inguinal canal and inside diameter of superficial inguinal ring were the independent risk factors affecting incarcerated pediatric inguinal hernia (P<0.05).Conclusions Patients' age,length of inguinal canal and inside diameter of superficial inguinal ring are the independent risk factors affecting incarcerated pediatric inguinal hernia.

  9. Logistic regression analysis on risk factors of fetal growth retardation%运用 Logistic 回归分析探讨胎儿生长受限的高危因素

    Institute of Scientific and Technical Information of China (English)

    肖云山

    2014-01-01

    Objective:To study the correlated factors for fetal growth retardation ( FGR) and evaluate the association between variables and pregnancy outcomes .Methods:A case-control study was conducted in this hospital of Xiamen Maternity and Child Health Care Hospital based on the data of patients treated from 1.1.2011 to 12.31.2011.Chi-square test , independent-Sample t Test and multivariable unconditional Logistic regression analysis were used to evaluate the association between variables and pregnancy outcomes .Results:According to the analysis , the factors associated with FGR were as follows:age(P=0.047,OR=0.949,CI 0.901-0.999),anemia(P=0.008,OR=1.354, CI 0.164-0.766), hypamnios(P=0.034,OR=2.530,CI 1.074-5.964 )and placental abnormality (P=0.015,OR=2.337,CI 1.180-4.626).Conclusion:Anemia, hypamnios and placental abnormality are reasons correlatted with FGR.Active prevention targeting to the above-mentioned high risk factors can reduce the occurrence of FGR .%目的:探讨胎儿生长受限的影响因素及其与妊娠结局的关联性。方法:以厦门市妇幼保健院2011年1月1日至2011年12月31日的分娩临床资料作为样本进行病例对照研究,数据经χ2检验、独立样本t检验和多因素非条件Logistic回归进行分析。结果:胎儿生长受限的影响因素有:孕妇年龄( P=0.047,OR =0.949,CI 0.901~0.999)、贫血(P=0.008,OR =1.354,CI 0.164~0.766)、羊水过少(P =0.034,OR =2.530,CI 1.074~5.964)及胎盘异常(P=0.015,OR=2.337,CI 1.180~4.626)等。结论:贫血、胎盘异常、羊水过少是与胎儿生长受限有关的主要因素,针对相关因素积极干预有助于防治胎儿生长受限。

  10. Risk factors related to cancer by Logistic regressive analysis%恶性肿瘤危险因素的Logistic回归分析

    Institute of Scientific and Technical Information of China (English)

    邱惠; 张艳; 雷海科; 冯长艳; 何美; 周琦

    2012-01-01

    目的:了解恶性肿瘤的患病现况,探讨恶性肿瘤发病的危险因素,为制定恶性肿瘤的预防策略提供科学依据.方法:将确诊为恶性肿瘤患者的30人作为病例组,采用随机抽样的方法从4 290份正常人群的资料中抽取出430人作为对照组.采用统计软件SPSS17.0对460人的资料,先进行单因素分析,然后采用多因素非条件Logistic回归模型筛选主要危险因素.结果:恶性肿瘤的患病率为0.69%,单因素分析显示年龄、年均收入、体重指数、文化程度、肿瘤家族史、水果蔬菜的饮食量、含油和脂肪多的食物频率、吸烟、饮酒及活动时间与肿瘤的发生有一定的关联,多因素显示年龄、体重指数、肿瘤家族史、水果蔬菜的饮食量、含油和脂肪多的食物、吸烟及活动时间是恶性肿瘤的危险因素.结论:肿瘤与生活方式有关,改变不良生活方式可以在一定程度上降低患肿瘤的风险.%Objective:To investigate the prevalence of cancer,assess the risk factors of cancer and offer the bases for making inter-ventional measures. Methods:30 cancer patients were taken as case group,and with the method of random sampling from the crowd of 4 290 normal data,430 people were extracted as control group. First, Chi-square and t-test were used to discuss the relationship between the various factors,and then Logistic regression analysis was made by statistics software SPSS17.0 based on the results of the 460 people. Results:The prevalence of cancer was 0.69%. Single factor analysis results showed cancer was relate to age,average annual income,BMI,educational level,family history of cancer,frequency of fruit and vegetable diet,oily and fatty food,smoking,drinking and activity time. Multivariate analysis showed that age,BMI,family history of cancer,frequency of fruit and vegetable diet,oily and fatty food,smoking and activity time were independent risk factors of cancer. Conclusion:Cancer is a kind of

  11. The Logistic Regression Analysis on Gender Stereotype of Graduates Employment%大学生就业性别刻板印象的Logistic回归研究

    Institute of Scientific and Technical Information of China (English)

    王炳成; 王俐; 王森

    2016-01-01

    The study processes data of 264 valid questionnaires and samples from 3 universities in Qingdao, which draws 5 facets of stereotype by factor analysis:female occupational stereotype,male occupational stere-otype,female characteristics stereotype,male characteristics stereotype,and behavior stereotype.The study discusses the relationship between the stereotype and graduates employment using Logistic regression analysis. Results show that (1 )Female occupational stereotype has a significantly negative effect on graduates employ-ment (p<0.05,β=-0.229),and with one point increasing of the measured value,the probability about“signing of graduates employment agreement /non-signing of graduates employment agreement”will reduce 7.96% (Exp(B)=0.796).(2)Male characteristics stereotype has a significantly positive effect on gradu-ates employment (p<0.01,β=0.429),and with one point increasing of the measured value,the probabili-ty about “signing of graduates employment agreement /non-signing of graduates employment agreement”will increase 15.36% (Exp(B)=1.536).%以青岛3所高校的毕业生为研究对象进行样本抽取与问卷调研,对获得的264份有效问卷进行数据处理,通过因子分析得到刻板印象的5个构面:女性职业刻板印象、男性职业刻板印象、女性特质刻板印象、男性特质刻板印象与行为刻板印象。在此基础上运用Logistic回归分析的方法研究刻板印象与毕业生就业情况之间的关系的结果表明:女性职业刻板印象对毕业生就业情况有显著性负向影响,且毕业生在该构面的测量值上每增加1分,“毕业生签订就业协议比未签订就业协议”的概率将减少7.96%;男性特质刻板印象构面对毕业生就业情况有显著性正向影响,且毕业生在男性特质刻板印象的测量值上每增加1分,“毕业生签订就业协议比未签订就业协议”的概率将增加15.36%。

  12. Logistic regression analysis on risk factors of fetal growth restriction%胎儿生长受限高危因素的Logistic回归分析

    Institute of Scientific and Technical Information of China (English)

    陈忠; 许建娟

    2012-01-01

    Objective: To explore the risk factors of fetal growth restriction (FGR) . Methods: A 1: 1 matched case - control study was conducted. The clinical data of 276 neonates with FGR (FGR group) and 276 neonates with normal weight (normal group) born in the hospital and the parturient women in 2011 were analyzed retrospectively. Results; There were statistically significant differences in maternal age, educational level, family monthly income, smoking, drinking tea, history of gestational infection, gestational complications, and fetal age of neonates between FGR group and normal group ( P < 0.05) . Taking neonatal weight as dependent variable, and taking the other factors as independent variables, non - conditional logistic regression analysis was conducted, the results showed that there were statistically significant differences in maternal age, educational level, family monthly income, smoking, drinking tea, history of gestational infection, gestational complications, and fetal age of neonates (P < 0.05) . Conclusion: High maternal age, low educational level, low family monthly income, smoking, drinking tea, history of gestational infection, gestational complications, and low fetal age of neonates are high risk factors of FGR. Active prevention targeting to the above - mentioned high risk factors during gestational period should be conducted to reduce the occurrence of FGR.%目的:探讨胎儿生长受限(FGR)发生的高危因素.方法:采用1∶1配比的病例对照研究设计,对2011年在该院分娩的276例FGR新生儿(FGR组)及276例正常体重新生儿(正常组)及其产妇病历资料进行回顾性分析.结果:FGR组与正常组在产妇年龄、文化程度、家庭月收入、吸烟、饮茶、妊娠期感染史、妊娠期合并症、妊娠期并发症、新生儿胎龄等单因素方面比较存在差异,差异有统计学意义(P<0.05);以新生儿体重为因变量,其他因素为自变量,进行非条件性的Logistic多因素分析,结果

  13. Comparison and validation of Logistic Regression and Analytic Hierarchy Process models of landslide susceptibility in monoclinic regions. A case study in Moldavian Plateau, N-E Romania

    Science.gov (United States)

    Ciprian Margarint, Mihai; Niculita, Mihai

    2014-05-01

    The regions with monoclinic geological structure are large portions of earth surface where the repetition of similar landform patterns is very distinguished, the scarps of cuestas being characterized by similar values of morphometrical variables. Landslides are associated with these scarps of cuestas and consequently, a very high value of landslide susceptibility can be reported on its surface. In these regions, landslide susceptibility mapping can be realized for the entire region, or for test areas, with accurate, reliable, and available datasets, concerning multi-temporal inventories and landslide predictors. Because of the similar geomorphologic and landslide distribution we think that if any relevance of using test areas for extrapolating susceptibility models is present, these areas should be targeted first. This study case try to establish the level of usability of landslide predictors influence, obtained for a 90 km2 sample located in the northern part of the Moldavian Plateau (N-E Romania), in other areas of the same physio-geographic region. In a first phase, landslide susceptibility assessment was carried out and validated using logistic regression (LR) approach, using a multiple landslide inventory. This inventory was created using ortorectified aerial images from 1978 and 2005, for each period being considered both old and active landslides. The modeling strategy was based on a distinctly inventory of depletion areas of all landslide, for 1978 phase, and on a number of 30 covariates extracted from topographical and aerial images (both from 1978 and 2005 periods). The geomorphometric variables were computed from a Digital Elevation Model (DEM) obtained by interpolation from 1:5000 contour data (2.5 m equidistance), at 10x10 m resolution. Distance from river network, distance from roads and land use were extracted from topographic maps and aerial images. By applying Akaike Information Criterion (AIC) the covariates with significance under 0.001 level

  14. Logistic regression analysis on risk factors for vascular dementia following cerebral infarction in 403 patients from Chongqing City Hospital and family follow-up studies

    Institute of Scientific and Technical Information of China (English)

    Hong Yang; Jingcheng Li; Huadong Zhou

    2007-01-01

    deficit scoring was carried out with the National Institutes of Health Stroke Scale.⑤Chi-square test was used for categorical variable, and t test for quantitative variable between dementia group and non-dementia group. Dementia-related factors were performed multiple-factor Logistic regression model analysis. MAIN OUTCOME MEASURES: Incidence of dementia and dementia-related risk factors of patients. RESULTS: Altogether 546 patients with stroke were involved in this study, 403 of them participated in the final analysis, and 143 dropped out. A total of 342 were followed-up in the hospital and 61 at home. At 3 months after cerebral infarction, vascular dementia occurred in 87 (21.6%) of 403 patients. The main risk factors were age (OR 1.179; 95%CI 1.130 - 1.230), low education level (OR 1.806; 95%CI 1.024 - 3.186), daily alcohol drinking (OR 3.447; 95%CI 1.591 - 7.468), stroke history (OR 2.531; 95%CI 1.419-4.512), atrial fibrilation(OR 3.475; 95%CI 1.712 - 7.057), dysphonia (OR 5.873; 95%CI 2.620 - 13.163) and left carotid artery infarction (OR 1.975; 95%CI 1.152 - 3.388).CONCLUSION: The incidence of vascular dementia is determined by synthetic action of multiple risk factors. Dysphonia is the most important influencing factor.

  15. Logistic regression analysis of risk factors of Type 2 diabetes mellitus complicated with cardiovascular disease factors%2型糖尿病并发心血管疾病危险因素Logistic 回归分析

    Institute of Scientific and Technical Information of China (English)

    刘明哲

    2015-01-01

    Objective To explore the main risk factors of type 2 diabetes mellitus(T2DM)complicated with cardiovascular disease(CVD).Methods The T2DMof 128 cases of CVD associated with CVD group were selected, the patients with T2DM 107 cases were selected as control group,used Logistic regression method for the analysis of the risk factors of concurrent CVD.Results The risk of CVD in patients with a family history of T2DM was 1.535 times of that of the other patients (OR =1.535,95%CI =1.145,2.057,P =0.036),the vegetarian diet patients was 41.3% (OR =0.413,95%CI =0.210,0.815,P =0.024),in patients with hypertension was 2.077 times (OR =2.077,95%CI =1.301,2.813,P =0.010).T2DM patients with TG,PBG,LDL -C,HDL -C per 1mmol/L rise,the risk of concurrent CVD was 1.192 times of that of the other patients (OR =1.192,95%CI 1.012,1.372, P =0.023),1.125 times(OR =1.125,95%CI =1.043,1.218,P =0.028),1.712 times (OR =1.712,95%CI =1.203,2.231,P =0.009)and 42.6% (OR =0.426,95%CI =0.239,0.776,P =0.011);HbA1c increased every 1%,the risk of concurrent CVD was 1.284 times of that of theother patients (OR =1.284,95%CI =1.132,1.413, P =0.013);BMI increased by 1kg/m2 ,the risk of concurrent CVD was 1.508 times of that of the other patients (OR =1.508,95%CI =1.143,1.825,P =0.026);C2 increased by 1mL/mmHg ×100,the risk was the other patient's 33.9% (OR =1.508,95%CI =1.143,1.825,P =0.026).Conclusion Family history of T2DM,hypertension, TG,PBG,LDL -C,HbA1c and BMI are major risk factors for T2DMwith CVD;vegetarian diet,HDL -C and C2 are protective factors.%目的:探讨2型糖尿病(T2DM)并发心血管疾病(CVD)的主要危险因素。方法选择 T2DM合并 CVD 患者128例为 CVD 组,单纯 T2DM患者107例为对照组,采用 Logistic 回归方法对其并发 CVD 的危险因素进行分析。结果有 T2DM家族史、素食膳食或高血压患者并发 CVD 的危险为其他患者的1.535倍(OR =1.535,95%CI =1.145,2.057,P =0.036)、41.3%(OR =0

  16. Acquiring data for large aquatic resource surveys: the art of ompromise among science, logistics, and reality

    Science.gov (United States)

    The US Environmental Protection Agency (EPA) is revising its strategy to obtain the information needed to answer questions pertinent to water-quality management efficiently and rigorously at national scales. One tool of this revised strategy is use of statistically based surveys ...

  17. Acquiring data for large aquatic resource surveys: the art of ompromise among science, logistics, and reality

    Science.gov (United States)

    The US Environmental Protection Agency (EPA) is revising its strategy to obtain the information needed to answer questions pertinent to water-quality management efficiently and rigorously at national scales. One tool of this revised strategy is use of statistically based surveys ...

  18. Unitary Response Regression Models

    Science.gov (United States)

    Lipovetsky, S.

    2007-01-01

    The dependent variable in a regular linear regression is a numerical variable, and in a logistic regression it is a binary or categorical variable. In these models the dependent variable has varying values. However, there are problems yielding an identity output of a constant value which can also be modelled in a linear or logistic regression with…

  19. Tramvay Yolcu Memnuniyetinin Lojistik Regresyon Analiziyle Ölçülmesi: Estram Örneği(Measuring the Traveller Satisfaction of Tram Using Logistic Regression: A Case Study of Estram

    Directory of Open Access Journals (Sweden)

    Nuray GİRGİNER

    2008-01-01

    Full Text Available In this study, it has been investigated traveller satisfaction about the tram which is one of the mass transportation vehicles on case of Eskisehir’s Tram System (Estram using Binomial Logistic Regression Analysis. Eskisehir’s population have become dense on students and their’s satisfactions as traveller have important. So, sample of this study has formed from 300 students of Anatolia University and Eskisehir Osmangazi University which are in Eskisehir and they have selected with Simple Random Sampling. As a consequence, utilizing some of subjective and objective variables, it is investigated whether or not Estram satisfies these students. Considering latent variable about satisfaction at the binomial level, binomial logistic regression is implemented about student satisfaction. The result of analysis showed that whole independent variables had negative effect on the satisfaction of students about Estram.

  20. 学龄前儿童单纯性肥胖影响因素的Logistic回归分析%Logistic regression analysis on effect factors of simple obesity of preschool children

    Institute of Scientific and Technical Information of China (English)

    沙海滨; 贺圣文; 王燕琳; 王素珍; 周健; 王琳琳

    2011-01-01

    目的:探讨学龄前儿童单纯性肥胖的影响因素,为制定儿童肥胖预防措施提供依据.方法:采用随机整群抽样的方法,对3所幼儿园学龄前儿童家长进行削卷调查并对结果进行分析.结果:经多因素条件Logistic回归分析,吃油炸食物(OR=1.804)、吃零食(OR=0.095)、偏食(OR=1.797)、食量大(OR=9.130)、父母对孩子肥胖的认知(OR=11.050)等5个因素为学龄前儿童单纯性肥胖的主要影响因素.结论:学龄前儿童不良饮食行为和生活习惯与单纯性肥胖密切相关.%Objective: To explore the effect factors of simple obesity of preschool children, provide a basis for making preventive measures of obesity among children. Methods: A random cluster sampling method was used to survey the parents of preschool children from 3 kindergartens, then the results were analyzed. Results: Multi -factor conditional logistic regression analysis showed that eating fried food ( OR = 1. 804 ), eating snacks ( OR = 0. 095 ), monophagia ( OR = 1.797 ), overeating ( OR = 9. 130 ), the cognition of parents on the obese children ( OR = 11. 050 ) were main effect factors of simple obesity of preschool children. Conclusion: Poor dietary behaviors and living habits of preschool children are related to simple obesity closely.

  1. Logistic and linear regression model documentation for statistical relations between continuous real-time and discrete water-quality constituents in the Kansas River, Kansas, July 2012 through June 2015

    Science.gov (United States)

    Foster, Guy M.; Graham, Jennifer L.

    2016-04-06

    The Kansas River is a primary source of drinking water for about 800,000 people in northeastern Kansas. Source-water supplies are treated by a combination of chemical and physical processes to remove contaminants before distribution. Advanced notification of changing water-quality conditions and cyanobacteria and associated toxin and taste-and-odor compounds provides drinking-water treatment facilities time to develop and implement adequate treatment strategies. The U.S. Geological Survey (USGS), in cooperation with the Kansas Water Office (funded in part through the Kansas State Water Plan Fund), and the City of Lawrence, the City of Topeka, the City of Olathe, and Johnson County Water One, began a study in July 2012 to develop statistical models at two Kansas River sites located upstream from drinking-water intakes. Continuous water-quality monitors have been operated and discrete-water quality samples have been collected on the Kansas River at Wamego (USGS site number 06887500) and De Soto (USGS site number 06892350) since July 2012. Continuous and discrete water-quality data collected during July 2012 through June 2015 were used to develop statistical models for constituents of interest at the Wamego and De Soto sites. Logistic models to continuously estimate the probability of occurrence above selected thresholds were developed for cyanobacteria, microcystin, and geosmin. Linear regression models to continuously estimate constituent concentrations were developed for major ions, dissolved solids, alkalinity, nutrients (nitrogen and phosphorus species), suspended sediment, indicator bacteria (Escherichia coli, fecal coliform, and enterococci), and actinomycetes bacteria. These models will be used to provide real-time estimates of the probability that cyanobacteria and associated compounds exceed thresholds and of the concentrations of other water-quality constituents in the Kansas River. The models documented in this report are useful for characterizing changes

  2. Basic Diagnosis and Prediction of Persistent Contrail Occurrence using High-resolution Numerical Weather Analyses/Forecasts and Logistic Regression. Part I: Effects of Random Error

    Science.gov (United States)

    Duda, David P.; Minnis, Patrick

    2009-01-01

    Straightforward application of the Schmidt-Appleman contrail formation criteria to diagnose persistent contrail occurrence from numerical weather prediction data is hindered by significant bias errors in the upper tropospheric humidity. Logistic models of contrail occurrence have been proposed to overcome this problem, but basic questions remain about how random measurement error may affect their accuracy. A set of 5000 synthetic contrail observations is created to study the effects of random error in these probabilistic models. The simulated observations are based on distributions of temperature, humidity, and vertical velocity derived from Advanced Regional Prediction System (ARPS) weather analyses. The logistic models created from the simulated observations were evaluated using two common statistical measures of model accuracy, the percent correct (PC) and the Hanssen-Kuipers discriminant (HKD). To convert the probabilistic results of the logistic models into a dichotomous yes/no choice suitable for the statistical measures, two critical probability thresholds are considered. The HKD scores are higher when the climatological frequency of contrail occurrence is used as the critical threshold, while the PC scores are higher when the critical probability threshold is 0.5. For both thresholds, typical random errors in temperature, relative humidity, and vertical velocity are found to be small enough to allow for accurate logistic models of contrail occurrence. The accuracy of the models developed from synthetic data is over 85 percent for both the prediction of contrail occurrence and non-occurrence, although in practice, larger errors would be anticipated.

  3. FINANCIAL EARLY-WARNING MODEL OF LISTED COMPANIES USING T-LOGISTIC REGRESSION%上市公司财务预警的T逻辑回归模型

    Institute of Scientific and Technical Information of China (English)

    徐征; 刘遵雄

    2015-01-01

    The classic logistic regression has the risk of over fitting. It can be solved by the regularization technique of the statistical learning theory. Optimization of convex loss function can ensure that the regularized risk minimization problem converges to the global optimum, but learning algorithm of convex loss function is susceptible to noise. Then T-logistic regression was proposed to amend, introducing T distribution into logistic regression. The non-convex loss function is made up for the deficiency of convex loss functions. Due to the non-convex loss function difficulty to solve, we will be logarithmic the objective function, and convex multiplicative programming is used to solver parameters. Through empirical study, it is found that T-logistic regression model has a good predictability and is tolerant to label noise.%针对经典的逻辑回归模型易受到样本类别噪声干扰的问题,采用T逻辑回归算法中的非凸损失函数以弥补这一不足。对T逻辑回归模型及求解算法进行了分析,建立T逻辑回归财务预警模型,并结合沪深上市公司财务数据开展实证分析,结果表明T逻辑回归模型具有较好的分类效果和鲁棒性。

  4. Algılanan Hizmet Kalitesi ve Lojistik Regresyon Analizi ile Hizmet Tercihine Etkisinin Belirlenmesi (Perceived Service Quality and Determination of the Effect on Service Preference with Logistic Regression Analysis

    Directory of Open Access Journals (Sweden)

    Mehmet AKSARAYLI

    2011-01-01

    Full Text Available In this study, students’ perceived service quality level of Dokuz Eylul University (DEU Buca Girl Dormitory Service is investigated by using SERVQUAL scale, which is a common service quality measure. Impacts of the dimensions of perceived service quality, which are tangibles, reliability, responsiveness, assurance, empathy, on preference and recommendation are investigated by logistic regression analysis. As a result, it is concluded that perceived service quality has impacts on preference and recommendation of dormitory service.

  5. Electric Vehicles in Logistics and Transportation: A Survey on Emerging Environmental, Strategic, and Operational Challenges

    Directory of Open Access Journals (Sweden)

    Angel Alejandro Juan

    2016-01-01

    Full Text Available Current logistics and transportation (L&T systems include heterogeneous fleets consisting of common internal combustion engine vehicles as well as other types of vehicles using “green” technologies, e.g., plug-in hybrid electric vehicles and electric vehicles (EVs. However, the incorporation of EVs in L&T activities also raise some additional challenges from the strategic, planning, and operational perspectives. For instance, smart cities are required to provide recharge stations for electric-based vehicles, meaning that investment decisions need to be made about the number, location, and capacity of these stations. Similarly, the limited driving-range capabilities of EVs, which are restricted by the amount of electricity stored in their batteries, impose non-trivial additional constraints when designing efficient distribution routes. Accordingly, this paper identifies and reviews several open research challenges related to the introduction of EVs in L&T activities, including: (a environmental-related issues; and (b strategic, planning and operational issues associated with “standard” EVs and with hydrogen-based EVs. The paper also analyzes how the introduction of EVs in L&T systems generates new variants of the well-known Vehicle Routing Problem, one of the most studied optimization problems in the L&T field, and proposes the use of metaheuristics and simheuristics as the most efficient way to deal with these complex optimization problems.

  6. Logistic support provided to Australian disaster medical assistance teams: results of a national survey of team members

    Directory of Open Access Journals (Sweden)

    Peter Aitken

    2012-02-01

    Full Text Available It is likely that calls for disaster medical assistance teams (DMATs continue in response to international disasters. As part of a national survey, the present study was designed to evaluate the Australian DMAT experience and the need for logistic support.Data were collected via an anonymous mailed survey distributed via State and Territory representatives on the Australian Health Protection Committee, who identified team members associated with Australian DMAT deployments from the 2004 Asian Tsunami disaster.The response rate for this survey was 50% (59/118. Most of the personnel had deployed to the South East Asian Tsunami affected areas. The DMAT members had significant clinical and international experience. There was unanimous support for dedicated logistic support with 80% (47/59 strongly agreeing. Only one respondent (2% disagreed with teams being self sufficient for a minimum of 72 hours. Most felt that transport around the site was not a problem (59%; 35/59, however, 34% (20/59 felt that transport to the site itself was problematic. Only 37% (22/59 felt that pre-deployment information was accurate. Communication with local health providers and other agencies was felt to be adequate by 53% (31/59 and 47% (28/59 respectively, while only 28% (17/59 felt that documentation methods were easy to use and reliable. Less than half (47%; 28/59 felt that equipment could be moved easily between areas by team members and 37% (22/59 that packaging enabled materials to be found easily. The maximum safe container weight was felt to be between 20 and 40 kg by 58% (34/59.This study emphasises the importance of dedicated logistic support for DMAT and the need for teams to be self sufficient for a minimum period of 72 hours. There is a need for accurate pre deployment information to guide resource prioritisation with clearly labelled pre packaging to assist access on site. Container weights should be restricted to between 20 and 40 kg, which would assist

  7. Basic Diagnosis and Prediction of Persistent Contrail Occurrence using High-resolution Numerical Weather Analyses/Forecasts and Logistic Regression. Part II: Evaluation of Sample Models

    Science.gov (United States)

    Duda, David P.; Minnis, Patrick

    2009-01-01

    Previous studies have shown that probabilistic forecasting may be a useful method for predicting persistent contrail formation. A probabilistic forecast to accurately predict contrail formation over the contiguous United States (CONUS) is created by using meteorological data based on hourly meteorological analyses from the Advanced Regional Prediction System (ARPS) and from the Rapid Update Cycle (RUC) as well as GOES water vapor channel measurements, combined with surface and satellite observations of contrails. Two groups of logistic models were created. The first group of models (SURFACE models) is based on surface-based contrail observations supplemented with satellite observations of contrail occurrence. The second group of models (OUTBREAK models) is derived from a selected subgroup of satellite-based observations of widespread persistent contrails. The mean accuracies for both the SURFACE and OUTBREAK models typically exceeded 75 percent when based on the RUC or ARPS analysis data, but decreased when the logistic models were derived from ARPS forecast data.

  8. The Relationship between Logistics Sophistication and Drivers of the Outsourcing of Logistics Activities

    Directory of Open Access Journals (Sweden)

    Peter Wanke

    2008-10-01

    Full Text Available A strong link has been established between operational excellence and the degree of sophistication of logistics organization, a function of factors such as performance monitoring, investment in Information Technology [IT] and the formalization of logistics organization, as proposed in the Bowersox, Daugherty, Dröge, Germain and Rogers (1992 Leading Edge model. At the same time, shippers have been increasingly outsourcing their logistics activities to third party providers. This paper, based on a survey with large Brazilian shippers, addresses a gap in the literature by investigating the relationship between dimensions of logistics organization sophistication and drivers of logistics outsourcing. To this end, the dimensions behind the logistics sophistication construct were first investigated. Results from factor analysis led to the identification of six dimensions of logistics sophistication. By means of multivariate logistical regression analyses it was possible to relate some of these dimensions, such as the formalization of the logistics organization, to certain drivers of the outsourcing of logistics activities of Brazilian shippers, such as cost savings. These results indicate the possibility of segmenting shippers according to characteristics of their logistics organization, which may be particularly useful to logistics service providers.

  9. 乳腺实性肿块超声诊断的Logistic回归分析%Application of the binary Logistic regression mode to analyze ultrasonographic features of the solid breast tumors

    Institute of Scientific and Technical Information of China (English)

    曾婕; 罗葆明; 智慧; 杨海云

    2008-01-01

    Objective To evaluate the application of the binary Logistic regression model to analyze ultrasonographie indexes of the solid breast tumors. Methods The indexes of two dimensional gray scale ultrasonography,two dimensional color Doppler flow imaging,three dimensional gray scale ultrasonography, three dimensional color Doppler flow imaging and ultrasonic elastography were evaluated in 151 breast lesions confirmed by surgical pathology. A Logistic regression model for predicting breast rnalignaney on the basis of ultrasonographic indexes was obtained. A receiver operating characteristic (ROC) curve was used to assess the performance of the Logistic regression model. Results Six ultrasonic indexes were finally entering the Logistic regression model. They were elasticity score, shape,internal echo, RI, enhancement of posterior acoustic alteration and the converging pattern in the coronal plane. The area under the ROC curve was 0. 996. The percentage correct of prediction was 97.35 %. Conclusions The multivariate analysis model of binary Logistic regression can describe and analyze the process of differential diagnosis of malignant and benign solid breast tumors by ultrasonography and can select out the valuable indexes of differential diagnosis.%目的 应用二分类Logistic回归模型分析乳腺肿块良恶性的超声鉴别诊断.方法 选择经手术病理证实的151个乳腺病灶的二维灰阶超声、二维彩色多普勒超声、三维灰阶超声、三维彩色多普勒超声、超声弹性成像的各诊断指标进行多因素回归分析,建立Logistic模型.用ROC曲线法评价Logistic模型的预报能力.结果 经前进法逐步回归的多变量二分类Logistic回归分析,筛选引入方程的超声检查指标包括:弹性成像评分、形态、内部回声、阻力指数、后方回声和汇聚征.Logistic模型的预报正确率为97.35%,ROC曲线下面积为0.996.结论 二分类Logistic回归多元分析模型能很好地描述和分析

  10. Large scale landslide susceptibility assessment using the statistical methods of logistic regression and BSA – study case: the sub-basin of the small Niraj (Transylvania Depression, Romania

    Directory of Open Access Journals (Sweden)

    S. Roşca

    2015-11-01

    Full Text Available The existence of a large number of GIS models for the identification of landslide occurrence probability makes difficult the selection of a specific one. The present study focuses on the application of two quantitative models: the logistic and the BSA models. The comparative analysis of the results aims at identifying the most suitable model. The territory corresponding to the Niraj Mic Basin (87 km2 is an area characterised by a wide variety of the landforms with their morphometric, morphographical and geological characteristics as well as by a high complexity of the land use types where active landslides exist. This is the reason why it represents the test area for applying the two models and for the comparison of the results. The large complexity of input variables is illustrated by 16 factors which were represented as 72 dummy variables, analysed on the basis of their importance within the model structures. The testing of the statistical significance corresponding to each variable reduced the number of dummy variables to 12 which were considered significant for the test area within the logistic model, whereas for the BSA model all the variables were employed. The predictability degree of the models was tested through the identification of the area under the ROC curve which indicated a good accuracy (AUROC = 0.86 for the testing area and predictability of the logistic model (AUROC = 0.63 for the validation area.

  11. 基于 Logistic 回归的税务稽查选案模型研究%Research on the tax inspection selection scheme model based on the Logistic regression

    Institute of Scientific and Technical Information of China (English)

    王艳杰; 李清; 齐鑫鑫

    2012-01-01

      The traditional artificial Tax-checking sampling, with artificial factors, lack scientific sex and accuracy of malpractice.Establish inspection model, using computer automatic selection, can avoid the defect and improve work efficiency.Initially we select 9 financial indicators, through screening, eventually using 8 indices established a tax inspection Cases-Choice Logistic discriminant model, the sample back with total accuracy up to 79.6%.%  传统的人工税务稽查选案,具有人为因素大、缺乏科学性和准确度等弊端.建立稽查选案模型、采用计算机自动选案,可以避免上述弊端并提高工作效率.最初选取了9个财务指标,经过筛选,最终使用8个指标建立了税务稽查选案的 Logistic 判别模型,样本回带总准确率达79.6%.

  12. Logistic Regression Analysis of Gallbladder Lesions of≥1 cm in Diameter Diagnosed by Ultrasound%超声诊断直径≥1 cm胆囊病变性质的Logistic回归分析

    Institute of Scientific and Technical Information of China (English)

    陈晓然; 唐少珊; 于冬梅; 刘站

    2013-01-01

    目的通过建立直径≥1 cm胆囊隆起样病变超声诊断的Logistic回归模型,筛选有助于鉴别此类病变良、恶性的超声特征。资料与方法回顾性分析165例经病理证实的直径≥1 cm胆囊隆起样病变的超声特征,包括病灶数目、大小、形态及基底宽窄,是否合并胆囊结石,胆囊壁是否连续,彩色多普勒血流显像是否检出血流信号等,通过多因素回归分析建立二分类Logistic回归模型,评价Logistic回归模型预报此类病变良、恶性的效能。结果经过二分类Logistic回归分析,病变形态、基底宽窄、彩色多普勒血流显像是否检出血流信号3个特征变量进入Logistic回归模型,是鉴别诊断胆囊隆起样病变良、恶性的敏感指标。二分类Logistic回归模型预报直径≥1 cm胆囊隆起样病变良、恶性的准确度、敏感度、特异度分别为97.0%、93.8%、97.3%,ROC曲线下面积为0.979。结论二分类Logistic回归分析能够筛选出对鉴别诊断直径≥1 cm胆囊隆起样病变良、恶性有意义的超声特征,病变形态、基底宽窄及血流信号对鉴别诊断病变的良、恶性有重要价值。%Purpose To establish Logistic regression model of gallbladder lesions of≥1 cm in diameter diagnosed by ultrasound, and to filter benign and malignant sonographic features. Materials and Methods The sonographic features were retrospectively analyzed in 165 patients with gallbladde apophysis lesions of≥1 cm in diameter which confirmed by pathology, including the number of lesions, size, shape and basal width, gallstones, continuous gallbladder wall continuous, blood flow signals detected by color Doppler flow imaging. Logistic regression model with bipartition was established by multivariate Logistic regression analysis, and the efficiency of Logistic regression model was evaluated to predict benign or malignant of these lesions. Results Three characteristic variables, including

  13. Risk Factors for Birth Defects:A Conditional Logistic Regression Analysis of a Case-Control Study in Guang-dong Province of China

    Institute of Scientific and Technical Information of China (English)

    王志瑾; 穆荔

    1999-01-01

    In order to study risk factors and their association with birth defects,data were collected from 329 cases and 329 controls in 38 hospitals in Guangdong Province of China in 1988.Information was obtained from the same questionnaire(23 risk factors listed)of cases and controls.We used a multivariate logistic model,which described variables significantly increased risk of birth defects.The risk factors included maternal educa-tional levels,medicine taken during pregnancy and antenatal care.It was suggested to strengthen antenatal care was the main preventive measure against birth defects.

  14. Completing the Remedial Sequence and College-Level Credit-Bearing Math: Comparing Binary, Cumulative, and Continuation Ratio Logistic Regression Models

    Science.gov (United States)

    Davidson, J. Cody

    2016-01-01

    Mathematics is the most common subject area of remedial need and the majority of remedial math students never pass a college-level credit-bearing math class. The majorities of studies that investigate this phenomenon are conducted at community colleges and use some type of regression model; however, none have used a continuation ratio model. The…

  15. Completing the Remedial Sequence and College-Level Credit-Bearing Math: Comparing Binary, Cumulative, and Continuation Ratio Logistic Regression Models

    Science.gov (United States)

    Davidson, J. Cody

    2016-01-01

    Mathematics is the most common subject area of remedial need and the majority of remedial math students never pass a college-level credit-bearing math class. The majorities of studies that investigate this phenomenon are conducted at community colleges and use some type of regression model; however, none have used a continuation ratio model. The…

  16. Application of decision tree and logistic regression on the health literacy prediction of hypertension patients%决策树与Logistic回归在高血压患者健康素养预测中的应用

    Institute of Scientific and Technical Information of China (English)

    李现文; 李春玉; Miyong Kim; 李贞姬; 黄德镐; 朱琴淑; 金今姬

    2012-01-01

    目的 探讨和评价决策树与Logistic回归用于预测高血压患者健康素养中的可行性与准确性.方法 利用Logistic回归分析和Answer Tree软件分别建立高血压患者健康素养预测模型,利用受试者工作曲线(ROC)评价两个预测模型的优劣.结果 Logistic回归预测模型的灵敏度(82.5%)、Youden指数(50.9%)高于决策树模型(77.9%,48.0%),决策树模型的特异性(70.1%)高于Logistic回归预测模型(68.4%),误判率(29.9%)低于Logistic回归预测模型(31.6%);决策树模型ROC曲线下面积与Logistic回归预测模型ROC曲线下面积相当(0.813 vs 0.847).结论 利用决策树预测高血压患者健康素养效果与Logistic回归模型相当,根据决策树模型可以确定高血压患者健康素养筛选策略,数据挖掘技术可以用于慢性病患者健康素养预测中.%Objective To study and evaluate the feasibility and accuracy for the application of decision tree methods and logistic regression on the health literacy prediction of hypertension patients. Method Two health literacy prediction models were generated with decision tree methods and logistic regression respectively. The receiver operating curve ( ROC) was used to evaluate the results of the two prediction models. Result The sensitivity(82. 5%) , Youden index (50. 9%)by logistic regression model was higher than decision tree model(77. 9% ,48. 0%) , the Spe-cificity(70. 1%)by decision tree model was higher than that of logistic regression model(68. 4%), The error rate (29.9%) was lower than that of logistic regression model(31. 6%). The ROC for both models were 0. 813 and 0. 847. Conclusion The effect of decision tree prediction model was similar to logistic regression prediction model. Health literacy screening strategy could be obtained by decision tree prediction model, implying the data mining methods is feasible in the chronic disease management of community health service.

  17. Using Meta-Regression to Explore Moderating Effects in Surveys of International Achievement

    Science.gov (United States)

    Benton, Tom

    2014-01-01

    This article demonstrates how meta-analytic techniques, that have typically been used to synthesize findings across numerous studies, can also be applied to examine the reasons why relationships between background characteristics and outcomes may vary across different locations in a single multi-site survey. This application is particularly…

  18. 孤独症谱系障碍相关危险因素的 Logistic回归分析%Logistic regression analysis on risk factors of autism spectrum disorders

    Institute of Scientific and Technical Information of China (English)

    刘栋; 张淑云; 邹时朴; 冯昶; 范广勤

    2016-01-01

    non -ASD controls (normal children, matched on gender)in Jiangxi Children′s Hospital were selected to undergo the risk factor survey for ASD.The survey content included 1 0 categories:general status,birth,feeding,the past history,mother′s pregnancy and her health condi-tion during pregnancy and environmental exposure,parents′occupational exposure,family history and relevant test re-sults.Logistic regression analysis was performed to analyze the results of the survey.Results The possible risk factors for ASD increased if mother had virus infection 2 years before pregnant (OR =7.97,95%CI:2.42 -26.31 ),had occu-pational exposure (OR =3.99,95%CI:1 .27 -1 2.52),volatile organic compounds exposure during pregnancy (OR =22.21 ,95%CI:2.28 -21 6.09),as well as living closely to transport passage ways during pregnancy (OR =0.59,95%CI:0.38 -0.93)or having a family heredity history (OR =58.50,95%CI:5.81 -589.57).Breastfeeding (OR =0.81 ,95%CI:0.66 -0.98)might be a protective factor in ASD.Conclusions In addition to genetic factors,the ute-rine environment from conception to birth and growth environment play an important role in the pathogenesis of ASD.

  19. Construction of Financial Crisis Predicting Model for Listed Companies Based on Logistic Regression%基于Logistic回归的上市公司财务预警模型构建

    Institute of Scientific and Technical Information of China (English)

    吴英

    2011-01-01

    The paper passes certain finance index sign and the finance index sign data to construct the predicting model for listed companies by logistic regression analysis.Through examination,the model has proved to be of actual application value.%通过一定的财务指标,采用我国上市公司的财务数据,基于Logistic回归方法构建上市公司财务危机预警的模型,经过检验,具有一定的实际应用价值。

  20. 基于时空Logistic回归模型的漳州城市扩展预测分析%Urban Expansion Prediction for Zhangzhou City Based on GIS and Spatiotemporal Logistic Regression Model

    Institute of Scientific and Technical Information of China (English)

    杨云龙; 周小成; 吴波

    2011-01-01

    本文提出一种以时空Logistic回归模型来预测城市扩展的新方法。其首先在传统Logistic回归模型中加入空间自相关结构构建空间Logistic回归模型,然后,利用漳州市区近20年(1989-2009年)的数据,建立不同时期城市扩展模拟的多个子空间Logistic回归模型M1,再采用一次平滑指数法综合处理这些时间序列的Mi,构建出顾及空间复杂性和时间序复杂性的时空Logistic回归预测模型。新方法一方面克服了传统Logistic回归模型法受限于预测年份影响因素数据难以获取的缺点,另一方面由于模型考虑了城市扩展的长时间序列复杂性,即综合了城市扩展不同时期影响因素不同的情况,使它更接近城市扩展的实际,因而预测精度会提高。以福建省漳州市区为例,分别运用传统Logistic回归模型方法,在传统Logistic回归模型中单独加入空间自相关结构的空间Logistic回归模型法和基于时空Logistic回归模型的新方法这3种方法,对2009年城市扩展进行了预测分析。结果表明,基于时空Logistic模型的新方法比传统Logistic回归模型法和空间Logistic回归模型法的预测精度都要好,总体预测精度分别为81.02%、83.82%和87.00%,预测城市用地的精度从63.59%提高到67.35%和73.34%,ROC曲线下的面积AUC从0.826提高到0.883和0.924。%We start this study aimed at building a new method of spatiotemporal logistic regression model to predict urban expansion. This method first established a space Logistic regression model by adding autocorrelation structure based on the traditional logistic regression model, then built the multiple sub-space Logistic regression model Mi of urban growth simulation of different stages by Zhangzhou City's nearly 20 years (from 1989 to 2009) data. After this work, a spatiotemporal logistic regression model which took into account the spatial complexity and temporal

  1. Logistic regression analysis of damp-heat and cold-damp impeding syndrome of rheumatoid arthritis: a perspective in Chinese medicine.

    Science.gov (United States)

    Wang, Zhi-Zhong; Fang, Yong-Fei; Wang, Yong; Mu, Fang-Xiang; Chen, Jun; Zou, Qing-Hua; Zhong, Bing; Li, Jing-Yi; Bo, Gan-Ping; Zhang, Rong-Hua

    2012-08-01

    To investigate a method for quantitative differential diagnosis of damp-heat and cold-damp impeding syndrome of rheumatoid arthritis (RA) in Chinese medicine (CM). Laboratory parameters were collected from 306 patients with RA. The clinical symptoms and laboratory parameters were compared between patients with these two syndromes (158 with RA of damp-heat impeding syndrome, and 148 with RA of cold-damp impeding syndrome), and a regression equation was established to facilitate discrimination of the two RA syndromes. There were significant differences in disease activity score in 28 joints [DAS28 (4)], erythrocyte sedimentation rate (ESR), white blood cell count (WBC), C-reactive protein (CRP), platelet count (PLT), albumin (ALB) and globulin (GLB) between the two syndrome of RA (Pheat from cold-damp impeding syndrome. The regression equation was as follows: P=1/{1+exp[-(3.0-0.021X (1)-0.196X (2)-0.163X (3)-1.559X (4)+1.504X (5)-0.927X (6)-1.039X (7)+1.070X (8)+1.330X (9))]}. The independent variables X (1)-X (9) were ESR, WBC, CRP, hot joint, cold joint, thirst, sweating, aversion to wind and cold, and cold limbs. A P value > 0.5 signified cold-damp impeding syndrome, and a P value heat impeding syndrome. The accuracy was 90.2%. The regression equation may be useful for discriminating damp-heat from cold-damp impeding syndrome of RA.

  2. 献血者献血意愿及献血服务的Logistic回归研究%Donate willingness of blood donors and logistic regression to study blood donation service

    Institute of Scientific and Technical Information of China (English)

    赵燕

    2016-01-01

    ObjectiveExplore the donors of blood donation willingness and blood stations of zaozhuang services,to promote the sustainable development of unpaid blood donation.MethodsWith the method of questionnaire survey,from March to August in 2015,the basic situation of blood donors,blood donation,blood donors intend,source of information for blood donation knowledge,attitude to blood donation behavior,such as blood donation service for questionnaire survey. ResultsThis study as a sample,a total of 2 860 blood donors for the questionnaire survey,get 2 759 valid questionnaire responses received. Blood donors different gender,age,educational level,occupation,monthly income(yuan),the number of blood donation form than have significant difference(P<0.05)."Compassion" is the main blood donors will. Blood donation knowledge information sources are mainly "radio and television". Logistic regression analysis showed that sex,blood donation willingness,attitude,service is the main reason affecting the blood donors and respectively. ConclusionZaozhuang gender and blood donors blood donation willingness is the main reason for the effect of blood donation,blood stations staff service attitude is also the main reason for the effect of blood donation and blood environment.%目的:探讨献血者意愿和血站的献血服务,推动无偿献血工作的可持续发展。方法采用问卷调查的方式,于2015年3月—8月对来我站献血者的基本情况、献血经历、献血者意愿、获取献血知识的信息来源、献血服务等方面进行问卷调查。结果本研究共抽取2860献血者作为样本进行了问卷调查,得到2759份有效问卷。献血者不同性别、年龄、文化程度、职业、人月收入(元)、献血次数的构成比有显著性差异(P<0.05)。“奉献爱心”是献血者最主要的意愿。获取献血知识信息来源主要是“广播电视”。Logistic回归分析表明,性别、献血意愿、服务态度

  3. Multivariate Logistic Regression Analysis of Anxiety and Depression in Peritoneal Dialysis Patients%腹膜透析患者焦虑和抑郁的多因素Logistic回归分析

    Institute of Scientific and Technical Information of China (English)

    陈伟; 黄燕林

    2011-01-01

    Objective To study the risk factors of anxiety and depression in peritoneal dialysis patients and to provide evidence of psychological intervention on patients for clinical nurses.Methods 169 patients with peritoneal dialysis were surveyed with SelfRating Anxiety Scale (SAS) and Self-Rating Depression Scale (SDS).The data were analyzed by multi-factor Logistic regression analysis.Results Mean score of anxiety was (41.24±9.11) and depression (48.71±12.06).The incidences of anxiety and depression were 17.8% and 52.6% respectively.Independent factors for anxiety were working status, age, dry skin, skin itching, mid upper arm circumference.Independent factors for depression were education background, medical expenses, working status, appetite, grip strength, calf circumference, edema and skin itching.Conclusion Many factors contributed to anxiety and depression of peritoneal dialysis patients.Medical staff should pay more attention to the psychological status of peritoneal dialysis patients who with different conditions during the implementation of psychological intervention.%目的 探讨腹膜透析患者焦虑和抑郁状况及其危险因素,为临床护士对患者心理干预提供依据.方法 选取169例腹膜透析患者,应用Zung's 的焦虑自评量表、抑郁自评量表评估患者的焦虑和抑郁症状,并对影响因素进行单因素及多因素Logistic 回归分析.结果 患者焦虑得分为(41.24±9.11)分,抑郁得分为(48.71±12.06)分.焦虑发生率为17.8%,抑郁发生率为52.6%.焦虑发生的独立危险因素为工作状况、皮肤干燥、皮肤瘙痒、上臂中点围、年龄.抑郁发生的独立危险因素为文化程度、医疗费用、工作状况、食欲、握力、小腿围、有无浮肿、皮肤瘙痒.结论腹膜透析患者存在焦虑抑郁情绪,焦虑和抑郁的发生与多种因素有关.医务人员应重视腹膜透析患者的心理状况,针对患者不同情况实施心理干预.

  4. [Logistic regression analysis for factors affecting 
the successful rate of nano-carbon in sentinel lymph 
node biopsy].

    Science.gov (United States)

    Wang, Xinzheng; Liu, Jinbiao; Hou, Yongqiang; Wang, Ning; Wang, Mingjun

    2016-04-01

    To explore the factors affecting the successful rate of nano-carbon in sentinel lymph node biopsy.
 A total of 270 patients with breast cancer, who were treated in First Affilitated Hospital of Henan University of Science and Technology from January 2013 to March 2015, were chosen and given sentinel lymph node biopsy (SLN) with nano-carbon, and the influencial factors were examined by logistic analysis.
 Successful rate of biopsy, accuracy, sensitivity and false negative rate was 92.2%, 97.6%, 93.1% and 6.8%, respectively. Age, primary tumor lesions, body mass index, axillary lymph node status, number of SLN and pathological grade were the factors affetcing successful biopsy (all Pbiopsy (all Pbiopsy, tumor location, affected sides, injection sites and chemotherapy showed little effect on the successful rate of biopsy (all P> 0.05).
 Nano-carbon tracer method is a reliable method in sentinel lymph node biopsy. The body mass index, age, and number of lymph node metastasis greatly impact the successful rate of biopsy.

  5. LOGISTIC REGRESSION ANALYSIS OF RISK FACTORS OF COPD IN RURAL AREAS OF FENGKAI%封开农村地区COPD相关危险因素的 Logistic 回归分析

    Institute of Scientific and Technical Information of China (English)

    钱形邦; 梁洪雁; 李秋生; 侯浩联; 侯秋华; 植彩雄

    2015-01-01

    Objective To study the prevalence and relevant risk factors of chronic obstructive pulmonary disease( COPD) in Fengkai rural areas, in order to provide scientific proof for the effective prevention of COPD.Methods The cluster-random-sampling method was performed to collect the data from 1386 cases ( aged over 40 years) in Fengkai rural areas.All the subjects were interviewed with questionnaires and tested with spirometry.A single factor and multivariate unconditional Logistic analysis of COPD incidence of relevant risk factors was performed.Results (1)The total prevalence of COPD was 10.24%;(2) The relevant risk factors of COPD:age (OR=4.002,95%, CI=2.339~7.605), smoking (OR=3.846, 95%CI=1.925~7.564), occupational dust expo-sure (OR=5.339, 95%CI=3.062~9.743), wood and coal fire (OR=1.206, 95%CI=0.895~2.666), poor kitchen ventilation equipment (OR=2.599, 95%CI=1.056~4.009), bad cooking habits (OR=1.408, 95%CI=0.758~2.255), and personal history of lung disease (OR=1.296, 95%, CI=1.015~2.847).Conclusion The incidence of COPD is the outcome of combined action of multiple factors.Use of cleaner fuels and improved ven-tilation may reduce the incidence of COPD.%目的 了解封开农村地区慢性阻塞性肺疾病( COPD)的患病率及相关危险因素,为COPD的有效干预提供科学依据. 方法 以整群抽样随机分层方法对封开县农村地区1 386例调查对象(年龄≥40岁)进行问卷调查及肺功能检测,采用单因素和多因素非条件Logistic回归分析COPD发病的相关危险因素.结果 本农村地区COPD总患病率为10畅24%; COPD相关危险因素分析显示:年龄( OR=4.002,95%CI=2.339~7.605)、吸烟(OR=3.846,95%CI=1.925 ~7.564)、职业粉尘暴露(OR=5.339,95%CI=3.062 ~9.743)、烧柴烧煤(OR=1.206,95%CI=0.895 ~2.666)、厨房通风设备差(OR=2.599,95%CI =1.056 ~4.009)、不良烹饪习惯(OR=1.408,95%CI=0.758~2.255)和肺部疾病个人史(OR=1.296,95%CI=1.015~2.847). 结论 COPD发病是多种因素相互作用的结

  6. A free-knot spline modeling framework for piecewise linear logistic regression in complex samples with body mass index and mortality as an example

    Directory of Open Access Journals (Sweden)

    Scott W. Keith

    2014-09-01

    Full Text Available This paper details the design, evaluation, and implementation of a framework for detecting and modeling nonlinearity between a binary outcome and a continuous predictor variable adjusted for covariates in complex samples. The framework provides familiar-looking parameterizations of output in terms of linear slope coefficients and odds ratios. Estimation methods focus on maximum likelihood optimization of piecewise linear free-knot splines formulated as B-splines. Correctly specifying the optimal number and positions of the knots improves the model, but is marked by computational intensity and numerical instability. Our inference methods utilize both parametric and nonparametric bootstrapping. Unlike other nonlinear modeling packages, this framework is designed to incorporate multistage survey sample designs common to nationally representative datasets. We illustrate the approach and evaluate its performance in specifying the correct number of knots under various conditions with an example using body mass index (BMI; kg/m2 and the complex multi-stage sampling design from the Third National Health and Nutrition Examination Survey to simulate binary mortality outcomes data having realistic nonlinear sample-weighted risk associations with BMI. BMI and mortality data provide a particularly apt example and area of application since BMI is commonly recorded in large health surveys with complex designs, often categorized for modeling, and nonlinearly related to mortality. When complex sample design considerations were ignored, our method was generally similar to or more accurate than two common model selection procedures, Schwarz’s Bayesian Information Criterion (BIC and Akaike’s Information Criterion (AIC, in terms of correctly selecting the correct number of knots. Our approach provided accurate knot selections when complex sampling weights were incorporated, while AIC and BIC were not effective under these conditions.

  7. EMPIRICAL STUDY OF DIFFERENT FACTORS EFFECTS ON ARTICLES PUBLICATION REGARDING SURVEY INTERVIEWER CHARACTERISTICS USING MULTILEVEL REGRESSION MODEL

    Directory of Open Access Journals (Sweden)

    Alina MOROŞANU

    2013-06-01

    Full Text Available The purpose of this research work is to evaluate the effects which some factors could have on articles publication regarding survey interviewer characteristics. For this, the author studied the existing literature from the various fields in which articles on survey interviewer characteristics has been published and which can be found in online articles database. The analysis was performed on 243 articles achieved by researchers in the time period 1949-2012. Using statistical software R and applying multilevel regression model, the results showed that the time period when the studied articles are made and the interaction between the number of authors and the number of pages affect the most their publication in journals with a certain level of impact factor.

  8. 门诊患者宫颈癌现患率及危险因素Logistic回归分析%Logistic regression analysis on prevalence rate of cervical cancer and risk factors in outpatient

    Institute of Scientific and Technical Information of China (English)

    文彩虹; 冯晓庆; 罗荣城

    2013-01-01

    Objective: To explore the prevalence rate of cervical cancer and analyze the related risk factors. Methods: The related data of 65 patients diagnosed as cervical cancer definitely and 114 healthy women without cervical intraepithelial neoplasia (CIN) definitely were analyzed retrospectively, questionnaire investigation and related examination were performed, than univariate non - conditional logistic regression analysis and multivariate non - conditional logistic regression analysis were used to analyze the indexesnon - conditional logistic regression analysis. Results: The prevalence rate of cervical cancer was 240/100 000. Univariate logistic regression analysis showed that the related factors of cervical cancer were human papillomavirus ( HPV) infection ( P two years (P 3 , and contraception with condom; multivariate non -conditional logistic regression analysis showed that the main risk factors of occurrence of cervical cancer were HPV infection (P<0.01) and serum selenium content<1.06 μg/ml (P<0.01) . Conclusion: The prevalence rate of cervical cancer is at a high level; HPV infection and low serum selenium content are correlated with the occurrence of cervical cancer, so preventing HPV infection and supplying selenium can paly active roles in controlling female cervical cancer.%目的:探讨宫颈癌的现患率并分析其相关危险因素.方法:回顾分析确诊为宫颈癌的患者65例和随机选取的114例确诊未发生宫颈鳞状上皮内瘤变的健康妇女的相关资料,并对其进行问卷调查及相关检查,对各指标行单因素和多因素非条件Logistic回归分析.结果:宫颈癌现患率为240/10万.单因素分析发现,与宫颈癌发生有关的是人乳头瘤病毒(HPV)感染(P<0.01)、血清硒(Se)含量<1.06μg/ml (P<0.01),患宫颈炎年限>2年(P<0.01)、性伴侣>3个(P<0.05)、避孕套防护(P<0.01),多因素非条件Logistic分析发现宫颈癌发病主要危险因素依次是HPV感染(P<0.01)

  9. Empirical Analysis of Logistics Demand Forecasting of Hebei Based on Multi-linear Regression Model%基于多元线性回归模型的河北省物流需求预测实证分析

    Institute of Scientific and Technical Information of China (English)

    周晓娟; 景志英

    2013-01-01

    尝试运用多元线性回归模型对河北省物流需求进行预测分析.在借鉴前人研究成果的基础上,选取研究指标,并且根据统计对数据的严格要求,选取了1990-2009年河北省统计年鉴上的相关指标作为数据来源,并对数据进行了逐步回归,以消除多重共线性,最后得出回归模型,并对模型进行了相关检验,验证模型是适合进行预测的.最后提出基于货运量是物流需求预测的关键,从三个方面提出加快河北省物流产业发展的政策建议.%In this paper,we attempted to use the multi-linear regression model to forecast and analyze the logistics demand of Hebei.On the basis of previous studies,we selected the suitable research index and then,in accordance with the stringent statistical standards,selected the relevant index in the statistical yearbook of Hebei from 1990 to 2009 as the data source,performed stepwise regression on the data to eliminate multicollinearity,and then obtained the regression model which was proved to be suitable for the forecasting.At the end,we proposed the suggestion to speed up the development of the logistics industry in Hebei on three aspects.

  10. Logistic regression analysis on inlfuencing factors of medical abortion outcome%药物流产结局影响因素的logistic回归分析

    Institute of Scientific and Technical Information of China (English)

    刘梅云; 张红杰; 朱继红; 高敬

    2014-01-01

    目的:探讨药物流产结局的影响因素,为制定相应的干预措施提供依据。方法以药物流产后因不全流产或失败行手术干预者为病例组,以药物流产后完全流产者为对照组,采用病例对照研究,分析影响药物流产结局的因素。采用SPSS16.0进行数据分析。结果共调查病例组32例,对照组170例。单因素分析结果表明,手术流产史(χ2=4.691, P=0.030)、药物流产的地点(χ2=13.487,P=0.000)、药物流产的孕周(χ2=6.747,P=0.009)、药物流产前是否诊断为阴道炎(χ2=22.153,P=0.000)对药物流产是否完全流产的影响有统计学意义;多因素logistic回归分析结果表明,药物流产的地点(OR=3.693,P=0.009)和药物流产前是否诊断为阴道炎(OR=4.520,P=0.000)是药物流产结局的独立影响因素。结论在私立诊所行药物