WorldWideScience

Sample records for victim logistic regression

  1. Applied logistic regression

    CERN Document Server

    Hosmer, David W; Sturdivant, Rodney X

    2013-01-01

     A new edition of the definitive guide to logistic regression modeling for health science and other applications This thoroughly expanded Third Edition provides an easily accessible introduction to the logistic regression (LR) model and highlights the power of this model by examining the relationship between a dichotomous outcome and a set of covariables. Applied Logistic Regression, Third Edition emphasizes applications in the health sciences and handpicks topics that best suit the use of modern statistical software. The book provides readers with state-of-

  2. [Understanding logistic regression].

    Science.gov (United States)

    El Sanharawi, M; Naudet, F

    2013-10-01

    Logistic regression is one of the most common multivariate analysis models utilized in epidemiology. It allows the measurement of the association between the occurrence of an event (qualitative dependent variable) and factors susceptible to influence it (explicative variables). The choice of explicative variables that should be included in the logistic regression model is based on prior knowledge of the disease physiopathology and the statistical association between the variable and the event, as measured by the odds ratio. The main steps for the procedure, the conditions of application, and the essential tools for its interpretation are discussed concisely. We also discuss the importance of the choice of variables that must be included and retained in the regression model in order to avoid the omission of important confounding factors. Finally, by way of illustration, we provide an example from the literature, which should help the reader test his or her knowledge. Copyright © 2013 Elsevier Masson SAS. All rights reserved.

  3. Logistic regression models

    CERN Document Server

    Hilbe, Joseph M

    2009-01-01

    This book really does cover everything you ever wanted to know about logistic regression … with updates available on the author's website. Hilbe, a former national athletics champion, philosopher, and expert in astronomy, is a master at explaining statistical concepts and methods. Readers familiar with his other expository work will know what to expect-great clarity.The book provides considerable detail about all facets of logistic regression. No step of an argument is omitted so that the book will meet the needs of the reader who likes to see everything spelt out, while a person familiar with some of the topics has the option to skip "obvious" sections. The material has been thoroughly road-tested through classroom and web-based teaching. … The focus is on helping the reader to learn and understand logistic regression. The audience is not just students meeting the topic for the first time, but also experienced users. I believe the book really does meet the author's goal … .-Annette J. Dobson, Biometric...

  4. An Introduction to Logistic Regression.

    Science.gov (United States)

    Cizek, Gregory J.; Fitzgerald, Shawn M.

    1999-01-01

    Where linearity cannot be assumed, logistic regression may be appropriate. This article describes conditions and tests for using logistic regression; introduces the logistic-regression model, the use of logistic-regression software, and some applications in published literature. Univariate and multiple independent-variable conditions and…

  5. Logistic Regression: Concept and Application

    Science.gov (United States)

    Cokluk, Omay

    2010-01-01

    The main focus of logistic regression analysis is classification of individuals in different groups. The aim of the present study is to explain basic concepts and processes of binary logistic regression analysis intended to determine the combination of independent variables which best explain the membership in certain groups called dichotomous…

  6. Logistic regression: a brief primer.

    Science.gov (United States)

    Stoltzfus, Jill C

    2011-10-01

    Regression techniques are versatile in their application to medical research because they can measure associations, predict outcomes, and control for confounding variable effects. As one such technique, logistic regression is an efficient and powerful way to analyze the effect of a group of independent variables on a binary outcome by quantifying each independent variable's unique contribution. Using components of linear regression reflected in the logit scale, logistic regression iteratively identifies the strongest linear combination of variables with the greatest probability of detecting the observed outcome. Important considerations when conducting logistic regression include selecting independent variables, ensuring that relevant assumptions are met, and choosing an appropriate model building strategy. For independent variable selection, one should be guided by such factors as accepted theory, previous empirical investigations, clinical considerations, and univariate statistical analyses, with acknowledgement of potential confounding variables that should be accounted for. Basic assumptions that must be met for logistic regression include independence of errors, linearity in the logit for continuous variables, absence of multicollinearity, and lack of strongly influential outliers. Additionally, there should be an adequate number of events per independent variable to avoid an overfit model, with commonly recommended minimum "rules of thumb" ranging from 10 to 20 events per covariate. Regarding model building strategies, the three general types are direct/standard, sequential/hierarchical, and stepwise/statistical, with each having a different emphasis and purpose. Before reaching definitive conclusions from the results of any of these methods, one should formally quantify the model's internal validity (i.e., replicability within the same data set) and external validity (i.e., generalizability beyond the current sample). The resulting logistic regression model

  7. Logistic regression for circular data

    Science.gov (United States)

    Al-Daffaie, Kadhem; Khan, Shahjahan

    2017-05-01

    This paper considers the relationship between a binary response and a circular predictor. It develops the logistic regression model by employing the linear-circular regression approach. The maximum likelihood method is used to estimate the parameters. The Newton-Raphson numerical method is used to find the estimated values of the parameters. A data set from weather records of Toowoomba city is analysed by the proposed methods. Moreover, a simulation study is considered. The R software is used for all computations and simulations.

  8. Standards for Standardized Logistic Regression Coefficients

    Science.gov (United States)

    Menard, Scott

    2011-01-01

    Standardized coefficients in logistic regression analysis have the same utility as standardized coefficients in linear regression analysis. Although there has been no consensus on the best way to construct standardized logistic regression coefficients, there is now sufficient evidence to suggest a single best approach to the construction of a…

  9. Inconsistency Between Univariate and Multiple Logistic Regressions

    OpenAIRE

    WANG, HONGYUE; Peng, Jing; Wang, Bokai; Lu, Xiang; ZHENG, Julia Z.; Wang, Kejia; Tu, Xin M.; Feng, Changyong

    2017-01-01

    Summary Logistic regression is a popular statistical method in studying the effects of covariates on binary outcomes. It has been widely used in both clinical trials and observational studies. However, the results from the univariate regression and from the multiple logistic regression tend to be conflicting. A covariate may show very strong effect on the outcome in the multiple regression but not in the univariate regression, and vice versa. These facts have not been well appreciated in biom...

  10. Predicting Social Trust with Binary Logistic Regression

    Science.gov (United States)

    Adwere-Boamah, Joseph; Hufstedler, Shirley

    2015-01-01

    This study used binary logistic regression to predict social trust with five demographic variables from a national sample of adult individuals who participated in The General Social Survey (GSS) in 2012. The five predictor variables were respondents' highest degree earned, race, sex, general happiness and the importance of personally assisting…

  11. A Logistic Regression Model for Personnel Selection.

    Science.gov (United States)

    Raju, Nambury S.; And Others

    1991-01-01

    A two-parameter logistic regression model for personnel selection is proposed. The model was tested with a database of 84,808 military enlistees. The probability of job success was related directly to trait levels, addressing such topics as selection, validity generalization, employee classification, selection bias, and utility-based fair…

  12. Hierarchical Logistic Regression in Course Placement

    Science.gov (United States)

    Schulz, E. Matthew; Betebenner, Damian; Ahn, Meeyeon

    2004-01-01

    Whether hierarchical logistic regression can reduce the sample size requirement for estimating optimal cutoff scores in a course placement service where predictive validity is measured by a threshold utility function is explored. Data from courses with varying class size were randomly partitioned into two halves per course. Nonhierarchical and…

  13. Targeting: Logistic Regression, Special Cases and Extensions

    Directory of Open Access Journals (Sweden)

    Helmut Schaeben

    2014-12-01

    Full Text Available Logistic regression is a classical linear model for logit-transformed conditional probabilities of a binary target variable. It recovers the true conditional probabilities if the joint distribution of predictors and the target is of log-linear form. Weights-of-evidence is an ordinary logistic regression with parameters equal to the differences of the weights of evidence if all predictor variables are discrete and conditionally independent given the target variable. The hypothesis of conditional independence can be tested in terms of log-linear models. If the assumption of conditional independence is violated, the application of weights-of-evidence does not only corrupt the predicted conditional probabilities, but also their rank transform. Logistic regression models, including the interaction terms, can account for the lack of conditional independence, appropriate interaction terms compensate exactly for violations of conditional independence. Multilayer artificial neural nets may be seen as nested regression-like models, with some sigmoidal activation function. Most often, the logistic function is used as the activation function. If the net topology, i.e., its control, is sufficiently versatile to mimic interaction terms, artificial neural nets are able to account for violations of conditional independence and yield very similar results. Weights-of-evidence cannot reasonably include interaction terms; subsequent modifications of the weights, as often suggested, cannot emulate the effect of interaction terms.

  14. Logistic regression a self-learning text

    CERN Document Server

    Kleinbaum, David G

    1994-01-01

    This textbook provides students and professionals in the health sciences with a presentation of the use of logistic regression in research. The text is self-contained, and designed to be used both in class or as a tool for self-study. It arises from the author's many years of experience teaching this material and the notes on which it is based have been extensively used throughout the world.

  15. Prediction of Rainfall Using Logistic Regression

    OpenAIRE

    A. H. M. Rahmatullah Imon; Manos C. Roy; S. K. Bhattacharjee

    2012-01-01

    The use of logistic regression modeling has exploded during the past decade for prediction and forecasting. From its original acceptance in epidemiologic research, the method is now commonly employed in almost all branches of knowledge. Rainfall is one of the most important phenomena of climate system. It is well known that the variability and intensity of rainfall act on natural, agricultural, human and even total biological system. So it is essential to be able to predict rainfall by findi...

  16. Leukemia prediction using sparse logistic regression.

    Directory of Open Access Journals (Sweden)

    Tapio Manninen

    Full Text Available We describe a supervised prediction method for diagnosis of acute myeloid leukemia (AML from patient samples based on flow cytometry measurements. We use a data driven approach with machine learning methods to train a computational model that takes in flow cytometry measurements from a single patient and gives a confidence score of the patient being AML-positive. Our solution is based on an [Formula: see text] regularized logistic regression model that aggregates AML test statistics calculated from individual test tubes with different cell populations and fluorescent markers. The model construction is entirely data driven and no prior biological knowledge is used. The described solution scored a 100% classification accuracy in the DREAM6/FlowCAP2 Molecular Classification of Acute Myeloid Leukaemia Challenge against a golden standard consisting of 20 AML-positive and 160 healthy patients. Here we perform a more extensive validation of the prediction model performance and further improve and simplify our original method showing that statistically equal results can be obtained by using simple average marker intensities as features in the logistic regression model. In addition to the logistic regression based model, we also present other classification models and compare their performance quantitatively. The key benefit in our prediction method compared to other solutions with similar performance is that our model only uses a small fraction of the flow cytometry measurements making our solution highly economical.

  17. Interpreting parameters in the logistic regression model with random effects

    DEFF Research Database (Denmark)

    Larsen, Klaus; Petersen, Jørgen Holm; Budtz-Jørgensen, Esben

    2000-01-01

    interpretation, interval odds ratio, logistic regression, median odds ratio, normally distributed random effects......interpretation, interval odds ratio, logistic regression, median odds ratio, normally distributed random effects...

  18. Prediction of Rainfall Using Logistic Regression

    Directory of Open Access Journals (Sweden)

    A.H.M. Rahmatullah Imon

    2012-07-01

    Full Text Available Normal 0 false false false EN-US X-NONE X-NONE /* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:10.0pt; font-family:"Times New Roman","serif";} The use of logistic regression modeling has exploded during the past decade for prediction and forecasting. From its original acceptance in epidemiologic research, the method is now commonly employed in almost all branches of knowledge. Rainfall is one of the most important phenomena of climate system. It is well known that the variability and intensity of rainfall act on natural, agricultural, human and even total biological system. So it is essential to be able to predict rainfall by finding out the appropriate predictors. In this paper an attempt has been made to use logistic regression for predicting rainfall. It is evident that the climatic data are often subjected to gross recording errors though this problem often goes unnoticed to the analysts. In this paper we have used very recent screening methods to check and correct the climatic data that we use in our study. We have used fourteen years’ daily rainfall data to formulate our model. Then we use two years’ observed daily rainfall data treating them as future data for the cross validation of our model. Our findings clearly show that if we are able to choose appropriate predictors for rainfall, logistic regression model can predict the rainfall very efficiently.

  19. Multinomial logistic regression in workers' health

    Science.gov (United States)

    Grilo, Luís M.; Grilo, Helena L.; Gonçalves, Sónia P.; Junça, Ana

    2017-11-01

    In European countries, namely in Portugal, it is common to hear some people mentioning that they are exposed to excessive and continuous psychosocial stressors at work. This is increasing in diverse activity sectors, such as, the Services sector. A representative sample was collected from a Portuguese Services' organization, by applying a survey (internationally validated), which variables were measured in five ordered categories in Likert-type scale. A multinomial logistic regression model is used to estimate the probability of each category of the dependent variable general health perception where, among other independent variables, burnout appear as statistically significant.

  20. Supporting Regularized Logistic Regression Privately and Efficiently.

    Science.gov (United States)

    Li, Wenfa; Liu, Hongzhe; Yang, Peng; Xie, Wei

    2016-01-01

    As one of the most popular statistical and machine learning models, logistic regression with regularization has found wide adoption in biomedicine, social sciences, information technology, and so on. These domains often involve data of human subjects that are contingent upon strict privacy regulations. Concerns over data privacy make it increasingly difficult to coordinate and conduct large-scale collaborative studies, which typically rely on cross-institution data sharing and joint analysis. Our work here focuses on safeguarding regularized logistic regression, a widely-used statistical model while at the same time has not been investigated from a data security and privacy perspective. We consider a common use scenario of multi-institution collaborative studies, such as in the form of research consortia or networks as widely seen in genetics, epidemiology, social sciences, etc. To make our privacy-enhancing solution practical, we demonstrate a non-conventional and computationally efficient method leveraging distributing computing and strong cryptography to provide comprehensive protection over individual-level and summary data. Extensive empirical evaluations on several studies validate the privacy guarantee, efficiency and scalability of our proposal. We also discuss the practical implications of our solution for large-scale studies and applications from various disciplines, including genetic and biomedical studies, smart grid, network analysis, etc.

  1. Supporting Regularized Logistic Regression Privately and Efficiently.

    Directory of Open Access Journals (Sweden)

    Wenfa Li

    Full Text Available As one of the most popular statistical and machine learning models, logistic regression with regularization has found wide adoption in biomedicine, social sciences, information technology, and so on. These domains often involve data of human subjects that are contingent upon strict privacy regulations. Concerns over data privacy make it increasingly difficult to coordinate and conduct large-scale collaborative studies, which typically rely on cross-institution data sharing and joint analysis. Our work here focuses on safeguarding regularized logistic regression, a widely-used statistical model while at the same time has not been investigated from a data security and privacy perspective. We consider a common use scenario of multi-institution collaborative studies, such as in the form of research consortia or networks as widely seen in genetics, epidemiology, social sciences, etc. To make our privacy-enhancing solution practical, we demonstrate a non-conventional and computationally efficient method leveraging distributing computing and strong cryptography to provide comprehensive protection over individual-level and summary data. Extensive empirical evaluations on several studies validate the privacy guarantee, efficiency and scalability of our proposal. We also discuss the practical implications of our solution for large-scale studies and applications from various disciplines, including genetic and biomedical studies, smart grid, network analysis, etc.

  2. Logistic regression applied to natural hazards: rare event logistic regression with replications

    Directory of Open Access Journals (Sweden)

    M. Guns

    2012-06-01

    Full Text Available Statistical analysis of natural hazards needs particular attention, as most of these phenomena are rare events. This study shows that the ordinary rare event logistic regression, as it is now commonly used in geomorphologic studies, does not always lead to a robust detection of controlling factors, as the results can be strongly sample-dependent. In this paper, we introduce some concepts of Monte Carlo simulations in rare event logistic regression. This technique, so-called rare event logistic regression with replications, combines the strength of probabilistic and statistical methods, and allows overcoming some of the limitations of previous developments through robust variable selection. This technique was here developed for the analyses of landslide controlling factors, but the concept is widely applicable for statistical analyses of natural hazards.

  3. Logistic regression applied to natural hazards: rare event logistic regression with replications

    Science.gov (United States)

    Guns, M.; Vanacker, V.

    2012-06-01

    Statistical analysis of natural hazards needs particular attention, as most of these phenomena are rare events. This study shows that the ordinary rare event logistic regression, as it is now commonly used in geomorphologic studies, does not always lead to a robust detection of controlling factors, as the results can be strongly sample-dependent. In this paper, we introduce some concepts of Monte Carlo simulations in rare event logistic regression. This technique, so-called rare event logistic regression with replications, combines the strength of probabilistic and statistical methods, and allows overcoming some of the limitations of previous developments through robust variable selection. This technique was here developed for the analyses of landslide controlling factors, but the concept is widely applicable for statistical analyses of natural hazards.

  4. Logistic regression against a divergent Bayesian network

    Directory of Open Access Journals (Sweden)

    Noel Antonio Sánchez Trujillo

    2015-01-01

    Full Text Available This article is a discussion about two statistical tools used for prediction and causality assessment: logistic regression and Bayesian networks. Using data of a simulated example from a study assessing factors that might predict pulmonary emphysema (where fingertip pigmentation and smoking are considered; we posed the following questions. Is pigmentation a confounding, causal or predictive factor? Is there perhaps another factor, like smoking, that confounds? Is there a synergy between pigmentation and smoking? The results, in terms of prediction, are similar with the two techniques; regarding causation, differences arise. We conclude that, in decision-making, the sum of both: a statistical tool, used with common sense, and previous evidence, taking years or even centuries to develop; is better than the automatic and exclusive use of statistical resources.

  5. Using Dominance Analysis to Determine Predictor Importance in Logistic Regression

    Science.gov (United States)

    Azen, Razia; Traxel, Nicole

    2009-01-01

    This article proposes an extension of dominance analysis that allows researchers to determine the relative importance of predictors in logistic regression models. Criteria for choosing logistic regression R[superscript 2] analogues were determined and measures were selected that can be used to perform dominance analysis in logistic regression. A…

  6. Gaussian Process Regression Model in Spatial Logistic Regression

    Science.gov (United States)

    Sofro, A.; Oktaviarina, A.

    2018-01-01

    Spatial analysis has developed very quickly in the last decade. One of the favorite approaches is based on the neighbourhood of the region. Unfortunately, there are some limitations such as difficulty in prediction. Therefore, we offer Gaussian process regression (GPR) to accommodate the issue. In this paper, we will focus on spatial modeling with GPR for binomial data with logit link function. The performance of the model will be investigated. We will discuss the inference of how to estimate the parameters and hyper-parameters and to predict as well. Furthermore, simulation studies will be explained in the last section.

  7. A Methodology for Generating Placement Rules that Utilizes Logistic Regression

    Science.gov (United States)

    Wurtz, Keith

    2008-01-01

    The purpose of this article is to provide the necessary tools for institutional researchers to conduct a logistic regression analysis and interpret the results. Aspects of the logistic regression procedure that are necessary to evaluate models are presented and discussed with an emphasis on cutoff values and choosing the appropriate number of…

  8. A logistic regression estimating function for spatial Gibbs point processes

    DEFF Research Database (Denmark)

    Baddeley, Adrian; Coeurjolly, Jean-François; Rubak, Ege

    We propose a computationally efficient logistic regression estimating function for spatial Gibbs point processes. The sample points for the logistic regression consist of the observed point pattern together with a random pattern of dummy points. The estimating function is closely related...

  9. Spatial correlation in Bayesian logistic regression with misclassification

    DEFF Research Database (Denmark)

    Bihrmann, Kristine; Toft, Nils; Nielsen, Søren Saxmose

    2014-01-01

    Standard logistic regression assumes that the outcome is measured perfectly. In practice, this is often not the case, which could lead to biased estimates if not accounted for. This study presents Bayesian logistic regression with adjustment for misclassification of the outcome applied to data...

  10. Parameters Estimation of Geographically Weighted Ordinal Logistic Regression (GWOLR) Model

    Science.gov (United States)

    Zuhdi, Shaifudin; Retno Sari Saputro, Dewi; Widyaningsih, Purnami

    2017-06-01

    A regression model is the representation of relationship between independent variable and dependent variable. The dependent variable has categories used in the logistic regression model to calculate odds on. The logistic regression model for dependent variable has levels in the logistics regression model is ordinal. GWOLR model is an ordinal logistic regression model influenced the geographical location of the observation site. Parameters estimation in the model needed to determine the value of a population based on sample. The purpose of this research is to parameters estimation of GWOLR model using R software. Parameter estimation uses the data amount of dengue fever patients in Semarang City. Observation units used are 144 villages in Semarang City. The results of research get GWOLR model locally for each village and to know probability of number dengue fever patient categories.

  11. MODELING SNAKE MICROHABITAT FROM RADIOTELEMETRY STUDIES USING POLYTOMOUS LOGISTIC REGRESSION

    Science.gov (United States)

    Multivariate analysis of snake microhabitat has historically used techniques that were derived under assumptions of normality and common covariance structure (e.g., discriminant function analysis, MANOVA). In this study, polytomous logistic regression (PLR which does not require ...

  12. Tree-based model checking for logistic regression.

    Science.gov (United States)

    Su, Xiaogang

    2007-05-10

    A tree procedure is proposed to check the adequacy of a fitted logistic regression model. The proposed method not only makes natural assessment for the logistic model, but also provides clues to amend its lack-of-fit. The resulting tree-augmented logistic model facilitates a refined model with meaningful interpretation. We demonstrate its use via simulation studies and an application to the Pima Indians diabetes data. Copyright 2006 John Wiley & Sons, Ltd.

  13. An Original Stepwise Multilevel Logistic Regression Analysis of Discriminatory Accuracy

    DEFF Research Database (Denmark)

    Merlo, Juan; Wagner, Philippe; Ghith, Nermin

    2016-01-01

    BACKGROUND AND AIM: Many multilevel logistic regression analyses of "neighbourhood and health" focus on interpreting measures of associations (e.g., odds ratio, OR). In contrast, multilevel analysis of variance is rarely considered. We propose an original stepwise analytical approach that disting......BACKGROUND AND AIM: Many multilevel logistic regression analyses of "neighbourhood and health" focus on interpreting measures of associations (e.g., odds ratio, OR). In contrast, multilevel analysis of variance is rarely considered. We propose an original stepwise analytical approach...

  14. Infinite Parameter Estimates in Logistic Regression: Opportunities, not Problems.

    Science.gov (United States)

    Rindskopf, David

    2002-01-01

    Asserts that, in principle, an analyst should be satisfied with infinite estimates slope in logistic regression because it indicates that a predictor is perfect. Using simple approaches, hypothesis tests may be performed and confidence intervals calculated even when a slope is infinite. Some functions of parameters that are infinite are still…

  15. Two-factor logistic regression in pediatric liver transplantation

    Science.gov (United States)

    Uzunova, Yordanka; Prodanova, Krasimira; Spasov, Lyubomir

    2017-12-01

    Using a two-factor logistic regression analysis an estimate is derived for the probability of absence of infections in the early postoperative period after pediatric liver transplantation. The influence of both the bilirubin level and the international normalized ratio of prothrombin time of blood coagulation at the 5th postoperative day is studied.

  16. Bayesian logistic regression in detection of gene–steroid interaction ...

    Indian Academy of Sciences (India)

    cancer. Bayesian logistic regression in PROC GENMOD in SAS statistical software, ver. 9.4 was used to detect gene– steroid interactions influencing cancer. Single marker analysis using PLINK identified 12 SNPs associated with cancer. (P < 0.05); especially, SNP rs6532496 revealed the strongest association with cancer ...

  17. Uncertainties in spatially aggregated predictions from a logistic regression model

    NARCIS (Netherlands)

    Horssen, P.W. van; Pebesma, E.J.; Schot, P.P.

    2002-01-01

    This paper presents a method to assess the uncertainty of an ecological spatial prediction model which is based on logistic regression models, using data from the interpolation of explanatory predictor variables. The spatial predictions are presented as approximate 95% prediction intervals. The

  18. Geographically Weighted Logistic Regression Applied to Credit Scoring Models

    Directory of Open Access Journals (Sweden)

    Pedro Henrique Melo Albuquerque

    Full Text Available Abstract This study used real data from a Brazilian financial institution on transactions involving Consumer Direct Credit (CDC, granted to clients residing in the Distrito Federal (DF, to construct credit scoring models via Logistic Regression and Geographically Weighted Logistic Regression (GWLR techniques. The aims were: to verify whether the factors that influence credit risk differ according to the borrower’s geographic location; to compare the set of models estimated via GWLR with the global model estimated via Logistic Regression, in terms of predictive power and financial losses for the institution; and to verify the viability of using the GWLR technique to develop credit scoring models. The metrics used to compare the models developed via the two techniques were the AICc informational criterion, the accuracy of the models, the percentage of false positives, the sum of the value of false positive debt, and the expected monetary value of portfolio default compared with the monetary value of defaults observed. The models estimated for each region in the DF were distinct in their variables and coefficients (parameters, with it being concluded that credit risk was influenced differently in each region in the study. The Logistic Regression and GWLR methodologies presented very close results, in terms of predictive power and financial losses for the institution, and the study demonstrated viability in using the GWLR technique to develop credit scoring models for the target population in the study.

  19. A Solution to Separation and Multicollinearity in Multiple Logistic Regression.

    Science.gov (United States)

    Shen, Jianzhao; Gao, Sujuan

    2008-10-01

    In dementia screening tests, item selection for shortening an existing screening test can be achieved using multiple logistic regression. However, maximum likelihood estimates for such logistic regression models often experience serious bias or even non-existence because of separation and multicollinearity problems resulting from a large number of highly correlated items. Firth (1993, Biometrika, 80(1), 27-38) proposed a penalized likelihood estimator for generalized linear models and it was shown to reduce bias and the non-existence problems. The ridge regression has been used in logistic regression to stabilize the estimates in cases of multicollinearity. However, neither solves the problems for each other. In this paper, we propose a double penalized maximum likelihood estimator combining Firth's penalized likelihood equation with a ridge parameter. We present a simulation study evaluating the empirical performance of the double penalized likelihood estimator in small to moderate sample sizes. We demonstrate the proposed approach using a current screening data from a community-based dementia study.

  20. Determination of riverbank erosion probability using Locally Weighted Logistic Regression

    Science.gov (United States)

    Ioannidou, Elena; Flori, Aikaterini; Varouchakis, Emmanouil A.; Giannakis, Georgios; Vozinaki, Anthi Eirini K.; Karatzas, George P.; Nikolaidis, Nikolaos

    2015-04-01

    Riverbank erosion is a natural geomorphologic process that affects the fluvial environment. The most important issue concerning riverbank erosion is the identification of the vulnerable locations. An alternative to the usual hydrodynamic models to predict vulnerable locations is to quantify the probability of erosion occurrence. This can be achieved by identifying the underlying relations between riverbank erosion and the geomorphological or hydrological variables that prevent or stimulate erosion. Thus, riverbank erosion can be determined by a regression model using independent variables that are considered to affect the erosion process. The impact of such variables may vary spatially, therefore, a non-stationary regression model is preferred instead of a stationary equivalent. Locally Weighted Regression (LWR) is proposed as a suitable choice. This method can be extended to predict the binary presence or absence of erosion based on a series of independent local variables by using the logistic regression model. It is referred to as Locally Weighted Logistic Regression (LWLR). Logistic regression is a type of regression analysis used for predicting the outcome of a categorical dependent variable (e.g. binary response) based on one or more predictor variables. The method can be combined with LWR to assign weights to local independent variables of the dependent one. LWR allows model parameters to vary over space in order to reflect spatial heterogeneity. The probabilities of the possible outcomes are modelled as a function of the independent variables using a logistic function. Logistic regression measures the relationship between a categorical dependent variable and, usually, one or several continuous independent variables by converting the dependent variable to probability scores. Then, a logistic regression is formed, which predicts success or failure of a given binary variable (e.g. erosion presence or absence) for any value of the independent variables. The

  1. Parameter Estimation for Improving Association Indicators in Binary Logistic Regression

    Directory of Open Access Journals (Sweden)

    Mahdi Bashiri

    2012-02-01

    Full Text Available The aim of this paper is estimation of Binary logistic regression parameters for maximizing the log-likelihood function with improved association indicators. In this paper the parameter estimation steps have been explained and then measures of association have been introduced and their calculations have been analyzed. Moreover a new related indicators based on membership degree level have been expressed. Indeed association measures demonstrate the number of success responses occurred in front of failure in certain number of Bernoulli independent experiments. In parameter estimation, existing indicators values is not sensitive to the parameter values, whereas the proposed indicators are sensitive to the estimated parameters during the iterative procedure. Therefore, proposing a new association indicator of binary logistic regression with more sensitivity to the estimated parameters in maximizing the log- likelihood in iterative procedure is innovation of this study.

  2. Sugarcane Land Classification with Satellite Imagery using Logistic Regression Model

    Science.gov (United States)

    Henry, F.; Herwindiati, D. E.; Mulyono, S.; Hendryli, J.

    2017-03-01

    This paper discusses the classification of sugarcane plantation area from Landsat-8 satellite imagery. The classification process uses binary logistic regression method with time series data of normalized difference vegetation index as input. The process is divided into two steps: training and classification. The purpose of training step is to identify the best parameter of the regression model using gradient descent algorithm. The best fit of the model can be utilized to classify sugarcane and non-sugarcane area. The experiment shows high accuracy and successfully maps the sugarcane plantation area which obtained best result of Cohen’s Kappa value 0.7833 (strong) with 89.167% accuracy.

  3. Model performance analysis and model validation in logistic regression

    Directory of Open Access Journals (Sweden)

    Rosa Arboretti Giancristofaro

    2007-10-01

    Full Text Available In this paper a new model validation procedure for a logistic regression model is presented. At first, we illustrate a brief review of different techniques of model validation. Next, we define a number of properties required for a model to be considered "good", and a number of quantitative performance measures. Lastly, we describe a methodology for the assessment of the performance of a given model by using an example taken from a management study.

  4. Logistic Regression Model on Antenna Control Unit Autotracking Mode

    Science.gov (United States)

    2015-10-20

    y is the logarithm of odds, or log-odds, also known as the logit of probability. Our model derives the logit of probabilities as the linear...partitioned over the control set predictors. This linearity of the logit vs. predictor is an assumption essential to our model . Not only can we...412TW-PA-15240 Logistic Regression Model on Antenna Control Unit Autotracking Mode DANIEL T. LAIRD AIR FORCE TEST CENTER EDWARDS AFB, CA

  5. Predicting company growth using logistic regression and neural networks

    Directory of Open Access Journals (Sweden)

    Marijana Zekić-Sušac

    2016-12-01

    Full Text Available The paper aims to establish an efficient model for predicting company growth by leveraging the strengths of logistic regression and neural networks. A real dataset of Croatian companies was used which described the relevant industry sector, financial ratios, income, and assets in the input space, with a dependent binomial variable indicating whether a company had high-growth if it had annualized growth in assets by more than 20% a year over a three-year period. Due to a large number of input variables, factor analysis was performed in the pre -processing stage in order to extract the most important input components. Building an efficient model with a high classification rate and explanatory ability required application of two data mining methods: logistic regression as a parametric and neural networks as a non -parametric method. The methods were tested on the models with and without variable reduction. The classification accuracy of the models was compared using statistical tests and ROC curves. The results showed that neural networks produce a significantly higher classification accuracy in the model when incorporating all available variables. The paper further discusses the advantages and disadvantages of both approaches, i.e. logistic regression and neural networks in modelling company growth. The suggested model is potentially of benefit to investors and economic policy makers as it provides support for recognizing companies with growth potential, especially during times of economic downturn.

  6. Classifying machinery condition using oil samples and binary logistic regression

    Science.gov (United States)

    Phillips, J.; Cripps, E.; Lau, John W.; Hodkiewicz, M. R.

    2015-08-01

    The era of big data has resulted in an explosion of condition monitoring information. The result is an increasing motivation to automate the costly and time consuming human elements involved in the classification of machine health. When working with industry it is important to build an understanding and hence some trust in the classification scheme for those who use the analysis to initiate maintenance tasks. Typically "black box" approaches such as artificial neural networks (ANN) and support vector machines (SVM) can be difficult to provide ease of interpretability. In contrast, this paper argues that logistic regression offers easy interpretability to industry experts, providing insight to the drivers of the human classification process and to the ramifications of potential misclassification. Of course, accuracy is of foremost importance in any automated classification scheme, so we also provide a comparative study based on predictive performance of logistic regression, ANN and SVM. A real world oil analysis data set from engines on mining trucks is presented and using cross-validation we demonstrate that logistic regression out-performs the ANN and SVM approaches in terms of prediction for healthy/not healthy engines.

  7. Exploring the Characteristics of Personal Victims Using the National Crime Victimization Survey

    National Research Council Canada - National Science Library

    Jairam, Shashi

    1998-01-01

    .... Two statistical methods were used to investigate these hypotheses, logistical regression for victimization prevalence, and negative binomial regression for victimization incidence and concentration...

  8. Prediction of unwanted pregnancies using logistic regression, probit regression and discriminant analysis.

    Science.gov (United States)

    Ebrahimzadeh, Farzad; Hajizadeh, Ebrahim; Vahabi, Nasim; Almasian, Mohammad; Bakhteyar, Katayoon

    2015-01-01

    Unwanted pregnancy not intended by at least one of the parents has undesirable consequences for the family and the society. In the present study, three classification models were used and compared to predict unwanted pregnancies in an urban population. In this cross-sectional study, 887 pregnant mothers referring to health centers in Khorramabad, Iran, in 2012 were selected by the stratified and cluster sampling; relevant variables were measured and for prediction of unwanted pregnancy, logistic regression, discriminant analysis, and probit regression models and SPSS software version 21 were used. To compare these models, indicators such as sensitivity, specificity, the area under the ROC curve, and the percentage of correct predictions were used. The prevalence of unwanted pregnancies was 25.3%. The logistic and probit regression models indicated that parity and pregnancy spacing, contraceptive methods, household income and number of living male children were related to unwanted pregnancy. The performance of the models based on the area under the ROC curve was 0.735, 0.733, and 0.680 for logistic regression, probit regression, and linear discriminant analysis, respectively. Given the relatively high prevalence of unwanted pregnancies in Khorramabad, it seems necessary to revise family planning programs. Despite the similar accuracy of the models, if the researcher is interested in the interpretability of the results, the use of the logistic regression model is recommended.

  9. Robust Logistic and Probit Methods for Binary and Multinomial Regression.

    Science.gov (United States)

    Tabatabai, M A; Li, H; Eby, W M; Kengwoung-Keumo, J J; Manne, U; Bae, S; Fouad, M; Singh, K P

    In this paper we introduce new robust estimators for the logistic and probit regressions for binary, multinomial, nominal and ordinal data and apply these models to estimate the parameters when outliers or inluential observations are present. Maximum likelihood estimates don't behave well when outliers or inluential observations are present. One remedy is to remove inluential observations from the data and then apply the maximum likelihood technique on the deleted data. Another approach is to employ a robust technique that can handle outliers and inluential observations without removing any observations from the data sets. The robustness of the method is tested using real and simulated data sets.

  10. MENENTUKAN PROBABILITAS QUALITAS LULUSAN PROGRAM STUDI MENGGUNAKAN LOGISTIC REGRESSION

    Directory of Open Access Journals (Sweden)

    Maxsi Ary

    2016-03-01

    Full Text Available Abstract – Human resources (HR is one of the success factors in the economic field, namely how to create a human resources (HR qualified and have the skills and highly competitive in the global competition. Educational level of the labor force that is still relatively low. The structure of education of the workforce is still dominated Indonesian basic education which is about 63.2%. The issue raised is to determine the probability of a program of study (whether or not to see some of the ratio of the number of graduates by the number of students per class, the amount of quota size class (large or small using logistic regression models. Data were obtained from a search result based on the amount of data the study program students and graduates in 2010 Data processing using SPSS. The results of the analysis by assessing model fit and the results will be given for each model fit. Starting with the hypothesis for assessing model fit, statistical -2LogL, Cox and Snell's R Square, Hosmer and Lemeshow's Goodness of Fit Test, and the classification table. The results of the analysis using SPSS as a tool aimed at measuring quality of graduate courses at a university, college, or academy, whether or not based on the ratio of the number of graduates and class quotas. Keywords: Quota Class, Probability, Logistic Regression Abstrak – Sumberdaya manusia (SDM adalah salah satu faktor kesuksesan dalam bidang ekonomi, yaitu bagaimana menciptakan sumber daya manusia (SDM yang berkualitas dan memiliki keterampilan serta berdaya saing tinggi dalam persaingan global. Tingkat pendidikan angkatan kerja yang ada masih relatif rendah. Struktur pendidikan angkatan kerja Indonesia masih didominasi pendidikan dasar yaitu sekitar 63,2%. Persoalan yang dikemukakan adalah menentukan probabilitas sebuah program studi (baik atau tidak dengan melihat beberapa rasio jumlah lulusan dengan jumlah mahasiswa per angkatan, ukuran besarnya kuota kelas (besar atau kecil menggunakan

  11. Logistic regression in estimates of femoral neck fracture by fall

    Directory of Open Access Journals (Sweden)

    Jaroslava Wendlová

    2010-04-01

    Full Text Available Jaroslava WendlováDerer’s University Hospital and Policlinic, Osteological Unit, Bratislava, SlovakiaAbstract: The latest methods in estimating the probability (absolute risk of osteoporotic fractures include several logistic regression models, based on qualitative risk factors plus bone mineral density (BMD, and the probability estimate of fracture in the future. The Slovak logistic regression model, in contrast to other models, is created from quantitative variables of the proximal femur (in International System of Units and estimates the probability of fracture by fall.Objectives: The first objective of this study was to order selected independent variables according to the intensity of their influence (statistical significance upon the occurrence of values of the dependent variable: femur strength index (FSI. The second objective was to determine, using logistic regression, whether the odds of FSI acquiring a pathological value (femoral neck fracture by fall increased or declined if the value of the variables (T–score total hip, BMI, alpha angle, theta angle and HAL were raised by one unit.Patients and methods: Bone densitometer measurements using dual energy X–ray absorptiometry (DXA, (Prodigy, Primo, GE, USA of the left proximal femur were obtained from 3 216 East Slovak women with primary or secondary osteoporosis or osteopenia, aged 20–89 years (mean age 58.9; 95% CI: −58.42; 59.38. The following variables were measured: FSI, T-score total hip BMD, body mass index (BMI, as were the geometrical variables of proximal femur alpha angle (α angle, theta angle (θ angle, and hip axis length (HAL.Statistical analysis: Logistic regression was used to measure the influence of the independent variables (T-score total hip, alpha angle, theta angle, HAL, BMI upon the dependent variable (FSI.Results: The order of independent variables according to the intensity of their influence (greatest to least upon the occurrence of values of the

  12. Landslide Hazard Mapping in Rwanda Using Logistic Regression

    Science.gov (United States)

    Piller, A.; Anderson, E.; Ballard, H.

    2015-12-01

    Landslides in the United States cause more than $1 billion in damages and 50 deaths per year (USGS 2014). Globally, figures are much more grave, yet monitoring, mapping and forecasting of these hazards are less than adequate. Seventy-five percent of the population of Rwanda earns a living from farming, mostly subsistence. Loss of farmland, housing, or life, to landslides is a very real hazard. Landslides in Rwanda have an impact at the economic, social, and environmental level. In a developing nation that faces challenges in tracking, cataloging, and predicting the numerous landslides that occur each year, satellite imagery and spatial analysis allow for remote study. We have focused on the development of a landslide inventory and a statistical methodology for assessing landslide hazards. Using logistic regression on approximately 30 test variables (i.e. slope, soil type, land cover, etc.) and a sample of over 200 landslides, we determine which variables are statistically most relevant to landslide occurrence in Rwanda. A preliminary predictive hazard map for Rwanda has been produced, using the variables selected from the logistic regression analysis.

  13. The intermediate endpoint effect in logistic and probit regression.

    Science.gov (United States)

    MacKinnon, D P; Lockwood, C M; Brown, C H; Wang, W; Hoffman, J M

    2007-01-01

    An intermediate endpoint is hypothesized to be in the middle of the causal sequence relating an independent variable to a dependent variable. The intermediate variable is also called a surrogate or mediating variable and the corresponding effect is called the mediated, surrogate endpoint, or intermediate endpoint effect. Clinical studies are often designed to change an intermediate or surrogate endpoint and through this intermediate change influence the ultimate endpoint. In many intermediate endpoint clinical studies the dependent variable is binary, and logistic or probit regression is used. The purpose of this study is to describe a limitation of a widely used approach to assessing intermediate endpoint effects and to propose an alternative method, based on products of coefficients, that yields more accurate results. The intermediate endpoint model for a binary outcome is described for a true binary outcome and for a dichotomization of a latent continuous outcome. Plots of true values and a simulation study are used to evaluate the different methods. Distorted estimates of the intermediate endpoint effect and incorrect conclusions can result from the application of widely used methods to assess the intermediate endpoint effect. The same problem occurs for the proportion of an effect explained by an intermediate endpoint, which has been suggested as a useful measure for identifying intermediate endpoints. A solution to this problem is given based on the relationship between latent variable modeling and logistic or probit regression. More complicated intermediate variable models are not addressed in the study, although the methods described in the article can be extended to these more complicated models. Researchers are encouraged to use an intermediate endpoint method based on the product of regression coefficients. A common method based on difference in coefficient methods can lead to distorted conclusions regarding the intermediate effect.

  14. Naive Bayes vs Logistic Regression: Theory, Implementation and Experimental Validation

    Directory of Open Access Journals (Sweden)

    Tapan Kumar Bhowmik

    2015-12-01

    Full Text Available This article presents the theoretical derivation as well as practical steps for implementing Naive Bayes (NB and Logistic Regression (LR classifiers. A generative learning under Gaussian Naive Bayes assumption and two discriminative learning techniques based on gradient ascent and Newton-Raphson methods are described to estimate the parameters of LR. Some limitation of learning techniques and implementation issues are discussed as well. A set of experiments are performed for both the classifiers under different learning circumstances and their performances are compared. From the experiments, it is observed that LR learning with gradient ascent technique outperforms general NB classifier. However, under Gaussian Naive Bayes assumption, both classifiers NB and LR perform similar.

  15. Logistic Regression in the Identification of Hazards in Construction

    Science.gov (United States)

    Drozd, Wojciech

    2017-10-01

    The construction site and its elements create circumstances that are conducive to the formation of risks to safety during the execution of works. Analysis indicates the critical importance of these factors in the set of characteristics that describe the causes of accidents in the construction industry. This article attempts to analyse the characteristics related to the construction site, in order to indicate their importance in defining the circumstances of accidents at work. The study includes sites inspected in 2014 - 2016 by the employees of the District Labour Inspectorate in Krakow (Poland). The analysed set of detailed (disaggregated) data includes both quantitative and qualitative characteristics. The substantive task focused on classification modelling in the identification of hazards in construction and identifying those of the analysed characteristics that are important in an accident. In terms of methodology, resource data analysis using statistical classifiers, in the form of logistic regression, was the method used.

  16. Forecast Model of Urban Stagnant Water Based on Logistic Regression

    Directory of Open Access Journals (Sweden)

    Liu Pan

    2017-01-01

    Full Text Available With the development of information technology, the construction of water resource system has been gradually carried out. In the background of big data, the work of water information needs to carry out the process of quantitative to qualitative change. Analyzing the correlation of data and exploring the deep value of data which are the key of water information’s research. On the basis of the research on the water big data and the traditional data warehouse architecture, we try to find out the connection of different data source. According to the temporal and spatial correlation of stagnant water and rainfall, we use spatial interpolation to integrate data of stagnant water and rainfall which are from different data source and different sensors, then use logistic regression to find out the relationship between them.

  17. Predictions of flood warning threshold exceedance computed with logistic regression

    Science.gov (United States)

    Diomede, Tommaso; Marsigli, Chiara; Stefania Tesini, Maria

    2017-04-01

    A method based on logistic regression is proposed for the prediction of river level threshold exceedance at different lead times (from +6h up to +42h). The aim of the study is to provide a valuable tool for the issue of warnings by the authority responsible of public safety in case of flood. The role of different precipitation periods as predictors for the exceedance of a fixed river level has been investigated, in order to derive significant information for flood forecasting. Based on catchment-averaged values, a separation of "antecedent" and "peak-triggering" rainfall amounts as independent variables is attempted. In particular, the following flood-related precipitation periods have been considered: (i) the period from 1 to n days before the forecast issue time, which may be relevant for the soil saturation ("state of the catchment"), (ii) the last 24 hours, which may be relevant for the current water level in the river ("state of the river"), and (iii) the period from 0 to x hours in advance with respect to the forecast issue time, when the flood-triggering precipitation generally occurs ("state of the atmosphere"). Several combinations and values of these predictors have been tested to optimise the method implementation. In particular, the period for the precursor antecedent precipitation ranges between 5 and 45 days; the current "state of the river" can be represented by the last 24-h precipitation or, as alternative, by the current river level. The flood-triggering precipitation has been cumulated over the next 18-42 hours, or the previous 6-12h, according to the forecast lead time. The proposed approach requires a specific implementation of logistic regression for each river section and warning threshold. The method performance has been evaluated over several catchments in the Emilia-Romagna Region, northern Italy, which dimensions range from 100 to 1000 km2. A statistical analysis in terms of false alarms, misses and related scores was carried out by using

  18. Application of Bayesian logistic regression to mining biomedical data.

    Science.gov (United States)

    Avali, Viji R; Cooper, Gregory F; Gopalakrishnan, Vanathi

    2014-01-01

    Mining high dimensional biomedical data with existing classifiers is challenging and the predictions are often inaccurate. We investigated the use of Bayesian Logistic Regression (B-LR) for mining such data to predict and classify various disease conditions. The analysis was done on twelve biomedical datasets with binary class variables and the performance of B-LR was compared to those from other popular classifiers on these datasets with 10-fold cross validation using the WEKA data mining toolkit. The statistical significance of the results was analyzed by paired two tailed t-tests and non-parametric Wilcoxon signed-rank tests. We observed overall that B-LR with non-informative Gaussian priors performed on par with other classifiers in terms of accuracy, balanced accuracy and AUC. These results suggest that it is worthwhile to explore the application of B-LR to predictive modeling tasks in bioinformatics using informative biological prior probabilities. With informative prior probabilities, we conjecture that the performance of B-LR will improve.

  19. Autoregressive logistic regression applied to atmospheric circulation patterns

    Science.gov (United States)

    Guanche, Y.; Mínguez, R.; Méndez, F. J.

    2014-01-01

    Autoregressive logistic regression models have been successfully applied in medical and pharmacology research fields, and in simple models to analyze weather types. The main purpose of this paper is to introduce a general framework to study atmospheric circulation patterns capable of dealing simultaneously with: seasonality, interannual variability, long-term trends, and autocorrelation of different orders. To show its effectiveness on modeling performance, daily atmospheric circulation patterns identified from observed sea level pressure fields over the Northeastern Atlantic, have been analyzed using this framework. Model predictions are compared with probabilities from the historical database, showing very good fitting diagnostics. In addition, the fitted model is used to simulate the evolution over time of atmospheric circulation patterns using Monte Carlo method. Simulation results are statistically consistent with respect to the historical sequence in terms of (1) probability of occurrence of the different weather types, (2) transition probabilities and (3) persistence. The proposed model constitutes an easy-to-use and powerful tool for a better understanding of the climate system.

  20. Sample size determination for logistic regression on a logit-normal distribution.

    Science.gov (United States)

    Kim, Seongho; Heath, Elisabeth; Heilbrun, Lance

    2017-06-01

    Although the sample size for simple logistic regression can be readily determined using currently available methods, the sample size calculation for multiple logistic regression requires some additional information, such as the coefficient of determination ([Formula: see text]) of a covariate of interest with other covariates, which is often unavailable in practice. The response variable of logistic regression follows a logit-normal distribution which can be generated from a logistic transformation of a normal distribution. Using this property of logistic regression, we propose new methods of determining the sample size for simple and multiple logistic regressions using a normal transformation of outcome measures. Simulation studies and a motivating example show several advantages of the proposed methods over the existing methods: (i) no need for [Formula: see text] for multiple logistic regression, (ii) available interim or group-sequential designs, and (iii) much smaller required sample size.

  1. A Comparative Study of Cox Regression vs. Log-Logistic ...

    African Journals Online (AJOL)

    Colorectal cancer is common and lethal disease with different incidence rate in different parts of the world which is taken into account as the third cause of cancer-related deaths. In the present study, using non-parametric Cox model and parametric Log-logistic model, factors influencing survival of patients with colorectal ...

  2. The effect of high leverage points on the logistic ridge regression estimator having multicollinearity

    Science.gov (United States)

    Ariffin, Syaiba Balqish; Midi, Habshah

    2014-06-01

    This article is concerned with the performance of logistic ridge regression estimation technique in the presence of multicollinearity and high leverage points. In logistic regression, multicollinearity exists among predictors and in the information matrix. The maximum likelihood estimator suffers a huge setback in the presence of multicollinearity which cause regression estimates to have unduly large standard errors. To remedy this problem, a logistic ridge regression estimator is put forward. It is evident that the logistic ridge regression estimator outperforms the maximum likelihood approach for handling multicollinearity. The effect of high leverage points are then investigated on the performance of the logistic ridge regression estimator through real data set and simulation study. The findings signify that logistic ridge regression estimator fails to provide better parameter estimates in the presence of both high leverage points and multicollinearity.

  3. On the Usefulness of a Multilevel Logistic Regression Approach to Person-Fit Analysis

    Science.gov (United States)

    Conijn, Judith M.; Emons, Wilco H. M.; van Assen, Marcel A. L. M.; Sijtsma, Klaas

    2011-01-01

    The logistic person response function (PRF) models the probability of a correct response as a function of the item locations. Reise (2000) proposed to use the slope parameter of the logistic PRF as a person-fit measure. He reformulated the logistic PRF model as a multilevel logistic regression model and estimated the PRF parameters from this…

  4. A New Hybrid Method Logistic Regression and Feedforward Neural Network for Lung Cancer Data

    OpenAIRE

    Taner Tunç

    2012-01-01

    Logistic regression (LR) is a conventional statistical technique used for data classification problem. Logistic regression is a model-based method, and it uses nonlinear model structure. Another technique used for classification is feedforward artificial neural networks. Feedforward artificial neural network is a data-based method which can model nonlinear models through its activation function. In this study, a hybrid approach of model-based logistic regression technique and data-based artif...

  5. Estimating Engineering and Manufacturing Development Cost Risk Using Logistic and Multiple Regression

    National Research Council Canada - National Science Library

    Bielecki, John

    2003-01-01

    .... Previous research has demonstrated the use of a two-step logistic and multiple regression methodology to predicting cost growth produces desirable results versus traditional single-step regression...

  6. An Entropy-Based Measure for Assessing Fuzziness in Logistic Regression

    Science.gov (United States)

    Weiss, Brandi A.; Dardick, William

    2016-01-01

    This article introduces an entropy-based measure of data-model fit that can be used to assess the quality of logistic regression models. Entropy has previously been used in mixture-modeling to quantify how well individuals are classified into latent classes. The current study proposes the use of entropy for logistic regression models to quantify…

  7. Power and Sample Size Calculations for Logistic Regression Tests for Differential Item Functioning

    Science.gov (United States)

    Li, Zhushan

    2014-01-01

    Logistic regression is a popular method for detecting uniform and nonuniform differential item functioning (DIF) effects. Theoretical formulas for the power and sample size calculations are derived for likelihood ratio tests and Wald tests based on the asymptotic distribution of the maximum likelihood estimators for the logistic regression model.…

  8. Comparison of standard maximum likelihood classification and polytomous logistic regression used in remote sensing

    Science.gov (United States)

    John Hogland; Nedret Billor; Nathaniel Anderson

    2013-01-01

    Discriminant analysis, referred to as maximum likelihood classification within popular remote sensing software packages, is a common supervised technique used by analysts. Polytomous logistic regression (PLR), also referred to as multinomial logistic regression, is an alternative classification approach that is less restrictive, more flexible, and easy to interpret. To...

  9. What Are the Odds of that? A Primer on Understanding Logistic Regression

    Science.gov (United States)

    Huang, Francis L.; Moon, Tonya R.

    2013-01-01

    The purpose of this Methodological Brief is to present a brief primer on logistic regression, a commonly used technique when modeling dichotomous outcomes. Using data from the National Education Longitudinal Study of 1988 (NELS:88), logistic regression techniques were used to investigate student-level variables in eighth grade (i.e., enrolled in a…

  10. Comparing Methodologies for Developing an Early Warning System: Classification and Regression Tree Model versus Logistic Regression. REL 2015-077

    Science.gov (United States)

    Koon, Sharon; Petscher, Yaacov

    2015-01-01

    The purpose of this report was to explicate the use of logistic regression and classification and regression tree (CART) analysis in the development of early warning systems. It was motivated by state education leaders' interest in maintaining high classification accuracy while simultaneously improving practitioner understanding of the rules by…

  11. Dynamic logistic regression and dynamic model averaging for binary classification.

    Science.gov (United States)

    McCormick, Tyler H; Raftery, Adrian E; Madigan, David; Burd, Randall S

    2012-03-01

    We propose an online binary classification procedure for cases when there is uncertainty about the model to use and parameters within a model change over time. We account for model uncertainty through dynamic model averaging, a dynamic extension of Bayesian model averaging in which posterior model probabilities may also change with time. We apply a state-space model to the parameters of each model and we allow the data-generating model to change over time according to a Markov chain. Calibrating a "forgetting" factor accommodates different levels of change in the data-generating mechanism. We propose an algorithm that adjusts the level of forgetting in an online fashion using the posterior predictive distribution, and so accommodates various levels of change at different times. We apply our method to data from children with appendicitis who receive either a traditional (open) appendectomy or a laparoscopic procedure. Factors associated with which children receive a particular type of procedure changed substantially over the 7 years of data collection, a feature that is not captured using standard regression modeling. Because our procedure can be implemented completely online, future data collection for similar studies would require storing sensitive patient information only temporarily, reducing the risk of a breach of confidentiality. © 2011, The International Biometric Society.

  12. Logistic regression models for polymorphic and antagonistic pleiotropic gene action on human aging and longevity

    DEFF Research Database (Denmark)

    Tan, Qihua; Bathum, L; Christiansen, L

    2003-01-01

    In this paper, we apply logistic regression models to measure genetic association with human survival for highly polymorphic and pleiotropic genes. By modelling genotype frequency as a function of age, we introduce a logistic regression model with polytomous responses to handle the polymorphic...... situation. Genotype and allele-based parameterization can be used to investigate the modes of gene action and to reduce the number of parameters, so that the power is increased while the amount of multiple testing minimized. A binomial logistic regression model with fractional polynomials is used to capture...

  13. Analysis of Success in General Chemistry Based on Diagnostic Testing Using Logistic Regression

    Science.gov (United States)

    Legg, Margaret J.; Legg, Jason C.; Greenbowe, Thomas J.

    2001-08-01

    Several chemistry diagnostic and placement exams are used to help place chemistry students in an appropriate course or to determine strengths and weaknesses for specific topics in chemistry or math. The purpose of obtaining pre-course measurements is to increase students' academic success. Often these tests are used to predict the chance a student has in passing a course. This paper discusses the statistical methods of logistic regression applied to predicting the probability of passing a course, based on the scores on the California Chemistry Diagnostic Test at two different institutions with two different instructors over multiple years. This technique describes the relation of a test score (a continuous variable) to the probability of passing the class (a binary variable). Many papers in the Journal of Chemical Education have used a simple linear regression technique to correlate placement test scores with the proportion of students passing a course. The model assumptions are difficult to satisfy when using simple linear regression. Simple linear regression is useful when continuous predictor variables predict a continuous response, whereas logistic regression is useful when continuous predictor variables predict a binary response. Differences between simple linear regression and logistic regression and methods for evaluating linear regression model assumptions are discussed in detail. The fundamental concepts behind regression are described, with the caveats of using regression equations for predictions. By using logistic regression, instructors will be able to provide students with an estimate of their probability of passing the course.

  14. Evaluation of Logistic Regression and Multivariate Adaptive Regression Spline Models for Groundwater Potential Mapping Using R and GIS

    OpenAIRE

    Soyoung Park; Se-Yeong Hamm; Hang-Tak Jeon; Jinsoo Kim

    2017-01-01

    This study mapped and analyzed groundwater potential using two different models, logistic regression (LR) and multivariate adaptive regression splines (MARS), and compared the results. A spatial database was constructed for groundwater well data and groundwater influence factors. Groundwater well data with a high potential yield of ≥70 m3/d were extracted, and 859 locations (70%) were used for model training, whereas the other 365 locations (30%) were used for model validation. We analyzed 16...

  15. A New Hybrid Method Logistic Regression and Feedforward Neural Network for Lung Cancer Data

    Directory of Open Access Journals (Sweden)

    Taner Tunç

    2012-01-01

    Full Text Available Logistic regression (LR is a conventional statistical technique used for data classification problem. Logistic regression is a model-based method, and it uses nonlinear model structure. Another technique used for classification is feedforward artificial neural networks. Feedforward artificial neural network is a data-based method which can model nonlinear models through its activation function. In this study, a hybrid approach of model-based logistic regression technique and data-based artificial neural network was proposed for classification purposes. The proposed approach was applied to lung cancer data, and obtained results were compared. It was seen that the proposed hybrid approach was superior to logistic regression and feedforward artificial neural networks with respect to many criteria.

  16. How to deal with continuous and dichotomic outcomes in epidemiological research: linear and logistic regression analyses

    NARCIS (Netherlands)

    Tripepi, Giovanni; Jager, Kitty J.; Stel, Vianda S.; Dekker, Friedo W.; Zoccali, Carmine

    2011-01-01

    Because of some limitations of stratification methods, epidemiologists frequently use multiple linear and logistic regression analyses to address specific epidemiological questions. If the dependent variable is a continuous one (for example, systolic pressure and serum creatinine), the researcher

  17. Logistic Regression Analysis of Operational Errors and Routine Operations Using Sector Characteristics

    National Research Council Canada - National Science Library

    Pfleiderer, Elaine M; Scroggins, Cheryl L; Manning, Carol A

    2009-01-01

    Two separate logistic regression analyses were conducted for low- and high-altitude sectors to determine whether a set of dynamic sector characteristics variables could reliably discriminate between operational error (OE...

  18. Logistic regression models of factors influencing the location of bioenergy and biofuels plants

    Science.gov (United States)

    T.M. Young; R.L. Zaretzki; J.H. Perdue; F.M. Guess; X. Liu

    2011-01-01

    Logistic regression models were developed to identify significant factors that influence the location of existing wood-using bioenergy/biofuels plants and traditional wood-using facilities. Logistic models provided quantitative insight for variables influencing the location of woody biomass-using facilities. Availability of "thinnings to a basal area of 31.7m2/ha...

  19. Machine Learning, Linear and Bayesian Models for Logistic Regression in Failure Detection Problems

    OpenAIRE

    Pavlyshenko, B.

    2016-01-01

    In this work, we study the use of logistic regression in manufacturing failures detection. As a data set for the analysis, we used the data from Kaggle competition Bosch Production Line Performance. We considered the use of machine learning, linear and Bayesian models. For machine learning approach, we analyzed XGBoost tree based classifier to obtain high scored classification. Using the generalized linear model for logistic regression makes it possible to analyze the influence of the factors...

  20. Factors Influencing Drug Injection History among Prisoners: A Comparison between Classification and Regression Trees and Logistic Regression Analysis.

    Science.gov (United States)

    Rastegari, Azam; Haghdoost, Ali Akbar; Baneshi, Mohammad Reza

    2013-01-01

    Due to the importance of medical studies, researchers of this field should be familiar with various types of statistical analyses to select the most appropriate method based on the characteristics of their data sets. Classification and regression trees (CARTs) can be as complementary to regression models. We compared the performance of a logistic regression model and a CART in predicting drug injection among prisoners. Data of 2720 Iranian prisoners was studied to determine the factors influencing drug injection. The collected data was divided into two groups of training and testing. A logistic regression model and a CART were applied on training data. The performance of the two models was then evaluated on testing data. The regression model and the CART had 8 and 4 significant variables, respectively. Overall, heroin use, history of imprisonment, age at first drug use, and marital status were important factors in determining the history of drug injection. Subjects without the history of heroin use or heroin users with short-term imprisonment were at lower risk of drug injection. Among heroin addicts with long-term imprisonment, individuals with higher age at first drug use and married subjects were at lower risk of drug injection. Although the logistic regression model was more sensitive than the CART, the two models had the same levels of specificity and classification accuracy. In this study, both sensitivity and specificity were important. While the logistic regression model had better performance, the graphical presentation of the CART simplifies the interpretation of the results. In general, a combination of different analytical methods is recommended to explore the effects of variables.

  1. Mortality risk prediction in burn injury: Comparison of logistic regression with machine learning approaches.

    Science.gov (United States)

    Stylianou, Neophytos; Akbarov, Artur; Kontopantelis, Evangelos; Buchan, Iain; Dunn, Ken W

    2015-08-01

    Predicting mortality from burn injury has traditionally employed logistic regression models. Alternative machine learning methods have been introduced in some areas of clinical prediction as the necessary software and computational facilities have become accessible. Here we compare logistic regression and machine learning predictions of mortality from burn. An established logistic mortality model was compared to machine learning methods (artificial neural network, support vector machine, random forests and naïve Bayes) using a population-based (England & Wales) case-cohort registry. Predictive evaluation used: area under the receiver operating characteristic curve; sensitivity; specificity; positive predictive value and Youden's index. All methods had comparable discriminatory abilities, similar sensitivities, specificities and positive predictive values. Although some machine learning methods performed marginally better than logistic regression the differences were seldom statistically significant and clinically insubstantial. Random forests were marginally better for high positive predictive value and reasonable sensitivity. Neural networks yielded slightly better prediction overall. Logistic regression gives an optimal mix of performance and interpretability. The established logistic regression model of burn mortality performs well against more complex alternatives. Clinical prediction with a small set of strong, stable, independent predictors is unlikely to gain much from machine learning outside specialist research contexts. Copyright © 2015 Elsevier Ltd and ISBI. All rights reserved.

  2. Analysis of Differential Item Functioning (DIF) Using Hierarchical Logistic Regression Models.

    Science.gov (United States)

    Swanson, David B.; Clauser, Brian E.; Case, Susan M.; Nungester, Ronald J.; Featherman, Carol

    2002-01-01

    Outlines an approach to differential item functioning (DIF) analysis using hierarchical linear regression that makes it possible to combine results of logistic regression analyses across items to identify consistent sources of DIF, to quantify the proportion of explained variation in DIF coefficients, and to compare the predictive accuracy of…

  3. Binary logistic regression-Instrument for assessing museum indoor air impact on exhibits.

    Science.gov (United States)

    Bucur, Elena; Danet, Andrei Florin; Lehr, Carol Blaziu; Lehr, Elena; Nita-Lazar, Mihai

    2017-04-01

    This paper presents a new way to assess the environmental impact on historical artifacts using binary logistic regression. The prediction of the impact on the exhibits during certain pollution scenarios (environmental impact) was calculated by a mathematical model based on the binary logistic regression; it allows the identification of those environmental parameters from a multitude of possible parameters with a significant impact on exhibitions and ranks them according to their severity effect. Air quality (NO 2 , SO 2 , O 3 and PM 2.5 ) and microclimate parameters (temperature, humidity) monitoring data from a case study conducted within exhibition and storage spaces of the Romanian National Aviation Museum Bucharest have been used for developing and validating the binary logistic regression method and the mathematical model. The logistic regression analysis was used on 794 data combinations (715 to develop of the model and 79 to validate it) by a Statistical Package for Social Sciences (SPSS 20.0). The results from the binary logistic regression analysis demonstrated that from six parameters taken into consideration, four of them present a significant effect upon exhibits in the following order: O 3 >PM 2.5 >NO 2 >humidity followed at a significant distance by the effects of SO 2 and temperature. The mathematical model, developed in this study, correctly predicted 95.1 % of the cumulated effect of the environmental parameters upon the exhibits. Moreover, this model could also be used in the decisional process regarding the preventive preservation measures that should be implemented within the exhibition space. The paper presents a new way to assess the environmental impact on historical artifacts using binary logistic regression. The mathematical model developed on the environmental parameters analyzed by the binary logistic regression method could be useful in a decision-making process establishing the best measures for pollution reduction and preventive

  4. Modelling of binary logistic regression for obesity among secondary students in a rural area of Kedah

    Science.gov (United States)

    Kamaruddin, Ainur Amira; Ali, Zalila; Noor, Norlida Mohd.; Baharum, Adam; Ahmad, Wan Muhamad Amir W.

    2014-07-01

    Logistic regression analysis examines the influence of various factors on a dichotomous outcome by estimating the probability of the event's occurrence. Logistic regression, also called a logit model, is a statistical procedure used to model dichotomous outcomes. In the logit model the log odds of the dichotomous outcome is modeled as a linear combination of the predictor variables. The log odds ratio in logistic regression provides a description of the probabilistic relationship of the variables and the outcome. In conducting logistic regression, selection procedures are used in selecting important predictor variables, diagnostics are used to check that assumptions are valid which include independence of errors, linearity in the logit for continuous variables, absence of multicollinearity, and lack of strongly influential outliers and a test statistic is calculated to determine the aptness of the model. This study used the binary logistic regression model to investigate overweight and obesity among rural secondary school students on the basis of their demographics profile, medical history, diet and lifestyle. The results indicate that overweight and obesity of students are influenced by obesity in family and the interaction between a student's ethnicity and routine meals intake. The odds of a student being overweight and obese are higher for a student having a family history of obesity and for a non-Malay student who frequently takes routine meals as compared to a Malay student.

  5. Modeling Governance KB with CATPCA to Overcome Multicollinearity in the Logistic Regression

    Science.gov (United States)

    Khikmah, L.; Wijayanto, H.; Syafitri, U. D.

    2017-04-01

    The problem often encounters in logistic regression modeling are multicollinearity problems. Data that have multicollinearity between explanatory variables with the result in the estimation of parameters to be bias. Besides, the multicollinearity will result in error in the classification. In general, to overcome multicollinearity in regression used stepwise regression. They are also another method to overcome multicollinearity which involves all variable for prediction. That is Principal Component Analysis (PCA). However, classical PCA in only for numeric data. Its data are categorical, one method to solve the problems is Categorical Principal Component Analysis (CATPCA). Data were used in this research were a part of data Demographic and Population Survey Indonesia (IDHS) 2012. This research focuses on the characteristic of women of using the contraceptive methods. Classification results evaluated using Area Under Curve (AUC) values. The higher the AUC value, the better. Based on AUC values, the classification of the contraceptive method using stepwise method (58.66%) is better than the logistic regression model (57.39%) and CATPCA (57.39%). Evaluation of the results of logistic regression using sensitivity, shows the opposite where CATPCA method (99.79%) is better than logistic regression method (92.43%) and stepwise (92.05%). Therefore in this study focuses on major class classification (using a contraceptive method), then the selected model is CATPCA because it can raise the level of the major class model accuracy.

  6. Using the Logistic Regression model in supporting decisions of establishing marketing strategies

    Directory of Open Access Journals (Sweden)

    Cristinel CONSTANTIN

    2015-12-01

    Full Text Available This paper is about an instrumental research regarding the using of Logistic Regression model for data analysis in marketing research. The decision makers inside different organisation need relevant information to support their decisions regarding the marketing strategies. The data provided by marketing research could be computed in various ways but the multivariate data analysis models can enhance the utility of the information. Among these models we can find the Logistic Regression model, which is used for dichotomous variables. Our research is based on explanation the utility of this model and interpretation of the resulted information in order to help practitioners and researchers to use it in their future investigations

  7. Construction of risk prediction model of type 2 diabetes mellitus based on logistic regression

    Directory of Open Access Journals (Sweden)

    Li Jian

    2017-01-01

    Full Text Available Objective: to construct multi factor prediction model for the individual risk of T2DM, and to explore new ideas for early warning, prevention and personalized health services for T2DM. Methods: using logistic regression techniques to screen the risk factors for T2DM and construct the risk prediction model of T2DM. Results: Male’s risk prediction model logistic regression equation: logit(P=BMI × 0.735+ vegetables × (−0.671 + age × 0.838+ diastolic pressure × 0.296+ physical activity× (−2.287 + sleep ×(−0.009 +smoking ×0.214; Female’s risk prediction model logistic regression equation: logit(P=BMI ×1.979+ vegetables× (−0.292 + age × 1.355+ diastolic pressure× 0.522+ physical activity × (−2.287 + sleep × (−0.010.The area under the ROC curve of male was 0.83, the sensitivity was 0.72, the specificity was 0.86, the area under the ROC curve of female was 0.84, the sensitivity was 0.75, the specificity was 0.90. Conclusion: This study model data is from a compared study of nested case, the risk prediction model has been established by using the more mature logistic regression techniques, and the model is higher predictive sensitivity, specificity and stability.

  8. Multiple Logistic Regression Analysis of Cigarette Use among High School Students

    Science.gov (United States)

    Adwere-Boamah, Joseph

    2011-01-01

    A binary logistic regression analysis was performed to predict high school students' cigarette smoking behavior from selected predictors from 2009 CDC Youth Risk Behavior Surveillance Survey. The specific target student behavior of interest was frequent cigarette use. Five predictor variables included in the model were: a) race, b) frequency of…

  9. A Comparison of Logistic Regression, Neural Networks, and Classification Trees Predicting Success of Actuarial Students

    Science.gov (United States)

    Schumacher, Phyllis; Olinsky, Alan; Quinn, John; Smith, Richard

    2010-01-01

    The authors extended previous research by 2 of the authors who conducted a study designed to predict the successful completion of students enrolled in an actuarial program. They used logistic regression to determine the probability of an actuarial student graduating in the major or dropping out. They compared the results of this study with those…

  10. Detecting DIF in Polytomous Items Using MACS, IRT and Ordinal Logistic Regression

    Science.gov (United States)

    Elosua, Paula; Wells, Craig

    2013-01-01

    The purpose of the present study was to compare the Type I error rate and power of two model-based procedures, the mean and covariance structure model (MACS) and the item response theory (IRT), and an observed-score based procedure, ordinal logistic regression, for detecting differential item functioning (DIF) in polytomous items. A simulation…

  11. A Note on Three Statistical Tests in the Logistic Regression DIF Procedure

    Science.gov (United States)

    Paek, Insu

    2012-01-01

    Although logistic regression became one of the well-known methods in detecting differential item functioning (DIF), its three statistical tests, the Wald, likelihood ratio (LR), and score tests, which are readily available under the maximum likelihood, do not seem to be consistently distinguished in DIF literature. This paper provides a clarifying…

  12. Comparison of IRT Likelihood Ratio Test and Logistic Regression DIF Detection Procedures

    Science.gov (United States)

    Atar, Burcu; Kamata, Akihito

    2011-01-01

    The Type I error rates and the power of IRT likelihood ratio test and cumulative logit ordinal logistic regression procedures in detecting differential item functioning (DIF) for polytomously scored items were investigated in this Monte Carlo simulation study. For this purpose, 54 simulation conditions (combinations of 3 sample sizes, 2 sample…

  13. Propensity Score Estimation with Data Mining Techniques: Alternatives to Logistic Regression

    Science.gov (United States)

    Keller, Bryan S. B.; Kim, Jee-Seon; Steiner, Peter M.

    2013-01-01

    Propensity score analysis (PSA) is a methodological technique which may correct for selection bias in a quasi-experiment by modeling the selection process using observed covariates. Because logistic regression is well understood by researchers in a variety of fields and easy to implement in a number of popular software packages, it has…

  14. Accuracy of Bayes and Logistic Regression Subscale Probabilities for Educational and Certification Tests

    Science.gov (United States)

    Rudner, Lawrence

    2016-01-01

    In the machine learning literature, it is commonly accepted as fact that as calibration sample sizes increase, Naïve Bayes classifiers initially outperform Logistic Regression classifiers in terms of classification accuracy. Applied to subtests from an on-line final examination and from a highly regarded certification examination, this study shows…

  15. Using ROC curves to compare neural networks and logistic regression for modeling individual noncatastrophic tree mortality

    Science.gov (United States)

    Susan L. King

    2003-01-01

    The performance of two classifiers, logistic regression and neural networks, are compared for modeling noncatastrophic individual tree mortality for 21 species of trees in West Virginia. The output of the classifier is usually a continuous number between 0 and 1. A threshold is selected between 0 and 1 and all of the trees below the threshold are classified as...

  16. Comparing Linear Discriminant Function with Logistic Regression for the Two-Group Classification Problem.

    Science.gov (United States)

    Fan, Xitao; Wang, Lin

    The Monte Carlo study compared the performance of predictive discriminant analysis (PDA) and that of logistic regression (LR) for the two-group classification problem. Prior probabilities were used for classification, but the cost of misclassification was assumed to be equal. The study used a fully crossed three-factor experimental design (with…

  17. Mapping the probability of ripened subsoils using Bayesian logistic regression with informative priors

    NARCIS (Netherlands)

    Steinbuch, Luc; Brus, Dick J.; Heuvelink, Gerard B.M.

    2018-01-01

    One of the first soil forming processes in marine and fluviatile clay soils is ripening, the irreversible change of physical and chemical soil properties, especially consistency, under influence of air. We used Bayesian binomial logistic regression (BBLR) to update the map showing unripened

  18. Semi-parametric estimation of random effects in a logistic regression model using conditional inference

    DEFF Research Database (Denmark)

    Petersen, Jørgen Holm

    2016-01-01

    This paper describes a new approach to the estimation in a logistic regression model with two crossed random effects where special interest is in estimating the variance of one of the effects while not making distributional assumptions about the other effect. A composite likelihood is studied...

  19. Large scale identification and categorization of protein sequences using structured logistic regression

    DEFF Research Database (Denmark)

    Pedersen, Bjørn Panella; Ifrim, Georgiana; Liboriussen, Poul

    2014-01-01

    Abstract Background Structured Logistic Regression (SLR) is a newly developed machine learning tool first proposed in the context of text categorization. Current availability of extensive protein sequence databases calls for an automated method to reliably classify sequences and SLR seems well...

  20. The use of logistic regression in modelling the distributions of bird ...

    African Journals Online (AJOL)

    The method of logistic regression was used to model the observed geographical distribution patterns of bird species in Swaziland in relation to a set of environmental variables. Reporting rates derived from bird atlas data are used as an index of population densities. This is justified in part by the success of the modelling ...

  1. Landslide susceptibility mapping along road corridors in the Indian Himalayas using Bayesian logistic regression models

    Science.gov (United States)

    Das, Iswar; Stein, Alfred; Kerle, Norman; Dadhwal, Vinay K.

    2012-12-01

    Landslide susceptibility mapping (LSM) along road corridors in the Indian Himalayas is an essential exercise that helps planners and decision makers in determining the severity of probable slope failure areas. Logistic regression is commonly applied for this purpose, as it is a robust and straightforward technique that is relatively easy to handle. Ordinary logistic regression as a data-driven technique, however, does not allow inclusion of prior information. This study presents Bayesian logistic regression (BLR) for landslide susceptibility assessment along road corridors. The methodology is tested in a landslide-prone area in the Bhagirathi river valley in the Indian Himalayas. Parameter estimates from BLR are compared with those obtained from ordinary logistic regression. By means of iterative Markov Chain Monte Carlo simulation, BLR provides a rich set of results on parameter estimation. We assessed model performance by the receiver operator characteristics curve analysis, and validated the model using 50% of the landslide cells kept apart for testing and validation. The study concludes that BLR performs better in posterior parameter estimation in general and the uncertainty estimation in particular.

  2. Monte Carlo Evaluation of Two-Level Logistic Regression for Assessing Person Fit

    Science.gov (United States)

    Woods, Carol M.

    2008-01-01

    Person fit is the degree to which an item response model fits for individual examinees. Reise (2000) described how two-level logistic regression can be used to detect heterogeneity in person fit, evaluate potential predictors of person fit heterogeneity, and identify potentially aberrant individuals. The method has apparently never been applied to…

  3. Mixed-Effects Logistic Regression for Estimating Transitional Probabilities in Sequentially Coded Observational Data

    Science.gov (United States)

    Ozechowski, Timothy J.; Turner, Charles W.; Hops, Hyman

    2007-01-01

    This article demonstrates the use of mixed-effects logistic regression (MLR) for conducting sequential analyses of binary observational data. MLR is a special case of the mixed-effects logit modeling framework, which may be applied to multicategorical observational data. The MLR approach is motivated in part by G. A. Dagne, G. W. Howe, C. H.…

  4. Fitting multistate transition models with autoregressive logistic regression : Supervised exercise in intermittent claudication

    NARCIS (Netherlands)

    de Vries, S O; Fidler, Vaclav; Kuipers, Wietze D; Hunink, Maria G M

    1998-01-01

    The purpose of this study was to develop a model that predicts the outcome of supervised exercise for intermittent claudication. The authors present an example of the use of autoregressive logistic regression for modeling observed longitudinal data. Data were collected from 329 participants in a

  5. Logistic Regression and Probability of Business School Alumni Donations: Micro-Data Evidence.

    Science.gov (United States)

    Okunade, Albert Ade

    1993-01-01

    Analyzes propensity of business school alumni to give cash donations to their alma mater. Estimates a utility maximization model, using logistic regression and survey sample data of 1956-90 graduates of a large U.S public research university. Giving probability is strongly correlated with specific majors, time since graduation, other factors.…

  6. A Predictive Logistic Regression Model of World Conflict Using Open Source Data

    Science.gov (United States)

    2015-03-26

    Europe& Central Asia - 28 Nations Angola Bangladesh Afghanistan Benin Bhutan Albania Botswana Brunei Darussalam Armenia Burkina Faso Cambodia Azerbaijan...Board of Trustees of the Colorado School of Mines 90.7. Colorado School of Mines . Linear V. Logistic Regression. (n.d.). Retrieved 09 13, 2014, from

  7. Strategies for Testing Statistical and Practical Significance in Detecting DIF with Logistic Regression Models

    Science.gov (United States)

    Fidalgo, Angel M.; Alavi, Seyed Mohammad; Amirian, Seyed Mohammad Reza

    2014-01-01

    This study examines three controversial aspects in differential item functioning (DIF) detection by logistic regression (LR) models: first, the relative effectiveness of different analytical strategies for detecting DIF; second, the suitability of the Wald statistic for determining the statistical significance of the parameters of interest; and…

  8. Differential item functioning (DIF) analyses of health-related quality of life instruments using logistic regression.

    Science.gov (United States)

    Scott, Neil W; Fayers, Peter M; Aaronson, Neil K; Bottomley, Andrew; de Graeff, Alexander; Groenvold, Mogens; Gundy, Chad; Koller, Michael; Petersen, Morten A; Sprangers, Mirjam A G

    2010-08-04

    Differential item functioning (DIF) methods can be used to determine whether different subgroups respond differently to particular items within a health-related quality of life (HRQoL) subscale, after allowing for overall subgroup differences in that scale. This article reviews issues that arise when testing for DIF in HRQoL instruments. We focus on logistic regression methods, which are often used because of their efficiency, simplicity and ease of application. A review of logistic regression DIF analyses in HRQoL was undertaken. Methodological articles from other fields and using other DIF methods were also included if considered relevant. There are many competing approaches for the conduct of DIF analyses and many criteria for determining what constitutes significant DIF. DIF in short scales, as commonly found in HRQL instruments, may be more difficult to interpret. Qualitative methods may aid interpretation of such DIF analyses. A number of methodological choices must be made when applying logistic regression for DIF analyses, and many of these affect the results. We provide recommendations based on reviewing the current evidence. Although the focus is on logistic regression, many of our results should be applicable to DIF analyses in general. There is a need for more empirical and theoretical work in this area.

  9. Classification and regression tree analysis vs. multivariable linear and logistic regression methods as statistical tools for studying haemophilia.

    Science.gov (United States)

    Henrard, S; Speybroeck, N; Hermans, C

    2015-11-01

    Haemophilia is a rare genetic haemorrhagic disease characterized by partial or complete deficiency of coagulation factor VIII, for haemophilia A, or IX, for haemophilia B. As in any other medical research domain, the field of haemophilia research is increasingly concerned with finding factors associated with binary or continuous outcomes through multivariable models. Traditional models include multiple logistic regressions, for binary outcomes, and multiple linear regressions for continuous outcomes. Yet these regression models are at times difficult to implement, especially for non-statisticians, and can be difficult to interpret. The present paper sought to didactically explain how, why, and when to use classification and regression tree (CART) analysis for haemophilia research. The CART method is non-parametric and non-linear, based on the repeated partitioning of a sample into subgroups based on a certain criterion. Breiman developed this method in 1984. Classification trees (CTs) are used to analyse categorical outcomes and regression trees (RTs) to analyse continuous ones. The CART methodology has become increasingly popular in the medical field, yet only a few examples of studies using this methodology specifically in haemophilia have to date been published. Two examples using CART analysis and previously published in this field are didactically explained in details. There is increasing interest in using CART analysis in the health domain, primarily due to its ease of implementation, use, and interpretation, thus facilitating medical decision-making. This method should be promoted for analysing continuous or categorical outcomes in haemophilia, when applicable. © 2015 John Wiley & Sons Ltd.

  10. The alarming problems of confounding equivalence using logistic regression models in the perspective of causal diagrams

    Directory of Open Access Journals (Sweden)

    Yuanyuan Yu

    2017-12-01

    Full Text Available Abstract Background Confounders can produce spurious associations between exposure and outcome in observational studies. For majority of epidemiologists, adjusting for confounders using logistic regression model is their habitual method, though it has some problems in accuracy and precision. It is, therefore, important to highlight the problems of logistic regression and search the alternative method. Methods Four causal diagram models were defined to summarize confounding equivalence. Both theoretical proofs and simulation studies were performed to verify whether conditioning on different confounding equivalence sets had the same bias-reducing potential and then to select the optimum adjusting strategy, in which logistic regression model and inverse probability weighting based marginal structural model (IPW-based-MSM were compared. The “do-calculus” was used to calculate the true causal effect of exposure on outcome, then the bias and standard error were used to evaluate the performances of different strategies. Results Adjusting for different sets of confounding equivalence, as judged by identical Markov boundaries, produced different bias-reducing potential in the logistic regression model. For the sets satisfied G-admissibility, adjusting for the set including all the confounders reduced the equivalent bias to the one containing the parent nodes of the outcome, while the bias after adjusting for the parent nodes of exposure was not equivalent to them. In addition, all causal effect estimations through logistic regression were biased, although the estimation after adjusting for the parent nodes of exposure was nearest to the true causal effect. However, conditioning on different confounding equivalence sets had the same bias-reducing potential under IPW-based-MSM. Compared with logistic regression, the IPW-based-MSM could obtain unbiased causal effect estimation when the adjusted confounders satisfied G-admissibility and the optimal

  11. Predicting 30-day Hospital Readmission with Publicly Available Administrative Database. A Conditional Logistic Regression Modeling Approach.

    Science.gov (United States)

    Zhu, K; Lou, Z; Zhou, J; Ballester, N; Kong, N; Parikh, P

    2015-01-01

    This article is part of the Focus Theme of Methods of Information in Medicine on "Big Data and Analytics in Healthcare". Hospital readmissions raise healthcare costs and cause significant distress to providers and patients. It is, therefore, of great interest to healthcare organizations to predict what patients are at risk to be readmitted to their hospitals. However, current logistic regression based risk prediction models have limited prediction power when applied to hospital administrative data. Meanwhile, although decision trees and random forests have been applied, they tend to be too complex to understand among the hospital practitioners. Explore the use of conditional logistic regression to increase the prediction accuracy. We analyzed an HCUP statewide inpatient discharge record dataset, which includes patient demographics, clinical and care utilization data from California. We extracted records of heart failure Medicare beneficiaries who had inpatient experience during an 11-month period. We corrected the data imbalance issue with under-sampling. In our study, we first applied standard logistic regression and decision tree to obtain influential variables and derive practically meaning decision rules. We then stratified the original data set accordingly and applied logistic regression on each data stratum. We further explored the effect of interacting variables in the logistic regression modeling. We conducted cross validation to assess the overall prediction performance of conditional logistic regression (CLR) and compared it with standard classification models. The developed CLR models outperformed several standard classification models (e.g., straightforward logistic regression, stepwise logistic regression, random forest, support vector machine). For example, the best CLR model improved the classification accuracy by nearly 20% over the straightforward logistic regression model. Furthermore, the developed CLR models tend to achieve better sensitivity of

  12. Use of Logistic Regression for Forecasting Short-Term Volcanic Activity

    Directory of Open Access Journals (Sweden)

    Mark T. Woods

    2012-08-01

    Full Text Available An algorithm that forecasts volcanic activity using an event tree decision making framework and logistic regression has been developed, characterized, and validated. The suite of empirical models that drive the system were derived from a sparse and geographically diverse dataset comprised of source modeling results, volcano monitoring data, and historic information from analog volcanoes. Bootstrapping techniques were applied to the training dataset to allow for the estimation of robust logistic model coefficients. Probabilities generated from the logistic models increase with positive modeling results, escalating seismicity, and rising eruption frequency. Cross validation yielded a series of receiver operating characteristic curves with areas ranging between 0.78 and 0.81, indicating that the algorithm has good forecasting capabilities. Our results suggest that the logistic models are highly transportable and can compete with, and in some cases outperform, non-transportable empirical models trained with site specific information.

  13. Robust logistic regression to narrow down the winner's curse for rare and recessive susceptibility variants.

    Science.gov (United States)

    Kesselmeier, Miriam; Lorenzo Bermejo, Justo

    2017-11-01

    Logistic regression is the most common technique used for genetic case-control association studies. A disadvantage of standard maximum likelihood estimators of the genotype relative risk (GRR) is their strong dependence on outlier subjects, for example, patients diagnosed at unusually young age. Robust methods are available to constrain outlier influence, but they are scarcely used in genetic studies. This article provides a non-intimidating introduction to robust logistic regression, and investigates its benefits and limitations in genetic association studies. We applied the bounded Huber and extended the R package 'robustbase' with the re-descending Hampel functions to down-weight outlier influence. Computer simulations were carried out to assess the type I error rate, mean squared error (MSE) and statistical power according to major characteristics of the genetic study and investigated markers. Simulations were complemented with the analysis of real data. Both standard and robust estimation controlled type I error rates. Standard logistic regression showed the highest power but standard GRR estimates also showed the largest bias and MSE, in particular for associated rare and recessive variants. For illustration, a recessive variant with a true GRR=6.32 and a minor allele frequency=0.05 investigated in a 1000 case/1000 control study by standard logistic regression resulted in power=0.60 and MSE=16.5. The corresponding figures for Huber-based estimation were power=0.51 and MSE=0.53. Overall, Hampel- and Huber-based GRR estimates did not differ much. Robust logistic regression may represent a valuable alternative to standard maximum likelihood estimation when the focus lies on risk prediction rather than identification of susceptibility variants. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  14. A Hybrid Approach of Stepwise Regression, Logistic Regression, Support Vector Machine, and Decision Tree for Forecasting Fraudulent Financial Statements

    Science.gov (United States)

    Goo, Yeong-Jia James; Shen, Zone-De

    2014-01-01

    As the fraudulent financial statement of an enterprise is increasingly serious with each passing day, establishing a valid forecasting fraudulent financial statement model of an enterprise has become an important question for academic research and financial practice. After screening the important variables using the stepwise regression, the study also matches the logistic regression, support vector machine, and decision tree to construct the classification models to make a comparison. The study adopts financial and nonfinancial variables to assist in establishment of the forecasting fraudulent financial statement model. Research objects are the companies to which the fraudulent and nonfraudulent financial statement happened between years 1998 to 2012. The findings are that financial and nonfinancial information are effectively used to distinguish the fraudulent financial statement, and decision tree C5.0 has the best classification effect 85.71%. PMID:25302338

  15. A Hybrid Approach of Stepwise Regression, Logistic Regression, Support Vector Machine, and Decision Tree for Forecasting Fraudulent Financial Statements

    Directory of Open Access Journals (Sweden)

    Suduan Chen

    2014-01-01

    Full Text Available As the fraudulent financial statement of an enterprise is increasingly serious with each passing day, establishing a valid forecasting fraudulent financial statement model of an enterprise has become an important question for academic research and financial practice. After screening the important variables using the stepwise regression, the study also matches the logistic regression, support vector machine, and decision tree to construct the classification models to make a comparison. The study adopts financial and nonfinancial variables to assist in establishment of the forecasting fraudulent financial statement model. Research objects are the companies to which the fraudulent and nonfraudulent financial statement happened between years 1998 to 2012. The findings are that financial and nonfinancial information are effectively used to distinguish the fraudulent financial statement, and decision tree C5.0 has the best classification effect 85.71%.

  16. Use of multiple linear regression and logistic regression models to investigate changes in birthweight for term singleton infants in Scotland.

    Science.gov (United States)

    Bonellie, Sandra R

    2012-10-01

    To illustrate the use of regression and logistic regression models to investigate changes over time in size of babies particularly in relation to social deprivation, age of the mother and smoking. Mean birthweight has been found to be increasing in many countries in recent years, but there are still a group of babies who are born with low birthweights. Population-based retrospective cohort study. Multiple linear regression and logistic regression models are used to analyse data on term 'singleton births' from Scottish hospitals between 1994-2003. Mothers who smoke are shown to give birth to lighter babies on average, a difference of approximately 0.57 Standard deviations lower (95% confidence interval. 0.55-0.58) when adjusted for sex and parity. These mothers are also more likely to have babies that are low birthweight (odds ratio 3.46, 95% confidence interval 3.30-3.63) compared with non-smokers. Low birthweight is 30% more likely where the mother lives in the most deprived areas compared with the least deprived, (odds ratio 1.30, 95% confidence interval 1.21-1.40). Smoking during pregnancy is shown to have a detrimental effect on the size of infants at birth. This effect explains some, though not all, of the observed socioeconomic birthweight. It also explains much of the observed birthweight differences by the age of the mother.   Identifying mothers at greater risk of having a low birthweight baby as important implications for the care and advice this group receives. © 2012 Blackwell Publishing Ltd.

  17. Using logistic regression to describe the length of breastfeeding: a study in Guadalajara, Mexico.

    Science.gov (United States)

    Gonzalez-Perez, G J; Vega-Lopez, M G; Cabrera-Pivaral, C

    1998-12-01

    This study seeks, through a logistic regression model, to describe the pattern of breastfeeding duration in Guadalajara, Mexico, during 1993. A multistage random sample of children under 1 year of age (n = 1036) was studied; observational data regarding breastfeeding duration, obtained through a "status quo" procedure, were compared with prevalence rates obtained from the logistic regression model. Modeling the duration of breastfeeding during the first year of life rather than only analyzing observational data helps researchers to understand this process in a dynamic and quantitative way. For example, uncommon indicators of breastfeeding were derived from the model. These indicators are impossible to obtain from observational data. The prevalence curve estimated through the logistic model was adequately fitted to observed data: there were no significant differences between the number or distribution of breastfed infants observed and those predicted by the model. Moreover, the model revealed that less than 40% of the children were breastfed in the fourth month of life; the median age for weaning was 39.3 days; 55% of the potential breastfeeding in the first 4 months did not occur; and the greatest abandonment of breastfeeding in the first 4 months was observed in the first 60 days. Thus, logistic regression seems a suitable option to construct a population-based model that describes breastfeeding duration during the first year of life. The indicators derived from the model offer health care providers valuable information for developing programs that promote breastfeeding.

  18. Metallomics study using hair mineral analysis and multiple logistic regression analysis: relationship between cancer and minerals.

    Science.gov (United States)

    Yasuda, Hiroshi; Yoshida, Kazuya; Segawa, Mitsuru; Tokuda, Ryoichi; Tsutsui, Toyoharu; Yasuda, Yuichi; Magara, Shunichi

    2009-09-01

    The objective of this metallomics study is to investigate comprehensively some relationships between cancer risk and minerals, including essential and toxic metals. Twenty-four minerals including essential and toxic metals in scalp hair samples from 124 solid-cancer patients and 86 control subjects were measured with inductively coupled plasma mass spectrometry (ICP-MS), and the association of cancer with minerals was statistically analyzed with multiple logistic regression analysis. Multiple logistic regression analysis demonstrated that several minerals are significantly correlated to cancer, positively or inversely. The most cancer-correlated mineral was iodine (I) with the highest correlation coefficient of r = 0.301, followed by arsenic (As; r = 0.267), zinc (Zn; r = 0.261), and sodium (Na; r = 0.190), with p Multiple linear regression value was highly significantly correlated with probability of cancer (R (2) = 0.437, p analysis and the chi-square test, the precision of discrimination for cancer was estimated to be 0.871 (chi-square = 99.1, p study using multiple logistic regression analysis is a useful tool for estimating cancer risk.

  19. Mixed-Effects Logistic Regression Models for Indirectly Observed Discrete Outcome Variables.

    Science.gov (United States)

    Vermunt, Jeroen K

    2005-01-01

    A well-established approach to modeling clustered data introduces random effects in the model of interest. Mixed-effects logistic regression models can be used to predict discrete outcome variables when observations are correlated. An extension of the mixed-effects logistic regression model is presented in which the dependent variable is a latent class variable. This method makes it possible to deal simultaneously with the problems of correlated observations and measurement error in the dependent variable. As is shown, maximum likelihood estimation is feasible by means of an EM algorithm with an E step that makes use of the special structure of the likelihood function. The new model is illustrated with an example from organizational psychology.

  20. Nowcasting of Low-Visibility Procedure States with Ordered Logistic Regression at Vienna International Airport

    Science.gov (United States)

    Kneringer, Philipp; Dietz, Sebastian; Mayr, Georg J.; Zeileis, Achim

    2017-04-01

    Low-visibility conditions have a large impact on aviation safety and economic efficiency of airports and airlines. To support decision makers, we develop a statistical probabilistic nowcasting tool for the occurrence of capacity-reducing operations related to low visibility. The probabilities of four different low visibility classes are predicted with an ordered logistic regression model based on time series of meteorological point measurements. Potential predictor variables for the statistical models are visibility, humidity, temperature and wind measurements at several measurement sites. A stepwise variable selection method indicates that visibility and humidity measurements are the most important model inputs. The forecasts are tested with a 30 minute forecast interval up to two hours, which is a sufficient time span for tactical planning at Vienna Airport. The ordered logistic regression models outperform persistence and are competitive with human forecasters.

  1. Determination of sex using cephalo-facial dimensions by discriminant function and logistic regression equations

    Directory of Open Access Journals (Sweden)

    Twisha Shah

    2016-06-01

    Full Text Available The aim is to bring together the new anthropological techniques and knowledge about populations that are least known. The present study was performed on 901 healthy Gujarati volunteers (676 males, 225 females within the age group of 21–50 years with the aim to examine whether any correlation exists between cephalofacial measures naming maximum head length, maximum head breadth, bizygomatic breadth, bigonial diameter, morphological facial length, physiognomic facial length, biocular breadth and total cephalofacial height and sex determination. Also, discriminant function and logistic regression methods were verified to check the best accuracy level for sex determination. Mean values of cephalofacial dimensions were higher in males than in females. Best reliable results were obtained by using logistic regression equations in males (92% and discriminant function in females (80.9%. Our study conclusively establishes the existence of a definite statistically significant sexual dimorphism in Gujarati population using cephalo-facial dimensions.

  2. Rock-profile correlations through logistic regression; Correlacao rocha-perfil atraves de regressao logistica

    Energy Technology Data Exchange (ETDEWEB)

    Castro, Wagner Barbosa de Mello

    1998-02-01

    Logistic regression models were generated starting from lithofacies described in cores and in well logs for two wells of Campos Basin. The main objective was verify the applicability of the technique in reservoir geology. The models were used to estimate the occurrence of reservoir facies in the wells. Results obtained were compared to the results of a previous discriminant analysis with the objective of determinate the accuracy of the two techniques as tools to estimate reservoir facies. Although discriminant analysis resulted more accurate in the estimate of reservoir facies, the use of logistic regression should not be discarded. Its independence of the normal distribution hypothesis make this technique, at least in theory, more robust than the discriminant analysis. (author)

  3. A Novel Imbalanced Data Classification Approach Based on Logistic Regression and Fisher Discriminant

    Directory of Open Access Journals (Sweden)

    Baofeng Shi

    2015-01-01

    Full Text Available We introduce an imbalanced data classification approach based on logistic regression significant discriminant and Fisher discriminant. First of all, a key indicators extraction model based on logistic regression significant discriminant and correlation analysis is derived to extract features for customer classification. Secondly, on the basis of the linear weighted utilizing Fisher discriminant, a customer scoring model is established. And then, a customer rating model where the customer number of all ratings follows normal distribution is constructed. The performance of the proposed model and the classical SVM classification method are evaluated in terms of their ability to correctly classify consumers as default customer or nondefault customer. Empirical results using the data of 2157 customers in financial engineering suggest that the proposed approach better performance than the SVM model in dealing with imbalanced data classification. Moreover, our approach contributes to locating the qualified customers for the banks and the bond investors.

  4. Differential item functioning (DIF) analyses of health-related quality of life instruments using logistic regression

    DEFF Research Database (Denmark)

    Scott, Neil W; Fayers, Peter M; Aaronson, Neil K

    2010-01-01

    Differential item functioning (DIF) methods can be used to determine whether different subgroups respond differently to particular items within a health-related quality of life (HRQoL) subscale, after allowing for overall subgroup differences in that scale. This article reviews issues that arise ...... when testing for DIF in HRQoL instruments. We focus on logistic regression methods, which are often used because of their efficiency, simplicity and ease of application....

  5. PARAMETRIC AND NON PARAMETRIC (MARS: MULTIVARIATE ADDITIVE REGRESSION SPLINES) LOGISTIC REGRESSIONS FOR PREDICTION OF A DICHOTOMOUS RESPONSE VARIABLE WITH AN EXAMPLE FOR PRESENCE/ABSENCE OF AMPHIBIANS

    Science.gov (United States)

    The purpose of this report is to provide a reference manual that could be used by investigators for making informed use of logistic regression using two methods (standard logistic regression and MARS). The details for analyses of relationships between a dependent binary response ...

  6. A comparative study on entrepreneurial attitudes modeled with logistic regression and Bayes nets.

    Science.gov (United States)

    López Puga, Jorge; García García, Juan

    2012-11-01

    Entrepreneurship research is receiving increasing attention in our context, as entrepreneurs are key social agents involved in economic development. We compare the success of the dichotomic logistic regression model and the Bayes simple classifier to predict entrepreneurship, after manipulating the percentage of missing data and the level of categorization in predictors. A sample of undergraduate university students (N = 1230) completed five scales (motivation, attitude towards business creation, obstacles, deficiencies, and training needs) and we found that each of them predicted different aspects of the tendency to business creation. Additionally, our results show that the receiver operating characteristic (ROC) curve is affected by the rate of missing data in both techniques, but logistic regression seems to be more vulnerable when faced with missing data, whereas Bayes nets underperform slightly when categorization has been manipulated. Our study sheds light on the potential entrepreneur profile and we propose to use Bayesian networks as an additional alternative to overcome the weaknesses of logistic regression when missing data are present in applied research.

  7. Extreme Sparse Multinomial Logistic Regression: A Fast and Robust Framework for Hyperspectral Image Classification

    Science.gov (United States)

    Cao, Faxian; Yang, Zhijing; Ren, Jinchang; Ling, Wing-Kuen; Zhao, Huimin; Marshall, Stephen

    2017-12-01

    Although the sparse multinomial logistic regression (SMLR) has provided a useful tool for sparse classification, it suffers from inefficacy in dealing with high dimensional features and manually set initial regressor values. This has significantly constrained its applications for hyperspectral image (HSI) classification. In order to tackle these two drawbacks, an extreme sparse multinomial logistic regression (ESMLR) is proposed for effective classification of HSI. First, the HSI dataset is projected to a new feature space with randomly generated weight and bias. Second, an optimization model is established by the Lagrange multiplier method and the dual principle to automatically determine a good initial regressor for SMLR via minimizing the training error and the regressor value. Furthermore, the extended multi-attribute profiles (EMAPs) are utilized for extracting both the spectral and spatial features. A combinational linear multiple features learning (MFL) method is proposed to further enhance the features extracted by ESMLR and EMAPs. Finally, the logistic regression via the variable splitting and the augmented Lagrangian (LORSAL) is adopted in the proposed framework for reducing the computational time. Experiments are conducted on two well-known HSI datasets, namely the Indian Pines dataset and the Pavia University dataset, which have shown the fast and robust performance of the proposed ESMLR framework.

  8. Landslide susceptibility mapping on a global scale using the method of logistic regression

    Science.gov (United States)

    Lin, Le; Lin, Qigen; Wang, Ying

    2017-08-01

    This paper proposes a statistical model for mapping global landslide susceptibility based on logistic regression. After investigating explanatory factors for landslides in the existing literature, five factors were selected for model landslide susceptibility: relative relief, extreme precipitation, lithology, ground motion and soil moisture. When building the model, 70 % of landslide and nonlandslide points were randomly selected for logistic regression, and the others were used for model validation. To evaluate the accuracy of predictive models, this paper adopts several criteria including a receiver operating characteristic (ROC) curve method. Logistic regression experiments found all five factors to be significant in explaining landslide occurrence on a global scale. During the modeling process, percentage correct in confusion matrix of landslide classification was approximately 80 % and the area under the curve (AUC) was nearly 0.87. During the validation process, the above statistics were about 81 % and 0.88, respectively. Such a result indicates that the model has strong robustness and stable performance. This model found that at a global scale, soil moisture can be dominant in the occurrence of landslides and topographic factor may be secondary.

  9. Landslide susceptibility mapping using logistic statistical regression in Babaheydar Watershed, Chaharmahal Va Bakhtiari Province, Iran

    Directory of Open Access Journals (Sweden)

    Ebrahim Karimi Sangchini

    2015-01-01

    Full Text Available Landslides are amongst the most damaging natural hazards in mountainous regions. Every year, hundreds of people all over the world lose their lives in landslides; furthermore, there are large impacts on the local and global economy from these events. In this study, landslide hazard zonation in Babaheydar watershed using logistic regression was conducted to determine landslide hazard areas. At first, the landslide inventory map was prepared using aerial photograph interpretations and field surveys. The next step, ten landslide conditioning factors such as altitude, slope percentage, slope aspect, lithology, distance from faults, rivers, settlement and roads, land use, and precipitation were chosen as effective factors on landsliding in the study area. Subsequently, landslide susceptibility map was constructed using the logistic regression model in Geographic Information System (GIS. The ROC and Pseudo-R2 indexes were used for model assessment. Results showed that the logistic regression model provided slightly high prediction accuracy of landslide susceptibility maps in the Babaheydar Watershed with ROC equal to 0.876. Furthermore, the results revealed that about 44% of the watershed areas were located in high and very high hazard classes. The resultant landslide susceptibility maps can be useful in appropriate watershed management practices and for sustainable development in the region.

  10. Comparison of cranial sex determination by discriminant analysis and logistic regression.

    Science.gov (United States)

    Amores-Ampuero, Anabel; Alemán, Inmaculada

    2016-04-05

    Various methods have been proposed for estimating dimorphism. The objective of this study was to compare sex determination results from cranial measurements using discriminant analysis or logistic regression. The study sample comprised 130 individuals (70 males) of known sex, age, and cause of death from San José cemetery in Granada (Spain). Measurements of 19 neurocranial dimensions and 11 splanchnocranial dimensions were subjected to discriminant analysis and logistic regression, and the percentages of correct classification were compared between the sex functions obtained with each method. The discriminant capacity of the selected variables was evaluated with a cross-validation procedure. The percentage accuracy with discriminant analysis was 78.2% for the neurocranium (82.4% in females and 74.6% in males) and 73.7% for the splanchnocranium (79.6% in females and 68.8% in males). These percentages were higher with logistic regression analysis: 85.7% for the neurocranium (in both sexes) and 94.1% for the splanchnocranium (100% in females and 91.7% in males).

  11. LOGISTIC REGRESSION AS A TOOL FOR DETERMINATION OF THE PROBABILITY OF DEFAULT FOR ENTERPRISES

    Directory of Open Access Journals (Sweden)

    Erika SPUCHLAKOVA

    2017-12-01

    Full Text Available In a rapidly changing world it is necessary to adapt to new conditions. From a day to day approaches can vary. For the proper management of the company it is essential to know the financial situation. Assessment of the company financial health can be carried out by financial analysis which provides a number of methods how to evaluate the company financial health. Analysis indicators are often included in the company assessment, in obtaining bank loans and other financial resources to ensure the functioning of the company. As company focuses on the future and its planning, it is essential to forecast the future financial situation. According to the results of company´s financial health prediction, the company decides on the extension or limitation of its business. It depends mainly on the capabilities of company´s management how they will use information obtained from financial analysis in practice. The findings of logistic regression methods were published firstly in the 60s, as an alternative to the least squares method. The essence of logistic regression is to determine the relationship between being explained (dependent variable and explanatory (independent variables. The basic principle of this static method is based on the regression analysis, but unlike linear regression, it can predict the probability of a phenomenon that has occurred or not. The aim of this paper is to determine the probability of bankruptcy enterprises.

  12. A logistic normal multinomial regression model for microbiome compositional data analysis.

    Science.gov (United States)

    Xia, Fan; Chen, Jun; Fung, Wing Kam; Li, Hongzhe

    2013-12-01

    Changes in human microbiome are associated with many human diseases. Next generation sequencing technologies make it possible to quantify the microbial composition without the need for laboratory cultivation. One important problem of microbiome data analysis is to identify the environmental/biological covariates that are associated with different bacterial taxa. Taxa count data in microbiome studies are often over-dispersed and include many zeros. To account for such an over-dispersion, we propose to use an additive logistic normal multinomial regression model to associate the covariates to bacterial composition. The model can naturally account for sampling variabilities and zero observations and also allow for a flexible covariance structure among the bacterial taxa. In order to select the relevant covariates and to estimate the corresponding regression coefficients, we propose a group ℓ1 penalized likelihood estimation method for variable selection and estimation. We develop a Monte Carlo expectation-maximization algorithm to implement the penalized likelihood estimation. Our simulation results show that the proposed method outperforms the group ℓ1 penalized multinomial logistic regression and the Dirichlet multinomial regression models in variable selection. We demonstrate the methods using a data set that links human gut microbiome to micro-nutrients in order to identify the nutrients that are associated with the human gut microbiome enterotype. © 2013, The International Biometric Society.

  13. Predictors of postoperative outcomes of cubital tunnel syndrome treatments using multiple logistic regression analysis.

    Science.gov (United States)

    Suzuki, Taku; Iwamoto, Takuji; Shizu, Kanae; Suzuki, Katsuji; Yamada, Harumoto; Sato, Kazuki

    2017-05-01

    This retrospective study was designed to investigate prognostic factors for postoperative outcomes for cubital tunnel syndrome (CubTS) using multiple logistic regression analysis with a large number of patients. Eighty-three patients with CubTS who underwent surgeries were enrolled. The following potential prognostic factors for disease severity were selected according to previous reports: sex, age, type of surgery, disease duration, body mass index, cervical lesion, presence of diabetes mellitus, Workers' Compensation status, preoperative severity, and preoperative electrodiagnostic testing. Postoperative severity of disease was assessed 2 years after surgery by Messina's criteria which is an outcome measure specifically for CubTS. Bivariate analysis was performed to select candidate prognostic factors for multiple linear regression analyses. Multiple logistic regression analysis was conducted to identify the association between postoperative severity and selected prognostic factors. Both bivariate and multiple linear regression analysis revealed only preoperative severity as an independent risk factor for poor prognosis, while other factors did not show any significant association. Although conflicting results exist regarding prognosis of CubTS, this study supports evidence from previous studies and concludes early surgical intervention portends the most favorable prognosis. Copyright © 2017 The Japanese Orthopaedic Association. Published by Elsevier B.V. All rights reserved.

  14. Multiple Imputation of a Randomly Censored Covariate Improves Logistic Regression Analysis.

    Science.gov (United States)

    Atem, Folefac D; Qian, Jing; Maye, Jacqueline E; Johnson, Keith A; Betensky, Rebecca A

    2016-01-01

    Randomly censored covariates arise frequently in epidemiologic studies. The most commonly used methods, including complete case and single imputation or substitution, suffer from inefficiency and bias. They make strong parametric assumptions or they consider limit of detection censoring only. We employ multiple imputation, in conjunction with semi-parametric modeling of the censored covariate, to overcome these shortcomings and to facilitate robust estimation. We develop a multiple imputation approach for randomly censored covariates within the framework of a logistic regression model. We use the non-parametric estimate of the covariate distribution or the semiparametric Cox model estimate in the presence of additional covariates in the model. We evaluate this procedure in simulations, and compare its operating characteristics to those from the complete case analysis and a survival regression approach. We apply the procedures to an Alzheimer's study of the association between amyloid positivity and maternal age of onset of dementia. Multiple imputation achieves lower standard errors and higher power than the complete case approach under heavy and moderate censoring and is comparable under light censoring. The survival regression approach achieves the highest power among all procedures, but does not produce interpretable estimates of association. Multiple imputation offers a favorable alternative to complete case analysis and ad hoc substitution methods in the presence of randomly censored covariates within the framework of logistic regression.

  15. The assessment of groundwater nitrate contamination by using logistic regression model in a representative rural area

    Science.gov (United States)

    Ko, K.; Cheong, B.; Koh, D.

    2010-12-01

    Groundwater has been used a main source to provide a drinking water in a rural area with no regional potable water supply system in Korea. More than 50 percent of rural area residents depend on groundwater as drinking water. Thus, research on predicting groundwater pollution for the sustainable groundwater usage and protection from potential pollutants was demanded. This study was carried out to know the vulnerability of groundwater nitrate contamination reflecting the effect of land use in Nonsan city of a representative rural area of South Korea. About 47% of the study area is occupied by cultivated land with high vulnerable area to groundwater nitrate contamination because it has higher nitrogen fertilizer input of 62.3 tons/km2 than that of country’s average of 44.0 tons/km2. The two vulnerability assessment methods, logistic regression and DRASTIC model, were tested and compared to know more suitable techniques for the assessment of groundwater nitrate contamination in Nonsan area. The groundwater quality data were acquired from the collection of analyses of 111 samples of small potable supply system in the study area. The analyzed values of nitrate were classified by land use such as resident, upland, paddy, and field area. One dependent and two independent variables were addressed for logistic regression analysis. One dependent variable was a binary categorical data with 0 or 1 whether or not nitrate exceeding thresholds of 1 through 10 mg/L. The independent variables were one continuous data of slope indicating topography and multiple categorical data of land use which are classified by resident, upland, paddy, and field area. The results of the Levene’s test and T-test for slope and land use were showed the significant difference of mean values among groups in 95% confidence level. From the logistic regression, we could know the negative correlation between slope and nitrate which was caused by the decrease of contaminants inputs into groundwater with

  16. Predictive factors for sorafenib-induced hand-foot skin reaction using ordered logistic regression analysis.

    Science.gov (United States)

    Kanbayashi, Yuko; Hosokawa, Toyoshi; Yasui, Kohichiroh; Hongo, Fumiya; Yamaguchi, Kanji; Moriguchi, Michihisa; Miki, Tsuneharu; Itoh, Yoshito

    2016-01-01

    Predictive factors for sorafenib-induced hand-foot skin reaction (HFSR) using ordered logistic regression analysis were studied. This retrospective analysis evaluated patients admitted to a university hospital in Japan from May 2008 through October 2013. Patients age 20 years or older with relapsed or metastatic renal cell carcinoma, unresectable hepatocellular carcinoma, or gastrointestinal stromal tumor resistant to imatinib and sunitinib were included. Data were manually collected from patients' clinical records and included sex, age, Eastern Cooperative Oncology Group (ECOG) performance status, initial daily dose of sorafenib, duration of sorafenib use, concomitant medications, number of metastases, sites of metastases, physical examination findings, and type of cancer. Laboratory test values related to the patient's medical condition that seemed to influence HFSR or the absorption and pharmacologic effects of sorafenib were also collected. HFSR severity was also assessed. Univariate ordered logistic analysis was performed for HFSR severity outcomes and each candidate independent variable. A multivariate ordered logistic regression model was then constructed using a stepwise forward selection procedure. Data were screened for multicollinearity. Data from 113 patients were evaluated. This analysis identified duration of sorafenib use (odds ratio [OR], 0.0531), use of a proton pump inhibitor (PPI) (OR, 0.351), ECOG performance status (OR, 0.555), C-reactive protein level (OR, 17.74), and male sex (OR, 0.403) as significant factors for the occurrence of HFSR. Multivariate logistic regression analysis revealed that short duration of sorafenib use, avoidance of PPIs, good ECOG performance status, high C-reactive protein level, and female sex were predictive factors for the development of HFSR. Copyright © 2016 by the American Society of Health-System Pharmacists, Inc. All rights reserved.

  17. Evaluation of Logistic Regression and Multivariate Adaptive Regression Spline Models for Groundwater Potential Mapping Using R and GIS

    Directory of Open Access Journals (Sweden)

    Soyoung Park

    2017-07-01

    Full Text Available This study mapped and analyzed groundwater potential using two different models, logistic regression (LR and multivariate adaptive regression splines (MARS, and compared the results. A spatial database was constructed for groundwater well data and groundwater influence factors. Groundwater well data with a high potential yield of ≥70 m3/d were extracted, and 859 locations (70% were used for model training, whereas the other 365 locations (30% were used for model validation. We analyzed 16 groundwater influence factors including altitude, slope degree, slope aspect, plan curvature, profile curvature, topographic wetness index, stream power index, sediment transport index, distance from drainage, drainage density, lithology, distance from fault, fault density, distance from lineament, lineament density, and land cover. Groundwater potential maps (GPMs were constructed using LR and MARS models and tested using a receiver operating characteristics curve. Based on this analysis, the area under the curve (AUC for the success rate curve of GPMs created using the MARS and LR models was 0.867 and 0.838, and the AUC for the prediction rate curve was 0.836 and 0.801, respectively. This implies that the MARS model is useful and effective for groundwater potential analysis in the study area.

  18. Analysis of sparse data in logistic regression in medical research: A newer approach

    Directory of Open Access Journals (Sweden)

    S Devika

    2016-01-01

    Full Text Available Background and Objective: In the analysis of dichotomous type response variable, logistic regression is usually used. However, the performance of logistic regression in the presence of sparse data is questionable. In such a situation, a common problem is the presence of high odds ratios (ORs with very wide 95% confidence interval (CI (OR: >999.999, 95% CI: 999.999. In this paper, we addressed this issue by using penalized logistic regression (PLR method. Materials and Methods: Data from case-control study on hyponatremia and hiccups conducted in Christian Medical College, Vellore, Tamil Nadu, India was used. The outcome variable was the presence/absence of hiccups and the main exposure variable was the status of hyponatremia. Simulation dataset was created with different sample sizes and with a different number of covariates. Results: A total of 23 cases and 50 controls were used for the analysis of ordinary and PLR methods. The main exposure variable hyponatremia was present in nine (39.13% of the cases and in four (8.0% of the controls. Of the 23 hiccup cases, all were males and among the controls, 46 (92.0% were males. Thus, the complete separation between gender and the disease group led into an infinite OR with 95% CI (OR: >999.999, 95% CI: 999.999 whereas there was a finite and consistent regression coefficient for gender (OR: 5.35; 95% CI: 0.42, 816.48 using PLR. After adjusting for all the confounding variables, hyponatremia entailed 7.9 (95% CI: 2.06, 38.86 times higher risk for the development of hiccups as was found using PLR whereas there was an overestimation of risk OR: 10.76 (95% CI: 2.17, 53.41 using the conventional method. Simulation experiment shows that the estimated coverage probability of this method is near the nominal level of 95% even for small sample sizes and for a large number of covariates. Conclusions: PLR is almost equal to the ordinary logistic regression when the sample size is large and is superior in small cell

  19. Predicting smear negative pulmonary tuberculosis with classification trees and logistic regression: a cross-sectional study

    Directory of Open Access Journals (Sweden)

    Kritski Afrânio

    2006-02-01

    Full Text Available Abstract Background Smear negative pulmonary tuberculosis (SNPT accounts for 30% of pulmonary tuberculosis cases reported yearly in Brazil. This study aimed to develop a prediction model for SNPT for outpatients in areas with scarce resources. Methods The study enrolled 551 patients with clinical-radiological suspicion of SNPT, in Rio de Janeiro, Brazil. The original data was divided into two equivalent samples for generation and validation of the prediction models. Symptoms, physical signs and chest X-rays were used for constructing logistic regression and classification and regression tree models. From the logistic regression, we generated a clinical and radiological prediction score. The area under the receiver operator characteristic curve, sensitivity, and specificity were used to evaluate the model's performance in both generation and validation samples. Results It was possible to generate predictive models for SNPT with sensitivity ranging from 64% to 71% and specificity ranging from 58% to 76%. Conclusion The results suggest that those models might be useful as screening tools for estimating the risk of SNPT, optimizing the utilization of more expensive tests, and avoiding costs of unnecessary anti-tuberculosis treatment. Those models might be cost-effective tools in a health care network with hierarchical distribution of scarce resources.

  20. [Calculating Pearson residual in logistic regressions: a comparison between SPSS and SAS].

    Science.gov (United States)

    Xu, Hao; Zhang, Tao; Li, Xiao-song; Liu, Yuan-yuan

    2015-01-01

    To compare the results of Pearson residual calculations in logistic regression models using SPSS and SAS. We reviewed Pearson residual calculation methods, and used two sets of data to test logistic models constructed by SPSS and STATA. One model contained a small number of covariates compared to the number of observed. The other contained a similar number of covariates as the number of observed. The two software packages produced similar Pearson residual estimates when the models contained a similar number of covariates as the number of observed, but the results differed when the number of observed was much greater than the number of covariates. The two software packages produce different results of Pearson residuals, especially when the models contain a small number of covariates. Further studies are warranted.

  1. A review of logistic regression models used to predict post-fire tree mortality of western North American conifers

    Science.gov (United States)

    Travis Woolley; David C. Shaw; Lisa M. Ganio; Stephen. Fitzgerald

    2012-01-01

    Logistic regression models used to predict tree mortality are critical to post-fire management, planning prescribed bums and understanding disturbance ecology. We review literature concerning post-fire mortality prediction using logistic regression models for coniferous tree species in the western USA. We include synthesis and review of: methods to develop, evaluate...

  2. A secure distributed logistic regression protocol for the detection of rare adverse drug events.

    Science.gov (United States)

    El Emam, Khaled; Samet, Saeed; Arbuckle, Luk; Tamblyn, Robyn; Earle, Craig; Kantarcioglu, Murat

    2013-05-01

    There is limited capacity to assess the comparative risks of medications after they enter the market. For rare adverse events, the pooling of data from multiple sources is necessary to have the power and sufficient population heterogeneity to detect differences in safety and effectiveness in genetic, ethnic and clinically defined subpopulations. However, combining datasets from different data custodians or jurisdictions to perform an analysis on the pooled data creates significant privacy concerns that would need to be addressed. Existing protocols for addressing these concerns can result in reduced analysis accuracy and can allow sensitive information to leak. To develop a secure distributed multi-party computation protocol for logistic regression that provides strong privacy guarantees. We developed a secure distributed logistic regression protocol using a single analysis center with multiple sites providing data. A theoretical security analysis demonstrates that the protocol is robust to plausible collusion attacks and does not allow the parties to gain new information from the data that are exchanged among them. The computational performance and accuracy of the protocol were evaluated on simulated datasets. The computational performance scales linearly as the dataset sizes increase. The addition of sites results in an exponential growth in computation time. However, for up to five sites, the time is still short and would not affect practical applications. The model parameters are the same as the results on pooled raw data analyzed in SAS, demonstrating high model accuracy. The proposed protocol and prototype system would allow the development of logistic regression models in a secure manner without requiring the sharing of personal health information. This can alleviate one of the key barriers to the establishment of large-scale post-marketing surveillance programs. We extended the secure protocol to account for correlations among patients within sites through

  3. Modeling of geogenic radon in Switzerland based on ordered logistic regression.

    Science.gov (United States)

    Kropat, Georg; Bochud, François; Murith, Christophe; Palacios Gruson, Martha; Baechler, Sébastien

    2017-01-01

    The estimation of the radon hazard of a future construction site should ideally be based on the geogenic radon potential (GRP), since this estimate is free of anthropogenic influences and building characteristics. The goal of this study was to evaluate terrestrial gamma dose rate (TGD), geology, fault lines and topsoil permeability as predictors for the creation of a GRP map based on logistic regression. Soil gas radon measurements (SRC) are more suited for the estimation of GRP than indoor radon measurements (IRC) since the former do not depend on ventilation and heating habits or building characteristics. However, SRC have only been measured at a few locations in Switzerland. In former studies a good correlation between spatial aggregates of IRC and SRC has been observed. That's why we used IRC measurements aggregated on a 10 km × 10 km grid to calibrate an ordered logistic regression model for geogenic radon potential (GRP). As predictors we took into account terrestrial gamma doserate, regrouped geological units, fault line density and the permeability of the soil. The classification success rate of the model results to 56% in case of the inclusion of all 4 predictor variables. Our results suggest that terrestrial gamma doserate and regrouped geological units are more suited to model GRP than fault line density and soil permeability. Ordered logistic regression is a promising tool for the modeling of GRP maps due to its simplicity and fast computation time. Future studies should account for additional variables to improve the modeling of high radon hazard in the Jura Mountains of Switzerland. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.

  4. A logistic regression model of Coronary Artery Disease among Male Patients in Punjab

    Directory of Open Access Journals (Sweden)

    Sohail Chand

    2005-07-01

    Full Text Available This is a cross-sectional retrospective study of 308 male patients, who were presented first time for coronary angiography at the Punjab Institute of Cardiology. The mean age was 50.97 + 9.9 among male patients. As the response variable coronary artery disease (CAD was a binary variable, logistic regression model was fitted to predict the Coronary Artery Disease with the help of significant risk factors. Age, Chest pain, Diabetes Mellitus, Smoking and Lipids are resulted as significant risk factors associated with CAD among male population.

  5. Bringing balance and technical accuracy to reporting odds ratios and the results of logistic regression analyses

    Directory of Open Access Journals (Sweden)

    Jason W. Osborne

    2006-10-01

    Full Text Available Logistic regression and odds ratios (ORs are powerful tools recently becoming more common in the social sciences. Yet few understand the technical challenges of correctly interpreting an odds ratio, and often it is done incorrectly in a variety of different ways. The goal of this brief note is to review the correct interpretation of the odds ratio, how to transform it into the more easily understood and intuitive relative risk (RRs estimate, and a suggestion for dealing with odds ratios or relative risk estimates that are below 1.0 so that perceptually their magnitude is equivalent of Ors or RRs greater than 1.0.

  6. Estimating bias and variances in bootstrap logistic regression for Umaru and impact data

    Science.gov (United States)

    Fitrianto, Anwar; Cing, Ng Mei

    2014-12-01

    We employed random-x bootstrap in binary logistic regression model. We investigate the effect of sample size and number of bootstrap replication on the bias and variance. The performance of estimated coefficient is measured based on the bias, variance, and confidence interval of the bootstrap estimates. In addition, we also focus on the length of confidence interval of the bootstrap estimates. We found that bias and variance decrease for larger sample size. We noticed that length of confidence intervals decrease as the sample size and number of bootstrap replication are getting large. The results show that the estimated coefficient is more precise as the sample size increases.

  7. Mediation analysis for logistic regression with interactions: Application of a surrogate marker in ophthalmology.

    Science.gov (United States)

    Jensen, Signe M; Hauger, Hanne; Ritz, Christian

    2018-01-01

    Mediation analysis is often based on fitting two models, one including and another excluding a potential mediator, and subsequently quantify the mediated effects by combining parameter estimates from these two models. Standard errors of such derived parameters may be approximated using the delta method. For a study evaluating a treatment effect on visual acuity, a binary outcome, we demonstrate how mediation analysis may conveniently be carried out by means of marginally fitted logistic regression models in combination with the delta method. Several metrics of mediation are estimated and results are compared to findings using existing methods.

  8. Coordinate Descent Based Hierarchical Interactive Lasso Penalized Logistic Regression and Its Application to Classification Problems

    Directory of Open Access Journals (Sweden)

    Jin-Jia Wang

    2014-01-01

    Full Text Available We present the hierarchical interactive lasso penalized logistic regression using the coordinate descent algorithm based on the hierarchy theory and variables interactions. We define the interaction model based on the geometric algebra and hierarchical constraint conditions and then use the coordinate descent algorithm to solve for the coefficients of the hierarchical interactive lasso model. We provide the results of some experiments based on UCI datasets, Madelon datasets from NIPS2003, and daily activities of the elder. The experimental results show that the variable interactions and hierarchy contribute significantly to the classification. The hierarchical interactive lasso has the advantages of the lasso and interactive lasso.

  9. Accuracy of Bayes and Logistic Regression Subscale Probabilities for Educational and Certification Tests

    Directory of Open Access Journals (Sweden)

    Lawrence Rudner

    2016-06-01

    Full Text Available In the machine learning literature, it is commonly accepted as fact that as calibration sample sizes increase, Naïve Bayes classifiers initially outperform Logistic Regression classifiers in terms of classification accuracy. Applied to subtests from an on-line final examination and from a highly regarded certification examination, this study shows that the conclusion also applies to the probabilities estimated from short subtests of mental abilities and that small samples can yield excellent accuracy. The calculated Bayes probabilities can be used to provide meaningful examinee feedback regardless of whether the test was originally designed to be unidimensional.

  10. Accuracy of Bayes and Logistic Regression Subscale Probabilities for Educational and Certification Tests

    Directory of Open Access Journals (Sweden)

    Lawrence Rudner

    2016-07-01

    Full Text Available In the machine learning literature, it is commonly accepted as fact that as calibration sample sizes increase, Na ve Bayes classifiers initially outperform Logistic Regression classifiers in terms of classification accuracy. Applied to subtests from an on-line final examination and from a highly regarded certification examination, this study shows that the conclusion also applies to the probabilities estimated from short subtests of mental abilities and that small samples can yield excellent accuracy. The calculated Bayes probabilities can be used to provide meaningful examinee feedback regardless of whether the test was originally designed to be unidimensional.

  11. Estimating the causes of traffic accidents using logistic regression and discriminant analysis.

    Science.gov (United States)

    Karacasu, Murat; Ergül, Barış; Altin Yavuz, Arzu

    2014-01-01

    Factors that affect traffic accidents have been analysed in various ways. In this study, we use the methods of logistic regression and discriminant analysis to determine the damages due to injury and non-injury accidents in the Eskisehir Province. Data were obtained from the accident reports of the General Directorate of Security in Eskisehir; 2552 traffic accidents between January and December 2009 were investigated regarding whether they resulted in injury. According to the results, the effects of traffic accidents were reflected in the variables. These results provide a wealth of information that may aid future measures toward the prevention of undesired results.

  12. Inverse estimation of multiple muscle activations based on linear logistic regression.

    Science.gov (United States)

    Sekiya, Masashi; Tsuji, Toshiaki

    2017-07-01

    This study deals with a technology to estimate the muscle activity from the movement data using a statistical model. A linear regression (LR) model and artificial neural networks (ANN) have been known as statistical models for such use. Although ANN has a high estimation capability, it is often in the clinical application that the lack of data amount leads to performance deterioration. On the other hand, the LR model has a limitation in generalization performance. We therefore propose a muscle activity estimation method to improve the generalization performance through the use of linear logistic regression model. The proposed method was compared with the LR model and ANN in the verification experiment with 7 participants. As a result, the proposed method showed better generalization performance than the conventional methods in various tasks.

  13. Predicting pilot-error incidents of US airline pilots using logistic regression.

    Science.gov (United States)

    McFadden, K L

    1997-06-01

    In a population of 70,164 airline pilots obtained from the Federal Aviation Administration, 475 males and 22 females had pilot-error incidents in the years 1986-1992. A simple chi-squared test revealed that female pilots employed by major airlines had a significantly greater likelihood of pilot-error incidents than their male colleagues. In order to control for age, experience (total flying hours), risk exposure (recent flying hours) and employer (major/non-major airline) simultaneously, the author built a model of male pilot-error incidents using logistic regression. The regression analysis indicated that youth, inexperience and non-major airline employer were independent contributors to the increased risk of pilot-error incidents. The results also provide further support to the literature that pilot performance does not differ significantly between male and female airline pilots.

  14. Mapping of the DLQI scores to EQ-5D utility values using ordinal logistic regression.

    Science.gov (United States)

    Ali, Faraz Mahmood; Kay, Richard; Finlay, Andrew Y; Piguet, Vincent; Kupfer, Joerg; Dalgard, Florence; Salek, M Sam

    2017-11-01

    The Dermatology Life Quality Index (DLQI) and the European Quality of Life-5 Dimension (EQ-5D) are separate measures that may be used to gather health-related quality of life (HRQoL) information from patients. The EQ-5D is a generic measure from which health utility estimates can be derived, whereas the DLQI is a specialty-specific measure to assess HRQoL. To reduce the burden of multiple measures being administered and to enable a more disease-specific calculation of health utility estimates, we explored an established mathematical technique known as ordinal logistic regression (OLR) to develop an appropriate model to map DLQI data to EQ-5D-based health utility estimates. Retrospective data from 4010 patients were randomly divided five times into two groups for the derivation and testing of the mapping model. Split-half cross-validation was utilized resulting in a total of ten ordinal logistic regression models for each of the five EQ-5D dimensions against age, sex, and all ten items of the DLQI. Using Monte Carlo simulation, predicted health utility estimates were derived and compared against those observed. This method was repeated for both OLR and a previously tested mapping methodology based on linear regression. The model was shown to be highly predictive and its repeated fitting demonstrated a stable model using OLR as well as linear regression. The mean differences between OLR-predicted health utility estimates and observed health utility estimates ranged from 0.0024 to 0.0239 across the ten modeling exercises, with an average overall difference of 0.0120 (a 1.6% underestimate, not of clinical importance). This modeling framework developed in this study will enable researchers to calculate EQ-5D health utility estimates from a specialty-specific study population, reducing patient and economic burden.

  15. Comparative study of biodegradability prediction of chemicals using decision trees, functional trees, and logistic regression.

    Science.gov (United States)

    Chen, Guangchao; Li, Xuehua; Chen, Jingwen; Zhang, Ya-Nan; Peijnenburg, Willie J G M

    2014-12-01

    Biodegradation is the principal environmental dissipation process of chemicals. As such, it is a dominant factor determining the persistence and fate of organic chemicals in the environment, and is therefore of critical importance to chemical management and regulation. In the present study, the authors developed in silico methods assessing biodegradability based on a large heterogeneous set of 825 organic compounds, using the techniques of the C4.5 decision tree, the functional inner regression tree, and logistic regression. External validation was subsequently carried out by 2 independent test sets of 777 and 27 chemicals. As a result, the functional inner regression tree exhibited the best predictability with predictive accuracies of 81.5% and 81.0%, respectively, on the training set (825 chemicals) and test set I (777 chemicals). Performance of the developed models on the 2 test sets was subsequently compared with that of the Estimation Program Interface (EPI) Suite Biowin 5 and Biowin 6 models, which also showed a better predictability of the functional inner regression tree model. The model built in the present study exhibits a reasonable predictability compared with existing models while possessing a transparent algorithm. Interpretation of the mechanisms of biodegradation was also carried out based on the models developed. © 2014 SETAC.

  16. Evaluating penalized logistic regression models to predict Heat-Related Electric grid stress days

    Energy Technology Data Exchange (ETDEWEB)

    Bramer, L. M.; Rounds, J.; Burleyson, C. D.; Fortin, D.; Hathaway, J.; Rice, J.; Kraucunas, I.

    2017-11-01

    Understanding the conditions associated with stress on the electricity grid is important in the development of contingency plans for maintaining reliability during periods when the grid is stressed. In this paper, heat-related grid stress and the relationship with weather conditions is examined using data from the eastern United States. Penalized logistic regression models were developed and applied to predict stress on the electric grid using weather data. The inclusion of other weather variables, such as precipitation, in addition to temperature improved model performance. Several candidate models and datasets were examined. A penalized logistic regression model fit at the operation-zone level was found to provide predictive value and interpretability. Additionally, the importance of different weather variables observed at different time scales were examined. Maximum temperature and precipitation were identified as important across all zones while the importance of other weather variables was zone specific. The methods presented in this work are extensible to other regions and can be used to aid in planning and development of the electrical grid.

  17. Logistic regression function for detection of suspicious performance during baseline evaluations using concussion vital signs.

    Science.gov (United States)

    Hill, Benjamin David; Womble, Melissa N; Rohling, Martin L

    2015-01-01

    This study utilized logistic regression to determine whether performance patterns on Concussion Vital Signs (CVS) could differentiate known groups with either genuine or feigned performance. For the embedded measure development group (n = 174), clinical patients and undergraduate students categorized as feigning obtained significantly lower scores on the overall test battery mean for the CVS, Shipley-2 composite score, and California Verbal Learning Test-Second Edition subtests than did genuinely performing individuals. The final full model of 3 predictor variables (Verbal Memory immediate hits, Verbal Memory immediate correct passes, and Stroop Test complex reaction time correct) was significant and correctly classified individuals in their known group 83% of the time (sensitivity = .65; specificity = .97) in a mixed sample of young-adult clinical cases and simulators. The CVS logistic regression function was applied to a separate undergraduate college group (n = 378) that was asked to perform genuinely and identified 5% as having possibly feigned performance indicating a low false-positive rate. The failure rate was 11% and 16% at baseline cognitive testing in samples of high school and college athletes, respectively. These findings have particular relevance given the increasing use of computerized test batteries for baseline cognitive testing and return-to-play decisions after concussion.

  18. Integration of logistic regression, Markov chain and cellular automata models to simulate urban expansion

    Science.gov (United States)

    Jokar Arsanjani, Jamal; Helbich, Marco; Kainz, Wolfgang; Darvishi Boloorani, Ali

    2013-04-01

    This research analyses the suburban expansion in the metropolitan area of Tehran, Iran. A hybrid model consisting of logistic regression model, Markov chain (MC), and cellular automata (CA) was designed to improve the performance of the standard logistic regression model. Environmental and socio-economic variables dealing with urban sprawl were operationalised to create a probability surface of spatiotemporal states of built-up land use for the years 2006, 2016, and 2026. For validation, the model was evaluated by means of relative operating characteristic values for different sets of variables. The approach was calibrated for 2006 by cross comparing of actual and simulated land use maps. The achieved outcomes represent a match of 89% between simulated and actual maps of 2006, which was satisfactory to approve the calibration process. Thereafter, the calibrated hybrid approach was implemented for forthcoming years. Finally, future land use maps for 2016 and 2026 were predicted by means of this hybrid approach. The simulated maps illustrate a new wave of suburban development in the vicinity of Tehran at the western border of the metropolis during the next decades.

  19. Euler Elastica regularized Logistic Regression for whole-brain decoding of fMRI data.

    Science.gov (United States)

    Zhang, Chuncheng; Yao, Li; Song, Sutao; Wen, Xiaotong; Zhao, Xiaojie; Long, Zhiying

    2017-09-25

    Multivariate pattern analysis (MVPA) methods have been widely applied to functional magnetic resonance imaging (fMRI) data to decode brain states. Due to the "high features, low samples" in fMRI data, machine learning methods have been widely regularized using various regularizations to avoid overfitting. Both total variation (TV) using the gradients of images and Euler's elastica (EE) using the gradient and the curvature of images are the two popular regulations with spatial structures. In contrast to TV, EE regulation is able to overcome the disadvantage of TV regulation that favored piecewise constant images over piecewise smooth images. In this study, we introduced EE to fMRI-based decoding for the first time and proposed the EE regularized multinomial logistic regression (EELR) algorithm for multi-class classification. We performed experimental tests on both simulated and real fMRI data to investigate the feasibility and robustness of EELR. The performance of EELR was compared with sparse logistic regression (SLR) and TV regularized LR (TVLR). The results showed that EELR was more robustness to noises and showed significantly higher classification performance than TVLR and SLR. Moreover, the forward models and weights patterns revealed that EELR detected larger brain regions that were discriminative to each task and activated by each task than TVLR. The results suggest that EELR not only performs well in brain decoding but also reveals meaningful discriminative and activation patterns. This study demonstrated that EELR showed promising potential in brain decoding and discriminative/activation pattern detection.

  20. Logistic regression analysis of psychosocial correlates associated with recovery from schizophrenia in a Chinese community.

    Science.gov (United States)

    Tse, Samson; Davidson, Larry; Chung, Ka-Fai; Yu, Chong Ho; Ng, King Lam; Tsoi, Emily

    2015-02-01

    More mental health services are adopting the recovery paradigm. This study adds to prior research by (a) using measures of stages of recovery and elements of recovery that were designed and validated in a non-Western, Chinese culture and (b) testing which demographic factors predict advanced recovery and whether placing importance on certain elements predicts advanced recovery. We examined recovery and factors associated with recovery among 75 Hong Kong adults who were diagnosed with schizophrenia and assessed to be in clinical remission. Data were collected on socio-demographic factors, recovery stages and elements associated with recovery. Logistic regression analysis was used to identify variables that could best predict stages of recovery. Receiver operating characteristic curves were used to detect the classification accuracy of the model (i.e. rates of correct classification of stages of recovery). Logistic regression results indicated that stages of recovery could be distinguished with reasonable accuracy for Stage 3 ('living with disability', classification accuracy = 75.45%) and Stage 4 ('living beyond disability', classification accuracy = 75.50%). However, there was no sufficient information to predict Combined Stages 1 and 2 ('overwhelmed by disability' and 'struggling with disability'). It was found that having a meaningful role and age were the most important differentiators of recovery stage. Preliminary findings suggest that adopting salient life roles personally is important to recovery and that this component should be incorporated into mental health services. © The Author(s) 2014.

  1. GIS-based rare events logistic regression for mineral prospectivity mapping

    Science.gov (United States)

    Xiong, Yihui; Zuo, Renguang

    2018-02-01

    Mineralization is a special type of singularity event, and can be considered as a rare event, because within a specific study area the number of prospective locations (1s) are considerably fewer than the number of non-prospective locations (0s). In this study, GIS-based rare events logistic regression (RELR) was used to map the mineral prospectivity in the southwestern Fujian Province, China. An odds ratio was used to measure the relative importance of the evidence variables with respect to mineralization. The results suggest that formations, granites, and skarn alterations, followed by faults and aeromagnetic anomaly are the most important indicators for the formation of Fe-related mineralization in the study area. The prediction rate and the area under the curve (AUC) values show that areas with higher probability have a strong spatial relationship with the known mineral deposits. Comparing the results with original logistic regression (OLR) demonstrates that the GIS-based RELR performs better than OLR. The prospectivity map obtained in this study benefits the search for skarn Fe-related mineralization in the study area.

  2. A logistic regression of risk factors for disease occurrence on Asian shrimp farms.

    Science.gov (United States)

    Leung, P; Tran, L T; Fast, A W

    2000-05-25

    Serious shrimp-disease outbreaks have reduced shrimp production and slowed industry growth since 1991. This paper tests factors such as farm sitting and design, and farm-management practices for relationships with disease occurrence. Logistic regression is used to analyze farm-level data from 3951 shrimp farms in 13 Asian countries. Disease occurrence is modeled as a 0-1 variable where 1 = disease loss of > or = 20% to any 1 crop, and 0 = losses of shrimp culture intensity, i.e. extensive, semi-intensive, and intensive. Attempts to apply logistic regression models to each country were not successful due to insufficient data for most countries. Factors affecting disease occurrences were quite different for different farming intensities. Farms that had larger pond production areas, with larger number of farms discharging effluent into their water supply canals, and removed silt had greater disease occurrence. On the other hand, farms that practiced polyculture and took water from the sea through a canal had lower disease occurrence.

  3. Predicting students' success at pre-university studies using linear and logistic regressions

    Science.gov (United States)

    Suliman, Noor Azizah; Abidin, Basir; Manan, Norhafizah Abdul; Razali, Ahmad Mahir

    2014-09-01

    The study is aimed to find the most suitable model that could predict the students' success at the medical pre-university studies, Centre for Foundation in Science, Languages and General Studies of Cyberjaya University College of Medical Sciences (CUCMS). The predictors under investigation were the national high school exit examination-Sijil Pelajaran Malaysia (SPM) achievements such as Biology, Chemistry, Physics, Additional Mathematics, Mathematics, English and Bahasa Malaysia results as well as gender and high school background factors. The outcomes showed that there is a significant difference in the final CGPA, Biology and Mathematics subjects at pre-university by gender factor, while by high school background also for Mathematics subject. In general, the correlation between the academic achievements at the high school and medical pre-university is moderately significant at α-level of 0.05, except for languages subjects. It was found also that logistic regression techniques gave better prediction models than the multiple linear regression technique for this data set. The developed logistic models were able to give the probability that is almost accurate with the real case. Hence, it could be used to identify successful students who are qualified to enter the CUCMS medical faculty before accepting any students to its foundation program.

  4. Classification of Effective Soil Depth by Using Multinomial Logistic Regression Analysis

    Science.gov (United States)

    Chang, C. H.; Chan, H. C.; Chen, B. A.

    2016-12-01

    Classification of effective soil depth is a task of determining the slopeland utilizable limitation in Taiwan. The "Slopeland Conservation and Utilization Act" categorizes the slopeland into agriculture and husbandry land, land suitable for forestry and land for enhanced conservation according to the factors including average slope, effective soil depth, soil erosion and parental rock. However, sit investigation of the effective soil depth requires a cost-effective field work. This research aimed to classify the effective soil depth by using multinomial logistic regression with the environmental factors. The Wen-Shui Watershed located at the central Taiwan was selected as the study areas. The analysis of multinomial logistic regression is performed by the assistance of a Geographic Information Systems (GIS). The effective soil depth was categorized into four levels including deeper, deep, shallow and shallower. The environmental factors of slope, aspect, digital elevation model (DEM), curvature and normalized difference vegetation index (NDVI) were selected for classifying the soil depth. An Error Matrix was then used to assess the model accuracy. The results showed an overall accuracy of 75%. At the end, a map of effective soil depth was produced to help planners and decision makers in determining the slopeland utilizable limitation in the study areas.

  5. Screening for ketosis using multiple logistic regression based on milk yield and composition.

    Science.gov (United States)

    Kayano, Mitsunori; Kataoka, Tomoko

    2015-11-01

    Multiple logistic regression was applied to milk yield and composition data for 632 records of healthy cows and 61 records of ketotic cows in Hokkaido, Japan. The purpose was to diagnose ketosis based on milk yield and composition, simultaneously. The cows were divided into two groups: (1) multiparous, including 314 healthy cows and 45 ketotic cows and (2) primiparous, including 318 healthy cows and 16 ketotic cows, since nutritional status, milk yield and composition are affected by parity. Multiple logistic regression was applied to these groups separately. For multiparous cows, milk yield (kg/day/cow) and protein-to-fat (P/F) ratio in milk were significant factors (Pketosis. For primiparous cows, lactose content (%), solid not fat (SNF) content (%) and milk urea nitrogen (MUN) content (mg/dl) were significantly associated with ketosis (Pketosis, provided the sensitivity, specificity and AUC values of (1) 0.711, 0.726 and 0.781; and (2) 0.678, 0.767 and 0.738, respectively.

  6. Ordinal logistic regression analysis on the nutritional status of children in KarangKitri village

    Science.gov (United States)

    Ohyver, Margaretha; Yongharto, Kimmy Octavian

    2015-09-01

    Ordinal logistic regression is a statistical technique that can be used to describe the relationship between ordinal response variable with one or more independent variables. This method has been used in various fields including in the health field. In this research, ordinal logistic regression is used to describe the relationship between nutritional status of children with age, gender, height, and family status. Nutritional status of children in this research is divided into over nutrition, well nutrition, less nutrition, and malnutrition. The purpose for this research is to describe the characteristics of children in the KarangKitri Village and to determine the factors that influence the nutritional status of children in the KarangKitri village. There are three things that obtained from this research. First, there are still children who are not categorized as well nutritional status. Second, there are children who come from sufficient economic level which include in not normal status. Third, the factors that affect the nutritional level of children are age, family status, and height.

  7. Predictive market segmentation model: An application of logistic regression model and CHAID procedure

    Directory of Open Access Journals (Sweden)

    Soldić-Aleksić Jasna

    2009-01-01

    Full Text Available Market segmentation presents one of the key concepts of the modern marketing. The main goal of market segmentation is focused on creating groups (segments of customers that have similar characteristics, needs, wishes and/or similar behavior regarding the purchase of concrete product/service. Companies can create specific marketing plan for each of these segments and therefore gain short or long term competitive advantage on the market. Depending on the concrete marketing goal, different segmentation schemes and techniques may be applied. This paper presents a predictive market segmentation model based on the application of logistic regression model and CHAID analysis. The logistic regression model was used for the purpose of variables selection (from the initial pool of eleven variables which are statistically significant for explaining the dependent variable. Selected variables were afterwards included in the CHAID procedure that generated the predictive market segmentation model. The model results are presented on the concrete empirical example in the following form: summary model results, CHAID tree, Gain chart, Index chart, risk and classification tables.

  8. Sparse Logistic Regression for Diagnosis of Liver Fibrosis in Rat by Using SCAD-Penalized Likelihood

    Directory of Open Access Journals (Sweden)

    Fang-Rong Yan

    2011-01-01

    Full Text Available The objective of the present study is to find out the quantitative relationship between progression of liver fibrosis and the levels of certain serum markers using mathematic model. We provide the sparse logistic regression by using smoothly clipped absolute deviation (SCAD penalized function to diagnose the liver fibrosis in rats. Not only does it give a sparse solution with high accuracy, it also provides the users with the precise probabilities of classification with the class information. In the simulative case and the experiment case, the proposed method is comparable to the stepwise linear discriminant analysis (SLDA and the sparse logistic regression with least absolute shrinkage and selection operator (LASSO penalty, by using receiver operating characteristic (ROC with bayesian bootstrap estimating area under the curve (AUC diagnostic sensitivity for selected variable. Results show that the new approach provides a good correlation between the serum marker levels and the liver fibrosis induced by thioacetamide (TAA in rats. Meanwhile, this approach might also be used in predicting the development of liver cirrhosis.

  9. Predicting Factors of INSURE Failure in Low Birth Weight Neonates with RDS; A Logistic Regression Model

    Directory of Open Access Journals (Sweden)

    Bita Najafian

    2015-02-01

    Full Text Available Background:Respiratory Distress syndrome is the most common respiratory disease in premature neonate and the most important cause of death among them. We aimed to investigate factors to predict successful or failure of INSURE method as a therapeutic method of RDS. Methods:In a cohort study,45 neonates with diagnosed RDS and birth weight lower than 1500g were included and they underwent INSURE followed by NCPAP(Nasal Continuous Positive Airway Pressure. The patients were divided into failure or successful groups and factors which can predict success of INSURE were investigated by logistic regression in SPSS 16th version. Results:29 and16 neonates were observed in successful and failure groups, respectively. Birth weight was the only variable with significant difference between two groups (P=0.002. Finally logistic regression test showed that birth weight is only predicting factor for success (P: 0.001, EXP[β]: 0.009, CI [95%]: 1.003-0.014 and mortality (P: 0.029, EXP[β]: 0.993, CI [95%]: 0.987-0.999 of neonates treated with INSURE method. Conclusion:Predicting factors which affect on success rate of INSURE can be useful for treating and reducing charge of neonate with RDS and the birth weight is one of the effective factor on INSURE Success in this study.

  10. Predicting Factors of INSURE Failure in Low Birth Weight Neonates with RDS; A Logistic Regression Model

    Directory of Open Access Journals (Sweden)

    Bita Najafian

    2015-02-01

    Full Text Available Background:Respiratory Distress syndrome is the most common respiratory disease in premature neonate and the most important cause of death among them. We aimed to investigate factors to predict successful or failure of INSURE method as a therapeutic method of RDS.Methods:In a cohort study,45 neonates with diagnosed RDS and birth weight lower than 1500g were included and they underwent INSURE followed by NCPAP(Nasal Continuous Positive Airway Pressure. The patients were divided into failure or successful groups and factors which can predict success of INSURE were investigated by logistic regression in SPSS 16th version.Results:29 and16 neonates were observed in successful and failure groups, respectively. Birth weight was the only variable with significant difference between two groups (P=0.002. Finally logistic regression test showed that birth weight is only predicting factor for success (P: 0.001, EXP[β]: 0.009, CI [95%]: 1.003-0.014 and mortality (P: 0.029, EXP[β]: 0.993, CI [95%]: 0.987-0.999 of neonates treated with INSURE method.Conclusion:Predicting factors which affect on success rate of INSURE can be useful for treating and reducing charge of neonate with RDS and the birth weight is one of the effective factor on INSURE Success in this study.

  11. A genetic algorithm for variable selection in logistic regression analysis of radiotherapy treatment outcomes

    Science.gov (United States)

    Gayou, Olivier; Das, Shiva K.; Zhou, Su-Min; Marks, Lawrence B.; Parda, David S.; Miften, Moyed

    2008-01-01

    A given outcome of radiotherapy treatment can be modeled by analyzing its correlation with a combination of dosimetric, physiological, biological, and clinical factors, through a logistic regression fit of a large patient population. The quality of the fit is measured by the combination of the predictive power of this particular set of factors and the statistical significance of the individual factors in the model. We developed a genetic algorithm (GA), in which a small sample of all the possible combinations of variables are fitted to the patient data. New models are derived from the best models, through crossover and mutation operations, and are in turn fitted. The process is repeated until the sample converges to the combination of factors that best predicts the outcome. The GA was tested on a data set that investigated the incidence of lung injury in NSCLC patients treated with 3DCRT. The GA identified a model with two variables as the best predictor of radiation pneumonitis: the V30 (p=0.048) and the ongoing use of tobacco at the time of referral (p=0.074). This two-variable model was confirmed as the best model by analyzing all possible combinations of factors. In conclusion, genetic algorithms provide a reliable and fast way to select significant factors in logistic regression analysis of large clinical studies. PMID:19175102

  12. Regularization Paths for Conditional Logistic Regression: The clogitL1 Package

    Directory of Open Access Journals (Sweden)

    Stephen Reid

    2014-07-01

    Full Text Available We apply the cyclic coordinate descent algorithm of Friedman, Hastie, and Tibshirani (2010 to the ?tting of a conditional logistic regression model with lasso (?1 and elastic net penalties. The sequential strong rules of Tibshirani, Bien, Hastie, Friedman, Taylor, Simon, and Tibshirani (2012 are also used in the algorithm and it is shown that these offer a considerable speed up over the standard coordinate descent algorithm with warm starts. Once implemented, the algorithm is used in simulation studies to compare the variable selection and prediction performance of the conditional logistic regression model against that of its unconditional (standard counterpart. We ?nd that the conditional model performs admirably on datasets drawn from a suitable conditional distribution, outperforming its unconditional counterpart at variable selection. The conditional model is also ?t to a small real world dataset, demonstrating how we obtain regularization paths for the parameters of the model and how we apply cross validation for this method where natural unconditional prediction rules are hard to come by.

  13. Application of logistic regression models in observational methodology: game formats in grassroots football in initiation into football

    Directory of Open Access Journals (Sweden)

    Daniel Lapresa

    2016-01-01

    Full Text Available This study shows how simple and multiple logistic regression can be used in observational methodology and more specifically, in the fields of physical activity and sport. We demonstrate this in a study designed to determine whether three-a-side futsal or five-a-side futsal is more suited to the needs and potential of children aged 6-to-8 years. We constructed a multiple logistic regression model to analyze use of space (depth of play and three simple logistic regression models to determine which game format is more likely to potentiate effective technical and tactical performance.

  14. A comparison of discriminant analysis and logistic regression for the prediction of coliform mastitis in dairy cows.

    Science.gov (United States)

    Montgomery, M E; White, M E; Martin, S W

    1987-01-01

    Results from discriminant analysis and logistic regression were compared using two data sets from a study on predictors of coliform mastitis in dairy cows. Both techniques selected the same set of variables as important predictors and were of nearly equal value in classifying cows as having, or not having mastitis. The logistic regression model made fewer classification errors. The magnitudes of the effects were considerably different for some variables. Given the failure to meet the underlying assumptions of discriminant analysis, the coefficients from logistic regression are preferable. PMID:3453271

  15. Controlling Type I Error Rates in Assessing DIF for Logistic Regression Method Combined with SIBTEST Regression Correction Procedure and DIF-Free-Then-DIF Strategy

    Science.gov (United States)

    Shih, Ching-Lin; Liu, Tien-Hsiang; Wang, Wen-Chung

    2014-01-01

    The simultaneous item bias test (SIBTEST) method regression procedure and the differential item functioning (DIF)-free-then-DIF strategy are applied to the logistic regression (LR) method simultaneously in this study. These procedures are used to adjust the effects of matching true score on observed score and to better control the Type I error…

  16. Binary logistic regression modelling: Measuring the probability of relapse cases among drug addict

    Science.gov (United States)

    Ismail, Mohd Tahir; Alias, Siti Nor Shadila

    2014-07-01

    For many years Malaysia faced the drug addiction issues. The most serious case is relapse phenomenon among treated drug addict (drug addict who have under gone the rehabilitation programme at Narcotic Addiction Rehabilitation Centre, PUSPEN). Thus, the main objective of this study is to find the most significant factor that contributes to relapse to happen. The binary logistic regression analysis was employed to model the relationship between independent variables (predictors) and dependent variable. The dependent variable is the status of the drug addict either relapse, (Yes coded as 1) or not, (No coded as 0). Meanwhile the predictors involved are age, age at first taking drug, family history, education level, family crisis, community support and self motivation. The total of the sample is 200 which the data are provided by AADK (National Antidrug Agency). The finding of the study revealed that age and self motivation are statistically significant towards the relapse cases..

  17. ENHANCED PREDICTION OF STUDENT DROPOUTS USING FUZZY INFERENCE SYSTEM AND LOGISTIC REGRESSION

    Directory of Open Access Journals (Sweden)

    A. Saranya

    2016-01-01

    Full Text Available Predicting college and school dropouts is a major problem in educational system and has complicated challenge due to data imbalance and multi dimensionality, which can affect the low performance of students. In this paper, we have collected different database from various colleges, among these 500 best real attributes are identified in order to identify the factor that affecting dropout students using neural based classification algorithm and different mining technique are implemented for data processing. We also propose a Dropout Prediction Algorithm (DPA using fuzzy logic and Logistic Regression based inference system because the weighted average will improve the performance of whole system. We are experimented our proposed work with all other classification systems and documented as the best outcomes. The aggregated data is given to the decision trees for better dropout prediction. The accuracy of overall system 98.6% it shows the proposed work depicts efficient prediction.

  18. Computational procedures for probing interactions in OLS and logistic regression: SPSS and SAS implementations.

    Science.gov (United States)

    Hayes, Andrew F; Matthes, Jörg

    2009-08-01

    Researchers often hypothesize moderated effects, in which the effect of an independent variable on an outcome variable depends on the value of a moderator variable. Such an effect reveals itself statistically as an interaction between the independent and moderator variables in a model of the outcome variable. When an interaction is found, it is important to probe the interaction, for theories and hypotheses often predict not just interaction but a specific pattern of effects of the focal independent variable as a function of the moderator. This article describes the familiar pick-a-point approach and the much less familiar Johnson-Neyman technique for probing interactions in linear models and introduces macros for SPSS and SAS to simplify the computations and facilitate the probing of interactions in ordinary least squares and logistic regression. A script version of the SPSS macro is also available for users who prefer a point-and-click user interface rather than command syntax.

  19. Modeling data for pancreatitis in presence of a duodenal diverticula using logistic regression

    Science.gov (United States)

    Dineva, S.; Prodanova, K.; Mlachkova, D.

    2013-12-01

    The presence of a periampullary duodenal diverticulum (PDD) is often observed during upper digestive tract barium meal studies and endoscopic retrograde cholangiopancreatography (ERCP). A few papers reported that the diverticulum had something to do with the incidence of pancreatitis. The aim of this study is to investigate if the presence of duodenal diverticula predisposes to the development of a pancreatic disease. A total 3966 patients who had undergone ERCP were studied retrospectively. They were divided into 2 groups-with and without PDD. Patients with a duodenal diverticula had a higher rate of acute pancreatitis. The duodenal diverticula is a risk factor for acute idiopathic pancreatitis. A multiple logistic regression to obtain adjusted estimate of odds and to identify if a PDD is a predictor of acute or chronic pancreatitis was performed. The software package STATISTICA 10.0 was used for analyzing the real data.

  20. AN APPLICATION OF THE LOGISTIC REGRESSION MODEL IN THE EXPERIMENTAL PHYSICAL CHEMISTRY

    Directory of Open Access Journals (Sweden)

    Elpidio Corral-López

    2015-06-01

    Full Text Available The calculation of intensive properties molar volumes of ethanol-water mixtures by experimental densities and tangent method in the Physical Chemistry Laboratory presents the problem of making manually the molar volume curve versus mole fraction and the trace of the tangent line trace. The advantage of using a statistical model the Logistic Regression on a Texas VOYAGE graphing calculator allowed trace the curve and the tangents in situ, and also evaluate the students work during the experimental session. The error percentage between the molar volumes calculated using literature data and those obtained with statistical method is minimal, which validates the model. It is advantageous use the calculator with this application as a teaching support tool, reducing the evaluation time of 3 weeks to 3 hours.

  1. Using Logistic Regression to Model New York City Restaurant Grades Over a Two-Year Period

    Directory of Open Access Journals (Sweden)

    David Nadler

    2014-07-01

    Full Text Available A knowledge gap exists in the role of restaurant type on the prediction of attaining the highest grade possible from the local health inspection agency. This study identified disparities using logistic regression between the issuance of a Grade A and restaurant type and location. This study tested the eight most inspected types of restaurants within the City of New York and calculated the odds ratios of their receiving the highest inspection grade by the New York City Department of Health and Mental Hygiene. A fitted equation has been proposed for the prediction of receiving the highest inspection grade based upon the citywide results of these eight restaurant types from calendar years 2011 and 2012. The results suggest that certain styles of restaurants have lower odds of receiving the highest grade in comparison to American-style restaurants.

  2. Logistic Regression Analysis on Factors Affecting Adoption of Rice-Fish Farming in North Iran

    Directory of Open Access Journals (Sweden)

    Seyyed Ali NOORHOSSEINI-NIYAKI

    2012-06-01

    Full Text Available We evaluated the factors influencing the adoption of rice-fish farming in the Tavalesh region near the Caspian Sea in northern Iran. We conducted a survey with open-ended questions. Data were collected from 184 respondents (61 adopters and 123 non-adopters randomly sampled from selected villages and analyzed using logistic regression and multi-response analysis. Family size, number of contacts with an extension agent, participation in extension-education activities, membership in social institutions and the presence of farm workers were the most important socio-economic factors for the adoption of rice-fish farming system. In addition, economic problems were the most common issue reported by adopters. Other issues such as lack of access to appropriate fish food, losses of fish, lack of access to high quality fish fingerlings and dehydration and poor water quality were also important to a number of farmers.

  3. Desertification Susceptibility Mapping Using Logistic Regression Analysis in the Djelfa Area, Algeria

    Directory of Open Access Journals (Sweden)

    Farid Djeddaoui

    2017-10-01

    Full Text Available The main goal of this work was to identify the areas that are most susceptible to desertification in a part of the Algerian steppe, and to quantitatively assess the key factors that contribute to this desertification. In total, 139 desertified zones were mapped using field surveys and photo-interpretation. We selected 16 spectral and geomorphic predictive factors, which a priori play a significant role in desertification. They were mainly derived from Landsat 8 imagery and Shuttle Radar Topographic Mission digital elevation model (SRTM DEM. Some factors, such as the topographic position index (TPI and curvature, were used for the first time in this kind of study. For this purpose, we adapted the logistic regression algorithm for desertification susceptibility mapping, which has been widely used for landslide susceptibility mapping. The logistic model was evaluated using the area under the receiver operating characteristic (ROC curve. The model accuracy was 87.8%. We estimated the model uncertainties using a bootstrap method. Our analysis suggests that the predictive model is robust and stable. Our results indicate that land cover factors, including normalized difference vegetation index (NDVI and rangeland classes, play a major role in determining desertification occurrence, while geomorphological factors have a limited impact. The predictive map shows that 44.57% of the area is classified as highly to very highly susceptible to desertification. The developed approach can be used to assess desertification in areas with similar characteristics and to guide possible actions to combat desertification.

  4. Effective factors contraceptive use by logistic regression model in Tehran, 1996

    Directory of Open Access Journals (Sweden)

    Ramezani F

    1999-07-01

    Full Text Available Despite unwillingness to fertility, about 30% of couples do not use any kind of contraception and this will lead to unwanted pregnancy. In this clinical trial study, 4177 subjects who had at least one alive child, and delivered in one of the 12 university hospitals in Tehran were recruited. This study was conducted in 1996. The questionnaire included some questions about contraceptive use, their attitudes about unwantedness or wantedness of their current pregnancies. Data were analysed using a Logistic Regrassion Model. Results showed that 20.3% of those who had no fertility intention, did not use any kind of contraception methods, 41.1% of the subjects who were using a contraception method before pregnancy, had got pregnant unwantedly. Based on Logistic Regression Model; age, education, previous familiarity of women with contraception methods and husband's education were the most significant factors in contraceptive use. Subjects who were 20 years old and less or 35 years old and more and illeterate subjects were at higher risk for unuse of contraception methods. This risk was not related to the gender of their children that suggests a positive change in their perspectives towards sex and the number of children. It is suggested that health politicians choose an appropriate model to enhance the literacy, education and counseling for the correct usage of contraceptives and prevention of unwanted pregnancy.

  5. Logistic regression analysis of pedestrian casualty risk in passenger vehicle collisions in China.

    Science.gov (United States)

    Kong, Chunyu; Yang, Jikuang

    2010-07-01

    A large number of pedestrian fatalities were reported in China since the 1990s, however the exposure of pedestrians in public traffic has never been measured quantitatively using in-depth accident data. This study aimed to investigate the association between the impact speed and risk of pedestrian casualties in passenger vehicle collisions based on real-world accident cases in China. The cases were selected from a database of in-depth investigation of vehicle accidents in Changsha-IVAC. The sampling criteria were defined as (1) the accident was a frontal impact that occurred between 2003 and 2009; (2) the pedestrian age was above 14; (3) the injury according to the Abbreviated Injury Scale (AIS) was 1+; (4) the accident involved passenger cars, SUVs, or MPVs; and (5) the vehicle impact speed can be determined. The selected IVAC data set, which included 104 pedestrian accident cases, was weighted based on the national traffic accident data. The logistical regression models of the risks for pedestrian fatalities and AIS 3+ injuries were developed in terms of vehicle impact speed using the unweighted and weighted data sets. A multiple logistic regression model on the risk of pedestrian AIS 3+ injury was developed considering the age and impact speed as two variables. It was found that the risk of pedestrian fatality is 26% at 50 km/h, 50% at 58 km/h, and 82% at 70 km/h. At an impact speed of 80 km/h, the pedestrian rarely survives. The weighted risk curves indicated that the risks of pedestrian fatality and injury in China were higher than that in other high-income countries, whereas the risks of pedestrian casualty was lower than in these countries 30 years ago. The findings could have a contribution to better understanding of the exposures of pedestrians in urban traffic in China, and provide background knowledge for the development of strategies for pedestrian protection. Copyright 2010 Elsevier Ltd. All rights reserved.

  6. Optimization of Game Formats in U-10 Soccer Using Logistic Regression Analysis

    Directory of Open Access Journals (Sweden)

    Amatria Mario

    2016-12-01

    Full Text Available Small-sided games provide young soccer players with better opportunities to develop their skills and progress as individual and team players. There is, however, little evidence on the effectiveness of different game formats in different age groups, and furthermore, these formats can vary between and even within countries. The Royal Spanish Soccer Association replaced the traditional grassroots 7-a-side format (F-7 with the 8-a-side format (F-8 in the 2011-12 season and the country’s regional federations gradually followed suit. The aim of this observational methodology study was to investigate which of these formats best suited the learning needs of U-10 players transitioning from 5-aside futsal. We built a multiple logistic regression model to predict the success of offensive moves depending on the game format and the area of the pitch in which the move was initiated. Success was defined as a shot at the goal. We also built two simple logistic regression models to evaluate how the game format influenced the acquisition of technicaltactical skills. It was found that the probability of a shot at the goal was higher in F-7 than in F-8 for moves initiated in the Creation Sector-Own Half (0.08 vs 0.07 and the Creation Sector-Opponent's Half (0.18 vs 0.16. The probability was the same (0.04 in the Safety Sector. Children also had more opportunities to control the ball and pass or take a shot in the F-7 format (0.24 vs 0.20, and these were also more likely to be successful in this format (0.28 vs 0.19.

  7. Modeling group size and scalar stress by logistic regression from an archaeological perspective.

    Directory of Open Access Journals (Sweden)

    Gianmarco Alberti

    Full Text Available Johnson's scalar stress theory, describing the mechanics of (and the remedies to the increase in in-group conflictuality that parallels the increase in groups' size, provides scholars with a useful theoretical framework for the understanding of different aspects of the material culture of past communities (i.e., social organization, communal food consumption, ceramic style, architecture and settlement layout. Due to its relevance in archaeology and anthropology, the article aims at proposing a predictive model of critical level of scalar stress on the basis of community size. Drawing upon Johnson's theory and on Dunbar's findings on the cognitive constrains to human group size, a model is built by means of Logistic Regression on the basis of the data on colony fissioning among the Hutterites of North America. On the grounds of the theoretical framework sketched in the first part of the article, the absence or presence of colony fissioning is considered expression of not critical vs. critical level of scalar stress for the sake of the model building. The model, which is also tested against a sample of archaeological and ethnographic cases: a confirms the existence of a significant relationship between critical scalar stress and group size, setting the issue on firmer statistical grounds; b allows calculating the intercept and slope of the logistic regression model, which can be used in any time to estimate the probability that a community experienced a critical level of scalar stress; c allows locating a critical scalar stress threshold at community size 127 (95% CI: 122-132, while the maximum probability of critical scale stress is predicted at size 158 (95% CI: 147-170. The model ultimately provides grounds to assess, for the sake of any further archaeological/anthropological interpretation, the probability that a group reached a hot spot of size development critical for its internal cohesion.

  8. Evaluation of graphical diagnostics for assessing goodness of fit of logistic regression models.

    Science.gov (United States)

    Pavan Kumar, Venkata V; Duffull, Stephen B

    2011-04-01

    The aim of the current work was to evaluate graphical diagnostics for assessment of the fit of logistic regression models. Assessment of goodness of fit of a model to the data set is essential to ensure the model provides an acceptable description of the binary variables seen. For logistic regression the most common diagnostic used for this purpose is binning the data and comparing the empirical probability of the occurrence of a dependent variable with the model predicted probability against the mean covariate value in the bin. Although intuitively appealing this method, which we term simple binning, may not have consistent properties for diagnosing model problems. In this report we describe and evaluate two different diagnostic procedures, random binning and simplified Bayes marginal model plots. These procedures were assessed via simulation under three different designs. Design 1: studies which were balanced on binary variables and a continuous covariate. Design 2: studies that were balanced on binary variables but unbalanced on the continuous covariate. Design 3: studies that were unbalanced on both the binary variables and the covariate. Each simulated study consisted of 500 individuals. Thirty studies were simulated. The covariate of interest was dose which could range from 0 to 20 units. The data were simulated with the dose being related to the outcome according to an E (max) model on the logit scale. A logit E (max) model (correct model) and a logit linear model (wrong model) were fitted to all data sets. The performance of the above diagnostics, in addition to simple binning, was compared. For all designs the proposed diagnostics performed at least as well and in many instances better than simple binning. In case of design 1 random binning and simple binning are identical. In the case of designs 2 and 3 random binning and simplified Bayes marginal model plots were superior in assessing the model fit when compared to simple binning. For the examples tested

  9. Logistic regression model for diagnosis of transition zone prostate cancer on multi-parametric MRI

    Energy Technology Data Exchange (ETDEWEB)

    Dikaios, Nikolaos; Halligan, Steve; Taylor, Stuart; Atkinson, David; Punwani, Shonit [University College London, Centre for Medical Imaging, London (United Kingdom); University College London Hospital, Departments of Radiology, London (United Kingdom); Alkalbani, Jokha; Sidhu, Harbir Singh; Fujiwara, Taiki [University College London, Centre for Medical Imaging, London (United Kingdom); Abd-Alazeez, Mohamed; Ahmed, Hashim; Emberton, Mark [University College London, Research Department of Urology, London (United Kingdom); Kirkham, Alex; Allen, Clare [University College London Hospital, Departments of Radiology, London (United Kingdom); Freeman, Alex [University College London Hospital, Department of Histopathology, London (United Kingdom)

    2014-09-17

    We aimed to develop logistic regression (LR) models for classifying prostate cancer within the transition zone on multi-parametric magnetic resonance imaging (mp-MRI). One hundred and fifty-five patients (training cohort, 70 patients; temporal validation cohort, 85 patients) underwent mp-MRI and transperineal-template-prostate-mapping (TPM) biopsy. Positive cores were classified by cancer definitions: (1) any-cancer; (2) definition-1 [≥Gleason 4 + 3 or ≥ 6 mm cancer core length (CCL)] [high risk significant]; and (3) definition-2 (≥Gleason 3 + 4 or ≥ 4 mm CCL) cancer [intermediate-high risk significant]. For each, logistic-regression mp-MRI models were derived from the training cohort and validated internally and with the temporal cohort. Sensitivity/specificity and the area under the receiver operating characteristic (ROC-AUC) curve were calculated. LR model performance was compared to radiologists' performance. Twenty-eight of 70 patients from the training cohort, and 25/85 patients from the temporal validation cohort had significant cancer on TPM. The ROC-AUC of the LR model for classification of cancer was 0.73/0.67 at internal/temporal validation. The radiologist A/B ROC-AUC was 0.65/0.74 (temporal cohort). For patients scored by radiologists as Prostate Imaging Reporting and Data System (Pi-RADS) score 3, sensitivity/specificity of radiologist A 'best guess' and LR model was 0.14/0.54 and 0.71/0.61, respectively; and radiologist B 'best guess' and LR model was 0.40/0.34 and 0.50/0.76, respectively. LR models can improve classification of Pi-RADS score 3 lesions similar to experienced radiologists. (orig.)

  10. Transmission Risks of Schistosomiasis Japonica: Extraction from Back-propagation Artificial Neural Network and Logistic Regression Model: e2123

    National Research Council Canada - National Science Library

    Jun-Fang Xu; Jing Xu; Shi-Zhu Li; Tia-Wu Jia; Xi-Bao Huang; Hua-Ming Zhang; Mei Chen; Guo-Jing Yang; Shu-Jing Gao; Qing-Yun Wang; Xiao-Nong Zhou

    2013-01-01

    ...) and logistic regression model in assessment of transmission risks of Schistosoma japonicum with epidemiological data collected from 2339 villagers from 1247 households in six villages of Jiangling County, P.R. China...

  11. Transmission risks of schistosomiasis japonica: extraction from back-propagation artificial neural network and logistic regression model

    National Research Council Canada - National Science Library

    Xu, Jun-Fang; Xu, Jing; Li, Shi-Zhu; Jia, Tia-Wu; Huang, Xi-Bao; Zhang, Hua-Ming; Chen, Mei; Yang, Guo-Jing; Gao, Shu-Jing; Wang, Qing-Yun; Zhou, Xiao-Nong

    2013-01-01

    ...) and logistic regression model in assessment of transmission risks of Schistosoma japonicum with epidemiological data collected from 2339 villagers from 1247 households in six villages of Jiangling County, P.R. China...

  12. Genomic-Enabled Prediction of Ordinal Data with Bayesian Logistic Ordinal Regression.

    Science.gov (United States)

    Montesinos-López, Osval A; Montesinos-López, Abelardo; Crossa, José; Burgueño, Juan; Eskridge, Kent

    2015-08-18

    Most genomic-enabled prediction models developed so far assume that the response variable is continuous and normally distributed. The exception is the probit model, developed for ordered categorical phenotypes. In statistical applications, because of the easy implementation of the Bayesian probit ordinal regression (BPOR) model, Bayesian logistic ordinal regression (BLOR) is implemented rarely in the context of genomic-enabled prediction [sample size (n) is much smaller than the number of parameters (p)]. For this reason, in this paper we propose a BLOR model using the Pólya-Gamma data augmentation approach that produces a Gibbs sampler with similar full conditional distributions of the BPOR model and with the advantage that the BPOR model is a particular case of the BLOR model. We evaluated the proposed model by using simulation and two real data sets. Results indicate that our BLOR model is a good alternative for analyzing ordinal data in the context of genomic-enabled prediction with the probit or logit link. Copyright © 2015 Montesinos-López et al.

  13. Predicting the aquatic toxicity mode of action using logistic regression and linear discriminant analysis.

    Science.gov (United States)

    Ren, Y Y; Zhou, L C; Yang, L; Liu, P Y; Zhao, B W; Liu, H X

    2016-09-01

    The paper highlights the use of the logistic regression (LR) method in the construction of acceptable statistically significant, robust and predictive models for the classification of chemicals according to their aquatic toxic modes of action. Essentials accounting for a reliable model were all considered carefully. The model predictors were selected by stepwise forward discriminant analysis (LDA) from a combined pool of experimental data and chemical structure-based descriptors calculated by the CODESSA and DRAGON software packages. Model predictive ability was validated both internally and externally. The applicability domain was checked by the leverage approach to verify prediction reliability. The obtained models are simple and easy to interpret. In general, LR performs much better than LDA and seems to be more attractive for the prediction of the more toxic compounds, i.e. compounds that exhibit excess toxicity versus non-polar narcotic compounds and more reactive compounds versus less reactive compounds. In addition, model fit and regression diagnostics was done through the influence plot which reflects the hat-values, studentized residuals, and Cook's distance statistics of each sample. Overdispersion was also checked for the LR model. The relationships between the descriptors and the aquatic toxic behaviour of compounds are also discussed.

  14. Loss of Power in Logistic, Ordinal Logistic, and Probit Regression When an Outcome Variable Is Coarsely Categorized

    Science.gov (United States)

    Taylor, Aaron B.; West, Stephen G.; Aiken, Leona S.

    2006-01-01

    Variables that have been coarsely categorized into a small number of ordered categories are often modeled as outcome variables in psychological research. The authors employ a Monte Carlo study to investigate the effects of this coarse categorization of dependent variables on power to detect true effects using three classes of regression models:…

  15. [Formulation of combined predictive indicators using logistic regression model in predicting sepsis and prognosis].

    Science.gov (United States)

    Duan, Liwei; Zhang, Sheng; Lin, Zhaofen

    2017-02-01

    To explore the method and performance of using multiple indices to diagnose sepsis and to predict the prognosis of severe ill patients. Critically ill patients at first admission to intensive care unit (ICU) of Changzheng Hospital, Second Military Medical University, from January 2014 to September 2015 were enrolled if the following conditions were satisfied: (1) patients were 18-75 years old; (2) the length of ICU stay was more than 24 hours; (3) All records of the patients were available. Data of the patients was collected by searching the electronic medical record system. Logistic regression model was formulated to create the new combined predictive indicator and the receiver operating characteristic (ROC) curve for the new predictive indicator was built. The area under the ROC curve (AUC) for both the new indicator and original ones were compared. The optimal cut-off point was obtained where the Youden index reached the maximum value. Diagnostic parameters such as sensitivity, specificity and predictive accuracy were also calculated for comparison. Finally, individual values were substituted into the equation to test the performance in predicting clinical outcomes. A total of 362 patients (218 males and 144 females) were enrolled in our study and 66 patients died. The average age was (48.3±19.3) years old. (1) For the predictive model only containing categorical covariants [including procalcitonin (PCT), lipopolysaccharide (LPS), infection, white blood cells count (WBC) and fever], increased PCT, increased WBC and fever were demonstrated to be independent risk factors for sepsis in the logistic equation. The AUC for the new combined predictive indicator was higher than that of any other indictor, including PCT, LPS, infection, WBC and fever (0.930 vs. 0.661, 0.503, 0.570, 0.837, 0.800). The optimal cut-off value for the new combined predictive indicator was 0.518. Using the new indicator to diagnose sepsis, the sensitivity, specificity and diagnostic accuracy

  16. Multiple logistic regression model of signalling practices of drivers on urban highways

    Science.gov (United States)

    Puan, Othman Che; Ibrahim, Muttaka Na'iya; Zakaria, Rozana

    2015-05-01

    Giving signal is a way of informing other road users, especially to the conflicting drivers, the intention of a driver to change his/her movement course. Other users are exposed to hazard situation and risks of accident if the driver who changes his/her course failed to give signal as required. This paper describes the application of logistic regression model for the analysis of driver's signalling practices on multilane highways based on possible factors affecting driver's decision such as driver's gender, vehicle's type, vehicle's speed and traffic flow intensity. Data pertaining to the analysis of such factors were collected manually. More than 2000 drivers who have performed a lane changing manoeuvre while driving on two sections of multilane highways were observed. Finding from the study shows that relatively a large proportion of drivers failed to give any signals when changing lane. The result of the analysis indicates that although the proportion of the drivers who failed to provide signal prior to lane changing manoeuvre is high, the degree of compliances of the female drivers is better than the male drivers. A binary logistic model was developed to represent the probability of a driver to provide signal indication prior to lane changing manoeuvre. The model indicates that driver's gender, type of vehicle's driven, speed of vehicle and traffic volume influence the driver's decision to provide a signal indication prior to a lane changing manoeuvre on a multilane urban highway. In terms of types of vehicles driven, about 97% of motorcyclists failed to comply with the signal indication requirement. The proportion of non-compliance drivers under stable traffic flow conditions is much higher than when the flow is relatively heavy. This is consistent with the data which indicates a high degree of non-compliances when the average speed of the traffic stream is relatively high.

  17. Determinants of unmet need for family planning in rural Burkina Faso: a multilevel logistic regression analysis.

    Science.gov (United States)

    Wulifan, Joseph K; Jahn, Albrecht; Hien, Hervé; Ilboudo, Patrick Christian; Meda, Nicolas; Robyn, Paul Jacob; Saidou Hamadou, T; Haidara, Ousmane; De Allegri, Manuela

    2017-12-19

    Unmet need for family planning has implications for women and their families, such as unsafe abortion, physical abuse, and poor maternal health. Contraceptive knowledge has increased across low-income settings, yet unmet need remains high with little information on the factors explaining it. This study assessed factors associated with unmet need among pregnant women in rural Burkina Faso. We collected data on pregnant women through a population-based survey conducted in 24 rural districts between October 2013 and March 2014. Multivariate multilevel logistic regression was used to assess the association between unmet need for family planning and a selection of relevant demand- and supply-side factors. Of the 1309 pregnant women covered in the survey, 239 (18.26%) reported experiencing unmet need for family planning. Pregnant women with more than three living children [OR = 1.80; 95% CI (1.11-2.91)], those with a child younger than 1 year [OR = 1.75; 95% CI (1.04-2.97)], pregnant women whose partners disapproves contraceptive use [OR = 1.51; 95% CI (1.03-2.21)] and women who desired fewer children compared to their partners preferred number of children [OR = 1.907; 95% CI (1.361-2.672)] were significantly more likely to experience unmet need for family planning, while health staff training in family planning logistics management (OR = 0.46; 95% CI (0.24-0.73)] was associated with a lower probability of experiencing unmet need for family planning. Findings suggest the need to strengthen family planning interventions in Burkina Faso to ensure greater uptake of contraceptive use and thus reduce unmet need for family planning.

  18. Sample size matters: Investigating the optimal sample size for a logistic regression debris flow susceptibility model

    Science.gov (United States)

    Heckmann, Tobias; Gegg, Katharina; Becht, Michael

    2013-04-01

    Statistical approaches to landslide susceptibility modelling on the catchment and regional scale are used very frequently compared to heuristic and physically based approaches. In the present study, we deal with the problem of the optimal sample size for a logistic regression model. More specifically, a stepwise approach has been chosen in order to select those independent variables (from a number of derivatives of a digital elevation model and landcover data) that explain best the spatial distribution of debris flow initiation zones in two neighbouring central alpine catchments in Austria (used mutually for model calculation and validation). In order to minimise problems arising from spatial autocorrelation, we sample a single raster cell from each debris flow initiation zone within an inventory. In addition, as suggested by previous work using the "rare events logistic regression" approach, we take a sample of the remaining "non-event" raster cells. The recommendations given in the literature on the size of this sample appear to be motivated by practical considerations, e.g. the time and cost of acquiring data for non-event cases, which do not apply to the case of spatial data. In our study, we aim at finding empirically an "optimal" sample size in order to avoid two problems: First, a sample too large will violate the independent sample assumption as the independent variables are spatially autocorrelated; hence, a variogram analysis leads to a sample size threshold above which the average distance between sampled cells falls below the autocorrelation range of the independent variables. Second, if the sample is too small, repeated sampling will lead to very different results, i.e. the independent variables and hence the result of a single model calculation will be extremely dependent on the choice of non-event cells. Using a Monte-Carlo analysis with stepwise logistic regression, 1000 models are calculated for a wide range of sample sizes. For each sample size

  19. Appropriate assessment of neighborhood effects on individual health: integrating random and fixed effects in multilevel logistic regression

    DEFF Research Database (Denmark)

    Larsen, Klaus; Merlo, Juan

    2005-01-01

    The logistic regression model is frequently used in epidemiologic studies, yielding odds ratio or relative risk interpretations. Inspired by the theory of linear normal models, the logistic regression model has been extended to allow for correlated responses by introducing random effects. However......, the model does not inherit the interpretational features of the normal model. In this paper, the authors argue that the existing measures are unsatisfactory (and some of them are even improper) when quantifying results from multilevel logistic regression analyses. The authors suggest a measure...... of heterogeneity, the median odds ratio, that quantifies cluster heterogeneity and facilitates a direct comparison between covariate effects and the magnitude of heterogeneity in terms of well-known odds ratios. Quantifying cluster-level covariates in a meaningful way is a challenge in multilevel logistic...

  20. Flood susceptible analysis at Kelantan river basin using remote sensing and logistic regression model

    Science.gov (United States)

    Pradhan, Biswajeet

    Recently, in 2006 and 2007 heavy monsoons rainfall have triggered floods along Malaysia's east coast as well as in southern state of Johor. The hardest hit areas are along the east coast of peninsular Malaysia in the states of Kelantan, Terengganu and Pahang. The city of Johor was particularly hard hit in southern side. The flood cost nearly billion ringgit of property and many lives. The extent of damage could have been reduced or minimized if an early warning system would have been in place. This paper deals with flood susceptibility analysis using logistic regression model. We have evaluated the flood susceptibility and the effect of flood-related factors along the Kelantan river basin using the Geographic Information System (GIS) and remote sensing data. Previous flooded areas were extracted from archived radarsat images using image processing tools. Flood susceptibility mapping was conducted in the study area along the Kelantan River using radarsat imagery and then enlarged to 1:25,000 scales. Topographical, hydrological, geological data and satellite images were collected, processed, and constructed into a spatial database using GIS and image processing. The factors chosen that influence flood occurrence were: topographic slope, topographic aspect, topographic curvature, DEM and distance from river drainage, all from the topographic database; flow direction, flow accumulation, extracted from hydrological database; geology and distance from lineament, taken from the geologic database; land use from SPOT satellite images; soil texture from soil database; and the vegetation index value from SPOT satellite images. Flood susceptible areas were analyzed and mapped using the probability-logistic regression model. Results indicate that flood prone areas can be performed at 1:25,000 which is comparable to some conventional flood hazard map scales. The flood prone areas delineated on these maps correspond to areas that would be inundated by significant flooding

  1. SU-F-R-22: Malignancy Classification for Small Pulmonary Nodules with Radiomics and Logistic Regression

    Energy Technology Data Exchange (ETDEWEB)

    Huang, W; Tu, S [Chang Gung University, Kwei-shan, Tao-Yuan, Taiwan (China)

    2016-06-15

    Purpose: We conducted a retrospective study of Radiomics research for classifying malignancy of small pulmonary nodules. A machine learning algorithm of logistic regression and open research platform of Radiomics, IBEX (Imaging Biomarker Explorer), were used to evaluate the classification accuracy. Methods: The training set included 100 CT image series from cancer patients with small pulmonary nodules where the average diameter is 1.10 cm. These patients registered at Chang Gung Memorial Hospital and received a CT-guided operation of lung cancer lobectomy. The specimens were classified by experienced pathologists with a B (benign) or M (malignant). CT images with slice thickness of 0.625 mm were acquired from a GE BrightSpeed 16 scanner. The study was formally approved by our institutional internal review board. Nodules were delineated and 374 feature parameters were extracted from IBEX. We first used the t-test and p-value criteria to study which feature can differentiate between group B and M. Then we implemented a logistic regression algorithm to perform nodule malignancy classification. 10-fold cross-validation and the receiver operating characteristic curve (ROC) were used to evaluate the classification accuracy. Finally hierarchical clustering analysis, Spearman rank correlation coefficient, and clustering heat map were used to further study correlation characteristics among different features. Results: 238 features were found differentiable between group B and M based on whether their statistical p-values were less than 0.05. A forward search algorithm was used to select an optimal combination of features for the best classification and 9 features were identified. Our study found the best accuracy of classifying malignancy was 0.79±0.01 with the 10-fold cross-validation. The area under the ROC curve was 0.81±0.02. Conclusion: Benign nodules may be treated as a malignant tumor in low-dose CT and patients may undergo unnecessary surgeries or treatments. Our

  2. A logistic regression based approach for the prediction of flood warning threshold exceedance

    Science.gov (United States)

    Diomede, Tommaso; Trotter, Luca; Stefania Tesini, Maria; Marsigli, Chiara

    2016-04-01

    A method based on logistic regression is proposed for the prediction of river level threshold exceedance at short (+0-18h) and medium (+18-42h) lead times. The aim of the study is to provide a valuable tool for the issue of warnings by the authority responsible of public safety in case of flood. The role of different precipitation periods as predictors for the exceedance of a fixed river level has been investigated, in order to derive significant information for flood forecasting. Based on catchment-averaged values, a separation of "antecedent" and "peak-triggering" rainfall amounts as independent variables is attempted. In particular, the following flood-related precipitation periods have been considered: (i) the period from 1 to n days before the forecast issue time, which may be relevant for the soil saturation, (ii) the last 24 hours, which may be relevant for the current water level in the river, and (iii) the period from 0 to x hours in advance with respect to the forecast issue time, when the flood-triggering precipitation generally occurs. Several combinations and values of these predictors have been tested to optimise the method implementation. In particular, the period for the precursor antecedent precipitation ranges between 5 and 45 days; the state of the river can be represented by the last 24-h precipitation or, as alternative, by the current river level. The flood-triggering precipitation has been cumulated over the next 18 hours (for the short lead time) and 36-42 hours (for the medium lead time). The proposed approach requires a specific implementation of logistic regression for each river section and warning threshold. The method performance has been evaluated over the Santerno river catchment (about 450 km2) in the Emilia-Romagna Region, northern Italy. A statistical analysis in terms of false alarms, misses and related scores was carried out by using a 8-year long database. The results are quite satisfactory, with slightly better performances

  3. Geographical variation of unmet medical needs in Italy: a multivariate logistic regression analysis.

    Science.gov (United States)

    Cavalieri, Marina

    2013-05-12

    Unmet health needs should be, in theory, a minor issue in Italy where a publicly funded and universally accessible health system exists. This, however, does not seem to be the case. Moreover, in the last two decades responsibilities for health care have been progressively decentralized to regional governments, which have differently organized health service delivery within their territories. Regional decision-making has affected the use of health care services, further increasing the existing geographical disparities in the access to care across the country. This study aims at comparing self-perceived unmet needs across Italian regions and assessing how the reported reasons - grouped into the categories of availability, accessibility and acceptability - vary geographically. Data from the 2006 Italian component of the European Union Statistics on Income and Living Conditions are employed to explore reasons and predictors of self-reported unmet medical needs among 45,175 Italian respondents aged 18 and over. Multivariate logistic regression models are used to determine adjusted rates for overall unmet medical needs and for each of the three categories of reasons. Results show that, overall, 6.9% of the Italian population stated having experienced at least one unmet medical need during the last 12 months. The unadjusted rates vary markedly across regions, thus resulting in a clear-cut north-south divide (4.6% in the North-East vs. 10.6% in the South). Among those reporting unmet medical needs, the leading reason was problems of accessibility related to cost or transportation (45.5%), followed by acceptability (26.4%) and availability due to the presence of too long waiting lists (21.4%). In the South, more than one out of two individuals with an unmet need refrained from seeing a physician due to economic reasons. In the northern regions, working and family responsibilities contribute relatively more to the underutilization of medical services. Logistic regression

  4. An epidemiological survey on road traffic crashes in Iran: application of the two logistic regression models.

    Science.gov (United States)

    Bakhtiyari, Mahmood; Mehmandar, Mohammad Reza; Mirbagheri, Babak; Hariri, Gholam Reza; Delpisheh, Ali; Soori, Hamid

    2014-01-01

    Risk factors of human-related traffic crashes are the most important and preventable challenges for community health due to their noteworthy burden in developing countries in particular. The present study aims to investigate the role of human risk factors of road traffic crashes in Iran. Through a cross-sectional study using the COM 114 data collection forms, the police records of almost 600,000 crashes occurred in 2010 are investigated. The binary logistic regression and proportional odds regression models are used. The odds ratio for each risk factor is calculated. These models are adjusted for known confounding factors including age, sex and driving time. The traffic crash reports of 537,688 men (90.8%) and 54,480 women (9.2%) are analysed. The mean age is 34.1 ± 14 years. Not maintaining eyes on the road (53.7%) and losing control of the vehicle (21.4%) are the main causes of drivers' deaths in traffic crashes within cities. Not maintaining eyes on the road is also the most frequent human risk factor for road traffic crashes out of cities. Sudden lane excursion (OR = 9.9, 95% CI: 8.2-11.9) and seat belt non-compliance (OR = 8.7, CI: 6.7-10.1), exceeding authorised speed (OR = 17.9, CI: 12.7-25.1) and exceeding safe speed (OR = 9.7, CI: 7.2-13.2) are the most significant human risk factors for traffic crashes in Iran. The high mortality rate of 39 people for every 100,000 population emphasises on the importance of traffic crashes in Iran. Considering the important role of human risk factors in traffic crashes, struggling efforts are required to control dangerous driving behaviours such as exceeding speed, illegal overtaking and not maintaining eyes on the road.

  5. Predicting the "graduate on time (GOT)" of PhD students using binary logistics regression model

    Science.gov (United States)

    Shariff, S. Sarifah Radiah; Rodzi, Nur Atiqah Mohd; Rahman, Kahartini Abdul; Zahari, Siti Meriam; Deni, Sayang Mohd

    2016-10-01

    Malaysian government has recently set a new goal to produce 60,000 Malaysian PhD holders by the year 2023. As a Malaysia's largest institution of higher learning in terms of size and population which offers more than 500 academic programmes in a conducive and vibrant environment, UiTM has taken several initiatives to fill up the gap. Strategies to increase the numbers of graduates with PhD are a process that is challenging. In many occasions, many have already identified that the struggle to get into the target set is even more daunting, and that implementation is far too ideal. This has further being progressing slowly as the attrition rate increases. This study aims to apply the proposed models that incorporates several factors in predicting the number PhD students that will complete their PhD studies on time. Binary Logistic Regression model is proposed and used on the set of data to determine the number. The results show that only 6.8% of the 2014 PhD students are predicted to graduate on time and the results are compared wih the actual number for validation purpose.

  6. Prediction of Depression in Cancer Patients With Different Classification Criteria, Linear Discriminant Analysis versus Logistic Regression.

    Science.gov (United States)

    Shayan, Zahra; Mohammad Gholi Mezerji, Naser; Shayan, Leila; Naseri, Parisa

    2015-11-03

    Logistic regression (LR) and linear discriminant analysis (LDA) are two popular statistical models for prediction of group membership. Although they are very similar, the LDA makes more assumptions about the data. When categorical and continuous variables used simultaneously, the optimal choice between the two models is questionable. In most studies, classification error (CE) is used to discriminate between subjects in several groups, but this index is not suitable to predict the accuracy of the outcome. The present study compared LR and LDA models using classification indices. This cross-sectional study selected 243 cancer patients. Sample sets of different sizes (n = 50, 100, 150, 200, 220) were randomly selected and the CE, B, and Q classification indices were calculated by the LR and LDA models. CE revealed the a lack of superiority for one model over the other, but the results showed that LR performed better than LDA for the B and Q indices in all situations. No significant effect for sample size on CE was noted for selection of an optimal model. Assessment of the accuracy of prediction of real data indicated that the B and Q indices are appropriate for selection of an optimal model. The results of this study showed that LR performs better in some cases and LDA in others when based on CE. The CE index is not appropriate for classification, although the B and Q indices performed better and offered more efficient criteria for comparison and discrimination between groups.

  7. CPAT: Coding-Potential Assessment Tool using an alignment-free logistic regression model.

    Science.gov (United States)

    Wang, Liguo; Park, Hyun Jung; Dasari, Surendra; Wang, Shengqin; Kocher, Jean-Pierre; Li, Wei

    2013-04-01

    Thousands of novel transcripts have been identified using deep transcriptome sequencing. This discovery of large and 'hidden' transcriptome rejuvenates the demand for methods that can rapidly distinguish between coding and noncoding RNA. Here, we present a novel alignment-free method, Coding Potential Assessment Tool (CPAT), which rapidly recognizes coding and noncoding transcripts from a large pool of candidates. To this end, CPAT uses a logistic regression model built with four sequence features: open reading frame size, open reading frame coverage, Fickett TESTCODE statistic and hexamer usage bias. CPAT software outperformed (sensitivity: 0.96, specificity: 0.97) other state-of-the-art alignment-based software such as Coding-Potential Calculator (sensitivity: 0.99, specificity: 0.74) and Phylo Codon Substitution Frequencies (sensitivity: 0.90, specificity: 0.63). In addition to high accuracy, CPAT is approximately four orders of magnitude faster than Coding-Potential Calculator and Phylo Codon Substitution Frequencies, enabling its users to process thousands of transcripts within seconds. The software accepts input sequences in either FASTA- or BED-formatted data files. We also developed a web interface for CPAT that allows users to submit sequences and receive the prediction results almost instantly.

  8. Knowledge and perception on tuberculosis transmission in Tanzania: Multinomial logistic regression analysis of secondary data.

    Science.gov (United States)

    Ismail, Abbas; Josephat, Peter

    2014-01-01

    Tuberculosis (TB) is one of the most important public health problems in Tanzania and was declared as a national public health emergency in 2006. Community and individual knowledge and perceptions are critical factors in the control of the disease. The objective of this study was to analyze the knowledge and perception on the transmission of TB in Tanzania. Multinomial Logistic Regression analysis was considered in order to quantify the impact of knowledge and perception on TB. The data used was adopted as secondary data from larger national survey 2007-08 Tanzania HIV/AIDS and Malaria Indicator Survey. The findings across groups revealed that knowledge on TB transmission increased with an increase in age and level of education. People in rural areas had less knowledge regarding tuberculosis transmission compared to urban areas [OR = 0.7]. People with the access to radio [OR = 1.7] were more knowledgeable on tuberculosis transmission compared to those who did not have access to radio. People who did not have telephone [OR = 0.6] were less knowledgeable on tuberculosis route of transmission compared to those who had telephone. The findings showed that socio-demographic factors such as age, education, place of residence and owning telephone or radio varied systematically with knowledge on tuberculosis transmission.

  9. Landslide Fissure Inference Assessment by ANFIS and Logistic Regression Using UAS-Based Photogrammetry

    Directory of Open Access Journals (Sweden)

    Ozgun Akcay

    2015-10-01

    Full Text Available Unmanned Aerial Systems (UAS are now capable of gathering high-resolution data, therefore, landslides can be explored in detail at larger scales. In this research, 132 aerial photographs were captured, and 85,456 features were detected and matched automatically using UAS photogrammetry. The root mean square (RMS values of the image coordinates of the Ground Control Points (GPCs varied from 0.521 to 2.293 pixels, whereas maximum RMS values of automatically matched features was calculated as 2.921 pixels. Using the 3D point cloud, which was acquired by aerial photogrammetry, the raster datasets of the aspect, slope, and maximally stable extremal regions (MSER detecting visual uniformity, were defined as three variables, in order to reason fissure structures on the landslide surface. In this research, an Adaptive Neuro Fuzzy Inference System (ANFIS and a Logistic Regression (LR were implemented using training datasets to infer fissure data appropriately. The accuracy of the predictive models was evaluated by drawing receiver operating characteristic (ROC curves and by calculating the area under the ROC curve (AUC. The experiments exposed that high-resolution imagery is an indispensable data source to model and validate landslide fissures appropriately.

  10. The likelihood of achieving quantified road safety targets: a binary logistic regression model for possible factors.

    Science.gov (United States)

    Sze, N N; Wong, S C; Lee, C Y

    2014-12-01

    In past several decades, many countries have set quantified road safety targets to motivate transport authorities to develop systematic road safety strategies and measures and facilitate the achievement of continuous road safety improvement. Studies have been conducted to evaluate the association between the setting of quantified road safety targets and road fatality reduction, in both the short and long run, by comparing road fatalities before and after the implementation of a quantified road safety target. However, not much work has been done to evaluate whether the quantified road safety targets are actually achieved. In this study, we used a binary logistic regression model to examine the factors - including vehicle ownership, fatality rate, and national income, in addition to level of ambition and duration of target - that contribute to a target's success. We analyzed 55 quantified road safety targets set by 29 countries from 1981 to 2009, and the results indicate that targets that are in progress and with lower level of ambitions had a higher likelihood of eventually being achieved. Moreover, possible interaction effects on the association between level of ambition and the likelihood of success are also revealed. Copyright © 2014 Elsevier Ltd. All rights reserved.

  11. Fisher Scoring Method for Parameter Estimation of Geographically Weighted Ordinal Logistic Regression (GWOLR) Model

    Science.gov (United States)

    Widyaningsih, Purnami; Retno Sari Saputro, Dewi; Nugrahani Putri, Aulia

    2017-06-01

    GWOLR model combines geographically weighted regression (GWR) and (ordinal logistic reression) OLR models. Its parameter estimation employs maximum likelihood estimation. Such parameter estimation, however, yields difficult-to-solve system of nonlinear equations, and therefore numerical approximation approach is required. The iterative approximation approach, in general, uses Newton-Raphson (NR) method. The NR method has a disadvantage—its Hessian matrix is always the second derivatives of each iteration so it does not always produce converging results. With regard to this matter, NR model is modified by substituting its Hessian matrix into Fisher information matrix, which is termed Fisher scoring (FS). The present research seeks to determine GWOLR model parameter estimation using Fisher scoring method and apply the estimation on data of the level of vulnerability to Dengue Hemorrhagic Fever (DHF) in Semarang. The research concludes that health facilities give the greatest contribution to the probability of the number of DHF sufferers in both villages. Based on the number of the sufferers, IR category of DHF in both villages can be determined.

  12. Application of singularity theory and logistic regression model for tungsten polymetallic potential mapping

    Science.gov (United States)

    Liu, Y.; Xia, Q.; Cheng, Q.; Wang, X.

    2013-07-01

    Geo-anomalies with complex structures in the earth's crust may be defined as preferable hydrothermal ore-forming targets. The separation and explanation of anomaly from geological background have a profound influence on the analysis of geological evolution and the ore-forming process. Usually one of the key steps to identify favorable exploration targets is to determine the threshold to separate anomaly from geological background. In this paper, the singularity theory and concentration-area (C-A) fractal method was applied in determination of the threshold of geo-anomalies. According to the thresholds, four singular maps can be each divided into two segments. Each of them is correlated to the anomaly and background of the geological objects (e.g., faults, fault intersections and contacts), which reveals that the various geo-anomalies can be characterized by the singularities. The results indicate that the local singularity method can be used to identify the weak anomalies in a low background. Logistic regression model was used to combine geo-singularity maps for mineral potential mapping, which provides a useful input for further mineral exploration in the Nanling tungsten polymetallic belt, South China.

  13. The Web system for coronary disease risk estimation by logistic regression (“CHD risk”

    Directory of Open Access Journals (Sweden)

    Ines Drenjančević-Perić

    2009-02-01

    Full Text Available Aim Coronary and heart diseases (CHD represent one of thegreatest medical problems in the developed world. To facilitatethe CHD probability estimation procedure, as well as to speed upthe procedure of making and issuing patient’s final diagnosis, wedeveloped the Risk estimation application (‘’CHD Risk’’.Methods Risk estimation is based upon a multivariate analysis ofstatistical data by using logistic regression as a method for probabilityestimation. The method estimates how the final outcome isinfluenced by every single factor. Risk factors represent independentvariables of the model, while a coronary disease risk indicatoris dependent variable.Results The ‘’CHD Risk” was tested for three cases and showedapplication credibility. The system provides coronary disease riskprobability estimation for the given risk factor values, as well asadvice for factors whose values exceed the range of normal values.The system allows input of additional statistical data, whichimproves its learning properties.Conclusions Although the “CHD Risk” system can neither makefinal decisions nor replace the medical professionals themselves, itfulfills the aim to develop the web based tool to help the physiciansto monitor their patients’ health condition, as well as to suggestpreventive measures and therapy.

  14. Network Intrusion Detection through Discriminative Feature Selection by Using Sparse Logistic Regression

    Directory of Open Access Journals (Sweden)

    Reehan Ali Shah

    2017-11-01

    Full Text Available Intrusion detection system (IDS is a well-known and effective component of network security that provides transactions upon the network systems with security and safety. Most of earlier research has addressed difficulties such as overfitting, feature redundancy, high-dimensional features and a limited number of training samples but feature selection. We approach the problem of feature selection via sparse logistic regression (SPLR. In this paper, we propose a discriminative feature selection and intrusion classification based on SPLR for IDS. The SPLR is a recently developed technique for data analysis and processing via sparse regularized optimization that selects a small subset from the original feature variables to model the data for the purpose of classification. A linear SPLR model aims to select the discriminative features from the repository of datasets and learns the coefficients of the linear classifier. Compared with the feature selection approaches, like filter (ranking and wrapper methods that separate the feature selection and classification problems, SPLR can combine feature selection and classification into a unified framework. The experiments in this correspondence demonstrate that the proposed method has better performance than most of the well-known techniques used for intrusion detection.

  15. Determining the Impact of Residential Neighbourhood Crime on Housing Investment Using Logistic Regression

    Directory of Open Access Journals (Sweden)

    Sunday Emmanuel Olajide

    2016-12-01

    Full Text Available This paper discusses the impact of criminal activities on residential property value. With regard to criminal activities, the paper emphasizes on the contribution of each component of property crime. One thousand (1000 sets of structured questionnaire were administered on the residents of residential estates within the South Western States of Nigeria out of which 467 were considered useable after the data screening. Purposive and systematic sampling techniques were used while logistic regression was used to determine the impact of each of the components of residential property crime on housing investment. The results showed the P-Values of 0.000, 0.322, 0.335, 0.545 and 0.992 for violent crime, incivilities and street crime, burglary and theft, vandalism and robbery respectively. However, the R2 which represents the generalisation of the impact of neighbourhood crime on housing investment was 44 % and aggregate P-value was 0.000. Using the Hosmer and Lemeshow (H-L test of goodness of fit, the model had approximately 89% predictive probability which is considered excellent. This indicates that the alternative hypothesis is upheld that residential neighbourhood crime is capable of impacting on residential property value. The policy implication of this result is that no effort should be spared in combating residential neighbourhood crime in order to boost and encourage housing investment.

  16. Artificial neural networks versus bivariate logistic regression in prediction diagnosis of patients with hypertension and diabetes.

    Science.gov (United States)

    Adavi, Mehdi; Salehi, Masoud; Roudbari, Masoud

    2016-01-01

    Diabetes and hypertension are important non-communicable diseases and their prevalence is important for health authorities. The aim of this study was to determine the predictive precision of the bivariate Logistic Regression (LR) and Artificial Neutral Network (ANN) in concurrent diagnosis of diabetes and hypertension. This cross-sectional study was performed with 12000 Iranian people in 2013 using stratified- cluster sampling. The research questionnaire included information on hypertension and diabetes and their risk factors. A perceptron ANN with two hidden layers was applied to data. To build a joint LR model and ANN, SAS 9.2 and Matlab software were used. The AUC was used to find the higher accurate model for predicting diabetes and hypertension. The variables of gender, type of cooking oil, physical activity, family history, age, passive smokers and obesity entered to the LR model and ANN. The odds ratios of affliction to both diabetes and hypertension is high in females, users of solid oil, with no physical activity, with positive family history, age of equal or higher than 55, passive smokers and those with obesity. The AUC for LR model and ANN were 0.78 (p=0.039) and 0.86 (p=0.046), respectively. The best model for concurrent affliction to hypertension and diabetes is ANN which has higher accuracy than the bivariate LR model.

  17. IDENTIFIKASI FAKTOR PREDIKSI DIAGNOSIS TINGKAT KEGANASAN KANKER PAYUDARA METODE STEPWISE BINARY LOGISTIC REGRESSION

    Directory of Open Access Journals (Sweden)

    Retno Aulia Vinarti

    2014-01-01

    Full Text Available The World Health Organization (WHO reported that deaths caused by cancer in the world these last four years has increased significantly. The data also reflected in the increase in breast cancer cases. In Indonesia, two cases also the highest cases of adult female deaths. Based on Hospital Information System, the number of breast cancer patients either inpatient or outpatient care amounted to 28.7%. This fact revealed more than 40% of all cancers can be prevented with early detection cancer. Role of Information Technology can implemented by data mining techniques to shorten the diagnosing time, accuracy and selection of factors early detection of breast cancer. Stepwise binary logistic regression method has the advantage to add and subtract the independent variables in accordance with level of significance of the model. Based on the analysis of weighting method, the highest four variables that should be more aware is the area of cancer (area, fineness (smoothness, the number of dots (concave points or the nucleus of cancer and grayish level of cancer (texture. So the accuracy and processing speed of diagnosis of the severity of breast cancer can be improved through this method.

  18. Propensity score matching of the gymnastics for diabetes mellitus using logistic regression

    Science.gov (United States)

    Otok, Bambang Widjanarko; Aisyah, Amalia; Purhadi, Andari, Shofi

    2017-12-01

    Diabetes Mellitus (DM) is a group of metabolic diseases with characteristics shows an abnormal blood glucose level occurring due to pancreatic insulin deficiency, decreased insulin effectiveness or both. The report from the ministry of health shows that DMs prevalence data of East Java province is 2.1%, while the DMs prevalence of Indonesia is only 1,5%. Given the high cases of DM in East Java, it needs the preventive action to control factors causing the complication of DM. This study aims to determine the combination factors causing the complication of DM to reduce the bias by confounding variables using Propensity Score Matching (PSM) with the method of propensity score estimation is binary logistic regression. The data used in this study is the medical record from As-Shafa clinic consisting of 6 covariates and health complication as response variable. The result of PSM analysis showed that there are 22 of 126 DMs patients attending gymnastics paired with patients who didnt attend to diabetes gymnastics. The Average Treatment of Treated (ATT) estimation results showed that the more patients who didnt attend to gymnastics, the more likely the risk for the patients having DMs complications.

  19. Prediction of Depression in Cancer Patients With Different Classification Criteria, Linear Discriminant Analysis versus Logistic Regression

    Science.gov (United States)

    Shayan, Zahra; Mezerji, Naser Mohammad Gholi; Shayan, Leila; Naseri, Parisa

    2016-01-01

    Background: Logistic regression (LR) and linear discriminant analysis (LDA) are two popular statistical models for prediction of group membership. Although they are very similar, the LDA makes more assumptions about the data. When categorical and continuous variables used simultaneously, the optimal choice between the two models is questionable. In most studies, classification error (CE) is used to discriminate between subjects in several groups, but this index is not suitable to predict the accuracy of the outcome. The present study compared LR and LDA models using classification indices. Methods: This cross-sectional study selected 243 cancer patients. Sample sets of different sizes (n = 50, 100, 150, 200, 220) were randomly selected and the CE, B, and Q classification indices were calculated by the LR and LDA models. Results: CE revealed the a lack of superiority for one model over the other, but the results showed that LR performed better than LDA for the B and Q indices in all situations. No significant effect for sample size on CE was noted for selection of an optimal model. Assessment of the accuracy of prediction of real data indicated that the B and Q indices are appropriate for selection of an optimal model. Conclusion: The results of this study showed that LR performs better in some cases and LDA in others when based on CE. The CE index is not appropriate for classification, although the B and Q indices performed better and offered more efficient criteria for comparison and discrimination between groups. PMID:26925900

  20. [Logistic regression analysis on relationships between traditional Chinese medicine constitutional types and overweight or obesity].

    Science.gov (United States)

    Zhu, Yan-bo; Wang, Qi; Wu, Cheng-yu; Pang, Guo-ming; Zhao, Jian-xiong; Shen, Shi-lin; Xia, Zhong-yuan; Yan, Xue

    2010-11-01

    To explore the relationships between traditional Chinese medicine (TCM) constitutional types and overweight or obesity so as to provide evidence for adjusting constitutional bias and preventing and treating obesity. The data comes from a cross-sectional survey on TCM constitution of 18 805 samples aged above 18 in Beijing and 8 provinces (Jiangsu, Anhui, Gansu, Qinghai, Fujian, Jilin, Jiangxi and Henan) in China. The survey of TCM constitution was performed by standardized constitution in Chinese medicine questionnaire (CCMQ). Discriminatory analysis method was used to judge the individual's constitutional type (gentleness type, qi-deficiency type, yang-deficiency type, yin-deficiency type, phlegm-dampness type, dampness-heat type, blood-stasis type, qi-depression type and special diathesis type). The relationships between TCM constitution types and overweight or obesity was investigated by logistic regression analysis. Compared with gentleness type, the risk of overweight (OR, 2.05; 95% CI, 1.79-2.35) and obesity (OR, 4.34; 95% CI, 3.52-5.36) in phlegm-dampness type is significantly increased; the risk of obesity (OR, 1.60; 95% CI, 1.30-1.98) in qi-deficiency type is significantly higher; the risk of overweight and obesity in yang-deficiency type, blood-stasis type, and qi-depression type is significantly lower. Phlegm-dampness type and qi-deficiency type are the main constitutional risk factors of overweight or obesity.

  1. Prediction of cannabis and cocaine use in adolescence using decision trees and logistic regression

    Directory of Open Access Journals (Sweden)

    Alfonso L. Palmer

    2010-01-01

    Full Text Available Spain is one of the European countries with the highest prevalence of cannabis and cocaine use among young people. The aim of this study was to investigate the factors related to the consumption of cocaine and cannabis among adolescents. A questionnaire was administered to 9,284 students between 14 and 18 years of age in Palma de Mallorca (47.1% boys and 52.9% girls whose mean age was 15.59 years. Logistic regression and decision trees were carried out in order to model the consumption of cannabis and cocaine. The results show the use of legal substances and committing fraudulence or theft are the main variables that raise the odds of consuming cannabis. In boys, cannabis consumption and a family history of drug use increase the odds of consuming cocaine, whereas in girls the use of alcohol, behaviours of fraudulence or theft and difficulty in some personal skills influence their odds of consuming cocaine. Finally, ease of access to the substance greatly raises the odds of consuming cocaine and cannabis in both genders. Decision trees highlight the role of consuming other substances and committing fraudulence or theft. The results of this study gain importance when it comes to putting into practice effective prevention programmes.

  2. Comparison of logistic regression and artificial neural network in low back pain prediction: second national health survey.

    Science.gov (United States)

    Parsaeian, M; Mohammad, K; Mahmoudi, M; Zeraati, H

    2012-01-01

    The purpose of this investigation was to compare empirically predictive ability of an artificial neural network with a logistic regression in prediction of low back pain. Data from the second national health survey were considered in this investigation. This data includes the information of low back pain and its associated risk factors among Iranian people aged 15 years and older. Artificial neural network and logistic regression models were developed using a set of 17294 data and they were validated in a test set of 17295 data. Hosmer and Lemeshow recommendation for model selection was used in fitting the logistic regression. A three-layer perceptron with 9 inputs, 3 hidden and 1 output neurons was employed. The efficiency of two models was compared by receiver operating characteristic analysis, root mean square and -2 Loglikelihood criteria. The area under the ROC curve (SE), root mean square and -2Loglikelihood of the logistic regression was 0.752 (0.004), 0.3832 and 14769.2, respectively. The area under the ROC curve (SE), root mean square and -2Loglikelihood of the artificial neural network was 0.754 (0.004), 0.3770 and 14757.6, respectively. Based on these three criteria, artificial neural network would give better performance than logistic regression. Although, the difference is statistically significant, it does not seem to be clinically significant.

  3. Logistic Regression for Seismically Induced Landslide Predictions: Using Uniform Hazard and Geophysical Layers as Predictor Variables

    Science.gov (United States)

    Nowicki, M. A.; Hearne, M.; Thompson, E.; Wald, D. J.

    2012-12-01

    Seismically induced landslides present a costly and often fatal threats in many mountainous regions. Substantial effort has been invested to understand where seismically induced landslides may occur in the future. Both slope-stability methods and, more recently, statistical approaches to the problem are described throughout the literature. Though some regional efforts have succeeded, no uniformly agreed-upon method is available for predicting the likelihood and spatial extent of seismically induced landslides. For use in the U. S. Geological Survey (USGS) Prompt Assessment of Global Earthquakes for Response (PAGER) system, we would like to routinely make such estimates, in near-real time, around the globe. Here we use the recently produced USGS ShakeMap Atlas of historic earthquakes to develop an empirical landslide probability model. We focus on recent events, yet include any digitally-mapped landslide inventories for which well-constrained ShakeMaps are also available. We combine these uniform estimates of the input shaking (e.g., peak acceleration and velocity) with broadly available susceptibility proxies, such as topographic slope and surface geology. The resulting database is used to build a predictive model of the probability of landslide occurrence with logistic regression. The landslide database includes observations from the Northridge, California (1994); Wenchuan, China (2008); ChiChi, Taiwan (1999); and Chuetsu, Japan (2004) earthquakes; we also provide ShakeMaps for moderate-sized events without landslide for proper model testing and training. The performance of the regression model is assessed with both statistical goodness-of-fit metrics and a qualitative review of whether or not the model is able to capture the spatial extent of landslides for each event. Part of our goal is to determine which variables can be employed based on globally-available data or proxies, and whether or not modeling results from one region are transferrable to

  4. Predictors of work injury in underground mines - an application of a logistic regression model

    Energy Technology Data Exchange (ETDEWEB)

    P.S. Paul [Indian School of Mines University, Dhanbad (India). Department of Mining Engineering

    2009-05-15

    Mine accidents and injuries are complex and generally characterized by several factors starting from personal to technical, and technical to social characteristics. In this study, an attempt has been made to identify the various factors responsible for work related injuries in mines and to estimate the risk of work injury to mine workers. The prediction of work injury in mines was done by a step-by-step multivariate logistic regression modeling with an application to case study mines in India. In total, 18 variables were considered in this study. Most of the variables are not directly quantifiable. Instruments were developed to quantify them through a questionnaire type survey. Underground mine workers were randomly selected for the survey. Responses from 300 participants were used for the analysis. Four variables, age, negative affectivity, job dissatisfaction, and physical hazards bear significant discriminating power for risk of injury to the workers, comparing between cases and controls in a multivariate situation while controlling all the personal and socio-technical variables. The analysis reveals that negatively affected workers are 2.54 times more prone to injuries than the less negatively affected workers and this factor is a more important risk factor for the case-study mines. Long term planning through identification of the negative individuals, proper counseling regarding the adverse effects of negative behaviors and special training is urgently required. Care should be taken for the aged and experienced workers in terms of their job responsibility and training requirements. Management should provide a friendly atmosphere during work to increase the confidence of the injury prone miners. 44 refs., 4 tabs.

  5. THE ROLE AND PLACE OF LOGISTIC REGRESSION AND ROC ANALYSIS IN SOLVING MEDICAL DIAGNOSTIC TASK

    Directory of Open Access Journals (Sweden)

    S. G. Grigoryev

    2016-01-01

    Full Text Available Diagnostics, equally with  prevention and  treatment, is a basis of medical science and practice. For its history the medicine  has accumulated a great variety  of diagnostic methods for different diseases and  pathologic conditions. Nevertheless, new  tests,  methods and  tools are being  developed and recommended to application nowadays. Such  indicators as sensitivity and  specificity which  are defined on the basis  of fourfold contingency  tables   construction or  ROC-analysis method with  ROC  – curve  modelling (Receiver operating characteristic are used  as the  methods to estimate the  diagnostic capability. Fourfold  table  is used  with  the purpose to estimate the method which confirms or denies the diagnosis, i.e. a quality indicator. ROC-curve, being a graph, allows making the estimation of model  quality by subdivision of two classes  on the  basis  of identifying the  point  of cutting off a continuous or discrete quantitative attribute.The method of logistic regression technique is introduced as a tool to develop some  mathematical-statistical forecasting model  of probability of the event the researcher is interested in if there are two possible variants of the outcome. The method of ROC-analysis is chosen and described in detail as a tool to estimate the  model  quality. The capabilities of the named methods are demonstrated by a real example of creation  and  efficiency estimation (sensitivity and  specificity of a forecasting model  of probability of complication development in the form of pyodermatitis in children with  atopic dermatitis.

  6. Statistical sex determination from craniometrics: Comparison of linear discriminant analysis, logistic regression, and support vector machines.

    Science.gov (United States)

    Santos, Frédéric; Guyomarc'h, Pierre; Bruzek, Jaroslav

    2014-12-01

    Accuracy of identification tools in forensic anthropology primarily rely upon the variations inherent in the data upon which they are built. Sex determination methods based on craniometrics are widely used and known to be specific to several factors (e.g. sample distribution, population, age, secular trends, measurement technique, etc.). The goal of this study is to discuss the potential variations linked to the statistical treatment of the data. Traditional craniometrics of four samples extracted from documented osteological collections (from Portugal, France, the U.S.A., and Thailand) were used to test three different classification methods: linear discriminant analysis (LDA), logistic regression (LR), and support vector machines (SVM). The Portuguese sample was set as a training model on which the other samples were applied in order to assess the validity and reliability of the different models. The tests were performed using different parameters: some included the selection of the best predictors; some included a strict decision threshold (sex assessed only if the related posterior probability was high, including the notion of indeterminate result); and some used an unbalanced sex-ratio. Results indicated that LR tends to perform slightly better than the other techniques and offers a better selection of predictors. Also, the use of a decision threshold (i.e. p>0.95) is essential to ensure an acceptable reliability of sex determination methods based on craniometrics. Although the Portuguese, French, and American samples share a similar sexual dimorphism, application of Western models on the Thai sample (that displayed a lower degree of dimorphism) was unsuccessful. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  7. Zone-specific logistic regression models improve classification of prostate cancer on multi-parametric MRI

    Energy Technology Data Exchange (ETDEWEB)

    Dikaios, Nikolaos; Halligan, Steve; Taylor, Stuart; Atkinson, David; Punwani, Shonit [University College London, Centre for Medical Imaging, London (United Kingdom); University College London Hospital, Departments of Radiology, London (United Kingdom); Alkalbani, Jokha; Sidhu, Harbir Singh [University College London, Centre for Medical Imaging, London (United Kingdom); Abd-Alazeez, Mohamed; Ahmed, Hashim U.; Emberton, Mark [University College London, Research Department of Urology, Division of Surgery and Interventional Science, London (United Kingdom); Kirkham, Alex [University College London Hospital, Departments of Radiology, London (United Kingdom); Freeman, Alex [University College London Hospital, Department of Histopathology, London (United Kingdom)

    2015-09-15

    To assess the interchangeability of zone-specific (peripheral-zone (PZ) and transition-zone (TZ)) multiparametric-MRI (mp-MRI) logistic-regression (LR) models for classification of prostate cancer. Two hundred and thirty-one patients (70 TZ training-cohort; 76 PZ training-cohort; 85 TZ temporal validation-cohort) underwent mp-MRI and transperineal-template-prostate-mapping biopsy. PZ and TZ uni/multi-variate mp-MRI LR-models for classification of significant cancer (any cancer-core-length (CCL) with Gleason > 3 + 3 or any grade with CCL ≥ 4 mm) were derived from the respective cohorts and validated within the same zone by leave-one-out analysis. Inter-zonal performance was tested by applying TZ models to the PZ training-cohort and vice-versa. Classification performance of TZ models for TZ cancer was further assessed in the TZ validation-cohort. ROC area-under-curve (ROC-AUC) analysis was used to compare models. The univariate parameters with the best classification performance were the normalised T2 signal (T2nSI) within the TZ (ROC-AUC = 0.77) and normalized early contrast-enhanced T1 signal (DCE-nSI) within the PZ (ROC-AUC = 0.79). Performance was not significantly improved by bi-variate/tri-variate modelling. PZ models that contained DCE-nSI performed poorly in classification of TZ cancer. The TZ model based solely on maximum-enhancement poorly classified PZ cancer. LR-models dependent on DCE-MRI parameters alone are not interchangeable between prostatic zones; however, models based exclusively on T2 and/or ADC are more robust for inter-zonal application. (orig.)

  8. Anomaly Detection Outperforms Logistic Regression in Predicting Outcomes in Trauma Patients.

    Science.gov (United States)

    Dezman, Zachary D W; Gao, Chen; Yang, Shiming; Hu, Peter; Yao, Li; Li, Hsiao-Chi; Chang, Chein-I; Mackenzie, Colin

    2017-01-01

    Recent advancements in trauma resuscitation have shown a great benefit of early identification and control of hemorrhage, which is the most common cause of death in injured patients. We introduce a new analytical approach, anomaly detection (AD), as an alternative method to the traditional logistic regression (LR) method in predicting which injured patients receive transfusions, intensive care, and other interventions. We abstracted routinely collected prehospital vital sign data from patient records (adult patients who survived more than 15 minutes after being directly admitted to a level 1 trauma center). The vital signs of the study cohort were analyzed using both LR and AD methods. Predictions on blood transfusions generated by these approaches were compared with hospital records using the respective areas under the receiver operating characteristic curves (AUROC). Of the patients seen at our trauma center between January 1, 2009, and December 31, 2010, 5,464 were included. AD significantly outperformed LR, identifying which patients would receive transfusions of uncrossmatched blood, transfusion of blood between the time of admission and 6 hours later, the need for intensive care, and in-hospital mortality (mean AUROC = 0.764 and 0.720, respectively). AD and LR provided similar predictions for the patients who would receive massive transfusion. Under the stratified 10 fold times 10 cross-validation test, AD also had significantly lower AUROC variance across subgroups than LR, suggesting AD is a more stable predictions model. AD provides enhanced predictions for clinically relevant outcomes in the trauma patient cohort studied and may assist providers in caring for acutely injured patients in the prehospital arena.

  9. Logistic regression modeling to assess groundwater vulnerability to contamination in Hawaii, USA.

    Science.gov (United States)

    Mair, Alan; El-Kadi, Aly I

    2013-10-01

    Capture zone analysis combined with a subjective susceptibility index is currently used in Hawaii to assess vulnerability to contamination of drinking water sources derived from groundwater. In this study, we developed an alternative objective approach that combines well capture zones with multiple-variable logistic regression (LR) modeling and applied it to the highly-utilized Pearl Harbor and Honolulu aquifers on the island of Oahu, Hawaii. Input for the LR models utilized explanatory variables based on hydrogeology, land use, and well geometry/location. A suite of 11 target contaminants detected in the region, including elevated nitrate (>1 mg/L), four chlorinated solvents, four agricultural fumigants, and two pesticides, was used to develop the models. We then tested the ability of the new approach to accurately separate groups of wells with low and high vulnerability, and the suitability of nitrate as an indicator of other types of contamination. Our results produced contaminant-specific LR models that accurately identified groups of wells with the lowest/highest reported detections and the lowest/highest nitrate concentrations. Current and former agricultural land uses were identified as significant explanatory variables for eight of the 11 target contaminants, while elevated nitrate was a significant variable for five contaminants. The utility of the combined approach is contingent on the availability of hydrologic and chemical monitoring data for calibrating groundwater and LR models. Application of the approach using a reference site with sufficient data could help identify key variables in areas with similar hydrogeology and land use but limited data. In addition, elevated nitrate may also be a suitable indicator of groundwater contamination in areas with limited data. The objective LR modeling approach developed in this study is flexible enough to address a wide range of contaminants and represents a suitable addition to the current subjective approach

  10. Multitask Coupled Logistic Regression and its Fast Implementation for Large Multitask Datasets.

    Science.gov (United States)

    Gu, Xin; Chung, Fu-Lai; Ishibuchi, Hisao; Wang, Shitong

    2015-09-01

    When facing multitask-learning problems, it is desirable that the learning method could find the correct input-output features and share the commonality among multiple domains and also scale-up for large multitask datasets. We introduce the multitask coupled logistic regression (LR) framework called LR-based multitask classification learning algorithm (MTC-LR), which is a new method for generating each classifier for each task, capable of sharing the commonality among multitask domains. The basic idea of MTC-LR is to use all individual LR based classifiers, each one appropriate for each task domain, but in contrast to other support vector machine (SVM)-based proposals, learning all the parameter vectors of all individual classifiers by using the conjugate gradient method, in a global way and without the use of kernel trick, and being easily extended into its scaled version. We theoretically show that the addition of a new term in the cost function of the set of LRs (that penalizes the diversity among multiple tasks) produces a coupling of multiple tasks that allows MTC-LR to improve the learning performance in a LR way. This finding can make us easily integrate it with a state-of-the-art fast LR algorithm called dual coordinate descent method (CDdual) to develop its fast version MTC-LR-CDdual for large multitask datasets. The proposed algorithm MTC-LR-CDdual is also theoretically analyzed. Our experimental results on artificial and real-datasets indicate the effectiveness of the proposed algorithm MTC-LR-CDdual in classification accuracy, speed, and robustness.

  11. Risk Factors Predicting Infectious Lactational Mastitis: Decision Tree Approach versus Logistic Regression Analysis.

    Science.gov (United States)

    Fernández, Leónides; Mediano, Pilar; García, Ricardo; Rodríguez, Juan M; Marín, María

    2016-09-01

    Objectives Lactational mastitis frequently leads to a premature abandonment of breastfeeding; its development has been associated with several risk factors. This study aims to use a decision tree (DT) approach to establish the main risk factors involved in mastitis and to compare its performance for predicting this condition with a stepwise logistic regression (LR) model. Methods Data from 368 cases (breastfeeding women with mastitis) and 148 controls were collected by a questionnaire about risk factors related to medical history of mother and infant, pregnancy, delivery, postpartum, and breastfeeding practices. The performance of the DT and LR analyses was compared using the area under the receiver operating characteristic (ROC) curve. Sensitivity, specificity and accuracy of both models were calculated. Results Cracked nipples, antibiotics and antifungal drugs during breastfeeding, infant age, breast pumps, familial history of mastitis and throat infection were significant risk factors associated with mastitis in both analyses. Bottle-feeding and milk supply were related to mastitis for certain subgroups in the DT model. The areas under the ROC curves were similar for LR and DT models (0.870 and 0.835, respectively). The LR model had better classification accuracy and sensitivity than the DT model, but the last one presented better specificity at the optimal threshold of each curve. Conclusions The DT and LR models constitute useful and complementary analytical tools to assess the risk of lactational infectious mastitis. The DT approach identifies high-risk subpopulations that need specific mastitis prevention programs and, therefore, it could be used to make the most of public health resources.

  12. Large scale identification and categorization of protein sequences using structured logistic regression.

    Directory of Open Access Journals (Sweden)

    Bjørn P Pedersen

    Full Text Available BACKGROUND: Structured Logistic Regression (SLR is a newly developed machine learning tool first proposed in the context of text categorization. Current availability of extensive protein sequence databases calls for an automated method to reliably classify sequences and SLR seems well-suited for this task. The classification of P-type ATPases, a large family of ATP-driven membrane pumps transporting essential cations, was selected as a test-case that would generate important biological information as well as provide a proof-of-concept for the application of SLR to a large scale bioinformatics problem. RESULTS: Using SLR, we have built classifiers to identify and automatically categorize P-type ATPases into one of 11 pre-defined classes. The SLR-classifiers are compared to a Hidden Markov Model approach and shown to be highly accurate and scalable. Representing the bulk of currently known sequences, we analysed 9.3 million sequences in the UniProtKB and attempted to classify a large number of P-type ATPases. To examine the distribution of pumps on organisms, we also applied SLR to 1,123 complete genomes from the Entrez genome database. Finally, we analysed the predicted membrane topology of the identified P-type ATPases. CONCLUSIONS: Using the SLR-based classification tool we are able to run a large scale study of P-type ATPases. This study provides proof-of-concept for the application of SLR to a bioinformatics problem and the analysis of P-type ATPases pinpoints new and interesting targets for further biochemical characterization and structural analysis.

  13. Reporting quality of multivariable logistic regression in selected Indian medical journals.

    Science.gov (United States)

    Kumar, R; Indrayan, A; Chhabra, P

    2012-01-01

    Use of multivariable logistic regression (MLR) modeling has steeply increased in the medical literature over the past few years. Testing of model assumptions and adequate reporting of MLR allow the reader to interpret results more accurately. To review the fulfillment of assumptions and reporting quality of MLR in selected Indian medical journals using established criteria. Analysis of published literature. Medknow.com publishes 68 Indian medical journals with open access. Eight of these journals had at least five articles using MLR between the years 1994 to 2008. Articles from each of these journals were evaluated according to the previously established 10-point quality criteria for reporting and to test the MLR model assumptions. SPSS 17 software and non-parametric test (Kruskal-Wallis H, Mann Whitney U, Spearman Correlation). One hundred and nine articles were finally found using MLR for analyzing the data in the selected eight journals. The number of such articles gradually increased after year 2003, but quality score remained almost similar over time. P value, odds ratio, and 95% confidence interval for coefficients in MLR was reported in 75.2% and sufficient cases (>10) per covariate of limiting sample size were reported in the 58.7% of the articles. No article reported the test for conformity of linear gradient for continuous covariates. Total score was not significantly different across the journals. However, involvement of statistician or epidemiologist as a co-author improved the average quality score significantly (P=0.014). Reporting of MLR in many Indian journals is incomplete. Only one article managed to score 8 out of 10 among 109 articles under review. All others scored less. Appropriate guidelines in instructions to authors, and pre-publication review of articles using MLR by a qualified statistician may improve quality of reporting.

  14. Reporting quality of multivariable logistic regression in selected Indian medical journals

    Directory of Open Access Journals (Sweden)

    R Kumar

    2012-01-01

    Full Text Available Background: Use of multivariable logistic regression (MLR modeling has steeply increased in the medical literature over the past few years. Testing of model assumptions and adequate reporting of MLR allow the reader to interpret results more accurately. Aims: To review the fulfillment of assumptions and reporting quality of MLR in selected Indian medical journals using established criteria. Setting and Design: Analysis of published literature. Materials and Methods: Medknow.com publishes 68 Indian medical journals with open access. Eight of these journals had at least five articles using MLR between the years 1994 to 2008. Articles from each of these journals were evaluated according to the previously established 10-point quality criteria for reporting and to test the MLR model assumptions. Statistical Analysis: SPSS 17 software and non-parametric test (Kruskal-Wallis H, Mann Whitney U, Spearman Correlation. Results: One hundred and nine articles were finally found using MLR for analyzing the data in the selected eight journals. The number of such articles gradually increased after year 2003, but quality score remained almost similar over time. P value, odds ratio, and 95% confidence interval for coefficients in MLR was reported in 75.2% and sufficient cases (>10 per covariate of limiting sample size were reported in the 58.7% of the articles. No article reported the test for conformity of linear gradient for continuous covariates. Total score was not significantly different across the journals. However, involvement of statistician or epidemiologist as a co-author improved the average quality score significantly (P=0.014. Conclusions: Reporting of MLR in many Indian journals is incomplete. Only one article managed to score 8 out of 10 among 109 articles under review. All others scored less. Appropriate guidelines in instructions to authors, and pre-publication review of articles using MLR by a qualified statistician may improve quality of

  15. [Logistic regression model of noninvasive prediction for portal hypertensive gastropathy in patients with hepatitis B associated cirrhosis].

    Science.gov (United States)

    Wang, Qingliang; Li, Xiaojie; Hu, Kunpeng; Zhao, Kun; Yang, Peisheng; Liu, Bo

    2015-05-12

    To explore the risk factors of portal hypertensive gastropathy (PHG) in patients with hepatitis B associated cirrhosis and establish a Logistic regression model of noninvasive prediction. The clinical data of 234 hospitalized patients with hepatitis B associated cirrhosis from March 2012 to March 2014 were analyzed retrospectively. The dependent variable was the occurrence of PHG while the independent variables were screened by binary Logistic analysis. Multivariate Logistic regression was used for further analysis of significant noninvasive independent variables. Logistic regression model was established and odds ratio was calculated for each factor. The accuracy, sensitivity and specificity of model were evaluated by the curve of receiver operating characteristic (ROC). According to univariate Logistic regression, the risk factors included hepatic dysfunction, albumin (ALB), bilirubin (TB), prothrombin time (PT), platelet (PLT), white blood cell (WBC), portal vein diameter, spleen index, splenic vein diameter, diameter ratio, PLT to spleen volume ratio, esophageal varices (EV) and gastric varices (GV). Multivariate analysis showed that hepatic dysfunction (X1), TB (X2), PLT (X3) and splenic vein diameter (X4) were the major occurring factors for PHG. The established regression model was Logit P=-2.667+2.186X1-2.167X2+0.725X3+0.976X4. The accuracy of model for PHG was 79.1% with a sensitivity of 77.2% and a specificity of 80.8%. Hepatic dysfunction, TB, PLT and splenic vein diameter are risk factors for PHG and the noninvasive predicted Logistic regression model was Logit P=-2.667+2.186X1-2.167X2+0.725X3+0.976X4.

  16. Education-Based Gaps in eHealth: A Weighted Logistic Regression Approach.

    Science.gov (United States)

    Amo, Laura

    2016-10-12

    Persons with a college degree are more likely to engage in eHealth behaviors than persons without a college degree, compounding the health disadvantages of undereducated groups in the United States. However, the extent to which quality of recent eHealth experience reduces the education-based eHealth gap is unexplored. The goal of this study was to examine how eHealth information search experience moderates the relationship between college education and eHealth behaviors. Based on a nationally representative sample of adults who reported using the Internet to conduct the most recent health information search (n=1458), I evaluated eHealth search experience in relation to the likelihood of engaging in different eHealth behaviors. I examined whether Internet health information search experience reduces the eHealth behavior gaps among college-educated and noncollege-educated adults. Weighted logistic regression models were used to estimate the probability of different eHealth behaviors. College education was significantly positively related to the likelihood of 4 eHealth behaviors. In general, eHealth search experience was negatively associated with health care behaviors, health information-seeking behaviors, and user-generated or content sharing behaviors after accounting for other covariates. Whereas Internet health information search experience has narrowed the education gap in terms of likelihood of using email or Internet to communicate with a doctor or health care provider and likelihood of using a website to manage diet, weight, or health, it has widened the education gap in the instances of searching for health information for oneself, searching for health information for someone else, and downloading health information on a mobile device. The relationship between college education and eHealth behaviors is moderated by Internet health information search experience in different ways depending on the type of eHealth behavior. After controlling for college

  17. The Application of the Cumulative Logistic Regression Model to Automated Essay Scoring

    Science.gov (United States)

    Haberman, Shelby J.; Sinharay, Sandip

    2010-01-01

    Most automated essay scoring programs use a linear regression model to predict an essay score from several essay features. This article applied a cumulative logit model instead of the linear regression model to automated essay scoring. Comparison of the performances of the linear regression model and the cumulative logit model was performed on a…

  18. Establishing the change in antibiotic resistance of Enterococcus faecium strains isolated from Dutch broilers by logistic regression and survival analysis

    NARCIS (Netherlands)

    Stegeman, J.A.; Vernooij, J.C.M.; Khalifa, O.A.; Broek, van den J.; Mevius, D.J.

    2006-01-01

    In this study, we investigated the change in the resistance of Enterococcus faecium strains isolated from Dutch broilers against erythromycin and virginiamycin in 1998, 1999 and 2001 by logistic regression analysis and survival analysis. The E. faecium strains were isolated from caecal samples that

  19. Use of genetic programming, logistic regression, and artificial neural nets to predict readmission after coronary artery bypass surgery.

    Science.gov (United States)

    Engoren, Milo; Habib, Robert H; Dooner, John J; Schwann, Thomas A

    2013-08-01

    As many as 14 % of patients undergoing coronary artery bypass surgery are readmitted within 30 days. Readmission is usually the result of morbidity and may lead to death. The purpose of this study is to develop and compare statistical and genetic programming models to predict readmission. Patients were divided into separate Construction and Validation populations. Using 88 variables, logistic regression, genetic programs, and artificial neural nets were used to develop predictive models. Models were first constructed and tested on the Construction populations, then validated on the Validation population. Areas under the receiver operator characteristic curves (AU ROC) were used to compare the models. Two hundred and two patients (7.6 %) in the 2,644 patient Construction group and 216 (8.0 %) of the 2,711 patient Validation group were re-admitted within 30 days of CABG surgery. Logistic regression predicted readmission with AU ROC = .675 ± .021 in the Construction group. Genetic programs significantly improved the accuracy, AU ROC = .767 ± .001, p genetic programming (AU ROC = .654 ± .001) was still trivially but statistically non-significantly better than that of the logistic regression (AU ROC = .644 ± .020, p = .61). Genetic programming and logistic regression provide alternative methods to predict readmission that are similarly accurate.

  20. Multinomial logistic regression modelling of obesity and overweight among primary school students in a rural area of Negeri Sembilan

    Science.gov (United States)

    Ghazali, Amirul Syafiq Mohd; Ali, Zalila; Noor, Norlida Mohd; Baharum, Adam

    2015-10-01

    Multinomial logistic regression is widely used to model the outcomes of a polytomous response variable, a categorical dependent variable with more than two categories. The model assumes that the conditional mean of the dependent categorical variables is the logistic function of an affine combination of predictor variables. Its procedure gives a number of logistic regression models that make specific comparisons of the response categories. When there are q categories of the response variable, the model consists of q-1 logit equations which are fitted simultaneously. The model is validated by variable selection procedures, tests of regression coefficients, a significant test of the overall model, goodness-of-fit measures, and validation of predicted probabilities using odds ratio. This study used the multinomial logistic regression model to investigate obesity and overweight among primary school students in a rural area on the basis of their demographic profiles, lifestyles and on the diet and food intake. The results indicated that obesity and overweight of students are related to gender, religion, sleep duration, time spent on electronic games, breakfast intake in a week, with whom meals are taken, protein intake, and also, the interaction between breakfast intake in a week with sleep duration, and the interaction between gender and protein intake.

  1. Logistic regression accuracy across different spatial and temporal scales for a wide-ranging species, the marbled murrelet

    Science.gov (United States)

    Carolyn B. Meyer; Sherri L. Miller; C. John Ralph

    2004-01-01

    The scale at which habitat variables are measured affects the accuracy of resource selection functions in predicting animal use of sites. We used logistic regression models for a wide-ranging species, the marbled murrelet, (Brachyramphus marmoratus) in a large region in California to address how much changing the spatial or temporal scale of...

  2. The Overall Odds Ratio as an Intuitive Effect Size Index for Multiple Logistic Regression: Examination of Further Refinements

    Science.gov (United States)

    Le, Huy; Marcus, Justin

    2012-01-01

    This study used Monte Carlo simulation to examine the properties of the overall odds ratio (OOR), which was recently introduced as an index for overall effect size in multiple logistic regression. It was found that the OOR was relatively independent of study base rate and performed better than most commonly used R-square analogs in indexing model…

  3. Logistic regression and cellular automata-based modelling of retail, commercial and residential development in the city of Ahmedabad, India

    NARCIS (Netherlands)

    Munshi, T.; Zuidgeest, M.H.P.; Brussel, M.J.G.; van Maarseveen, M.F.A.M.

    2014-01-01

    This study presents a hybrid simulation model that combines logistic regression and cellular automata-based modelling to simulate future urban growth and development for the city of Ahmedabad in India. The model enables to visualize the consequence of development projections in combination with

  4. Multinomial logistic regression modelling of obesity and overweight among primary school students in a rural area of Negeri Sembilan

    Energy Technology Data Exchange (ETDEWEB)

    Ghazali, Amirul Syafiq Mohd; Ali, Zalila; Noor, Norlida Mohd; Baharum, Adam [Pusat Pengajian Sains Matematik, Universiti Sains Malaysia, 11800 USM, Pulau Pinang, Malaysia amirul@unisel.edu.my, zalila@cs.usm.my, norlida@usm.my, adam@usm.my (Malaysia)

    2015-10-22

    Multinomial logistic regression is widely used to model the outcomes of a polytomous response variable, a categorical dependent variable with more than two categories. The model assumes that the conditional mean of the dependent categorical variables is the logistic function of an affine combination of predictor variables. Its procedure gives a number of logistic regression models that make specific comparisons of the response categories. When there are q categories of the response variable, the model consists of q-1 logit equations which are fitted simultaneously. The model is validated by variable selection procedures, tests of regression coefficients, a significant test of the overall model, goodness-of-fit measures, and validation of predicted probabilities using odds ratio. This study used the multinomial logistic regression model to investigate obesity and overweight among primary school students in a rural area on the basis of their demographic profiles, lifestyles and on the diet and food intake. The results indicated that obesity and overweight of students are related to gender, religion, sleep duration, time spent on electronic games, breakfast intake in a week, with whom meals are taken, protein intake, and also, the interaction between breakfast intake in a week with sleep duration, and the interaction between gender and protein intake.

  5. Applying additive logistic regression to data derived from sensors monitoring behavioral and physiological characteristics of dairy cows to detect lameness

    NARCIS (Netherlands)

    Kamphuis, C.; Frank, E.; Burke, J.; Verkerk, G.A.; Jago, J.

    2013-01-01

    The hypothesis was that sensors currently available on farm that monitor behavioral and physiological characteristics have potential for the detection of lameness in dairy cows. This was tested by applying additive logistic regression to variables derived from sensor data. Data were collected

  6. Methods for significance testing of categorical covariates in logistic regression models after multiple imputation: power and applicability analysis

    NARCIS (Netherlands)

    Eekhout, I.; Wiel, M.A. van de; Heymans, M.W.

    2017-01-01

    Background. Multiple imputation is a recommended method to handle missing data. For significance testing after multiple imputation, Rubin’s Rules (RR) are easily applied to pool parameter estimates. In a logistic regression model, to consider whether a categorical covariate with more than two levels

  7. Logistic regression model for identification of right ventricular dysfunction in patients with acute pulmonary embolism by means of computed tomography

    Energy Technology Data Exchange (ETDEWEB)

    Staskiewicz, Grzegorz, E-mail: grzegorz.staskiewicz@gmail.com [1st Department of Radiology, Medical University of Lublin, Lublin (Poland); Department of Human Anatomy, Medical University of Lublin, Lublin (Poland); Czekajska-Chehab, Elżbieta, E-mail: czekajska@gazeta.pl [1st Department of Radiology, Medical University of Lublin, Lublin (Poland); Uhlig, Sebastian, E-mail: uhligs@eranet.pl [1st Department of Radiology, Medical University of Lublin, Lublin (Poland); Przegalinski, Jerzy, E-mail: jerzy.przegalinski@umlub.pl [Department of Cardiology, Medical University of Lublin, Lublin (Poland); Maciejewski, Ryszard, E-mail: maciejewski.r@gmail.com [Department of Human Anatomy, Medical University of Lublin, Lublin (Poland); Drop, Andrzej, E-mail: andrzej.drop@umlub.pl [1st Department of Radiology, Medical University of Lublin, Lublin (Poland)

    2013-08-15

    Purpose: Diagnosis of right ventricular dysfunction in patients with acute pulmonary embolism (PE) is known to be associated with increased risk of mortality. The aim of the study was to calculate a logistic regression model for reliable identification of right ventricular dysfunction (RVD) in patients diagnosed with computed tomography pulmonary angiography. Material and methods: Ninety-seven consecutive patients with acute pulmonary embolism were divided into groups with and without RVD basing upon echocardiographic measurement of pulmonary artery systolic pressure (PASP). PE severity was graded with the pulmonary obstruction score. CT measurements of heart chambers and mediastinal vessels were performed; position of interventricular septum and presence of contrast reflux into the inferior vena cava were also recorded. The logistic regression model was prepared by means of stepwise logistic regression. Results: Among the used parameters, the final model consisted of pulmonary obstruction score, short axis diameter of right ventricle and diameter of inferior vena cava. The calculated model is characterized by 79% sensitivity and 81% specificity, and its performance was significantly better than single CT-based measurements. Conclusion: Logistic regression model identifies RVD significantly better, than single CT-based measurements.

  8. Using Logistic Regression to Predict the Probability of Debris Flows in Areas Burned by Wildfires, Southern California, 2003-2006

    Science.gov (United States)

    Rupert, Michael G.; Cannon, Susan H.; Gartner, Joseph E.; Michael, John A.; Helsel, Dennis R.

    2008-01-01

    Logistic regression was used to develop statistical models that can be used to predict the probability of debris flows in areas recently burned by wildfires by using data from 14 wildfires that burned in southern California during 2003-2006. Twenty-eight independent variables describing the basin morphology, burn severity, rainfall, and soil properties of 306 drainage basins located within those burned areas were evaluated. The models were developed as follows: (1) Basins that did and did not produce debris flows soon after the 2003 to 2006 fires were delineated from data in the National Elevation Dataset using a geographic information system; (2) Data describing the basin morphology, burn severity, rainfall, and soil properties were compiled for each basin. These data were then input to a statistics software package for analysis using logistic regression; and (3) Relations between the occurrence or absence of debris flows and the basin morphology, burn severity, rainfall, and soil properties were evaluated, and five multivariate logistic regression models were constructed. All possible combinations of independent variables were evaluated to determine which combinations produced the most effective models, and the multivariate models that best predicted the occurrence of debris flows were identified. Percentage of high burn severity and 3-hour peak rainfall intensity were significant variables in all models. Soil organic matter content and soil clay content were significant variables in all models except Model 5. Soil slope was a significant variable in all models except Model 4. The most suitable model can be selected from these five models on the basis of the availability of independent variables in the particular area of interest and field checking of probability maps. The multivariate logistic regression models can be entered into a geographic information system, and maps showing the probability of debris flows can be constructed in recently burned areas of

  9. Susceptibility assessments of landslide triggered by Wencuan earthquake at Longnan by rare events logistic regression analyses

    Science.gov (United States)

    Bai, Shibiao; Glade, Thomas; Bell, Rainer; Wang, Jian

    2010-05-01

    Earthquake triggered landslides are very common throughout the world. In particular the last events, e.g. in Pakistan and in China 2008 have demonstrated, that this trigger should not been underestimated. In order to determine the most fragile landslide areas in the future for a similar earthquake, it is important to calculate for these areas landslide susceptibility maps. In this paper, firstly, the earthquake triggered landslide distribution inventory at Longnan, a case study in China, is build up by field investigation and interpretation of remote-sensing image data (SPOT 5 and ALOS). Then we presented the approach for the analysis and modeling of landslide data using rare events logistic regression. Data include digital orthophotomaps (DOM), digital elevation models (DEM), topographical parameters (e.g. altitude, slope, aspect, profile curvature, plan curvature, sediment transport capacity index, stream power index, topographic wetness index), geological information and further different GIS layers including settlement, road net and rivers. Landslides were identified by monoscopic manual interpretation, and validated during the field investigation. The quality of susceptibility mapping was validated by splitting the study area into a training and a validation set. The prediction capability analysis showed that the landslide susceptibility map could be used for land planning in this region as well as emergency planning by local authorities. The study are of Longnan is located in southern Gansu province bordering Shanxi in the east and Sichuan in the south. The major geographic features in Longnan are the Qinba Mountains in the east, the Loess Plateau in the north, and the Tibetan Plateau in the west. It is part of the Central Han basin in the east and the Sichuan basin in the south. The geological environment is in particular determined by regional fault zones. Neotectonic movements are active, and seismic activities are frequent. The length from east to west is

  10. A comparison of Cox and logistic regression for use in genome-wide association studies of cohort and case-cohort design

    Science.gov (United States)

    Staley, James R; Jones, Edmund; Kaptoge, Stephen; Butterworth, Adam S; Sweeting, Michael J; Wood, Angela M; Howson, Joanna M M

    2017-01-01

    Logistic regression is often used instead of Cox regression to analyse genome-wide association studies (GWAS) of single-nucleotide polymorphisms (SNPs) and disease outcomes with cohort and case-cohort designs, as it is less computationally expensive. Although Cox and logistic regression models have been compared previously in cohort studies, this work does not completely cover the GWAS setting nor extend to the case-cohort study design. Here, we evaluated Cox and logistic regression applied to cohort and case-cohort genetic association studies using simulated data and genetic data from the EPIC-CVD study. In the cohort setting, there was a modest improvement in power to detect SNP–disease associations using Cox regression compared with logistic regression, which increased as the disease incidence increased. In contrast, logistic regression had more power than (Prentice weighted) Cox regression in the case-cohort setting. Logistic regression yielded inflated effect estimates (assuming the hazard ratio is the underlying measure of association) for both study designs, especially for SNPs with greater effect on disease. Given logistic regression is substantially more computationally efficient than Cox regression in both settings, we propose a two-step approach to GWAS in cohort and case-cohort studies. First to analyse all SNPs with logistic regression to identify associated variants below a pre-defined P-value threshold, and second to fit Cox regression (appropriately weighted in case-cohort studies) to those identified SNPs to ensure accurate estimation of association with disease. PMID:28594416

  11. CHALLENGES OF AGRICULTURAL ENTREPRENEURSHIP IN URBAN KANO, NIGERIA: A MULTINOMIAL LOGISTIC REGRESSION APPROACH

    OpenAIRE

    Ibrahim Musa Gwadabe

    2017-01-01

    This paper analysed the determinants of agricultural entrepreneurial intentions of the unemployed in urban Kano, Nigeria, using three different multinomial logistic models fitted to the primary data obtained via structured questionnaire from the 173 out of 200 targeted respondents. The results suggest that age explains the likelihood of starting or engaging in agricultural business. Gender and educational levels were not significant in explaining the likelihood of starting the business. Inade...

  12. Classification of EEG recordings in auditory brain activity via a logistic functional linear regression model.

    OpenAIRE

    Gannaz, Irène

    2014-01-01

    International audience; We want to analyse EEG recordings in order to investigate the phonemic categorization at a very early stage of auditory processing. This problem can be modelled by a supervised classification of functional data. Discrimination is explored via a logistic functional linear model, using a wavelet representation of the data. Different procedures are investigated, based on penalized likelihood and principal component reduction or partial least squares reduction.

  13. Adolescent sexual victimization

    DEFF Research Database (Denmark)

    Bramsen, Rikke Holm; Lasgaard, Mathias; Koss, Mary P

    2012-01-01

    at baseline and first time APSV during a 6-month period. Data analysis was a binary logistic regression analysis. Number of sexual partners and displaying sexual risk behaviors significantly predicted subsequent first time peer-on-peer sexual victimization, whereas a history of child sexual abuse, early...

  14. Applying models for ordinal logistic regression to the analysis of household electricity consumption classes in Rio de Janeiro, Brazil

    Energy Technology Data Exchange (ETDEWEB)

    Fuks, Mauricio [Programa de Planejamento Energetico (PPE/COPPE) Universidade Federal do Rio de Janeiro (Brazil); Salazar, Esther [Department of Statistical Methods of the Universidade Federal do Rio de Janeiro (Brazil)

    2008-07-15

    This study applies the proportional odds and partial proportional odds models for ordinal logistic regression to analyze household electricity consumption classes. Micro-data from households situated in the state of Rio de Janeiro during 2004 was used to measure the performance of the models in correctly classifying household electricity consumption classes via sociodemographic, electricity usage and dwelling characteristics. The strategy of using binary logistic regressions to test the main hypothesis of the proportional odds model, suggested by Bender and Grouven, was successful in identifying which of the independent variables could be estimated via the proportional odds assumption. Results indicate that the partial proportional odds models is slightly superior to the more restrictive approach. The study includes probabilistic examples to describe how changes in the independent variables affect the probability of a household belonging to specific classes of electricity consumption. Projections using the final model indicated that the approach may be useful for estimating aggregate household electricity consumption. (author)

  15. Updated logistic regression equations for the calculation of post-fire debris-flow likelihood in the western United States

    Science.gov (United States)

    Staley, Dennis M.; Negri, Jacquelyn A.; Kean, Jason W.; Laber, Jayme L.; Tillery, Anne C.; Youberg, Ann M.

    2016-06-30

    Wildfire can significantly alter the hydrologic response of a watershed to the extent that even modest rainstorms can generate dangerous flash floods and debris flows. To reduce public exposure to hazard, the U.S. Geological Survey produces post-fire debris-flow hazard assessments for select fires in the western United States. We use publicly available geospatial data describing basin morphology, burn severity, soil properties, and rainfall characteristics to estimate the statistical likelihood that debris flows will occur in response to a storm of a given rainfall intensity. Using an empirical database and refined geospatial analysis methods, we defined new equations for the prediction of debris-flow likelihood using logistic regression methods. We showed that the new logistic regression model outperformed previous models used to predict debris-flow likelihood.

  16. The application of GIS-based logistic regression for landslide susceptibility mapping in the Kakuda-Yahiko Mountains, Central Japan

    Science.gov (United States)

    Ayalew, Lulseged; Yamagishi, Hiromitsu

    2005-02-01

    As a first step forward in regional hazard management, multivariate statistical analysis in the form of logistic regression was used to produce a landslide susceptibility map in the Kakuda-Yahiko Mountains of Central Japan. There are different methods to prepare landslide susceptibility maps. The use of logistic regression in this study stemmed not only from the fact that this approach relaxes the strict assumptions required by other multivariate statistical methods, but also to demonstrate that it can be combined with bivariate statistical analyses (BSA) to simplify the interpretation of the model obtained at the end. In susceptibility mapping, the use of logistic regression is to find the best fitting function to describe the relationship between the presence or absence of landslides (dependent variable) and a set of independent parameters such as slope angle and lithology. Here, an inventory map of 87 landslides was used to produce a dependent variable, which takes a value of 0 for the absence and 1 for the presence of slope failures. Lithology, bed rock-slope relationship, lineaments, slope gradient, aspect, elevation and road network were taken as independent parameters. The effect of each parameter on landslide occurrence was assessed from the corresponding coefficient that appears in the logistic regression function. The interpretations of the coefficients showed that road network plays a major role in determining landslide occurrence and distribution. Among the geomorphological parameters, aspect and slope gradient have a more significant contribution than elevation, although field observations showed that the latter is a good estimator of the approximate location of slope cuts. Using a predicted map of probability, the study area was classified into five categories of landslide susceptibility: extremely low, very low, low, medium and high. The medium and high susceptibility zones make up 8.87% of the total study area and involve mid-altitude slopes in the

  17. Identifying determinants and estimating the risk of inadequate and excess gestational weight gain using a multinomial logistic regression model

    OpenAIRE

    Beyene, Joseph; Neupane, Binod; McDonald, Sarah

    2014-01-01

    Binod Neupane,1 Sarah D McDonald,1,2 Joseph Beyene1 1Department of Clinical Epidemiology and Biostatistics, 2Department of Obstetrics and Gynecology and Radiology, McMaster University, Hamilton, ON, Canada Abstract: When there are three or more nominal categories of a response variable, the binomial logistic regression approach is widely used to model the relationships of exposure variables with different binomial responses one at a time. However, some of the separate binomial comparisons wo...

  18. The spatial prediction of landslide susceptibility applying artificial neural network and logistic regression models: A case study of Inje, Korea

    Science.gov (United States)

    Saro, Lee; Woo, Jeon Seong; Kwan-Young, Oh; Moung-Jin, Lee

    2016-02-01

    The aim of this study is to predict landslide susceptibility caused using the spatial analysis by the application of a statistical methodology based on the GIS. Logistic regression models along with artificial neutral network were applied and validated to analyze landslide susceptibility in Inje, Korea. Landslide occurrence area in the study were identified based on interpretations of optical remote sensing data (Aerial photographs) followed by field surveys. A spatial database considering forest, geophysical, soil and topographic data, was built on the study area using the Geographical Information System (GIS). These factors were analysed using artificial neural network (ANN) and logistic regression models to generate a landslide susceptibility map. The study validates the landslide susceptibility map by comparing them with landslide occurrence areas. The locations of landslide occurrence were divided randomly into a training set (50%) and a test set (50%). A training set analyse the landslide susceptibility map using the artificial network along with logistic regression models, and a test set was retained to validate the prediction map. The validation results revealed that the artificial neural network model (with an accuracy of 80.10%) was better at predicting landslides than the logistic regression model (with an accuracy of 77.05%). Of the weights used in the artificial neural network model, `slope' yielded the highest weight value (1.330), and `aspect' yielded the lowest value (1.000). This research applied two statistical analysis methods in a GIS and compared their results. Based on the findings, we were able to derive a more effective method for analyzing landslide susceptibility.

  19. The spatial prediction of landslide susceptibility applying artificial neural network and logistic regression models: A case study of Inje, Korea

    Directory of Open Access Journals (Sweden)

    Saro Lee

    2016-02-01

    Full Text Available The aim of this study is to predict landslide susceptibility caused using the spatial analysis by the application of a statistical methodology based on the GIS. Logistic regression models along with artificial neutral network were applied and validated to analyze landslide susceptibility in Inje, Korea. Landslide occurrence area in the study were identified based on interpretations of optical remote sensing data (Aerial photographs followed by field surveys. A spatial database considering forest, geophysical, soil and topographic data, was built on the study area using the Geographical Information System (GIS. These factors were analysed using artificial neural network (ANN and logistic regression models to generate a landslide susceptibility map. The study validates the landslide susceptibility map by comparing them with landslide occurrence areas. The locations of landslide occurrence were divided randomly into a training set (50% and a test set (50%. A training set analyse the landslide susceptibility map using the artificial network along with logistic regression models, and a test set was retained to validate the prediction map. The validation results revealed that the artificial neural network model (with an accuracy of 80.10% was better at predicting landslides than the logistic regression model (with an accuracy of 77.05%. Of the weights used in the artificial neural network model, ‘slope’ yielded the highest weight value (1.330, and ‘aspect’ yielded the lowest value (1.000. This research applied two statistical analysis methods in a GIS and compared their results. Based on the findings, we were able to derive a more effective method for analyzing landslide susceptibility.

  20. A comparative study of slope failure prediction using logistic regression, support vector machine and least square support vector machine models

    Science.gov (United States)

    Zhou, Lim Yi; Shan, Fam Pei; Shimizu, Kunio; Imoto, Tomoaki; Lateh, Habibah; Peng, Koay Swee

    2017-08-01

    A comparative study of logistic regression, support vector machine (SVM) and least square support vector machine (LSSVM) models has been done to predict the slope failure (landslide) along East-West Highway (Gerik-Jeli). The effects of two monsoon seasons (southwest and northeast) that occur in Malaysia are considered in this study. Two related factors of occurrence of slope failure are included in this study: rainfall and underground water. For each method, two predictive models are constructed, namely SOUTHWEST and NORTHEAST models. Based on the results obtained from logistic regression models, two factors (rainfall and underground water level) contribute to the occurrence of slope failure. The accuracies of the three statistical models for two monsoon seasons are verified by using Relative Operating Characteristics curves. The validation results showed that all models produced prediction of high accuracy. For the results of SVM and LSSVM, the models using RBF kernel showed better prediction compared to the models using linear kernel. The comparative results showed that, for SOUTHWEST models, three statistical models have relatively similar performance. For NORTHEAST models, logistic regression has the best predictive efficiency whereas the SVM model has the second best predictive efficiency.

  1. Comparison of ν-support vector regression and logistic equation for ...

    African Journals Online (AJOL)

    Due to the complexity and high non-linearity of bioprocess, most simple mathematical models fail to describe the exact behavior of biochemistry systems. As a novel type of learning method, support vector regression (SVR) owns the powerful capability to characterize problems via small sample, nonlinearity, high dimension ...

  2. An alternative to evaluate the efficiency of in vitro culture medium using a logistic regression model

    Directory of Open Access Journals (Sweden)

    Daniel Furtado Ferreira

    2003-01-01

    Full Text Available The evaluation of a culture medium for the in vitro culture of a species is performed using its physical and/or chemical properties. However, the analysis of the experimental results makes it possible to evaluate its quality. In this sense, this work presents an alternative using a logistic model to evaluate the culture medium to be used in vitro. The probabilities provided by this model will be used as a medium evaluator index. The importance of this index is based on the formalization of a statistical criterion for the selection of the adequate culture medium to be used on in vitro culture without excluding its physical and/or chemical properties. To demonstrate this procedure, an experiment determining the ideal medium for the in vitro culture of primary explants of Ipeca [Psychotria ipecacuanha (Brot. Stokes] was evaluated. The differentiation of the culture medium was based on the presence and absence of the growth regulator BAP (6-benzilaminopurine. A logistic model was adjusted as a function of the weight of fresh and dry matter. Minimum, medium and maximum probabilities obtained with this model showed that the culture medium containing BAP was the most adequate for the explant growth. Due to the high discriminative power of these mediums, detected by the model, their use is recommended as an alternative to select culture medium for similar experiments.

  3. The quest for conditional independence in prospectivity modeling: weights-of-evidence, boost weights-of-evidence, and logistic regression

    Science.gov (United States)

    Schaeben, Helmut; Semmler, Georg

    2016-09-01

    The objective of prospectivity modeling is prediction of the conditional probability of the presence T = 1 or absence T = 0 of a target T given favorable or prohibitive predictors B, or construction of a two classes 0,1 classification of T. A special case of logistic regression called weights-of-evidence (WofE) is geologists' favorite method of prospectivity modeling due to its apparent simplicity. However, the numerical simplicity is deceiving as it is implied by the severe mathematical modeling assumption of joint conditional independence of all predictors given the target. General weights of evidence are explicitly introduced which are as simple to estimate as conventional weights, i.e., by counting, but do not require conditional independence. Complementary to the regression view is the classification view on prospectivity modeling. Boosting is the construction of a strong classifier from a set of weak classifiers. From the regression point of view it is closely related to logistic regression. Boost weights-of-evidence (BoostWofE) was introduced into prospectivity modeling to counterbalance violations of the assumption of conditional independence even though relaxation of modeling assumptions with respect to weak classifiers was not the (initial) purpose of boosting. In the original publication of BoostWofE a fabricated dataset was used to "validate" this approach. Using the same fabricated dataset it is shown that BoostWofE cannot generally compensate lacking conditional independence whatever the consecutively processing order of predictors. Thus the alleged features of BoostWofE are disproved by way of counterexamples, while theoretical findings are confirmed that logistic regression including interaction terms can exactly compensate violations of joint conditional independence if the predictors are indicators.

  4. Logistic quantile regression provides improved estimates for bounded avian counts: A case study of California Spotted Owl fledgling production

    Science.gov (United States)

    Cade, Brian S.; Noon, Barry R.; Scherer, Rick D.; Keane, John J.

    2017-01-01

    Counts of avian fledglings, nestlings, or clutch size that are bounded below by zero and above by some small integer form a discrete random variable distribution that is not approximated well by conventional parametric count distributions such as the Poisson or negative binomial. We developed a logistic quantile regression model to provide estimates of the empirical conditional distribution of a bounded discrete random variable. The logistic quantile regression model requires that counts are randomly jittered to a continuous random variable, logit transformed to bound them between specified lower and upper values, then estimated in conventional linear quantile regression, repeating the 3 steps and averaging estimates. Back-transformation to the original discrete scale relies on the fact that quantiles are equivariant to monotonic transformations. We demonstrate this statistical procedure by modeling 20 years of California Spotted Owl fledgling production (0−3 per territory) on the Lassen National Forest, California, USA, as related to climate, demographic, and landscape habitat characteristics at territories. Spotted Owl fledgling counts increased nonlinearly with decreasing precipitation in the early nesting period, in the winter prior to nesting, and in the prior growing season; with increasing minimum temperatures in the early nesting period; with adult compared to subadult parents; when there was no fledgling production in the prior year; and when percentage of the landscape surrounding nesting sites (202 ha) with trees ≥25 m height increased. Changes in production were primarily driven by changes in the proportion of territories with 2 or 3 fledglings. Average variances of the discrete cumulative distributions of the estimated fledgling counts indicated that temporal changes in climate and parent age class explained 18% of the annual variance in owl fledgling production, which was 34% of the total variance. Prior fledgling production explained as much of

  5. Application of Ordinal Logistic Regression and Artifical Neural Networks in a Study of Student Satistaction

    Directory of Open Access Journals (Sweden)

    Meral Yay

    2009-06-01

    Full Text Available Measuring student satisfaction is an important issue especially for university administration, in order to improvestudent services and opportunities. The major objective of this study is to provide a solution for this issue.Consequently, student satisfaction has been measured with an ordered five-point Likert scale. A student satisfactionquestionnaire was applied to a total of 314 university students, consisting of 208 female and 106 male students, andsatisfaction was measured by asking students to respond to 19 questionnaire items. Ordinal regression and artificalneural network methods were applied to the collected data which emphasized the differences between the twomethods in terms of the correct classification percentages

  6. Modified Logistic Regression Approaches to Eliminating the Impact of Response Styles on DIF Detection in Likert-Type Scales.

    Science.gov (United States)

    Chen, Hui-Fang; Jin, Kuan-Yu; Wang, Wen-Chung

    2017-01-01

    Extreme response styles (ERS) is prevalent in Likert- or rating-type data but previous research has not well-addressed their impact on differential item functioning (DIF) assessments. This study aimed to fill in the knowledge gap and examined their influence on the performances of logistic regression (LR) approaches in DIF detections, including the ordinal logistic regression (OLR) and the logistic discriminant functional analysis (LDFA). Results indicated that both the standard OLR and LDFA yielded severely inflated false positive rates as the magnitude of the differences in ERS increased between two groups. This study proposed a class of modified LR approaches to eliminating the ERS effect on DIF assessment. These proposed modifications showed satisfactory control of false positive rates when no DIF items existed and yielded a better control of false positive rates and more accurate true positive rates under DIF conditions than the conventional LR approaches did. In conclusion, the proposed modifications are recommended in survey research when there are multiple group or cultural groups.

  7. The effect of high leverage points on the maximum estimated likelihood for separation in logistic regression

    Science.gov (United States)

    Ariffin, Syaiba Balqish; Midi, Habshah; Arasan, Jayanthi; Rana, Md Sohel

    2015-02-01

    This article is concerned with the performance of the maximum estimated likelihood estimator in the presence of separation in the space of the independent variables and high leverage points. The maximum likelihood estimator suffers from the problem of non overlap cases in the covariates where the regression coefficients are not identifiable and the maximum likelihood estimator does not exist. Consequently, iteration scheme fails to converge and gives faulty results. To remedy this problem, the maximum estimated likelihood estimator is put forward. It is evident that the maximum estimated likelihood estimator is resistant against separation and the estimates always exist. The effect of high leverage points are then investigated on the performance of maximum estimated likelihood estimator through real data sets and Monte Carlo simulation study. The findings signify that the maximum estimated likelihood estimator fails to provide better parameter estimates in the presence of both separation, and high leverage points.

  8. Mixed effects logistic regression models for longitudinal binary response data with informative drop-out.

    Science.gov (United States)

    Ten Have, T R; Kunselman, A R; Pulkstenis, E P; Landis, J R

    1998-03-01

    A shared parameter model with logistic link is presented for longitudinal binary response data to accommodate informative drop-out. The model consists of observed longitudinal and missing response components that share random effects parameters. To our knowledge, this is the first presentation of such a model for longitudinal binary response data. Comparisons are made to an approximate conditional logit model in terms of a clinical trial dataset and simulations. The naive mixed effects logit model that does not account for informative drop-out is also compared. The simulation-based differences among the models with respect to coverage of confidence intervals, bias, and mean squared error (MSE) depend on at least two factors: whether an effect is a between- or within-subject effect and the amount of between-subject variation as exhibited by variance components of the random effects distributions. When the shared parameter model holds, the approximate conditional model provides confidence intervals with good coverage for within-cluster factors but not for between-cluster factors. The converse is true for the naive model. Under a different drop-out mechanism, when the probability of drop-out is dependent only on the current unobserved observation, all three models behave similarly by providing between-subject confidence intervals with good coverage and comparable MSE and bias but poor within-subject confidence intervals, MSE, and bias. The naive model does more poorly with respect to the within-subject effects than do the shared parameter and approximate conditional models. The data analysis, which entails a comparison of two pain relievers and a placebo with respect to pain relief, conforms to the simulation results based on the shared parameter model but not on the simulation based on the outcome-driven drop-out process. This comparison between the data analysis and simulation results may provide evidence that the shared parameter model holds for the pain data.

  9. An Alternative Flight Software Trigger Paradigm: Applying Multivariate Logistic Regression to Sense Trigger Conditions Using Inaccurate or Scarce Information

    Science.gov (United States)

    Smith, Kelly M.; Gay, Robert S.; Stachowiak, Susan J.

    2013-01-01

    In late 2014, NASA will fly the Orion capsule on a Delta IV-Heavy rocket for the Exploration Flight Test-1 (EFT-1) mission. For EFT-1, the Orion capsule will be flying with a new GPS receiver and new navigation software. Given the experimental nature of the flight, the flight software must be robust to the loss of GPS measurements. Once the high-speed entry is complete, the drogue parachutes must be deployed within the proper conditions to stabilize the vehicle prior to deploying the main parachutes. When GPS is available in nominal operations, the vehicle will deploy the drogue parachutes based on an altitude trigger. However, when GPS is unavailable, the navigated altitude errors become excessively large, driving the need for a backup barometric altimeter to improve altitude knowledge. In order to increase overall robustness, the vehicle also has an alternate method of triggering the parachute deployment sequence based on planet-relative velocity if both the GPS and the barometric altimeter fail. However, this backup trigger results in large altitude errors relative to the targeted altitude. Motivated by this challenge, this paper demonstrates how logistic regression may be employed to semi-automatically generate robust triggers based on statistical analysis. Logistic regression is used as a ground processor pre-flight to develop a statistical classifier. The classifier would then be implemented in flight software and executed in real-time. This technique offers improved performance even in the face of highly inaccurate measurements. Although the logistic regression-based trigger approach will not be implemented within EFT-1 flight software, the methodology can be carried forward for future missions and vehicles.

  10. An investigation of the speeding-related crash designation through crash narrative reviews sampled via logistic regression.

    Science.gov (United States)

    Fitzpatrick, Cole D; Rakasi, Saritha; Knodler, Michael A

    2017-01-01

    Speed is one of the most important factors in traffic safety as higher speeds are linked to increased crash risk and higher injury severities. Nearly a third of fatal crashes in the United States are designated as "speeding-related", which is defined as either "the driver behavior of exceeding the posted speed limit or driving too fast for conditions." While many studies have utilized the speeding-related designation in safety analyses, no studies have examined the underlying accuracy of this designation. Herein, we investigate the speeding-related crash designation through the development of a series of logistic regression models that were derived from the established speeding-related crash typologies and validated using a blind review, by multiple researchers, of 604 crash narratives. The developed logistic regression model accurately identified crashes which were not originally designated as speeding-related but had crash narratives that suggested speeding as a causative factor. Only 53.4% of crashes designated as speeding-related contained narratives which described speeding as a causative factor. Further investigation of these crashes revealed that the driver contributing code (DCC) of "driving too fast for conditions" was being used in three separate situations. Additionally, this DCC was also incorrectly used when "exceeding the posted speed limit" would likely have been a more appropriate designation. Finally, it was determined that the responding officer only utilized one DCC in 82% of crashes not designated as speeding-related but contained a narrative indicating speed as a contributing causal factor. The use of logistic regression models based upon speeding-related crash typologies offers a promising method by which all possible speeding-related crashes could be identified. Published by Elsevier Ltd.

  11. Transmission Risks of Schistosomiasis Japonica: Extraction from Back-propagation Artificial Neural Network and Logistic Regression Model

    Science.gov (United States)

    Xu, Jun-Fang; Xu, Jing; Li, Shi-Zhu; Jia, Tia-Wu; Huang, Xi-Bao; Zhang, Hua-Ming; Chen, Mei; Yang, Guo-Jing; Gao, Shu-Jing; Wang, Qing-Yun; Zhou, Xiao-Nong

    2013-01-01

    Background The transmission of schistosomiasis japonica in a local setting is still poorly understood in the lake regions of the People's Republic of China (P. R. China), and its transmission patterns are closely related to human, social and economic factors. Methodology/Principal Findings We aimed to apply the integrated approach of artificial neural network (ANN) and logistic regression model in assessment of transmission risks of Schistosoma japonicum with epidemiological data collected from 2339 villagers from 1247 households in six villages of Jiangling County, P.R. China. By using the back-propagation (BP) of the ANN model, 16 factors out of 27 factors were screened, and the top five factors ranked by the absolute value of mean impact value (MIV) were mainly related to human behavior, i.e. integration of water contact history and infection history, family with past infection, history of water contact, infection history, and infection times. The top five factors screened by the logistic regression model were mainly related to the social economics, i.e. village level, economic conditions of family, age group, education level, and infection times. The risk of human infection with S. japonicum is higher in the population who are at age 15 or younger, or with lower education, or with the higher infection rate of the village, or with poor family, and in the population with more than one time to be infected. Conclusion/Significance Both BP artificial neural network and logistic regression model established in a small scale suggested that individual behavior and socioeconomic status are the most important risk factors in the transmission of schistosomiasis japonica. It was reviewed that the young population (≤15) in higher-risk areas was the main target to be intervened for the disease transmission control. PMID:23556015

  12. Modeling probability-based injury severity scores in logistic regression models: the logit transformation should be used.

    Science.gov (United States)

    Moore, Lynne; Lavoie, André; Bergeron, Eric; Emond, Marcel

    2007-03-01

    The International Classification of Disease Injury Severity Score (ICISS) and the Trauma Registry Abbreviated Injury Scale Score (TRAIS) are trauma injury severity scores based on probabilities of survival. They are widely used in logistic regression models as raw probability scores to predict the logit of mortality. The aim of this study was to evaluate whether these severity indicators would offer a more accurate prediction of mortality if they were used with a logit transformation. Analyses were based on 25,111 patients from the trauma registries of the four Level I trauma centers in the province of Quebec, Canada, abstracted between 1998 and 2005. The ICISS and TRAIS were calculated using survival proportions from the National Trauma Data Bank. The performance of the ICISS and TRAIS in their widely used form, proportions varying from 0 to 1, was compared with a logit transformation of the scores in logistic regression models predicting in-hospital mortality. Calibration was assessed with the Hosmer-Lemeshow statistic. Neither the ICISS nor the TRAIS had a linear relation with the logit of mortality. A logit transformation of these scores led to a near-linear association and consequently improved model calibration. The Hosmer-Lemeshow statistic was 68 (35-192) and 69 (41-120) with the logit transformation compared with 272 (227-339) and 204 (166-266) with no transformation, for the ICISS and TRAIS, respectively. In logistic regression models predicting mortality, the ICISS and TRAIS should be used with a logit transformation. This study has direct implications for improving the validity of analyses requiring control for injury severity case mix.

  13. Comparison of Logistic Regression and Random Forests techniques for shallow landslide susceptibility assessment in Giampilieri (NE Sicily, Italy)

    Science.gov (United States)

    Trigila, Alessandro; Iadanza, Carla; Esposito, Carlo; Scarascia-Mugnozza, Gabriele

    2015-11-01

    The aim of this work is to define reliable susceptibility models for shallow landslides using Logistic Regression and Random Forests multivariate statistical techniques. The study area, located in North-East Sicily, was hit on October 1st 2009 by a severe rainstorm (225 mm of cumulative rainfall in 7 h) which caused flash floods and more than 1000 landslides. Several small villages, such as Giampilieri, were hit with 31 fatalities, 6 missing persons and damage to buildings and transportation infrastructures. Landslides, mainly types such as earth and debris translational slides evolving into debris flows, were triggered on steep slopes and involved colluvium and regolith materials which cover the underlying metamorphic bedrock. The work has been carried out with the following steps: i) realization of a detailed event landslide inventory map through field surveys coupled with observation of high resolution aerial colour orthophoto; ii) identification of landslide source areas; iii) data preparation of landslide controlling factors and descriptive statistics based on a bivariate method (Frequency Ratio) to get an initial overview on existing relationships between causative factors and shallow landslide source areas; iv) choice of criteria for the selection and sizing of the mapping unit; v) implementation of 5 multivariate statistical susceptibility models based on Logistic Regression and Random Forests techniques and focused on landslide source areas; vi) evaluation of the influence of sample size and type of sampling on results and performance of the models; vii) evaluation of the predictive capabilities of the models using ROC curve, AUC and contingency tables; viii) comparison of model results and obtained susceptibility maps; and ix) analysis of temporal variation of landslide susceptibility related to input parameter changes. Models based on Logistic Regression and Random Forests have demonstrated excellent predictive capabilities. Land use and wildfire

  14. An Alternative Flight Software Paradigm: Applying Multivariate Logistic Regression to Sense Trigger Conditions using Inaccurate or Scarce Information

    Science.gov (United States)

    Smith, Kelly; Gay, Robert; Stachowiak, Susan

    2013-01-01

    In late 2014, NASA will fly the Orion capsule on a Delta IV-Heavy rocket for the Exploration Flight Test-1 (EFT-1) mission. For EFT-1, the Orion capsule will be flying with a new GPS receiver and new navigation software. Given the experimental nature of the flight, the flight software must be robust to the loss of GPS measurements. Once the high-speed entry is complete, the drogue parachutes must be deployed within the proper conditions to stabilize the vehicle prior to deploying the main parachutes. When GPS is available in nominal operations, the vehicle will deploy the drogue parachutes based on an altitude trigger. However, when GPS is unavailable, the navigated altitude errors become excessively large, driving the need for a backup barometric altimeter to improve altitude knowledge. In order to increase overall robustness, the vehicle also has an alternate method of triggering the parachute deployment sequence based on planet-relative velocity if both the GPS and the barometric altimeter fail. However, this backup trigger results in large altitude errors relative to the targeted altitude. Motivated by this challenge, this paper demonstrates how logistic regression may be employed to semi-automatically generate robust triggers based on statistical analysis. Logistic regression is used as a ground processor pre-flight to develop a statistical classifier. The classifier would then be implemented in flight software and executed in real-time. This technique offers improved performance even in the face of highly inaccurate measurements. Although the logistic regression-based trigger approach will not be implemented within EFT-1 flight software, the methodology can be carried forward for future missions and vehicles

  15. Differentiation of modern beach and coastal dune sands—a logistic regression approach using the parameters of the hyperbolic function

    Science.gov (United States)

    Vincent, Peter

    1986-10-01

    Particle-size data have been obtained from sieved samples of sand from beach and coastal dune environments at Ainsdale, northwest England. The four parameters of the hyperbolic distribution were calculated for each sample. These, together with additional derived parameters, were the basis of a binomial logit regression model which sought to distinguish the two sand environments. Seventy-five percent of the samples were correctly assigned by logistic equation containing a single independent variable, π, which is a measure of the skewness of the hyperbolic distribution.

  16. Predictors of employment status of treated patients with DSM-III-R diagnosis. Can logistic regression model find a solution?

    Science.gov (United States)

    Daradkeh, T K; Karim, L

    1994-01-01

    To investigate the predictors of employment status of patients with DSM-III-R diagnosis, 55 patients were selected by a simple random technique from the main psychiatric clinic in Al Ain, United Arab Emirates. Structured and formal assessments were carried out to extract the potential predictors of outcome of schizophrenia. Logistic regression model revealed that being married, absence of schizoid personality, free or with minimum symptoms of the illness, later age of onset, and higher educational attainment were the most significant predictors of employment outcome. The implications of the results of this study are discussed in the text.

  17. Assessing the spatial variability of weights of landslide causal factors in different regions from Romania, using logistic regression

    Science.gov (United States)

    Margarint, M. C.; Grozavu, A.; Patriche, C. V.

    2012-04-01

    Landslides represent a significant natural hazard in hilly areas of Romania which cause important damages. The scientific interest for landslide susceptibility mapping is quite recent and standardized through legislation. However, there is need for improving the methodology, in order for the susceptibility maps to constitute a sound basis for territorial planning. The logistic regression is one of the main statistical methods used for assessing terrain susceptibility for landsliding. There are different degrees of weighting the landslide causal factors mentioned in the scientific literature, but with large variations. This study aims to identify the range of variation of landslide causal factors for different regions in Romania. The following factors were taken into consideration: slope angle, terrain altitude, terrain curvature (mean, plan and profile), soil type, lithologic class, land use, distance from drainage network and roads, mean annual precipitations. Four square perimeters of 15x15 km were chosen from representative regions in terms of spatial extent of landslides: two situated in the central-northern part of the Moldavian Plateau, one in the Transylvania Depression and one in the Moldavian Subcarpathians. The logistic regression was applied separately for the four sectors. In order to monitor the differences in the final results, numerous attempts have been made, starting from landslides polygons acquired from both the topographic maps at scale 1:25.000 (1984-1985 edition) and the ortophotoimages (2005-2006). The other elements were acquired from cartographic materials at appropriate scales, according the international methodology. The data integration was accomplished in the georeferenced environment provided by TNTMips 6.9 ArcGIS 9.3 and SAGA 2.0.8 software packages, while the statistical analysis was performed using Excel 2003 and XLSTAT 2010 trial version. Maps for all landslide causal factors were achieved for each perimeter. The logistic

  18. Applicability of the Ricketts’ posteroanterior cephalometry for sex determination using logistic regression analysis in Hispano American Peruvians

    Science.gov (United States)

    Perez, Ivan; Chavez, Allison K.; Ponce, Dario

    2016-01-01

    Background: The Ricketts' posteroanterior (PA) cephalometry seems to be the most widely used and it has not been tested by multivariate statistics for sex determination. Objective: The objective was to determine the applicability of Ricketts' PA cephalometry for sex determination using the logistic regression analysis. Materials and Methods: The logistic models were estimated at distinct age cutoffs (all ages, 11 years, 13 years, and 15 years) in a database from 1,296 Hispano American Peruvians between 5 years and 44 years of age. Results: The logistic models were composed by six cephalometric measurements; the accuracy achieved by resubstitution varied between 60% and 70% and all the variables, with one exception, exhibited a direct relationship with the probability of being classified as male; the nasal width exhibited an indirect relationship. Conclusion: The maxillary and facial widths were present in all models and may represent a sexual dimorphism indicator. The accuracy found was lower than the literature and the Ricketts' PA cephalometry may not be adequate for sex determination. The indirect relationship of the nasal width in models with data from patients of 12 years of age or less may be a trait related to age or a characteristic in the studied population, which could be better studied and confirmed. PMID:27555732

  19. A Two-Stage Penalized Logistic Regression Approach to Case-Control Genome-Wide Association Studies

    Directory of Open Access Journals (Sweden)

    Jingyuan Zhao

    2012-01-01

    Full Text Available We propose a two-stage penalized logistic regression approach to case-control genome-wide association studies. This approach consists of a screening stage and a selection stage. In the screening stage, main-effect and interaction-effect features are screened by using L1-penalized logistic like-lihoods. In the selection stage, the retained features are ranked by the logistic likelihood with the smoothly clipped absolute deviation (SCAD penalty (Fan and Li, 2001 and Jeffrey’s Prior penalty (Firth, 1993, a sequence of nested candidate models are formed, and the models are assessed by a family of extended Bayesian information criteria (J. Chen and Z. Chen, 2008. The proposed approach is applied to the analysis of the prostate cancer data of the Cancer Genetic Markers of Susceptibility (CGEMS project in the National Cancer Institute, USA. Simulation studies are carried out to compare the approach with the pair-wise multiple testing approach (Marchini et al. 2005 and the LASSO-patternsearch algorithm (Shi et al. 2007.

  20. Dynamic Network Logistic Regression: A Logistic Choice Analysis of Inter- and Intra-Group Blog Citation Dynamics in the 2004 US Presidential Election.

    Science.gov (United States)

    Almquist, Zack W; Butts, Carter T

    2013-10-01

    Methods for analysis of network dynamics have seen great progress in the past decade. This article shows how Dynamic Network Logistic Regression techniques (a special case of the Temporal Exponential Random Graph Models) can be used to implement decision theoretic models for network dynamics in a panel data context. We also provide practical heuristics for model building and assessment. We illustrate the power of these techniques by applying them to a dynamic blog network sampled during the 2004 US presidential election cycle. This is a particularly interesting case because it marks the debut of Internet-based media such as blogs and social networking web sites as institutionally recognized features of the American political landscape. Using a longitudinal sample of all Democratic National Convention/Republican National Convention-designated blog citation networks, we are able to test the influence of various strategic, institutional, and balance-theoretic mechanisms as well as exogenous factors such as seasonality and political events on the propensity of blogs to cite one another over time. Using a combination of deviance-based model selection criteria and simulation-based model adequacy tests, we identify the combination of processes that best characterizes the choice behavior of the contending blogs.

  1. A Logistic Regression Model with a Hierarchical Random Error Term for Analyzing the Utilization of Public Transport

    Directory of Open Access Journals (Sweden)

    Chong Wei

    2015-01-01

    Full Text Available Logistic regression models have been widely used in previous studies to analyze public transport utilization. These studies have shown travel time to be an indispensable variable for such analysis and usually consider it to be a deterministic variable. This formulation does not allow us to capture travelers’ perception error regarding travel time, and recent studies have indicated that this error can have a significant effect on modal choice behavior. In this study, we propose a logistic regression model with a hierarchical random error term. The proposed model adds a new random error term for the travel time variable. This term structure enables us to investigate travelers’ perception error regarding travel time from a given choice behavior dataset. We also propose an extended model that allows constraining the sign of this error in the model. We develop two Gibbs samplers to estimate the basic hierarchical model and the extended model. The performance of the proposed models is examined using a well-known dataset.

  2. A general equation to obtain multiple cut-off scores on a test from multinomial logistic regression.

    Science.gov (United States)

    Bersabé, Rosa; Rivas, Teresa

    2010-05-01

    The authors derive a general equation to compute multiple cut-offs on a total test score in order to classify individuals into more than two ordinal categories. The equation is derived from the multinomial logistic regression (MLR) model, which is an extension of the binary logistic regression (BLR) model to accommodate polytomous outcome variables. From this analytical procedure, cut-off scores are established at the test score (the predictor variable) at which an individual is as likely to be in category j as in category j+1 of an ordinal outcome variable. The application of the complete procedure is illustrated by an example with data from an actual study on eating disorders. In this example, two cut-off scores on the Eating Attitudes Test (EAT-26) scores are obtained in order to classify individuals into three ordinal categories: asymptomatic, symptomatic and eating disorder. Diagnoses were made from the responses to a self-report (Q-EDD) that operationalises DSM-IV criteria for eating disorders. Alternatives to the MLR model to set multiple cut-off scores are discussed.

  3. GWAS on your notebook: fast semi-parallel linear and logistic regression for genome-wide association studies.

    Science.gov (United States)

    Sikorska, Karolina; Lesaffre, Emmanuel; Groenen, Patrick F J; Eilers, Paul H C

    2013-05-28

    Genome-wide association studies have become very popular in identifying genetic contributions to phenotypes. Millions of SNPs are being tested for their association with diseases and traits using linear or logistic regression models. This conceptually simple strategy encounters the following computational issues: a large number of tests and very large genotype files (many Gigabytes) which cannot be directly loaded into the software memory. One of the solutions applied on a grand scale is cluster computing involving large-scale resources. We show how to speed up the computations using matrix operations in pure R code. We improve speed: computation time from 6 hours is reduced to 10-15 minutes. Our approach can handle essentially an unlimited amount of covariates efficiently, using projections. Data files in GWAS are vast and reading them into computer memory becomes an important issue. However, much improvement can be made if the data is structured beforehand in a way allowing for easy access to blocks of SNPs. We propose several solutions based on the R packages ff and ncdf.We adapted the semi-parallel computations for logistic regression. We show that in a typical GWAS setting, where SNP effects are very small, we do not lose any precision and our computations are few hundreds times faster than standard procedures. We provide very fast algorithms for GWAS written in pure R code. We also show how to rearrange SNP data for fast access.

  4. GIS-based logistic regression for landslide susceptibility mapping of the Zhongxian segment in the Three Gorges area, China

    Science.gov (United States)

    Bai, Shi-Biao; Wang, Jian; Lü, Guo-Nian; Zhou, Ping-Gen; Hou, Sheng-Shan; Xu, Su-Ning

    2010-02-01

    A detailed landslide susceptibility map was produced using a logistic regression method with datasets developed for a geographic information system (GIS). Known as one of the most landslide-prone areas in China, the Zhongxian-Shizhu segment in the Three Gorges Reservoir region of China was selected as a suitable case to evaluate the frequency and distribution of landslides. The site covered an area of 260.9 km 2 with a landslide area of 5.3 km 2. Four data domains were used in this study: remote sensing products, thematic maps, geological maps, and topographical maps, all with 25 × 25 m 2 pixels or cells. Statistical relationships for landslide susceptibility were developed using landslide and landslide causative factor databases. We extended the application of logistic regression approaches to use all continuous variables as they are, and the landslide density is used to transform these nominal variables to numeric variable. According to the map, 2.8% of the study area was identified as an area with very high-susceptibility, whereas very low-, low-, medium- and high-susceptibility zones covered 18.2%, 36.2%, 26.7%, and 16.1% of the area, respectively. The quality of susceptibility mapping was validated, and the correct classification percentage and root mean square error (RMSE) values for the validation data were 81.4% and 0.392, respectively.

  5. Modeling Typhoon Event-Induced Landslides Using GIS-Based Logistic Regression: A Case Study of Alishan Forestry Railway, Taiwan

    Directory of Open Access Journals (Sweden)

    Sheng-Chuan Chen

    2013-01-01

    Full Text Available This study develops a model for evaluating the hazard level of landslides at Alishan Forestry Railway, Taiwan, by using logistic regression with the assistance of a geographical information system (GIS. A typhoon event-induced landslide inventory, independent variables, and a triggering factor were used to build the model. The environmental factors such as bedrock lithology from the geology database; topographic aspect, terrain roughness, profile curvature, and distance to river, from the topographic database; and the vegetation index value from SPOT 4 satellite images were used as variables that influence landslide occurrence. The area under curve (AUC of a receiver operator characteristic (ROC curve was used to validate the model. Effects of parameters on landslide occurrence were assessed from the corresponding coefficient that appears in the logistic regression function. Thereafter, the model was applied to predict the probability of landslides for rainfall data of different return periods. Using a predicted map of probability, the study area was classified into four ranks of landslide susceptibility: low, medium, high, and very high. As a result, most high susceptibility areas are located on the western portion of the study area. Several train stations and railways are located on sites with a high susceptibility ranking.

  6. A comparison of susceptibility maps created with logistic regression and SINMAP for spatial planning in the Lanzhou City, China

    Science.gov (United States)

    Bai, Shibiao; Thiebes, Benni; Bell, Rainer; Glade, Thomas; Wang, Jian

    2010-05-01

    Lanzhou city, the second largest city in north-western China. Its vicinity is known as one of the most landslide-prone areas in China. Thus, landslide risk must be reduced by e.g. spatial planning strategies. Reliable landslide susceptibility maps are an essential part of such a strategy. The study area is located upstream of the Yellow River and varies extremely in topography, population density, and relevant geological and geomorphologic processes. Within this study, landslide susceptibility maps are produced by a) GIS-based logistic regression and b) stability index mapping (the SINMAP approach). A landslide inventory was set up and landslide characteristics such as frequency and distribution were analysed. The landslide inventory provides the basis for both modelling approaches. Herein, logistic regression (LR) is based on distance from drainage systems, faults and roads, slope angle and aspect, topographic elevation, topographical wetness index, land use and loess hydraulic and geotechnical parameters. SINMAP is a terrain stability model that combines steady state hydrology assumptions with the infinite slope stability model to assess susceptibility to shallow landslides. The quality of the landslide susceptibility maps is validated and final maps of the different approaches are compared. Landslide susceptibility maps can be used for planning of protection and mitigation measures and provide the basis of the Lanzhou city landslide risk assessment.

  7. Incorporating the effects of topographic amplification in the analysis of earthquake-induced landslide hazards using logistic regression

    Science.gov (United States)

    Lee, S. T.; Yu, T. T.; Peng, W. F.; Wang, C. L.

    2010-12-01

    Seismic-induced landslide hazards are studied using seismic shaking intensity based on the topographic amplification effect. The estimation of the topographic effect includes the theoretical topographic amplification factors and the corresponding amplified ground motion. Digital elevation models (DEM) with a 5-m grid space are used. The logistic regression model and the geographic information system (GIS) are used to perform the seismic landslide hazard analysis. The 99 Peaks area, located 3 km away from the ruptured fault of the Chi-Chi earthquake, is used to test the proposed hypothesis. An inventory map of earthquake-triggered landslides is used to produce a dependent variable that takes a value of 0 (no landslides) or 1 (landslides). A set of independent parameters, including lithology, elevation, slope gradient, slope aspect, terrain roughness, land use, and Arias intensity (Ia) with the topographic effect. Subsequently, logistic regression is used to find the best fitting function to describe the relationship between the occurrence and absence of landslides within an individual grid cell. The results of seismic landslide hazard analysis that includes the topographic effect (AUROC = 0.890) are better than those of the analysis without it (AUROC = 0.874).

  8. Risk factors for subclinical intramammary infection in dairy goats in two longitudinal field studies evaluated by Bayesian logistic regression.

    Science.gov (United States)

    Koop, Gerrit; Collar, Carol A; Toft, Nils; Nielen, Mirjam; van Werven, Tine; Bacon, Debora; Gardner, Ian A

    2013-03-01

    Identification of risk factors for subclinical intramammary infections (IMI) in dairy goats should contribute to improved udder health. Intramammary infection may be diagnosed by bacteriological culture or by somatic cell count (SCC) of a milk sample. Both bacteriological culture and SCC are imperfect tests, particularly lacking sensitivity, which leads to misclassification and thus to biased estimates of odds ratios in risk factor studies. The objective of this study was to evaluate risk factors for the true (latent) IMI status of major pathogens in dairy goats. We used Bayesian logistic regression models that accounted for imperfect measurement of IMI by both culture and SCC. Udder half milk samples were collected from 530 Dutch and 438 California dairy goats in 10 herds on 3 occasions during lactation. Udder halves were classified as positive or negative for isolation of a major pathogen (mostly Staphylococcus aureus) on bacteriological culture and as positive or negative for SCC (cut-off of 2000 × 10(3)cells/mL). Potentially controllable risk factors (udder conformation, teat size, teat shape, teat placement, teat-end shape, teat-end callosity thickness, teat-end callosity roughness, caprine arthritis encephalitis-virus infection status, and kidding season), and uncontrollable risk factors (parity, lactation stage, milk yield, pregnancy status, and breed) were measured in the Dutch study, the Californian study or in both studies. Bayesian logistic regression models were constructed in which the true (but latent) infection status was linked to the joint test results, as functions of test sensitivity and specificity. The latent IMI status was the dependent variable in the logistic regression model with risk factors as independent variables and with random herd and goat effects. For the combined data from both studies, the culture-based estimate of apparent prevalence of major pathogens in udder halves was 2.6% (137/5220) and the estimate of the apparent

  9. Demographic, Psychological, and School Environment Correlates of Bullying Victimization and School Hassles in Rural Youth

    OpenAIRE

    Smokowski, Paul R.; Cotter, Katie L.; Caroline Robertson; Shenyang Guo

    2013-01-01

    Little is known about bullying in rural areas. The participants in this study included 3,610 racially diverse youth (average age = 12.8) from 28 rural schools who completed the School Success Profile-Plus. Binary logistic regression models were created to predict bullying victimization in the past 12 months, and ordered logistic regression was used to predict school hassles in the past 12 months. Overall, 22.71% of the sample experienced bullying victimization and school victimization rates r...

  10. Remote sensing and GIS-based landslide hazard analysis and cross-validation using multivariate logistic regression model on three test areas in Malaysia

    Science.gov (United States)

    Pradhan, Biswajeet

    2010-05-01

    This paper presents the results of the cross-validation of a multivariate logistic regression model using remote sensing data and GIS for landslide hazard analysis on the Penang, Cameron, and Selangor areas in Malaysia. Landslide locations in the study areas were identified by interpreting aerial photographs and satellite images, supported by field surveys. SPOT 5 and Landsat TM satellite imagery were used to map landcover and vegetation index, respectively. Maps of topography, soil type, lineaments and land cover were constructed from the spatial datasets. Ten factors which influence landslide occurrence, i.e., slope, aspect, curvature, distance from drainage, lithology, distance from lineaments, soil type, landcover, rainfall precipitation, and normalized difference vegetation index (ndvi), were extracted from the spatial database and the logistic regression coefficient of each factor was computed. Then the landslide hazard was analysed using the multivariate logistic regression coefficients derived not only from the data for the respective area but also using the logistic regression coefficients calculated from each of the other two areas (nine hazard maps in all) as a cross-validation of the model. For verification of the model, the results of the analyses were then compared with the field-verified landslide locations. Among the three cases of the application of logistic regression coefficient in the same study area, the case of Selangor based on the Selangor logistic regression coefficients showed the highest accuracy (94%), where as Penang based on the Penang coefficients showed the lowest accuracy (86%). Similarly, among the six cases from the cross application of logistic regression coefficient in other two areas, the case of Selangor based on logistic coefficient of Cameron showed highest (90%) prediction accuracy where as the case of Penang based on the Selangor logistic regression coefficients showed the lowest accuracy (79%). Qualitatively, the cross

  11. Exploring the performance of logistic regression model types on growth/no growth data of Listeria monocytogenes.

    Science.gov (United States)

    Gysemans, K P M; Bernaerts, K; Vermeulen, A; Geeraerd, A H; Debevere, J; Devlieghere, F; Van Impe, J F

    2007-03-20

    Several model types have already been developed to describe the boundary between growth and no growth conditions. In this article two types were thoroughly studied and compared, namely (i) the ordinary (linear) logistic regression model, i.e., with a polynomial on the right-hand side of the model equation (type I) and (ii) the (nonlinear) logistic regression model derived from a square root-type kinetic model (type II). The examination was carried out on the basis of the data described in Vermeulen et al. [Vermeulen, A., Gysemans, K.P.M., Bernaerts, K., Geeraerd, A.H., Van Impe, J.F., Debevere, J., Devlieghere, F., 2006-this issue. Influence of pH, water activity and acetic acid concentration on Listeria monocytogenes at 7 degrees C: data collection for the development of a growth/no growth model. International Journal of Food Microbiology. .]. These data sets consist of growth/no growth data for Listeria monocytogenes as a function of water activity (0.960-0.990), pH (5.0-6.0) and acetic acid percentage (0-0.8% (w/w)), both for a monoculture and a mixed strain culture. Numerous replicates, namely twenty, were performed at closely spaced conditions. In this way detailed information was obtained about the position of the interface and the transition zone between growth and no growth. The main questions investigated were (i) which model type performs best on the monoculture and the mixed strain data, (ii) are there differences between the growth/no growth interfaces of monocultures and mixed strain cultures, (iii) which parameter estimation approach works best for the type II models, and (iv) how sensitive is the performance of these models to the values of their nonlinear-appearing parameters. The results showed that both type I and II models performed well on the monoculture data with respect to goodness-of-fit and predictive power. The type I models were, however, more sensitive to anomalous data points. The situation was different for the mixed strain culture. In

  12. Estimating the Influence of Accident Related Factors on Motorcycle Fatal Accidents using Logistic Regression (Case Study: Denpasar-Bali

    Directory of Open Access Journals (Sweden)

    Wedagama D.M.P.

    2010-01-01

    Full Text Available In Denpasar the capital of Bali Province, motorcycle accident contributes to about 80% of total road accidents. Out of those motorcycle accidents, 32% are fatal accidents. This study investigates the influence of accident related factors on motorcycle fatal accidents in the city of Denpasar during period 2006-2008 using a logistic regression model. The study found that the fatality of collision with pedestrians and right angle accidents were respectively about 0.44 and 0.40 times lower than collision with other vehicles and accidents due to other factors. In contrast, the odds that a motorcycle accident will be fatal due to collision with heavy and light vehicles were 1.67 times more likely than with other motorcycles. Collision with pedestrians, right angle accidents, and heavy and light vehicles were respectively accounted for 31%, 29%, and 63% of motorcycle fatal accidents.

  13. Risk factors for subclinical intramammary infection in dairy goats in two longitudinal field studies evaluated by Bayesian logistic regression

    DEFF Research Database (Denmark)

    Koop, Gerrit; Collar, Carol A.; Toft, Nils

    2013-01-01

    , caprine arthritis encephalitis-virus infection status, and kidding season), and uncontrollable risk factors (parity, lactation stage, milk yield, pregnancy status, and breed) were measured in the Dutch study, the Californian study or in both studies. Bayesian logistic regression models were constructed......Identification of risk factors for subclinical intramammary infections (IMI) in dairy goats should contribute to improved udder health. Intramammary infection may be diagnosed by bacteriological culture or by somatic cell count (SCC) of a milk sample. Both bacteriological culture and SCC...... (mostly Staphylococcus aureus) on bacteriological culture and as positive or negative for SCC (cut-off of 2000 x 10(3) cells/mL). Potentially controllable risk factors (udder conformation, teat size, teat shape, teat placement, teat-end shape, teat-end callosity thickness, teat-end callosity roughness...

  14. Optimizing landslide susceptibility zonation: Effects of DEM spatial resolution and slope unit delineation on logistic regression models

    Science.gov (United States)

    Schlögel, R.; Marchesini, I.; Alvioli, M.; Reichenbach, P.; Rossi, M.; Malet, J.-P.

    2018-01-01

    We perform landslide susceptibility zonation with slope units using three digital elevation models (DEMs) of varying spatial resolution of the Ubaye Valley (South French Alps). In so doing, we applied a recently developed algorithm automating slope unit delineation, given a number of parameters, in order to optimize simultaneously the partitioning of the terrain and the performance of a logistic regression susceptibility model. The method allowed us to obtain optimal slope units for each available DEM spatial resolution. For each resolution, we studied the susceptibility model performance by analyzing in detail the relevance of the conditioning variables. The analysis is based on landslide morphology data, considering either the whole landslide or only the source area outline as inputs. The procedure allowed us to select the most useful information, in terms of DEM spatial resolution, thematic variables and landslide inventory, in order to obtain the most reliable slope unit-based landslide susceptibility assessment.

  15. Binary Logistic Regression Modeling of Idle CO Emissions in Order to Estimate Predictors Influences in Old Vehicle Park

    Directory of Open Access Journals (Sweden)

    Branimir Milosavljević

    2015-01-01

    Full Text Available This paper determines, by experiments, the CO emissions at idle running with 1,785 vehicles powered by spark ignition engine, in order to verify the correctness of emissions values with a representative sample of vehicles in Serbia. The permissible emissions limits were considered for three (3 fitted binary logistic regression (BLR models, and the key reason for such analysis is finding the predictors that can have a crucial influence on the accuracy of the estimation whether such vehicles have correct emissions or not. Having summarized the research results, we found out that vehicles produced in Serbia (hereinafter referred to as “domestic vehicles” cause more pollution than imported cars (hereinafter referred to as “foreign vehicles”, although domestic vehicles are of lower average age and mileage. Another trend was observed: low-power vehicles and vehicles produced before 1992 are potentially more serious polluters.

  16. A comparison between Bayes discriminant analysis and logistic regression for prediction of debris flow in southwest Sichuan, China

    Science.gov (United States)

    Xu, Wenbo; Jing, Shaocai; Yu, Wenjuan; Wang, Zhaoxian; Zhang, Guoping; Huang, Jianxi

    2013-11-01

    In this study, the high risk areas of Sichuan Province with debris flow, Panzhihua and Liangshan Yi Autonomous Prefecture, were taken as the studied areas. By using rainfall and environmental factors as the predictors and based on the different prior probability combinations of debris flows, the prediction of debris flows was compared in the areas with statistical methods: logistic regression (LR) and Bayes discriminant analysis (BDA). The results through the comprehensive analysis show that (a) with the mid-range scale prior probability, the overall predicting accuracy of BDA is higher than those of LR; (b) with equal and extreme prior probabilities, the overall predicting accuracy of LR is higher than those of BDA; (c) the regional predicting models of debris flows with rainfall factors only have worse performance than those introduced environmental factors, and the predicting accuracies of occurrence and nonoccurrence of debris flows have been changed in the opposite direction as the supplemented information.

  17. Predicting Success in Product Development: The Application of Principal Component Analysis to Categorical Data and Binomial Logistic Regression

    Directory of Open Access Journals (Sweden)

    Glauco H.S. Mendes

    2013-09-01

    Full Text Available Critical success factors in new product development (NPD in the Brazilian small and medium enterprises (SMEs are identified and analyzed. Critical success factors are best practices that can be used to improve NPD management and performance in a company. However, the traditional method for identifying these factors is survey methods. Subsequently, the collected data are reduced through traditional multivariate analysis. The objective of this work is to develop a logistic regression model for predicting the success or failure of the new product development. This model allows for an evaluation and prioritization of resource commitments. The results will be helpful for guiding management actions, as one way to improve NPD performance in those industries.

  18. Comparison of the performance of log-logistic regression and artificial neural networks for predicting breast cancer relapse.

    Science.gov (United States)

    Faradmal, Javad; Soltanian, Ali Reza; Roshanaei, Ghodratollah; Khodabakhshi, Reza; Kasaeian, Amir

    2014-01-01

    Breast cancer is the most common cancers in female populations. The exact cause is not known, but is most likely to be a combination of genetic and environmental factors. Log-logistic model (LLM) is applied as a statistical method for predicting survival and it influencing factors. In recent decades, artificial neural network (ANN) models have been increasingly applied to predict survival data. The present research was conducted to compare log-logistic regression and artificial neural network models in prediction of breast cancer (BC) survival. A historical cohort study was established with 104 patients suffering from BC from 1997 to 2005. To compare the ANN and LLM in our setting, we used the estimated areas under the receiver-operating characteristic (ROC) curve (AUC) and integrated AUC (iAUC). The data were analyzed using R statistical software. The AUC for the first, second and third years after diagnosis are 0.918, 0.780 and 0.800 in ANN, and 0.834, 0.733 and 0.616 in LLM, respectively. The mean AUC for ANN was statistically higher than that of the LLM (0.845 vs. 0.744). Hence, this study showed a significant difference between the performance in terms of prediction by ANN and LLM. This study demonstrated that the ability of prediction with ANN was higher than with the LLM model. Thus, the use of ANN method for prediction of survival in field of breast cancer is suggested.

  19. Modelling the spatial distribution of Fasciola hepatica in bovines using decision tree, logistic regression and GIS query approaches for Brazil.

    Science.gov (United States)

    Bennema, S C; Molento, M B; Scholte, R G; Carvalho, O S; Pritsch, I

    2017-11-01

    Fascioliasis is a condition caused by the trematode Fasciola hepatica. In this paper, the spatial distribution of F. hepatica in bovines in Brazil was modelled using a decision tree approach and a logistic regression, combined with a geographic information system (GIS) query. In the decision tree and the logistic model, isothermality had the strongest influence on disease prevalence. Also, the 50-year average precipitation in the warmest quarter of the year was included as a risk factor, having a negative influence on the parasite prevalence. The risk maps developed using both techniques, showed a predicted higher prevalence mainly in the South of Brazil. The prediction performance seemed to be high, but both techniques failed to reach a high accuracy in predicting the medium and high prevalence classes to the entire country. The GIS query map, based on the range of isothermality, minimum temperature of coldest month, precipitation of warmest quarter of the year, altitude and the average dailyland surface temperature, showed a possibility of presence of F. hepatica in a very large area. The risk maps produced using these methods can be used to focus activities of animal and public health programmes, even on non-evaluated F. hepatica areas.

  20. Investigation of expert rule bases, logistic regression, and non-linear machine learning techniques for predicting response to antiretroviral treatment.

    Science.gov (United States)

    Prosperi, Mattia C F; Altmann, Andre; Rosen-Zvi, Michal; Aharoni, Ehud; Borgulya, Gabor; Bazso, Fulop; Sönnerborg, Anders; Schülter, Eugen; Struck, Daniel; Ulivi, Giovanni; Vandamme, Anne-Mieke; Vercauteren, Jurgen; Zazzi, Maurizio

    2009-01-01

    The extreme flexibility of the HIV type-1 (HIV-1) genome makes it challenging to build the ideal antiretroviral treatment regimen. Interpretation of HIV-1 genotypic drug resistance is evolving from rule-based systems guided by expert opinion to data-driven engines developed through machine learning methods. The aim of the study was to investigate linear and non-linear statistical learning models for classifying short-term virological outcome of antiretroviral treatment. To optimize the model, different feature selection methods were considered. Robust extra-sample error estimation and different loss functions were used to assess model performance. The results were compared with widely used rule-based genotypic interpretation systems (Stanford HIVdb, Rega and ANRS). A set of 3,143 treatment change episodes were extracted from the EuResist database. The dataset included patient demographics, treatment history and viral genotypes. A logistic regression model using high order interaction variables performed better than rule-based genotypic interpretation systems (accuracy 75.63% versus 71.74-73.89%, area under the receiver operating characteristic curve [AUC] 0.76 versus 0.68-0.70) and was equivalent to a random forest model (accuracy 76.16%, AUC 0.77). However, when rule-based genotypic interpretation systems were coupled with additional patient attributes, and the combination was provided as input to the logistic regression model, the performance increased significantly, becoming comparable to the fully data-driven methods. Patient-derived supplementary features significantly improved the accuracy of the prediction of response to treatment, both with rule-based and data-driven interpretation systems. Fully data-driven models derived from large-scale data sources show promise as antiretroviral treatment decision support tools.

  1. Logistic regression analysis of prognostic factors in 106 acute-on-chronic liver failure patients with hepatic encephalopathy

    Directory of Open Access Journals (Sweden)

    CUI Yanping

    2014-10-01

    Full Text Available ObjectiveTo analyze the prognostic factors in acute-on-chronic liver failure (ACLF patients with hepatic encephalopathy (HE and to explore the risk factors for prognosis. MethodsA retrospective analysis was performed on 106 ACLF patients with HE who were hospitalized in our hospital from January 2010 to July 2013. The patients were divided into improved group and deteriorated group. The univariate indicators including age, sex, laboratory indicators [total bilirubin (TBil, albumin (Alb, alanine aminotransferase (ALT, aspartate amino-transferase (AST, and prothrombin time activity (PTA], the stage of HE, complications [persistent hyponatremia, digestive tract bleeding, hepatorenal syndrome (HRS, ascites, infection, and spontaneous bacterial peritonitis (SBP], and plasma exchange were analyzed by chi-square test or t-test. Indicators with statistical significance were subsequently analyzed by binary logistic regression. ResultsUnivariate analysis showed that ALT (P=0.009, PTA (P=0.043, the stage of HE (P=0.000, and HRS (P=0.003 were significantly different between the two groups, whereas differences in age, sex, TBil, Alb, AST, persistent hyponatremia, digestive tract bleeding, ascites, infection, SBP, and plasma exchange were not statistically significant (P>0.05. Binary logistic regression demonstrated that PTA (b=-0097, P=0.025, OR=0.908, HRS (b=2.279, P=0.007, OR=9.764, and the stage of HE (b=1873, P=0.000, OR=6.510 were prognostic factors in ACLF patients with HE. ConclusionThe stage of HE, HRS, and PTA are independent influential factors for the prognosis in ACLF patients with HE. Reduced PTA, advanced HE stage, and the presence of HRS indicate worse prognosis.

  2. Landslide susceptibility mapping for a part of North Anatolian Fault Zone (Northeast Turkey) using logistic regression model

    Science.gov (United States)

    Demir, Gökhan; aytekin, mustafa; banu ikizler, sabriye; angın, zekai

    2013-04-01

    The North Anatolian Fault is know as one of the most active and destructive fault zone which produced many earthquakes with high magnitudes. Along this fault zone, the morphology and the lithological features are prone to landsliding. However, many earthquake induced landslides were recorded by several studies along this fault zone, and these landslides caused both injuiries and live losts. Therefore, a detailed landslide susceptibility assessment for this area is indispancable. In this context, a landslide susceptibility assessment for the 1445 km2 area in the Kelkit River valley a part of North Anatolian Fault zone (Eastern Black Sea region of Turkey) was intended with this study, and the results of this study are summarized here. For this purpose, geographical information system (GIS) and a bivariate statistical model were used. Initially, Landslide inventory maps are prepared by using landslide data determined by field surveys and landslide data taken from General Directorate of Mineral Research and Exploration. The landslide conditioning factors are considered to be lithology, slope gradient, slope aspect, topographical elevation, distance to streams, distance to roads and distance to faults, drainage density and fault density. ArcGIS package was used to manipulate and analyze all the collected data Logistic regression method was applied to create a landslide susceptibility map. Landslide susceptibility maps were divided into five susceptibility regions such as very low, low, moderate, high and very high. The result of the analysis was verified using the inventoried landslide locations and compared with the produced probability model. For this purpose, Area Under Curvature (AUC) approach was applied, and a AUC value was obtained. Based on this AUC value, the obtained landslide susceptibility map was concluded as satisfactory. Keywords: North Anatolian Fault Zone, Landslide susceptibility map, Geographical Information Systems, Logistic Regression Analysis.

  3. Evaluating the Proportion of Treatment Effect Explained by a Continuous Surrogate Marker in Logistic or Probit Regression Models.

    Science.gov (United States)

    Huang, Jie; Huang, Bin

    2010-05-01

    Using surrogate endpoints in clinical trials is desirable for drug development because the trials can be shortened and therefore more cost-effective. Validating a surrogate for the clinical endpoint is critical in this context. One of the key steps in statistical validation of a surrogate for a single trial is to estimate the proportion of treatment effect explained (PTE or PE) by a surrogate. Often the measure for PTE is estimated from the difference in coefficients of treatment from two models with or without adjusting for the surrogate for clinical endpoint. Inherent problems with the method are: the two models may not be valid simultaneously; and the estimate can often lie outside the interval [0, 1]. In this article, we provide alternative measures for evaluating the proportion of treatment effect explained by a surrogate in logistic or probit regression models. Our measures can be estimated easily with any statistical programs capable of binary linear regression modeling, and the interpretation of the measures can be illustrated using Ordinal Dominance (OD) curves. The concept can be visually understood by any practical user. Simulation shows our alternative measures yield more accurate estimates which are less biased, less variable, and with narrower confidence intervals. A clinical trial example is provided.

  4. The implementation of rare events logistic regression to predict the distribution of mesophotic hard corals across the main Hawaiian Islands.

    Science.gov (United States)

    Veazey, Lindsay M; Franklin, Erik C; Kelley, Christopher; Rooney, John; Frazer, L Neil; Toonen, Robert J

    2016-01-01

    Predictive habitat suitability models are powerful tools for cost-effective, statistically robust assessment of the environmental drivers of species distributions. The aim of this study was to develop predictive habitat suitability models for two genera of scleractinian corals (Leptoserisand Montipora) found within the mesophotic zone across the main Hawaiian Islands. The mesophotic zone (30-180 m) is challenging to reach, and therefore historically understudied, because it falls between the maximum limit of SCUBA divers and the minimum typical working depth of submersible vehicles. Here, we implement a logistic regression with rare events corrections to account for the scarcity of presence observations within the dataset. These corrections reduced the coefficient error and improved overall prediction success (73.6% and 74.3%) for both original regression models. The final models included depth, rugosity, slope, mean current velocity, and wave height as the best environmental covariates for predicting the occurrence of the two genera in the mesophotic zone. Using an objectively selected theta ("presence") threshold, the predicted presence probability values (average of 0.051 for Leptoseris and 0.040 for Montipora) were translated to spatially-explicit habitat suitability maps of the main Hawaiian Islands at 25 m grid cell resolution. Our maps are the first of their kind to use extant presence and absence data to examine the habitat preferences of these two dominant mesophotic coral genera across Hawai'i.

  5. The implementation of rare events logistic regression to predict the distribution of mesophotic hard corals across the main Hawaiian Islands

    Directory of Open Access Journals (Sweden)

    Lindsay M. Veazey

    2016-07-01

    Full Text Available Predictive habitat suitability models are powerful tools for cost-effective, statistically robust assessment of the environmental drivers of species distributions. The aim of this study was to develop predictive habitat suitability models for two genera of scleractinian corals (Leptoserisand Montipora found within the mesophotic zone across the main Hawaiian Islands. The mesophotic zone (30–180 m is challenging to reach, and therefore historically understudied, because it falls between the maximum limit of SCUBA divers and the minimum typical working depth of submersible vehicles. Here, we implement a logistic regression with rare events corrections to account for the scarcity of presence observations within the dataset. These corrections reduced the coefficient error and improved overall prediction success (73.6% and 74.3% for both original regression models. The final models included depth, rugosity, slope, mean current velocity, and wave height as the best environmental covariates for predicting the occurrence of the two genera in the mesophotic zone. Using an objectively selected theta (“presence” threshold, the predicted presence probability values (average of 0.051 for Leptoseris and 0.040 for Montipora were translated to spatially-explicit habitat suitability maps of the main Hawaiian Islands at 25 m grid cell resolution. Our maps are the first of their kind to use extant presence and absence data to examine the habitat preferences of these two dominant mesophotic coral genera across Hawai‘i.

  6. Modeling multivariate binary responses with multiple levels of nesting based on alternating logistic regressions: an application to caries aggregation.

    Science.gov (United States)

    Ananth, C V; Kantor, M L

    2004-10-01

    Clustered binary responses are commonly encountered in dental research. Data analysis may include modeling both the marginal response probabilities (i.e., risk) and the dependence structure between pairs of responses (i.e., aggregation). While second-order generalized estimating equations (GEE2) is a well-known approach for such data, alternating logistic regressions (ALR) is a computationally efficient alternative method, especially for large clusters. We illustrate ALR with an application to caries aggregation using a dataset with 3 levels of nesting: tooth surfaces within an interproximal (IP) region, IP regions within a jaw, and jaws within a subject. Caries lesions appear to aggregate strongly within subjects with a spatially distributed risk. The minimum within-IP-region odds ratio (OR) was 2.25 (95% confidence interval 1.15, 4.41), and the within-IP-region ORs were always greater than the between-IP-region ORs. ALR is a convenient and useful regression technique for explicit modeling of the dependence structure, and may be applicable to other dental research problems involving clustered or nested responses.

  7. Assessment of susceptibility to earth-flow landslide using logistic regression and multivariate adaptive regression splines: A case of the Belice River basin (western Sicily, Italy)

    Science.gov (United States)

    Conoscenti, Christian; Ciaccio, Marilena; Caraballo-Arias, Nathalie Almaru; Gómez-Gutiérrez, Álvaro; Rotigliano, Edoardo; Agnesi, Valerio

    2015-08-01

    In this paper, terrain susceptibility to earth-flow occurrence was evaluated by using geographic information systems (GIS) and two statistical methods: Logistic regression (LR) and multivariate adaptive regression splines (MARS). LR has been already demonstrated to provide reliable predictions of earth-flow occurrence, whereas MARS, as far as we know, has never been used to generate earth-flow susceptibility models. The experiment was carried out in a basin of western Sicily (Italy), which extends for 51 km2 and is severely affected by earth-flows. In total, we mapped 1376 earth-flows, covering an area of 4.59 km2. To explore the effect of pre-failure topography on earth-flow spatial distribution, we performed a reconstruction of topography before the landslide occurrence. This was achieved by preparing a digital terrain model (DTM) where altitude of areas hosting landslides was interpolated from the adjacent undisturbed land surface by using the algorithm topo-to-raster. This DTM was exploited to extract 15 morphological and hydrological variables that, in addition to outcropping lithology, were employed as explanatory variables of earth-flow spatial distribution. The predictive skill of the earth-flow susceptibility models and the robustness of the procedure were tested by preparing five datasets, each including a different subset of landslides and stable areas. The accuracy of the predictive models was evaluated by drawing receiver operating characteristic (ROC) curves and by calculating the area under the ROC curve (AUC). The results demonstrate that the overall accuracy of LR and MARS earth-flow susceptibility models is from excellent to outstanding. However, AUC values of the validation datasets attest to a higher predictive power of MARS-models (AUC between 0.881 and 0.912) with respect to LR-models (AUC between 0.823 and 0.870). The adopted procedure proved to be resistant to overfitting and stable when changes of the learning and validation samples are

  8. Adverse events associated with incretin-based drugs in Japanese spontaneous reports: a mixed effects logistic regression model

    Directory of Open Access Journals (Sweden)

    Daichi Narushima

    2016-03-01

    Full Text Available Background: Spontaneous Reporting Systems (SRSs are passive systems composed of reports of suspected Adverse Drug Events (ADEs, and are used for Pharmacovigilance (PhV, namely, drug safety surveillance. Exploration of analytical methodologies to enhance SRS-based discovery will contribute to more effective PhV. In this study, we proposed a statistical modeling approach for SRS data to address heterogeneity by a reporting time point. Furthermore, we applied this approach to analyze ADEs of incretin-based drugs such as DPP-4 inhibitors and GLP-1 receptor agonists, which are widely used to treat type 2 diabetes. Methods: SRS data were obtained from the Japanese Adverse Drug Event Report (JADER database. Reported adverse events were classified according to the MedDRA High Level Terms (HLTs. A mixed effects logistic regression model was used to analyze the occurrence of each HLT. The model treated DPP-4 inhibitors, GLP-1 receptor agonists, hypoglycemic drugs, concomitant suspected drugs, age, and sex as fixed effects, while the quarterly period of reporting was treated as a random effect. Before application of the model, Fisher’s exact tests were performed for all drug-HLT combinations. Mixed effects logistic regressions were performed for the HLTs that were found to be associated with incretin-based drugs. Statistical significance was determined by a two-sided p-value <0.01 or a 99% two-sided confidence interval. Finally, the models with and without the random effect were compared based on Akaike’s Information Criteria (AIC, in which a model with a smaller AIC was considered satisfactory. Results: The analysis included 187,181 cases reported from January 2010 to March 2015. It showed that 33 HLTs, including pancreatic, gastrointestinal, and cholecystic events, were significantly associated with DPP-4 inhibitors or GLP-1 receptor agonists. In the AIC comparison, half of the HLTs reported with incretin-based drugs favored the random effect

  9. Methods for significance testing of categorical covariates in logistic regression models after multiple imputation: power and applicability analysis.

    Science.gov (United States)

    Eekhout, Iris; van de Wiel, Mark A; Heymans, Martijn W

    2017-08-22

    Multiple imputation is a recommended method to handle missing data. For significance testing after multiple imputation, Rubin's Rules (RR) are easily applied to pool parameter estimates. In a logistic regression model, to consider whether a categorical covariate with more than two levels significantly contributes to the model, different methods are available. For example pooling chi-square tests with multiple degrees of freedom, pooling likelihood ratio test statistics, and pooling based on the covariance matrix of the regression model. These methods are more complex than RR and are not available in all mainstream statistical software packages. In addition, they do not always obtain optimal power levels. We argue that the median of the p-values from the overall significance tests from the analyses on the imputed datasets can be used as an alternative pooling rule for categorical variables. The aim of the current study is to compare different methods to test a categorical variable for significance after multiple imputation on applicability and power. In a large simulation study, we demonstrated the control of the type I error and power levels of different pooling methods for categorical variables. This simulation study showed that for non-significant categorical covariates the type I error is controlled and the statistical power of the median pooling rule was at least equal to current multiple parameter tests. An empirical data example showed similar results. It can therefore be concluded that using the median of the p-values from the imputed data analyses is an attractive and easy to use alternative method for significance testing of categorical variables.

  10. Extension of the Peters-Belson method to estimate health disparities among multiple groups using logistic regression with survey data.

    Science.gov (United States)

    Li, Y; Graubard, B I; Huang, P; Gastwirth, J L

    2015-02-20

    Determining the extent of a disparity, if any, between groups of people, for example, race or gender, is of interest in many fields, including public health for medical treatment and prevention of disease. An observed difference in the mean outcome between an advantaged group (AG) and disadvantaged group (DG) can be due to differences in the distribution of relevant covariates. The Peters-Belson (PB) method fits a regression model with covariates to the AG to predict, for each DG member, their outcome measure as if they had been from the AG. The difference between the mean predicted and the mean observed outcomes of DG members is the (unexplained) disparity of interest. We focus on applying the PB method to estimate the disparity based on binary/multinomial/proportional odds logistic regression models using data collected from complex surveys with more than one DG. Estimators of the unexplained disparity, an analytic variance-covariance estimator that is based on the Taylor linearization variance-covariance estimation method, as well as a Wald test for testing a joint null hypothesis of zero for unexplained disparities between two or more minority groups and a majority group, are provided. Simulation studies with data selected from simple random sampling and cluster sampling, as well as the analyses of disparity in body mass index in the National Health and Nutrition Examination Survey 1999-2004, are conducted.  Empirical results indicate that the Taylor linearization variance-covariance estimation is accurate and that the proposed Wald test maintains the nominal level. Copyright © 2014 John Wiley & Sons, Ltd.

  11. Extension of the Peters–Belson method to estimate health disparities among multiple groups using logistic regression with survey data

    Science.gov (United States)

    Li, Y.; Graubard, B. I.; Huang, P.; Gastwirth, J. L.

    2015-01-01

    Determining the extent of a disparity, if any, between groups of people, for example, race or gender, is of interest in many fields, including public health for medical treatment and prevention of disease. An observed difference in the mean outcome between an advantaged group (AG) and disadvantaged group (DG) can be due to differences in the distribution of relevant covariates. The Peters–Belson (PB) method fits a regression model with covariates to the AG to predict, for each DG member, their outcome measure as if they had been from the AG. The difference between the mean predicted and the mean observed outcomes of DG members is the (unexplained) disparity of interest. We focus on applying the PB method to estimate the disparity based on binary/multinomial/proportional odds logistic regression models using data collected from complex surveys with more than one DG. Estimators of the unexplained disparity, an analytic variance–covariance estimator that is based on the Taylor linearization variance–covariance estimation method, as well as a Wald test for testing a joint null hypothesis of zero for unexplained disparities between two or more minority groups and a majority group, are provided. Simulation studies with data selected from simple random sampling and cluster sampling, as well as the analyses of disparity in body mass index in the National Health and Nutrition Examination Survey 1999–2004, are conducted. Empirical results indicate that the Taylor linearization variance–covariance estimation is accurate and that the proposed Wald test maintains the nominal level. PMID:25382235

  12. Discrimination of Mine Seismic Events and Blasts Using the Fisher Classifier, Naive Bayesian Classifier and Logistic Regression

    Science.gov (United States)

    Dong, Longjun; Wesseloo, Johan; Potvin, Yves; Li, Xibing

    2016-01-01

    Seismic events and blasts generate seismic waveforms that have different characteristics. The challenge to confidently differentiate these two signatures is complex and requires the integration of physical and statistical techniques. In this paper, the different characteristics of blasts and seismic events were investigated by comparing probability density distributions of different parameters. Five typical parameters of blasts and events and the probability density functions of blast time, as well as probability density functions of origin time difference for neighbouring blasts were extracted as discriminant indicators. The Fisher classifier, naive Bayesian classifier and logistic regression were used to establish discriminators. Databases from three Australian and Canadian mines were established for training, calibrating and testing the discriminant models. The classification performances and discriminant precision of the three statistical techniques were discussed and compared. The proposed discriminators have explicit and simple functions which can be easily used by workers in mines or researchers. Back-test, applied results, cross-validated results and analysis of receiver operating characteristic curves in different mines have shown that the discriminator for one of the mines has a reasonably good discriminating performance.

  13. Construction the model on the breast cancer survival analysis use support vector machine, logistic regression and decision tree.

    Science.gov (United States)

    Chao, Cheng-Min; Yu, Ya-Wen; Cheng, Bor-Wen; Kuo, Yao-Lung

    2014-10-01

    The aim of the paper is to use data mining technology to establish a classification of breast cancer survival patterns, and offers a treatment decision-making reference for the survival ability of women diagnosed with breast cancer in Taiwan. We studied patients with breast cancer in a specific hospital in Central Taiwan to obtain 1,340 data sets. We employed a support vector machine, logistic regression, and a C5.0 decision tree to construct a classification model of breast cancer patients' survival rates, and used a 10-fold cross-validation approach to identify the model. The results show that the establishment of classification tools for the classification of the models yielded an average accuracy rate of more than 90% for both; the SVM provided the best method for constructing the three categories of the classification system for the survival mode. The results of the experiment show that the three methods used to create the classification system, established a high accuracy rate, predicted a more accurate survival ability of women diagnosed with breast cancer, and could be used as a reference when creating a medical decision-making frame.

  14. Identifying Environmental and Social Factors Predisposing to Pathological Gambling Combining Standard Logistic Regression and Logic Learning Machine.

    Science.gov (United States)

    Parodi, Stefano; Dosi, Corrado; Zambon, Antonella; Ferrari, Enrico; Muselli, Marco

    2017-12-01

    Identifying potential risk factors for problem gambling (PG) is of primary importance for planning preventive and therapeutic interventions. We illustrate a new approach based on the combination of standard logistic regression and an innovative method of supervised data mining (Logic Learning Machine or LLM). Data were taken from a pilot cross-sectional study to identify subjects with PG behaviour, assessed by two internationally validated scales (SOGS and Lie/Bet). Information was obtained from 251 gamblers recruited in six betting establishments. Data on socio-demographic characteristics, lifestyle and cognitive-related factors, and type, place and frequency of preferred gambling were obtained by a self-administered questionnaire. The following variables associated with PG were identified: instant gratification games, alcohol abuse, cognitive distortion, illegal behaviours and having started gambling with a relative or a friend. Furthermore, the combination of LLM and LR indicated the presence of two different types of PG, namely: (a) daily gamblers, more prone to illegal behaviour, with poor money management skills and who started gambling at an early age, and (b) non-daily gamblers, characterised by superstitious beliefs and a higher preference for immediate reward games. Finally, instant gratification games were strongly associated with the number of games usually played. Studies on gamblers habitually frequently betting shops are rare. The finding of different types of PG by habitual gamblers deserves further analysis in larger studies. Advanced data mining algorithms, like LLM, are powerful tools and potentially useful in identifying risk factors for PG.

  15. Factors associated with trait anger level of juvenile offenders in Hubei province: A binary logistic regression analysis.

    Science.gov (United States)

    Tang, Li-Na; Ye, Xiao-Zhou; Yan, Qiu-Ge; Chang, Hong-Juan; Ma, Yu-Qiao; Liu, De-Bin; Li, Zhi-Gen; Yu, Yi-Zhen

    2017-02-01

    The risk factors of high trait anger of juvenile offenders were explored through questionnaire study in a youth correctional facility of Hubei province, China. A total of 1090 juvenile offenders in Hubei province were investigated by self-compiled social-demographic questionnaire, Childhood Trauma Questionnaire (CTQ), and State-Trait Anger Expression Inventory-II (STAXI-II). The risk factors were analyzed by chi-square tests, correlation analysis, and binary logistic regression analysis with SPSS 19.0. A total of 1082 copies of valid questionnaires were collected. High trait anger group (n=316) was defined as those who scored in the upper 27th percentile of STAXI-II trait anger scale (TAS), and the rest were defined as low trait anger group (n=766). The risk factors associated with high level of trait anger included: childhood emotional abuse, childhood sexual abuse, step family, frequent drug abuse, and frequent internet using (P0.05). It was suggested that traumatic experience in childhood and unhealthy life style may significantly increase the level of trait anger in adulthood. The risk factors of high trait anger and their effects should be taken into consideration seriously.

  16. Gully erosion susceptibility assessment by means of GIS-based logistic regression: A case of Sicily (Italy)

    Science.gov (United States)

    Conoscenti, Christian; Angileri, Silvia; Cappadonia, Chiara; Rotigliano, Edoardo; Agnesi, Valerio; Märker, Michael

    2014-01-01

    This research aims at characterizing susceptibility conditions to gully erosion by means of GIS and multivariate statistical analysis. The study area is a 9.5 km2 river catchment in central-northern Sicily, where agriculture activities are limited by intense erosion. By means of field surveys and interpretation of aerial images, we prepared a digital map of the spatial distribution of 260 gullies in the study area. In addition, from available thematic maps, a 5 m cell size digital elevation model and field checks, we derived 27 environmental attributes that describe the variability of lithology, land use, topography and road position. These attributes were selected for their potential influence on erosion processes, while the dependent variable was given by presence or absence of gullies within two different types of mapping units: 5 m grid cells and slope units (average size = 2.66 ha). The functional relationships between gully occurrence and the controlling factors were obtained from forward stepwise logistic regression to calculate the probability to host a gully for each mapping unit. In order to train and test the predictive models, three calibration and three validation subsets, of both grid cells and slope units, were randomly selected. Results of validation, based on ROC (receiving operating characteristic) curves, attest for acceptable to excellent accuracies of the models, showing better predictive skill and more stable performance of the susceptibility model based on grid cells.

  17. Analyzing factors associated with women's attitudes and behaviors toward screening mammography using design-based logistic regression.

    Science.gov (United States)

    Madadi, Mahboubeh; Zhang, Shengfan; Yeary, Karen H Kim; Henderson, Louise M

    2014-02-01

    We examined the factors associated with screening mammography adherence behaviors and influencing factors on women's attitudes toward mammography in non-adherent women. Design-based logistic regression models were developed to characterize the influencing factors, including socio-demographic, health related, behavioral characteristics, and knowledge of breast cancer/mammography, on women's compliance with and attitudes toward mammography using the 2003 Health Information National Trends Survey data. Findings indicate significant associations among adherence to mammography and marital status, income, health coverage, being advised by a doctor to have a mammogram, having had Pap smear before, perception of chance of getting breast cancer, and knowledge of mammography (frequency of doing mammogram) in both women younger than 65 and women aged 65 and older. However, number of visits to a healthcare provider per year and lifetime number of smoked cigarettes are only significant for women younger than 65. Factors significantly associated with attitudes toward mammography in non-adherent women are age, being advised by a doctor to have a mammogram, and seeking cancer information. To enhance adherence to mammography programs, physicians need to continue to advise their patients to obtain mammograms. In addition, increasing women's knowledge about the frequency and starting age for screening mammography may improve women's adherence. Financially related factors such as income and insurance are also shown to be significant factors. Hence, healthcare policies aimed at providing breast cancer screening services to underserved women will likely enhance mammography participation.

  18. A revised logistic regression equation and an automated procedure for mapping the probability of a stream flowing perennially in Massachusetts

    Science.gov (United States)

    Bent, Gardner C.; Steeves, Peter A.

    2006-01-01

    A revised logistic regression equation and an automated procedure were developed for mapping the probability of a stream flowing perennially in Massachusetts. The equation provides city and town conservation commissions and the Massachusetts Department of Environmental Protection a method for assessing whether streams are intermittent or perennial at a specific site in Massachusetts by estimating the probability of a stream flowing perennially at that site. This information could assist the environmental agencies who administer the Commonwealth of Massachusetts Rivers Protection Act of 1996, which establishes a 200-foot-wide protected riverfront area extending from the mean annual high-water line along each side of a perennial stream, with exceptions for some urban areas. The equation was developed by relating the observed intermittent or perennial status of a stream site to selected basin characteristics of naturally flowing streams (defined as having no regulation by dams, surface-water withdrawals, ground-water withdrawals, diversion, wastewater discharge, and so forth) in Massachusetts. This revised equation differs from the equation developed in a previous U.S. Geological Survey study in that it is solely based on visual observations of the intermittent or perennial status of stream sites across Massachusetts and on the evaluation of several additional basin and land-use characteristics as potential explanatory variables in the logistic regression analysis. The revised equation estimated more accurately the intermittent or perennial status of the observed stream sites than the equation from the previous study. Stream sites used in the analysis were identified as intermittent or perennial based on visual observation during low-flow periods from late July through early September 2001. The database of intermittent and perennial streams included a total of 351 naturally flowing (no regulation) sites, of which 85 were observed to be intermittent and 266 perennial

  19. Measuring decision weights in recognition experiments with multiple response alternatives: comparing the correlation and multinomial-logistic-regression methods.

    Science.gov (United States)

    Dai, Huanping; Micheyl, Christophe

    2012-11-01

    Psychophysical "reverse-correlation" methods allow researchers to gain insight into the perceptual representations and decision weighting strategies of individual subjects in perceptual tasks. Although these methods have gained momentum, until recently their development was limited to experiments involving only two response categories. Recently, two approaches for estimating decision weights in m-alternative experiments have been put forward. One approach extends the two-category correlation method to m > 2 alternatives; the second uses multinomial logistic regression (MLR). In this article, the relative merits of the two methods are discussed, and the issues of convergence and statistical efficiency of the methods are evaluated quantitatively using Monte Carlo simulations. The results indicate that, for a range of values of the number of trials, the estimated weighting patterns are closer to their asymptotic values for the correlation method than for the MLR method. Moreover, for the MLR method, weight estimates for different stimulus components can exhibit strong correlations, making the analysis and interpretation of measured weighting patterns less straightforward than for the correlation method. These and other advantages of the correlation method, which include computational simplicity and a close relationship to other well-established psychophysical reverse-correlation methods, make it an attractive tool to uncover decision strategies in m-alternative experiments.

  20. Predicting Student Success in a Major's Introductory Biology Course via Logistic Regression Analysis of Scientific Reasoning Ability and Mathematics Scores

    Science.gov (United States)

    Thompson, E. David; Bowling, Bethany V.; Markle, Ross E.

    2017-02-01

    Studies over the last 30 years have considered various factors related to student success in introductory biology courses. While much of the available literature suggests that the best predictors of success in a college course are prior college grade point average (GPA) and class attendance, faculty often require a valuable predictor of success in those courses wherein the majority of students are in the first semester and have no previous record of college GPA or attendance. In this study, we evaluated the efficacy of the ACT Mathematics subject exam and Lawson's Classroom Test of Scientific Reasoning in predicting success in a major's introductory biology course. A logistic regression was utilized to determine the effectiveness of a combination of scientific reasoning (SR) scores and ACT math (ACT-M) scores to predict student success. In summary, we found that the model—with both SR and ACT-M as significant predictors—could be an effective predictor of student success and thus could potentially be useful in practical decision making for the course, such as directing students to support services at an early point in the semester.

  1. A nesting site suitability model for rock partridge (Alectoris graeca in the Apennine Mountains using logistic regression

    Directory of Open Access Journals (Sweden)

    Lorenzo Boccia

    2010-01-01

    Full Text Available The rock partridge has undergone a decline throughout its entire distribution area, including the population of the central Italian Apennine Mountains. Areas of suitable habitat for this species have been reduced due to landscape fragmentation and the dynamics of domestic animal and wildlife management. The present study was conducted in the Province of Rieti, Lazio Region. Geograph- ical and land use predictors were evaluated in a GIS environment to identify the most relevant factors influencing the presence of rock partridge during the nesting period. Logistic regression was then imple- mented to create a model, characterised by a good level of adequacy, for predicting rock partridge nesting site habitat characteristics. Correct predictions of presence and absence were made in 65.2% and 98.6% of cases, respectively. The ROC value was 0.771, which is statistically significant (P<0.001. The results show that, on a local scale, slope (log, distance from forests, and the presence of bare rocks were statisti- cally significant factors. On a landscape scale, the percentage of forests, the presence of sparse vegetation (over 60%, and a negative Mean Shape Index (MSI were found to be statistically significant.

  2. Integrated network analysis and logistic regression modeling identify stage-specific genes in Oral Squamous Cell Carcinoma.

    Science.gov (United States)

    Randhawa, Vinay; Acharya, Vishal

    2015-07-16

    Oral squamous cell carcinoma (OSCC) is associated with substantial mortality and morbidity but, OSCC can be difficult to detect at its earliest stage due to its molecular complexity and clinical behavior. Therefore, identification of key gene signatures at an early stage will be highly helpful. The aim of this study was to identify key genes associated with progression of OSCC stages. Gene expression profiles were classified into cancer stage-related modules, i.e., groups of genes that are significantly related to a clinical stage. For prioritizing the candidate genes, analysis was further restricted to genes with high connectivity and a significant association with a stage. To assess predictive power of these genes, a classification model was also developed and tested by 5-fold cross validation and on an independent dataset. The identified genes were enriched for significant processes and functional pathways, and various genes were found to be directly implicated in OSCC. Forward and stepwise, multivariate logistic regression analyses identified 13 key genes whose expression discriminated early- and late-stage OSCC with predictive accuracy (area under curve; AUC) of ~0.81 in a 5-fold cross-validation strategy. The proposed network-driven integrative analytical approach can identify multiple genes significantly related to an OSCC stage; the classification model that is developed with these genes may help to distinguish cancer stages. The proposed genes and model hold promise for monitoring of OSCC stage progression, and our findings may facilitate cancer detection at an earlier stage, resulting in improved treatment outcomes.

  3. Classification Models to Predict Survival of Kidney Transplant Recipients Using Two Intelligent Techniques of Data Mining and Logistic Regression.

    Science.gov (United States)

    Nematollahi, M; Akbari, R; Nikeghbalian, S; Salehnasab, C

    2017-01-01

    Kidney transplantation is the treatment of choice for patients with end-stage renal disease (ESRD). Prediction of the transplant survival is of paramount importance. The objective of this study was to develop a model for predicting survival in kidney transplant recipients. In a cross-sectional study, 717 patients with ESRD admitted to Nemazee Hospital during 2008-2012 for renal transplantation were studied and the transplant survival was predicted for 5 years. The multilayer perceptron of artificial neural networks (MLP-ANN), logistic regression (LR), Support Vector Machine (SVM), and evaluation tools were used to verify the determinant models of the predictions and determine the independent predictors. The accuracy, area under curve (AUC), sensitivity, and specificity of SVM, MLP-ANN, and LR models were 90.4%, 86.5%, 98.2%, and 49.6%; 85.9%, 76.9%, 97.3%, and 26.1%; and 84.7%, 77.4%, 97.5%, and 17.4%, respectively. Meanwhile, the independent predictors were discharge time creatinine level, recipient age, donor age, donor blood group, cause of ESRD, recipient hypertension after transplantation, and duration of dialysis before transplantation. SVM and MLP-ANN models could efficiently be used for determining survival prediction in kidney transplant recipients.

  4. Spatial Analysis of Severe Fever with Thrombocytopenia Syndrome Virus in China Using a Geographically Weighted Logistic Regression Model

    Directory of Open Access Journals (Sweden)

    Liang Wu

    2016-11-01

    Full Text Available Severe fever with thrombocytopenia syndrome (SFTS is caused by severe fever with thrombocytopenia syndrome virus (SFTSV, which has had a serious impact on public health in parts of Asia. There is no specific antiviral drug or vaccine for SFTSV and, therefore, it is important to determine the factors that influence the occurrence of SFTSV infections. This study aimed to explore the spatial associations between SFTSV infections and several potential determinants, and to predict the high-risk areas in mainland China. The analysis was carried out at the level of provinces in mainland China. The potential explanatory variables that were investigated consisted of meteorological factors (average temperature, average monthly precipitation and average relative humidity, the average proportion of rural population and the average proportion of primary industries over three years (2010–2012. We constructed a geographically weighted logistic regression (GWLR model in order to explore the associations between the selected variables and confirmed cases of SFTSV. The study showed that: (1 meteorological factors have a strong influence on the SFTSV cover; (2 a GWLR model is suitable for exploring SFTSV cover in mainland China; (3 our findings can be used for predicting high-risk areas and highlighting when meteorological factors pose a risk in order to aid in the implementation of public health strategies.

  5. Development of synthetic velocity - depth damage curves using a Weighted Monte Carlo method and Logistic Regression analysis

    Science.gov (United States)

    Vozinaki, Anthi Eirini K.; Karatzas, George P.; Sibetheros, Ioannis A.; Varouchakis, Emmanouil A.

    2014-05-01

    Damage curves are the most significant component of the flood loss estimation models. Their development is quite complex. Two types of damage curves exist, historical and synthetic curves. Historical curves are developed from historical loss data from actual flood events. However, due to the scarcity of historical data, synthetic damage curves can be alternatively developed. Synthetic curves rely on the analysis of expected damage under certain hypothetical flooding conditions. A synthetic approach was developed and presented in this work for the development of damage curves, which are subsequently used as the basic input to a flood loss estimation model. A questionnaire-based survey took place among practicing and research agronomists, in order to generate rural loss data based on the responders' loss estimates, for several flood condition scenarios. In addition, a similar questionnaire-based survey took place among building experts, i.e. civil engineers and architects, in order to generate loss data for the urban sector. By answering the questionnaire, the experts were in essence expressing their opinion on how damage to various crop types or building types is related to a range of values of flood inundation parameters, such as floodwater depth and velocity. However, the loss data compiled from the completed questionnaires were not sufficient for the construction of workable damage curves; to overcome this problem, a Weighted Monte Carlo method was implemented, in order to generate extra synthetic datasets with statistical properties identical to those of the questionnaire-based data. The data generated by the Weighted Monte Carlo method were processed via Logistic Regression techniques in order to develop accurate logistic damage curves for the rural and the urban sectors. A Python-based code was developed, which combines the Weighted Monte Carlo method and the Logistic Regression analysis into a single code (WMCLR Python code). Each WMCLR code execution

  6. Multinomial Logistic Regression Predicted Probability Map To Visualize The Influence Of Socio-Economic Factors On Breast Cancer Occurrence in Southern Karnataka

    Science.gov (United States)

    Madhu, B.; Ashok, N. C.; Balasubramanian, S.

    2014-11-01

    Multinomial logistic regression analysis was used to develop statistical model that can predict the probability of breast cancer in Southern Karnataka using the breast cancer occurrence data during 2007-2011. Independent socio-economic variables describing the breast cancer occurrence like age, education, occupation, parity, type of family, health insurance coverage, residential locality and socioeconomic status of each case was obtained. The models were developed as follows: i) Spatial visualization of the Urban- rural distribution of breast cancer cases that were obtained from the Bharat Hospital and Institute of Oncology. ii) Socio-economic risk factors describing the breast cancer occurrences were complied for each case. These data were then analysed using multinomial logistic regression analysis in a SPSS statistical software and relations between the occurrence of breast cancer across the socio-economic status and the influence of other socio-economic variables were evaluated and multinomial logistic regression models were constructed. iii) the model that best predicted the occurrence of breast cancer were identified. This multivariate logistic regression model has been entered into a geographic information system and maps showing the predicted probability of breast cancer occurrence in Southern Karnataka was created. This study demonstrates that Multinomial logistic regression is a valuable tool for developing models that predict the probability of breast cancer Occurrence in Southern Karnataka.

  7. Revealing Victimization: The Impact of Methodological Features in the National Crime Victimization Survey.

    Science.gov (United States)

    Owens, Jennifer Gatewood

    2017-08-01

    This study examines the impact of methodological features of the National Crime Victimization Survey (NCVS) on respondent willingness to report violent, serious violent, and property victimizations to the NCVS. Bounded and unbounded data from the 1999-2005 NCVS are used to create a longitudinal file of respondents, and survey-weighted logistic regression models are used to assess the factors associated with the reporting of victimization. Net of sociodemographic control variables, unbounded interviews produced higher estimates of serious violence (72%), violence (66%), and property victimization (67%). Mobile respondents reported higher estimates than nonmobile respondents of serious violence (48%), violence (35%), and property victimization (15%). Compared with in-person interviews, interviewing by telephone increased reporting for serious violence (7%), violence (12%), and property victimization (17%). This study highlights the importance of controlling for these factors in both longitudinal and cross-sectional analyses to estimate victimization risk.

  8. Stalking Victimization, Labeling, and Reporting: Findings From the NCVS Stalking Victimization Supplement.

    Science.gov (United States)

    Ménard, Kim S; Cox, Amanda K

    2016-05-01

    Using the National Crime Victimization Survey 2006 Stalking Victimization Supplement (NCVS-SVS) and guided by Greenberg and Ruback's social influence model, this study examines the effects of individual (e.g., severity, sex, victim-offender relationship) and contextual (e.g., location) factors on stalking victimization risk, victim labeling and help seeking, and victim and third-party police contacts. Logistic regression results suggest individual and contextual characteristics matter. Consistent with prior research and the theoretical model, the positive effects of severity and sex (female) were significant across all dependent variables, whereas the interaction effect of victim-offender relationship and location held only for third-party police contacts. © The Author(s) 2015.

  9. Big Five Personality Traits of Cybercrime Victims.

    Science.gov (United States)

    van de Weijer, Steve G A; Leukfeldt, E Rutger

    2017-07-01

    The prevalence of cybercrime has increased rapidly over the last decades and has become part of the everyday life of citizens. It is, therefore, of great importance to gain more knowledge on the factors related to an increased or decreased likelihood of becoming a cybercrime victim. The current study adds to the existing body of knowledge using a large representative sample of Dutch individuals (N = 3,648) to study the relationship between cybercrime victimization and the key traits from the Big Five model of personality (i.e., extraversion, agreeableness, conscientiousness, emotional stability, and openness to experience). First, multinomial logistic regression analyses were used to examine the associations between the personality traits and three victim groups, that is, cybercrime victims versus nonvictims, traditional crime victims versus nonvictims, and cybercrime victims versus traditional crime victims. Next, logistic regression analyses were performed to predict victimization of cyber-dependent crimes (i.e., hacking and virus infection) and cyber-enabled crimes (i.e., online intimidation, online consumer fraud, and theft from bank account). The analyses show that personality traits are not specifically associated with cybercrime victimization, but rather with victimization in general. Only those with higher scores on emotional stability were less likely to become a victim of cybercrime than traditional crime. Furthermore, the results indicate that there are little differences between personality traits related to victimization of cyber-enabled and cyber-dependent crimes. Only individuals with higher scores on openness to experience have higher odds of becoming a victim of cyber-enabled crimes.

  10. SU-F-BRD-01: A Logistic Regression Model to Predict Objective Function Weights in Prostate Cancer IMRT

    Energy Technology Data Exchange (ETDEWEB)

    Boutilier, J; Chan, T; Lee, T [University of Toronto, Toronto, Ontario (Canada); Craig, T; Sharpe, M [University of Toronto, Toronto, Ontario (Canada); The Princess Margaret Cancer Centre - UHN, Toronto, ON (Canada)

    2014-06-15

    Purpose: To develop a statistical model that predicts optimization objective function weights from patient geometry for intensity-modulation radiotherapy (IMRT) of prostate cancer. Methods: A previously developed inverse optimization method (IOM) is applied retrospectively to determine optimal weights for 51 treated patients. We use an overlap volume ratio (OVR) of bladder and rectum for different PTV expansions in order to quantify patient geometry in explanatory variables. Using the optimal weights as ground truth, we develop and train a logistic regression (LR) model to predict the rectum weight and thus the bladder weight. Post hoc, we fix the weights of the left femoral head, right femoral head, and an artificial structure that encourages conformity to the population average while normalizing the bladder and rectum weights accordingly. The population average of objective function weights is used for comparison. Results: The OVR at 0.7cm was found to be the most predictive of the rectum weights. The LR model performance is statistically significant when compared to the population average over a range of clinical metrics including bladder/rectum V53Gy, bladder/rectum V70Gy, and mean voxel dose to the bladder, rectum, CTV, and PTV. On average, the LR model predicted bladder and rectum weights that are both 63% closer to the optimal weights compared to the population average. The treatment plans resulting from the LR weights have, on average, a rectum V70Gy that is 35% closer to the clinical plan and a bladder V70Gy that is 43% closer. Similar results are seen for bladder V54Gy and rectum V54Gy. Conclusion: Statistical modelling from patient anatomy can be used to determine objective function weights in IMRT for prostate cancer. Our method allows the treatment planners to begin the personalization process from an informed starting point, which may lead to more consistent clinical plans and reduce overall planning time.

  11. Evaluation of Prevalence and Related Factors of Pediatric Asthma in Children Under Six Years Old With Logistic Regression and Probit

    Directory of Open Access Journals (Sweden)

    AR Rajaeifard

    2011-08-01

    Full Text Available Introduction & Objective: Asthma is a chronic inflammatory airway disease. Asthma affects one in 13 school age children and is a leading cause of school absenteeism. It seems that prevalence of asthma is increasing wordwide. Many factors are identified and reported as factors related to asthma. This study was carried out to determine the prevalence of asthma and associated factors in 600 children under six years using logistic regression and probit. Materials & Methods: This cross-sectional study was conducted on 600 children under six years old. Questionnaire was constructed based on ISSAC questionnaire and its reliability was determined with a pilot study and calculated by the Cronbach's alpha equal to 69 percent. Cluster sampling based on household records as clusters was performed. Questionnaires were completed by trained staff under supervision of an expert person and by interviewing parents and children. Results: The prevalence of asthma was estimated to be 3.10 (7.89 to 12.78 percent. Based on fitting models to data, factors such as gender, maternal nutrition, exclusive breast feeding to 6 months, smoking at home by a family member and having a history of respiratory allergy in families were significantly associated with asthma prevalence (p-value ≤ 0.05. The results also demonstrated that the both models are almost identical in evaluating the data. Conclusion: This study showed that estimated asthma prevalence is equal to average prevalence reported in Iran. Protective factors, such as exclusive breast feeding as a strategy can be appropriated in children's health care programs and should be much more considered.

  12. Classification of Urban Aerial Data Based on Pixel Labelling with Deep Convolutional Neural Networks and Logistic Regression

    Science.gov (United States)

    Yao, W.; Poleswki, P.; Krzystek, P.

    2016-06-01

    The recent success of deep convolutional neural networks (CNN) on a large number of applications can be attributed to large amounts of available training data and increasing computing power. In this paper, a semantic pixel labelling scheme for urban areas using multi-resolution CNN and hand-crafted spatial-spectral features of airborne remotely sensed data is presented. Both CNN and hand-crafted features are applied to image/DSM patches to produce per-pixel class probabilities with a L1-norm regularized logistical regression classifier. The evidence theory infers a degree of belief for pixel labelling from different sources to smooth regions by handling the conflicts present in the both classifiers while reducing the uncertainty. The aerial data used in this study were provided by ISPRS as benchmark datasets for 2D semantic labelling tasks in urban areas, which consists of two data sources from LiDAR and color infrared camera. The test sites are parts of a city in Germany which is assumed to consist of typical object classes including impervious surfaces, trees, buildings, low vegetation, vehicles and clutter. The evaluation is based on the computation of pixel-based confusion matrices by random sampling. The performance of the strategy with respect to scene characteristics and method combination strategies is analyzed and discussed. The competitive classification accuracy could be not only explained by the nature of input data sources: e.g. the above-ground height of nDSM highlight the vertical dimension of houses, trees even cars and the nearinfrared spectrum indicates vegetation, but also attributed to decision-level fusion of CNN's texture-based approach with multichannel spatial-spectral hand-crafted features based on the evidence combination theory.

  13. Landslide susceptibility mapping using decision-tree based CHi-squared automatic interaction detection (CHAID) and Logistic regression (LR) integration

    Science.gov (United States)

    Althuwaynee, Omar F.; Pradhan, Biswajeet; Ahmad, Noordin

    2014-06-01

    This article uses methodology based on chi-squared automatic interaction detection (CHAID), as a multivariate method that has an automatic classification capacity to analyse large numbers of landslide conditioning factors. This new algorithm was developed to overcome the subjectivity of the manual categorization of scale data of landslide conditioning factors, and to predict rainfall-induced susceptibility map in Kuala Lumpur city and surrounding areas using geographic information system (GIS). The main objective of this article is to use CHi-squared automatic interaction detection (CHAID) method to perform the best classification fit for each conditioning factor, then, combining it with logistic regression (LR). LR model was used to find the corresponding coefficients of best fitting function that assess the optimal terminal nodes. A cluster pattern of landslide locations was extracted in previous study using nearest neighbor index (NNI), which were then used to identify the clustered landslide locations range. Clustered locations were used as model training data with 14 landslide conditioning factors such as; topographic derived parameters, lithology, NDVI, land use and land cover maps. Pearson chi-squared value was used to find the best classification fit between the dependent variable and conditioning factors. Finally the relationship between conditioning factors were assessed and the landslide susceptibility map (LSM) was produced. An area under the curve (AUC) was used to test the model reliability and prediction capability with the training and validation landslide locations respectively. This study proved the efficiency and reliability of decision tree (DT) model in landslide susceptibility mapping. Also it provided a valuable scientific basis for spatial decision making in planning and urban management studies.

  14. Developing a Referral Protocol for Community-Based Occupational Therapy Services in Taiwan: A Logistic Regression Analysis.

    Science.gov (United States)

    Mao, Hui-Fen; Chang, Ling-Hui; Tsai, Athena Yi-Jung; Huang, Wen-Ni; Wang, Jye

    2016-01-01

    Because resources for long-term care services are limited, timely and appropriate referral for rehabilitation services is critical for optimizing clients' functions and successfully integrating them into the community. We investigated which client characteristics are most relevant in predicting Taiwan's community-based occupational therapy (OT) service referral based on experts' beliefs. Data were collected in face-to-face interviews using the Multidimensional Assessment Instrument (MDAI). Community-dwelling participants (n = 221) ≥ 18 years old who reported disabilities in the previous National Survey of Long-term Care Needs in Taiwan were enrolled. The standard for referral was the judgment and agreement of two experienced occupational therapists who reviewed the results of the MDAI. Logistic regressions and Generalized Additive Models were used for analysis. Two predictive models were proposed, one using basic activities of daily living (BADLs) and one using instrumental ADLs (IADLs). Dementia, psychiatric disorders, cognitive impairment, joint range-of-motion limitations, fear of falling, behavioral or emotional problems, expressive deficits (in the BADL-based model), and limitations in IADLs or BADLs were significantly correlated with the need for referral. Both models showed high area under the curve (AUC) values on receiver operating curve testing (AUC = 0.977 and 0.972, respectively). The probability of being referred for community OT services was calculated using the referral algorithm. The referral protocol facilitated communication between healthcare professionals to make appropriate decisions for OT referrals. The methods and findings should be useful for developing referral protocols for other long-term care services.

  15. Study of risk factors affecting both hypertension and obesity outcome by using multivariate multilevel logistic regression models

    Directory of Open Access Journals (Sweden)

    Sepedeh Gholizadeh

    2016-07-01

    Full Text Available Background:Obesity and hypertension are the most important non-communicable diseases thatin many studies, the prevalence and their risk factors have been performedin each geographic region univariately.Study of factors affecting both obesity and hypertension may have an important role which to be adrressed in this study. Materials &Methods:This cross-sectional study was conducted on 1000 men aged 20-70 living in Bushehr province. Blood pressure was measured three times and the average of them was considered as one of the response variables. Hypertension was defined as systolic blood pressure ≥140 (and-or diastolic blood pressure ≥90 and obesity was defined as body mass index ≥25. Data was analyzed by using multilevel, multivariate logistic regression model by MlwiNsoftware. Results:Intra class correlations in cluster level obtained 33% for high blood pressure and 37% for obesity, so two level model was fitted to data. The prevalence of obesity and hypertension obtained 43.6% (0.95%CI; 40.6-46.5, 29.4% (0.95%CI; 26.6-32.1 respectively. Age, gender, smoking, hyperlipidemia, diabetes, fruit and vegetable consumption and physical activity were the factors affecting blood pressure (p≤0.05. Age, gender, hyperlipidemia, diabetes, fruit and vegetable consumption, physical activity and place of residence are effective on obesity (p≤0.05. Conclusion: The multilevel models with considering levels distribution provide more precise estimates. As regards obesity and hypertension are the major risk factors for cardiovascular disease, by knowing the high-risk groups we can d careful planning to prevention of non-communicable diseases and promotion of society health.

  16. Reproductive risk factors assessment for anaemia among pregnant women in India using a multinomial logistic regression model.

    Science.gov (United States)

    Perumal, Vanamail

    2014-07-01

    To assess reproductive risk factors for anaemia among pregnant women in urban and rural areas of India. The International Institute of Population Sciences, India, carried out third National Family Health Survey in 2005-2006 to estimate a key indicator from a sample of ever-married women in the reproductive age group 15-49 years. Data on various dimensions were collected using a structured questionnaire, and anaemia was measured using a portable HemoCue instrument. Anaemia prevalence among pregnant women was compared between rural and urban areas using chi-square test and odds ratio. Multinomial logistic regression analysis was used to determine risk factors. Anaemia prevalence was assessed among 3355 pregnant women from rural areas and 1962 pregnant women from urban areas. Moderate-to-severe anaemia in rural areas (32.4%) is significantly more common than in urban areas (27.3%) with an excess risk of 30%. Gestational age specific prevalence of anaemia significantly increases in rural areas after 6 months. Pregnancy duration is a significant risk factor in both urban and rural areas. In rural areas, increasing age at marriage and mass media exposure are significant protective factors of anaemia. However, more births in the last five years, alcohol consumption and smoking habits are significant risk factors. In rural areas, various reproductive factors and lifestyle characteristics constitute significant risk factors for moderate-to-severe anaemia. Therefore, intensive health education on reproductive practices and the impact of lifestyle characteristics are warranted to reduce anaemia prevalence. © 2014 John Wiley & Sons Ltd.

  17. Personality, Driving Behavior and Mental Disorders Factors as Predictors of Road Traffic Accidents Based on Logistic Regression

    Science.gov (United States)

    Alavi, Seyyed Salman; Mohammadi, Mohammad Reza; Souri, Hamid; Mohammadi Kalhori, Soroush; Jannatifard, Fereshteh; Sepahbodi, Ghazal

    2017-01-01

    Background: The aim of this study was to evaluate the effect of variables such as personality traits, driving behavior and mental illness on road traffic accidents among the drivers with accidents and those without road crash. Methods: In this cohort study, 800 bus and truck drivers were recruited. Participants were selected among drivers who referred to Imam Sajjad Hospital (Tehran, Iran) during 2013-2015. The Manchester driving behavior questionnaire (MDBQ), big five personality test (NEO personality inventory) and semi-structured interview (schizophrenia and affective disorders scale) were used. After two years, we surveyed all accidents due to human factors that involved the recruited drivers. The data were analyzed using the SPSS software by performing the descriptive statistics, t-test, and multiple logistic regression analysis methods. P values less than 0.05 were considered statistically significant. Results: In terms of controlling the effective and demographic variables, the findings revealed significant differences between the two groups of drivers that were and were not involved in road accidents. In addition, it was found that depression and anxiety could increase the odds ratio (OR) of road accidents by 2.4- and 2.7-folds, respectively (P=0.04, P=0.004). It is noteworthy to mention that neuroticism alone can increase the odds of road accidents by 1.1-fold (P=0.009), but other personality factors did not have a significant effect on the equation. Conclusion: The results revealed that some mental disorders affect the incidence of road collisions. Considering the importance and sensitivity of driving behavior, it is necessary to evaluate multiple psychological factors influencing drivers before and after receiving or renewing their driver’s license. PMID:28293047

  18. Developing a Referral Protocol for Community-Based Occupational Therapy Services in Taiwan: A Logistic Regression Analysis.

    Directory of Open Access Journals (Sweden)

    Hui-Fen Mao

    Full Text Available Because resources for long-term care services are limited, timely and appropriate referral for rehabilitation services is critical for optimizing clients' functions and successfully integrating them into the community. We investigated which client characteristics are most relevant in predicting Taiwan's community-based occupational therapy (OT service referral based on experts' beliefs. Data were collected in face-to-face interviews using the Multidimensional Assessment Instrument (MDAI. Community-dwelling participants (n = 221 ≥ 18 years old who reported disabilities in the previous National Survey of Long-term Care Needs in Taiwan were enrolled. The standard for referral was the judgment and agreement of two experienced occupational therapists who reviewed the results of the MDAI. Logistic regressions and Generalized Additive Models were used for analysis. Two predictive models were proposed, one using basic activities of daily living (BADLs and one using instrumental ADLs (IADLs. Dementia, psychiatric disorders, cognitive impairment, joint range-of-motion limitations, fear of falling, behavioral or emotional problems, expressive deficits (in the BADL-based model, and limitations in IADLs or BADLs were significantly correlated with the need for referral. Both models showed high area under the curve (AUC values on receiver operating curve testing (AUC = 0.977 and 0.972, respectively. The probability of being referred for community OT services was calculated using the referral algorithm. The referral protocol facilitated communication between healthcare professionals to make appropriate decisions for OT referrals. The methods and findings should be useful for developing referral protocols for other long-term care services.

  19. Detecting maternal-fetal genotype interactions associated with conotruncal heart defects: a haplotype-based analysis with penalized logistic regression.

    Science.gov (United States)

    Li, Ming; Erickson, Stephen W; Hobbs, Charlotte A; Li, Jingyun; Tang, Xinyu; Nick, Todd G; Macleod, Stewart L; Cleves, Mario A

    2014-04-01

    Nonsyndromic congenital heart defects (CHDs) develop during embryogenesis as a result of a complex interplay between environmental exposures, genetics, and epigenetic causes. Genetic factors associated with CHDs may be attributed to either independent effects of maternal or fetal genes, or the intergenerational interactions between maternal and fetal genes. Detecting gene-by-gene interactions underlying complex diseases is a major challenge in genetic research. Detecting maternal-fetal genotype (MFG) interactions and differentiating them from the maternal/fetal main effects has presented additional statistical challenges due to correlations between maternal and fetal genomes. Traditionally, genetic variants are tested separately for maternal/fetal main effects and MFG interactions on a single-locus basis. We conducted a haplotype-based analysis with a penalized logistic regression framework to dissect the genetic effect associated with the development of nonsyndromic conotruncal heart defects (CTD). Our method allows simultaneous model selection and effect estimation, providing a unified framework to differentiate maternal/fetal main effect from the MFG interaction effect. In addition, the method is able to test multiple highly linked SNPs simultaneously with a configuration of haplotypes, which reduces the data dimensionality and the burden of multiple testing. By analyzing a dataset from the National Birth Defects Prevention Study (NBDPS), we identified seven genes (GSTA1, SOD2, MTRR, AHCYL2, GCLC, GSTM3, and RFC1) associated with the development of CTDs. Our findings suggest that MFG interactions between haplotypes in three of seven genes, GCLC, GSTM3, and RFC1, are associated with nonsyndromic conotruncal heart defects. © 2014 WILEY PERIODICALS, INC.

  20. Digital soil mapping at pilot sites in the northwest coast of Egypt: A multinomial logistic regression approach

    Directory of Open Access Journals (Sweden)

    Fawzy Hassan Abdel-Kader

    2011-06-01

    Full Text Available The study examines a digital soil mapping approach for the production of soil maps by using multinomial logistic regression on soil and terrain information at pilot sites in the Northwestern Coastal region of Egypt. The aim is to reproduce the original map and predict soil distribution in the adherent landscape. Reference soil maps produced by conventional methods at Omayed and Nagamish areas were used. Spectral and terrain parameters were calculated and logit models of the soil classes were developed. Predicted soil classes’ maps were produced. Software’s IDRISI/SAGA/SATISTCA/SPSS were used. The terrain and spectral parameters were found to be significantly influential and the selection of the land surfaces predictors was satisfactory. The McFadden pseudo R-squares ranged from 0.473 to 0.496. The most significant terrain parameters influencing the spatial distribution of the soil classes were elevation, valley depth, multiresolution ridgetop flatness index, multiresolution valley-bottom flatness index, and SAGA wetness index. However, the most influential spectral parameters are the first two principal components of the six Landsat Enhanced Thematic Mapper bands. The overall accuracy of the predicted soil maps ranged from 72% to 74% with a Kappa Index ranging from 0.62 to 0.64. The developed probability models were successfully used to predict the spatial distribution of the soil mapping units at pixel resolutions of 28.5 m × 28.5 m and 90 m × 90 m at adjacent unvisited areas at Matrouh and Alamin. The developed methodology could contribute to the allocation and to the soil digital mapping and management of new expansion sites in remote desert areas of Egypt.

  1. Personality, Driving Behavior and Mental Disorders Factors as Predictors of Road Traffic Accidents Based on Logistic Regression

    Directory of Open Access Journals (Sweden)

    Seyyed Salman Alavi

    2017-01-01

    Full Text Available Background: The aim of this study was to evaluate the effect of variables such as personality traits, driving behavior and mental illness on road traffic accidents among the drivers with accidents and those without road crash. Methods: In this cohort study, 800 bus and truck drivers were recruited. Participants were selected among drivers who referred to Imam Sajjad Hospital (Tehran, Iran during 2013-2015. The Manchester driving behavior questionnaire (MDBQ, big five personality test (NEO personality inventory and semi-structured interview (SADS were used. After two years, we surveyed all accidents due to human factors that involved the recruited drivers. The data were analyzed using the SPSS software by performing the descriptive statistics, t-test, and multiple logistic regression analysis methods. P values less than 0.05 were considered statistically significant. Results: In terms of controlling the effective and demographic variables, the findings revealed significant differences between the two groups of drivers that were and were not involved in road accidents. In addition, it was found that depression and anxiety could increase the odds ratio (OR of road accidents by 2.4- and 2.7-folds, respectively (P=0.04, P=0.004. It is noteworthy to mention that neuroticism alone can increase the odds of road accidents by 1.1-fold (P=0.009, but other personality factors did not have a significant effect on the equation. Conclusion: The results revealed that some mental disorders affect the incidence of road collisions. Considering the importance and sensitivity of driving behavior, it is necessary to evaluate multiple psychological factors influencing drivers before and after receiving or renewing their driver’s license.

  2. A brief conceptual tutorial of multilevel analysis in social epidemiology: using measures of clustering in multilevel logistic regression to investigate contextual phenomena

    DEFF Research Database (Denmark)

    Merlo, J; Chaix, B; Ohlsson, H

    2006-01-01

    STUDY OBJECTIVE: In social epidemiology, it is easy to compute and interpret measures of variation in multilevel linear regression, but technical difficulties exist in the case of logistic regression. The aim of this study was to present measures of variation appropriate for the logistic case...... in a didactic rather than a mathematical way. Design and PARTICIPANTS: Data were used from the health survey conducted in 2000 in the county of Scania, Sweden, that comprised 10 723 persons aged 18-80 years living in 60 areas. Conducting multilevel logistic regression different techniques were applied...... to investigate whether the individual propensity to consult private physicians was statistically dependent on the area of residence (that is, intraclass correlation (ICC), median odds ratio (MOR)), the 80% interval odds ratio (IOR-80), and the sorting out index). RESULTS: The MOR provided more interpretable...

  3. Sample size matters: investigating the effect of sample size on a logistic regression debris flow susceptibility model

    Science.gov (United States)

    Heckmann, T.; Gegg, K.; Gegg, A.; Becht, M.

    2013-06-01

    Predictive spatial modelling is an important task in natural hazard assessment and regionalisation of geomorphic processes or landforms. Logistic regression is a multivariate statistical approach frequently used in predictive modelling; it can be conducted stepwise in order to select from a number of candidate independent variables those that lead to the best model. In our case study on a debris flow susceptibility model, we investigate the sensitivity of model selection and quality to different sample sizes in light of the following problem: on the one hand, a sample has to be large enough to cover the variability of geofactors within the study area, and to yield stable results; on the other hand, the sample must not be too large, because a large sample is likely to violate the assumption of independent observations due to spatial autocorrelation. Using stepwise model selection with 1000 random samples for a number of sample sizes between n = 50 and n = 5000, we investigate the inclusion and exclusion of geofactors and the diversity of the resulting models as a function of sample size; the multiplicity of different models is assessed using numerical indices borrowed from information theory and biodiversity research. Model diversity decreases with increasing sample size and reaches either a local minimum or a plateau; even larger sample sizes do not further reduce it, and approach the upper limit of sample size given, in this study, by the autocorrelation range of the spatial datasets. In this way, an optimised sample size can be derived from an exploratory analysis. Model uncertainty due to sampling and model selection, and its predictive ability, are explored statistically and spatially through the example of 100 models estimated in one study area and validated in a neighbouring area: depending on the study area and on sample size, the predicted probabilities for debris flow release differed, on average, by 7 to 23 percentage points. In view of these results, we

  4. Sample size matters: investigating the effect of sample size on a logistic regression susceptibility model for debris flows

    Science.gov (United States)

    Heckmann, T.; Gegg, K.; Gegg, A.; Becht, M.

    2014-02-01

    Predictive spatial modelling is an important task in natural hazard assessment and regionalisation of geomorphic processes or landforms. Logistic regression is a multivariate statistical approach frequently used in predictive modelling; it can be conducted stepwise in order to select from a number of candidate independent variables those that lead to the best model. In our case study on a debris flow susceptibility model, we investigate the sensitivity of model selection and quality to different sample sizes in light of the following problem: on the one hand, a sample has to be large enough to cover the variability of geofactors within the study area, and to yield stable and reproducible results; on the other hand, the sample must not be too large, because a large sample is likely to violate the assumption of independent observations due to spatial autocorrelation. Using stepwise model selection with 1000 random samples for a number of sample sizes between n = 50 and n = 5000, we investigate the inclusion and exclusion of geofactors and the diversity of the resulting models as a function of sample size; the multiplicity of different models is assessed using numerical indices borrowed from information theory and biodiversity research. Model diversity decreases with increasing sample size and reaches either a local minimum or a plateau; even larger sample sizes do not further reduce it, and they approach the upper limit of sample size given, in this study, by the autocorrelation range of the spatial data sets. In this way, an optimised sample size can be derived from an exploratory analysis. Model uncertainty due to sampling and model selection, and its predictive ability, are explored statistically and spatially through the example of 100 models estimated in one study area and validated in a neighbouring area: depending on the study area and on sample size, the predicted probabilities for debris flow release differed, on average, by 7 to 23 percentage points. In

  5. Building vulnerability to hydro-geomorphic hazards: Estimating damage probability from qualitative vulnerability assessment using logistic regression

    Science.gov (United States)

    Ettinger, Susanne; Mounaud, Loïc; Magill, Christina; Yao-Lafourcade, Anne-Françoise; Thouret, Jean-Claude; Manville, Vern; Negulescu, Caterina; Zuccaro, Giulio; De Gregorio, Daniela; Nardone, Stefano; Uchuchoque, Juan Alexis Luque; Arguedas, Anita; Macedo, Luisa; Manrique Llerena, Nélida

    2016-10-01

    bivariate analyses were applied to better characterize each vulnerability parameter. Multiple corresponding analyses revealed strong relationships between the "Distance to channel or bridges", "Structural building type", "Building footprint" and the observed damage. Logistic regression enabled quantification of the contribution of each explanatory parameter to potential damage, and determination of the significant parameters that express the damage susceptibility of a building. The model was applied 200 times on different calibration and validation data sets in order to examine performance. Results show that 90% of these tests have a success rate of more than 67%. Probabilities (at building scale) of experiencing different damage levels during a future event similar to the 8 February 2013 flash flood are the major outcomes of this study.

  6. Using ordinal logistic regression to evaluate the performance of laser-Doppler predictions of burn-healing time

    Directory of Open Access Journals (Sweden)

    Pape Sarah A

    2009-02-01

    Full Text Available Abstract Background Laser-Doppler imaging (LDI of cutaneous blood flow is beginning to be used by burn surgeons to predict the healing time of burn wounds; predicted healing time is used to determine wound treatment as either dressings or surgery. In this paper, we do a statistical analysis of the performance of the technique. Methods We used data from a study carried out by five burn centers: LDI was done once between days 2 to 5 post burn, and healing was assessed at both 14 days and 21 days post burn. Random-effects ordinal logistic regression and other models such as the continuation ratio model were used to model healing-time as a function of the LDI data, and of demographic and wound history variables. Statistical methods were also used to study the false-color palette, which enables the laser-Doppler imager to be used by clinicians as a decision-support tool. Results Overall performance is that diagnoses are over 90% correct. Related questions addressed were what was the best blood flow summary statistic and whether, given the blood flow measurements, demographic and observational variables had any additional predictive power (age, sex, race, % total body surface area burned (%TBSA, site and cause of burn, day of LDI scan, burn center. It was found that mean laser-Doppler flux over a wound area was the best statistic, and that, given the same mean flux, women recover slightly more slowly than men. Further, the likely degradation in predictive performance on moving to a patient group with larger %TBSA than those in the data sample was studied, and shown to be small. Conclusion Modeling healing time is a complex statistical problem, with random effects due to multiple burn areas per individual, and censoring caused by patients missing hospital visits and undergoing surgery. This analysis applies state-of-the art statistical methods such as the bootstrap and permutation tests to a medical problem of topical interest. New medical findings are

  7. Using Logistic Regression and Random Forests multivariate statistical methods for landslide spatial probability assessment in North-Est Sicily, Italy

    Science.gov (United States)

    Trigila, Alessandro; Iadanza, Carla; Esposito, Carlo; Scarascia-Mugnozza, Gabriele

    2015-04-01

    first phase of the work addressed to identify the spatial relationships between the landslides location and the 13 related factors by using the Frequency Ratio bivariate statistical method. The analysis was then carried out by adopting a multivariate statistical approach, according to the Logistic Regression technique and Random Forests technique that gave best results in terms of AUC. The models were performed and evaluated with different sample sizes and also taking into account the temporal variation of input variables such as burned areas by wildfire. The most significant outcome of this work are: the relevant influence of the sample size on the model results and the strong importance of some environmental factors (e.g. land use and wildfires) for the identification of the depletion zones of extremely rapid shallow landslides.

  8. Methodologies for the assessment of earthquake-triggered landslides hazard. A comparison of Logistic Regression and Artificial Neural Network models.

    Science.gov (United States)

    García-Rodríguez, M. J.; Malpica, J. A.; Benito, B.

    2009-04-01

    In recent years, interest in landslide hazard assessment studies has increased substantially. They are appropriate for evaluation and mitigation plan development in landslide-prone areas. There are several techniques available for landslide hazard research at a regional scale. Generally, they can be classified in two groups: qualitative and quantitative methods. Most of qualitative methods tend to be subjective, since they depend on expert opinions and represent hazard levels in descriptive terms. On the other hand, quantitative methods are objective and they are commonly used due to the correlation between the instability factors and the location of the landslides. Within this group, statistical approaches and new heuristic techniques based on artificial intelligence (artificial neural network (ANN), fuzzy logic, etc.) provide rigorous analysis to assess landslide hazard over large regions. However, they depend on qualitative and quantitative data, scale, types of movements and characteristic factors used. We analysed and compared an approach for assessing earthquake-triggered landslides hazard using logistic regression (LR) and artificial neural networks (ANN) with a back-propagation learning algorithm. One application has been developed in El Salvador, a country of Central America where the earthquake-triggered landslides are usual phenomena. In a first phase, we analysed the susceptibility and hazard associated to the seismic scenario of the 2001 January 13th earthquake. We calibrated the models using data from the landslide inventory for this scenario. These analyses require input variables representing physical parameters to contribute to the initiation of slope instability, for example, slope gradient, elevation, aspect, mean annual precipitation, lithology, land use, and terrain roughness, while the occurrence or non-occurrence of landslides is considered as dependent variable. The results of the landslide susceptibility analysis are checked using landslide

  9. Logistic quantile regression provides improved estimates for bounded avian counts: a case study of California Spotted Owl fledgling production

    Science.gov (United States)

    Brian S. Cade; Barry R. Noon; Rick D. Scherer; John J. Keane

    2017-01-01

    Counts of avian fledglings, nestlings, or clutch size that are bounded below by zero and above by some small integer form a discrete random variable distribution that is not approximated well by conventional parametric count distributions such as the Poisson or negative binomial. We developed a logistic quantile regression model to provide estimates of the empirical...

  10. The analysis of Dutch national data on heavy truck accidents: The necessary extension of traditional frequency counts with logistic regression analysis

    NARCIS (Netherlands)

    Vries, Y.W.R. de; Mooi, H.G.

    2001-01-01

    For the 1993-1997 Dutch national accident data, logistic regression analysis was used to find the most important factors, that influenced the outcome of an accident with a truck involved. Frequency counts were used to identify factors that occurred most frequently. The combination of these two

  11. Logistic regression and artificial neural network models for mapping of regional-scale landslide susceptibility in volcanic mountains of West Java (Indonesia)

    Science.gov (United States)

    Ngadisih, Bhandary, Netra P.; Yatabe, Ryuichi; Dahal, Ranjan K.

    2016-05-01

    West Java Province is the most landslide risky area in Indonesia owing to extreme geo-morphological conditions, climatic conditions and densely populated settlements with immense completed and ongoing development activities. So, a landslide susceptibility map at regional scale in this province is a fundamental tool for risk management and land-use planning. Logistic regression and Artificial Neural Network (ANN) models are the most frequently used tools for landslide susceptibility assessment, mainly because they are capable of handling the nature of landslide data. The main objective of this study is to apply logistic regression and ANN models and compare their performance for landslide susceptibility mapping in volcanic mountains of West Java Province. In addition, the model application is proposed to identify the most contributing factors to landslide events in the study area. The spatial database built in GIS platform consists of landslide inventory, four topographical parameters (slope, aspect, relief, distance to river), three geological parameters (distance to volcano crater, distance to thrust and fault, geological formation), and two anthropogenic parameters (distance to road, land use). The logistic regression model in this study revealed that slope, geological formations, distance to road and distance to volcano are the most influential factors of landslide events while, the ANN model revealed that distance to volcano crater, geological formation, distance to road, and land-use are the most important causal factors of landslides in the study area. Moreover, an evaluation of the model showed that the ANN model has a higher accuracy than the logistic regression model.

  12. Three Statistical Testing Procedures in Logistic Regression: Their Performance in Differential Item Functioning (DIF) Investigation. Research Report. ETS RR-09-35

    Science.gov (United States)

    Paek, Insu

    2009-01-01

    Three statistical testing procedures well-known in the maximum likelihood approach are the Wald, likelihood ratio (LR), and score tests. Although well-known, the application of these three testing procedures in the logistic regression method to investigate differential item function (DIF) has not been rigorously made yet. Employing a variety of…

  13. Logistic regression analysis of factors associated with avascular necrosis of the femoral head following femoral neck fractures in middle-aged and elderly patients.

    Science.gov (United States)

    Ai, Zi-Sheng; Gao, You-Shui; Sun, Yuan; Liu, Yue; Zhang, Chang-Qing; Jiang, Cheng-Hua

    2013-03-01

    Risk factors for femoral neck fracture-induced avascular necrosis of the femoral head have not been elucidated clearly in middle-aged and elderly patients. Moreover, the high incidence of screw removal in China and its effect on the fate of the involved femoral head require statistical methods to reflect their intrinsic relationship. Ninety-nine patients older than 45 years with femoral neck fracture were treated by internal fixation between May 1999 and April 2004. Descriptive analysis, interaction analysis between associated factors, single factor logistic regression, multivariate logistic regression, and detailed interaction analysis were employed to explore potential relationships among associated factors. Avascular necrosis of the femoral head was found in 15 cases (15.2 %). Age × the status of implants (removal vs. maintenance) and gender × the timing of reduction were interactive according to two-factor interactive analysis. Age, the displacement of fractures, the quality of reduction, and the status of implants were found to be significant factors in single factor logistic regression analysis. Age, age × the status of implants, and the quality of reduction were found to be significant factors in multivariate logistic regression analysis. In fine interaction analysis after multivariate logistic regression analysis, implant removal was the most important risk factor for avascular necrosis in 56-to-85-year-old patients, with a risk ratio of 26.00 (95 % CI = 3.076-219.747). The middle-aged and elderly have less incidence of avascular necrosis of the femoral head following femoral neck fractures treated by cannulated screws. The removal of cannulated screws can induce a significantly high incidence of avascular necrosis of the femoral head in elderly patients, while a high-quality reduction is helpful to reduce avascular necrosis.

  14. Patterns and trends in occupational attainment of first jobs in the Netherlands, 1930–1995 : ordinary least squares regression versus conditional multinomial logistic regression

    NARCIS (Netherlands)

    Dessens, Jos A. G.; Jansen, Wim; Ganzeboom, Harry B. G.; Heijden, Peter G. M. van der

    2003-01-01

    This paper brings together the virtues of linear regression models for status attainment models formulated by second-generation social mobility researchers and the strengths of log-linear models formulated by third-generation researchers, into fourth-generation social mobility models, by using

  15. "Logits and Tigers and Bears, Oh My! A Brief Look at the Simple Math of Logistic Regression and How It Can Improve Dissemination of Results"

    Directory of Open Access Journals (Sweden)

    Jason W. Osborne

    2012-06-01

    Full Text Available Logistic regression is slowly gaining acceptance in the social sciences, and fills an important niche in the researcher's toolkit: being able to predict important outcomes that are not continuous in nature. While OLS regression is a valuable tool, it cannot routinely be used to predict outcomes that are binary or categorical in nature. These outcomes represent important social science lines of research: retention in, or dropout from school, using illicit drugs, underage alcohol consumption, antisocial behavior, purchasing decisions, voting patterns, risky behavior, and so on. The goal of this paper is to briefly lead the reader through the surprisingly simple mathematics that underpins logistic regression: probabilities, odds, odds ratios, and logits. Anyone with spreadsheet software or a scientific calculator can follow along, and in turn, this knowledge can be used to make much more interesting, clear, and accurate presentations of results (especially to non-technical audiences. In particular, I will share an example of an interaction in logistic regression, how it was originally graphed, and how the graph was made substantially more user-friendly by converting the original metric (logits to a more readily interpretable metric (probability through three simple steps.

  16. Hospital-level associations with 30-day patient mortality after cardiac surgery: a tutorial on the application and interpretation of marginal and multilevel logistic regression.

    Science.gov (United States)

    Sanagou, Masoumeh; Wolfe, Rory; Forbes, Andrew; Reid, Christopher Michael

    2012-03-12

    Marginal and multilevel logistic regression methods can estimate associations between hospital-level factors and patient-level 30-day mortality outcomes after cardiac surgery. However, it is not widely understood how the interpretation of hospital-level effects differs between these methods. The Australasian Society of Cardiac and Thoracic Surgeons (ASCTS) registry provided data on 32,354 patients undergoing cardiac surgery in 18 hospitals from 2001 to 2009. The logistic regression methods related 30-day mortality after surgery to hospital characteristics with concurrent adjustment for patient characteristics. Hospital-level mortality rates varied from 1.0% to 4.1% of patients. Ordinary, marginal and multilevel regression methods differed with regard to point estimates and conclusions on statistical significance for hospital-level risk factors; ordinary logistic regression giving inappropriately narrow confidence intervals. The median odds ratio, MOR, from the multilevel model was 1.2 whereas ORs for most patient-level characteristics were of greater magnitude suggesting that unexplained between-hospital variation was not as relevant as patient-level characteristics for understanding mortality rates. For hospital-level characteristics in the multilevel model, 80% interval ORs, IOR-80%, supplemented the usual ORs from the logistic regression. The IOR-80% was (0.8 to 1.8) for academic affiliation and (0.6 to 1.3) for the median annual number of cardiac surgery procedures. The width of these intervals reflected the unexplained variation between hospitals in mortality rates; the inclusion of one in each interval suggested an inability to add meaningfully to explaining variation in mortality rates. Marginal and multilevel models take different approaches to account for correlation between patients within hospitals and they lead to different interpretations for hospital-level odds ratios.

  17. Hospital-level associations with 30-day patient mortality after cardiac surgery: a tutorial on the application and interpretation of marginal and multilevel logistic regression

    Directory of Open Access Journals (Sweden)

    Sanagou Masoumeh

    2012-03-01

    Full Text Available Abstract Background Marginal and multilevel logistic regression methods can estimate associations between hospital-level factors and patient-level 30-day mortality outcomes after cardiac surgery. However, it is not widely understood how the interpretation of hospital-level effects differs between these methods. Methods The Australasian Society of Cardiac and Thoracic Surgeons (ASCTS registry provided data on 32,354 patients undergoing cardiac surgery in 18 hospitals from 2001 to 2009. The logistic regression methods related 30-day mortality after surgery to hospital characteristics with concurrent adjustment for patient characteristics. Results Hospital-level mortality rates varied from 1.0% to 4.1% of patients. Ordinary, marginal and multilevel regression methods differed with regard to point estimates and conclusions on statistical significance for hospital-level risk factors; ordinary logistic regression giving inappropriately narrow confidence intervals. The median odds ratio, MOR, from the multilevel model was 1.2 whereas ORs for most patient-level characteristics were of greater magnitude suggesting that unexplained between-hospital variation was not as relevant as patient-level characteristics for understanding mortality rates. For hospital-level characteristics in the multilevel model, 80% interval ORs, IOR-80%, supplemented the usual ORs from the logistic regression. The IOR-80% was (0.8 to 1.8 for academic affiliation and (0.6 to 1.3 for the median annual number of cardiac surgery procedures. The width of these intervals reflected the unexplained variation between hospitals in mortality rates; the inclusion of one in each interval suggested an inability to add meaningfully to explaining variation in mortality rates. Conclusions Marginal and multilevel models take different approaches to account for correlation between patients within hospitals and they lead to different interpretations for hospital-level odds ratios.

  18. Multivariable logistic regression model: a novel mathematical model that predicts visual field sensitivity from macular ganglion cell complex thickness in glaucoma.

    Directory of Open Access Journals (Sweden)

    Daisuke Shiba

    Full Text Available PURPOSE: To design a mathematical model that can predict the relationship between the ganglion cell complex (GCC thickness and visual field sensitivity (VFS in glaucoma patients. DESIGN: Retrospective cross-sectional case series. METHOD: Within 3 months from VFS measurements by the Humphrey field analyzer 10-2 program, 83 eyes underwent macular GCC thickness measurements by spectral-domain optical coherence tomography (SD-OCT. Data were used to construct a multiple logistic model that depicted the relationship between the explanatory variables (GCC thickness, age, sex, and spherical equivalent of refractive errors determined by a regression analysis and the mean VFS corresponding to the SD-OCT scanned area. Analyses were performed in half or 8 segmented local areas as well as in whole scanned areas. A simple logistic model that included GCC thickness as the single explanatory variable was also constructed. The ability of the logistic models to depict the real GCC thickness/VFS in SAP distribution was analyzed by the χ2 test of goodness-of-fit. The significance of the model effect was analyzed by analysis of variance (ANOVA. RESULTS: Scatter plots between the GCC thickness and the mean VFS showed sigmoid curves. The χ2 test of goodness-of-fit revealed that the multiple logistic models showed a good fit for the real GCC thickness/VFS distribution in all areas except the nasal-inferior-outer area. ANOVA revealed that all of the multiple logistic models significantly predicted the VFS based on the explanatory variables. Although simple logistic models also exhibited significant VFS predictability based on the GCC thickness, the model effect was less than that observed for the multiple logistic models. CONCLUSIONS: The currently proposed logistic models are useful methods for depicting relationships between the explanatory variables, including the GCC thickness, and the mean VFS in glaucoma patients.

  19. Relative accuracy of spatial predictive models for lynx Lynx canadensis derived using logistic regression-AIC, multiple criteria evaluation and Bayesian approaches

    Directory of Open Access Journals (Sweden)

    Shelley M. ALEXANDER

    2009-02-01

    Full Text Available We compared probability surfaces derived using one set of environmental variables in three Geographic Information Systems (GIS-based approaches: logistic regression and Akaike’s Information Criterion (AIC, Multiple Criteria Evaluation (MCE, and Bayesian Analysis (specifically Dempster-Shafer theory. We used lynx Lynx canadensis as our focal species, and developed our environment relationship model using track data collected in Banff National Park, Alberta, Canada, during winters from 1997 to 2000. The accuracy of the three spatial models were compared using a contingency table method. We determined the percentage of cases in which both presence and absence points were correctly classified (overall accuracy, the failure to predict a species where it occurred (omission error and the prediction of presence where there was absence (commission error. Our overall accuracy showed the logistic regression approach was the most accurate (74.51%. The multiple criteria evaluation was intermediate (39.22%, while the Dempster-Shafer (D-S theory model was the poorest (29.90%. However, omission and commission error tell us a different story: logistic regression had the lowest commission error, while D-S theory produced the lowest omission error. Our results provide evidence that habitat modellers should evaluate all three error measures when ascribing confidence in their model. We suggest that for our study area at least, the logistic regression model is optimal. However, where sample size is small or the species is very rare, it may also be useful to explore and/or use a more ecologically cautious modelling approach (e.g. Dempster-Shafer that would over-predict, protect more sites, and thereby minimize the risk of missing critical habitat in conservation plans[Current Zoology 55(1: 28 – 40, 2009].

  20. Hospital-level associations with 30-day patient mortality after cardiac surgery: a tutorial on the application and interpretation of marginal and multilevel logistic regression

    OpenAIRE

    Sanagou Masoumeh; Wolfe Rory; Forbes Andrew; Reid Christopher

    2012-01-01

    Abstract Background Marginal and multilevel logistic regression methods can estimate associations between hospital-level factors and patient-level 30-day mortality outcomes after cardiac surgery. However, it is not widely understood how the interpretation of hospital-level effects differs between these methods. Methods The Australasian Society of Cardiac and Thoracic Surgeons (ASCTS) registry provided data on 32,354 patients undergoing cardiac surgery in 18 hospitals from 2001 to 2009. The lo...

  1. A Mathematical Tool for Inference in Logistic Regression with Small-Sized Data Sets: A Practical Application on ISW-Ridge Relationships

    Directory of Open Access Journals (Sweden)

    Tsung-Hao Chen

    2008-01-01

    Full Text Available The general approach to modeling binary data for the purpose of estimating the propagation of an internal solitary wave (ISW is based on the maximum likelihood estimate (MLE method. In cases where the number of observations in the data is small, any inferences made based on the asymptotic distribution of changes in the deviance may be unreliable for binary data (the model's lack of fit is described in terms of a quantity known as the deviance. The deviance for the binary data is given by D. Collett (2003. may be unreliable for binary data. Logistic regression shows that the P-values for the likelihood ratio test and the score test are both <0.05. However, the null hypothesis is not rejected in the Wald test. The seeming discrepancies in P-values obtained between the Wald test and the other two tests are a sign that the large-sample approximation is not stable. We find that the parameters and the odds ratio estimates obtained via conditional exact logistic regression are different from those obtained via unconditional asymptotic logistic regression. Using exact results is a good idea when the sample size is small and the approximate P-values are <0.10. Thus in this study exact analysis is more appropriate.

  2. An application in identifying high-risk populations in alternative tobacco product use utilizing logistic regression and CART: a heuristic comparison.

    Science.gov (United States)

    Lei, Yang; Nollen, Nikki; Ahluwahlia, Jasjit S; Yu, Qing; Mayo, Matthew S

    2015-04-09

    Other forms of tobacco use are increasing in prevalence, yet most tobacco control efforts are aimed at cigarettes. In light of this, it is important to identify individuals who are using both cigarettes and alternative tobacco products (ATPs). Most previous studies have used regression models. We conducted a traditional logistic regression model and a classification and regression tree (CART) model to illustrate and discuss the added advantages of using CART in the setting of identifying high-risk subgroups of ATP users among cigarettes smokers. The data were collected from an online cross-sectional survey administered by Survey Sampling International between July 5, 2012 and August 15, 2012. Eligible participants self-identified as current smokers, African American, White, or Latino (of any race), were English-speaking, and were at least 25 years old. The study sample included 2,376 participants and was divided into independent training and validation samples for a hold out validation. Logistic regression and CART models were used to examine the important predictors of cigarettes + ATP users. The logistic regression model identified nine important factors: gender, age, race, nicotine dependence, buying cigarettes or borrowing, whether the price of cigarettes influences the brand purchased, whether the participants set limits on cigarettes per day, alcohol use scores, and discrimination frequencies. The C-index of the logistic regression model was 0.74, indicating good discriminatory capability. The model performed well in the validation cohort also with good discrimination (c-index = 0.73) and excellent calibration (R-square = 0.96 in the calibration regression). The parsimonious CART model identified gender, age, alcohol use score, race, and discrimination frequencies to be the most important factors. It also revealed interesting partial interactions. The c-index is 0.70 for the training sample and 0.69 for the validation sample. The misclassification

  3. Sociodemographic and Incident Variables as Predictors of Victim Injury From Intimate Partner Violence: Findings From Police Reports.

    Science.gov (United States)

    Karlsson, Marie E; Reid Quiñones, Kathryn; López, Cristina M; Andrews, Arthur R; Wallace, Megan M; Rheingold, Alyssa

    2017-11-01

    Predictors of victim injury from intimate partner violence (IPV) were investigated using 1,292 police reports collected in South Carolina in 2009/2010. All cases were opposite sex adults. Results from bivariate statistics showed that IPV cases with ( n = 649) and without visible injuries ( n = 643) differed on victim gender, victim race, type of relationship, and perpetrator's alcohol use. Results from a logistic regression analysis predicting victim injury showed higher odds ratios for males, Whites, and couples identified as cohabitants. Although most victims, including most injured victims, were Black women, males and Whites were overrepresented in the injured group.

  4. Victimization among female and male sexual minority status groups: evidence from the British Crime Survey 2007-2010.

    Science.gov (United States)

    Mahoney, Bere; Davies, Michelle; Scurlock-Evans, Laura

    2014-01-01

    International surveys of victims show crime rates in England and Wales, including hate crimes, are among the highest in Europe. Nevertheless, sexual minority status is a less considered risk factor in general victimization research. This study used sexual minority status and sex to predict victimization across British Crime Surveys from 2007-2010. Logistic regression analyses showed sexual minority status groups were more likely than heterosexuals to be victimized from any and some specific crimes. However, bisexuals rather than lesbians or gay men were more consistently victimized, notably by sexual attacks and within the household. Implications for understanding victimization among these groups are discussed.

  5. Multiple logistic regression modelling substantiates multifactor contributions associated with sick building syndrome in residential interiors in Mauritius.

    Science.gov (United States)

    Jowaheer, V; Subratty, A H

    2003-03-01

    This paper presents a mathematical model that depicts the relationship between the possibility of occurrence of common health problems and factors leading to Sick Building Syndrome symptoms in domestic interiors in Mauritius. The prevalence of upper respiratory symptoms (dry eyes, runny nose), central nervous system symptoms (headache, nervousness), and musculoskeletal symptoms (pain/stiffness in shoulders/neck) were found to be elevated when responses were statistically regressed to type of building and age of respondents. The model presented here will be useful in helping to identify and quantify the relative role of factors that contribute to Sick Building Syndrome. Thus it may be possible to evaluate the effectiveness of current building operation practices and to prioritise allocations of resources for reduction of risk associated with Indoor Environmental Air Quality.

  6. Tropical Forest Fire Susceptibility Mapping at the Cat Ba National Park Area, Hai Phong City, Vietnam, Using GIS-Based Kernel Logistic Regression

    Directory of Open Access Journals (Sweden)

    Dieu Tien Bui

    2016-04-01

    Full Text Available The Cat Ba National Park area (Vietnam with its tropical forest is recognized as being part of the world biodiversity conservation by the United Nations Educational, Scientific and Cultural Organization (UNESCO and is a well-known destination for tourists, with around 500,000 travelers per year. This area has been the site for many research projects; however, no project has been carried out for forest fire susceptibility assessment. Thus, protection of the forest including fire prevention is one of the main concerns of the local authorities. This work aims to produce a tropical forest fire susceptibility map for the Cat Ba National Park area, which may be helpful for the local authorities in forest fire protection management. To obtain this purpose, first, historical forest fires and related factors were collected from various sources to construct a GIS database. Then, a forest fire susceptibility model was developed using Kernel logistic regression. The quality of the model was assessed using the Receiver Operating Characteristic (ROC curve, area under the ROC curve (AUC, and five statistical evaluation measures. The usability of the resulting model is further compared with a benchmark model, the support vector machine (SVM. The results show that the Kernel logistic regression model has a high level of performance in both the training and validation dataset, with a prediction capability of 92.2%. Since the Kernel logistic regression model outperforms the benchmark model, we conclude that the proposed model is a promising alternative tool that should also be considered for forest fire susceptibility mapping in other areas. The results of this study are useful for the local authorities in forest planning and management.

  7. Discriminating between adaptive and carcinogenic liver hypertrophy in rat studies using logistic ridge regression analysis of toxicogenomic data: The mode of action and predictive models.

    Science.gov (United States)

    Liu, Shujie; Kawamoto, Taisuke; Morita, Osamu; Yoshinari, Kouichi; Honda, Hiroshi

    2017-03-01

    Chemical exposure often results in liver hypertrophy in animal tests, characterized by increased liver weight, hepatocellular hypertrophy, and/or cell proliferation. While most of these changes are considered adaptive responses, there is concern that they may be associated with carcinogenesis. In this study, we have employed a toxicogenomic approach using a logistic ridge regression model to identify genes responsible for liver hypertrophy and hypertrophic hepatocarcinogenesis and to develop a predictive model for assessing hypertrophy-inducing compounds. Logistic regression models have previously been used in the quantification of epidemiological risk factors. DNA microarray data from the Toxicogenomics Project-Genomics Assisted Toxicity Evaluation System were used to identify hypertrophy-related genes that are expressed differently in hypertrophy induced by carcinogens and non-carcinogens. Data were collected for 134 chemicals (72 non-hypertrophy-inducing chemicals, 27 hypertrophy-inducing non-carcinogenic chemicals, and 15 hypertrophy-inducing carcinogenic compounds). After applying logistic ridge regression analysis, 35 genes for liver hypertrophy (e.g., Acot1 and Abcc3) and 13 genes for hypertrophic hepatocarcinogenesis (e.g., Asns and Gpx2) were selected. The predictive models built using these genes were 94.8% and 82.7% accurate, respectively. Pathway analysis of the genes indicates that, aside from a xenobiotic metabolism-related pathway as an adaptive response for liver hypertrophy, amino acid biosynthesis and oxidative responses appear to be involved in hypertrophic hepatocarcinogenesis. Early detection and toxicogenomic characterization of liver hypertrophy using our models may be useful for predicting carcinogenesis. In addition, the identified genes provide novel insight into discrimination between adverse hypertrophy associated with carcinogenesis and adaptive hypertrophy in risk assessment. Copyright © 2017 Elsevier Inc. All rights reserved.

  8. Evaluating risk factors for endemic human Salmonella Enteritidis infections with different phage types in Ontario, Canada using multinomial logistic regression and a case-case study approach

    Directory of Open Access Journals (Sweden)

    Varga Csaba

    2012-10-01

    Full Text Available Abstract Background Identifying risk factors for Salmonella Enteritidis (SE infections in Ontario will assist public health authorities to design effective control and prevention programs to reduce the burden of SE infections. Our research objective was to identify risk factors for acquiring SE infections with various phage types (PT in Ontario, Canada. We hypothesized that certain PTs (e.g., PT8 and PT13a have specific risk factors for infection. Methods Our study included endemic SE cases with various PTs whose isolates were submitted to the Public Health Laboratory-Toronto from January 20th to August 12th, 2011. Cases were interviewed using a standardized questionnaire that included questions pertaining to demographics, travel history, clinical symptoms, contact with animals, and food exposures. A multinomial logistic regression method using the Generalized Linear Latent and Mixed Model procedure and a case-case study design were used to identify risk factors for acquiring SE infections with various PTs in Ontario, Canada. In the multinomial logistic regression model, the outcome variable had three categories representing human infections caused by SE PT8, PT13a, and all other SE PTs (i.e., non-PT8/non-PT13a as a referent category to which the other two categories were compared. Results In the multivariable model, SE PT8 was positively associated with contact with dogs (OR=2.17, 95% CI 1.01-4.68 and negatively associated with pepper consumption (OR=0.35, 95% CI 0.13-0.94, after adjusting for age categories and gender, and using exposure periods and health regions as random effects to account for clustering. Conclusions Our study findings offer interesting hypotheses about the role of phage type-specific risk factors. Multinomial logistic regression analysis and the case-case study approach are novel methodologies to evaluate associations among SE infections with different PTs and various risk factors.

  9. Extraction of potential debris source areas by logistic regression technique: a case study from Barla, Besparmak and Kapi mountains (NW Taurids, Turkey)

    Science.gov (United States)

    Tunusluoglu, M. C.; Gokceoglu, C.; Nefeslioglu, H. A.; Sonmez, H.

    2008-03-01

    Debris flow is one of the most destructive mass movements. Sometimes regional debris flow susceptibility or hazard assessments can be more difficult than the other mass movements. Determination of debris accumulation zones and debris source areas, which is one of the most crucial stages in debris flow investigations, can be too difficult because of morphological restrictions. The main goal of the present study is to extract debris source areas by logistic regression analyses based on the data from the slopes of the Barla, Besparmak and Kapi Mountains in the SW part of the Taurids Mountain belt of Turkey, where formation of debris material are clearly evident and common. In this study, in order to achieve this goal, extensive field observations to identify the areal extent of debris source areas and debris material, air-photo studies to determine the debris source areas and also desk studies including Geographical Information System (GIS) applications and statistical assessments were performed. To justify the training data used in logistic regression analyses as representative, a random sampling procedure was applied. By using the results of the logistic regression analysis, the debris source area probability map of the region is produced. However, according to the field experiences of the authors, the produced map yielded over-predicted results. The main source of the over-prediction is structural relation between the bedding planes and slope aspects on the basis of the field observations, for the generation of debris, the dip of the bedding planes must be taken into consideration regarding the slope face. In order to eliminate this problem, in this study, an approach has been developed using probability distribution of the aspect values. With the application of structural adjustment, the final adjusted debris source area probability map is obtained for the study area. The field observations revealed that the actual debris source areas in the field coincide with

  10. Construction of hazard maps of Hantavirus contagion using Remote Sensing, logistic regression and Artificial Neural Networks: case Araucan\\'ia Region, Chile

    CERN Document Server

    Alvarez, G; Salinas, R

    2016-01-01

    In this research, methods and computational results based on statistical analysis and mathematical modelling, data collection in situ in order to make a hazard map of Hanta Virus infection in the region of Araucania, Chile are presented. The development of this work involves several elements such as Landsat satellite images, biological information regarding seropositivity of Hanta Virus and information concerning positive cases of infection detected in the region. All this information has been processed to find a function that models the danger of contagion in the region, through logistic regression analysis and Artificial Neural Networks

  11. Predicting hyperketonemia by logistic and linear regression using test-day milk and performance variables in early-lactation Holstein and Jersey cows.

    Science.gov (United States)

    Chandler, T L; Pralle, R S; Dórea, J R R; Poock, S E; Oetzel, G R; Fourdraine, R H; White, H M

    2018-03-01

    Although cowside testing strategies for diagnosing hyperketonemia (HYK) are available, many are labor intensive and costly, and some lack sufficient accuracy. Predicting milk ketone bodies by Fourier transform infrared spectrometry during routine milk sampling may offer a more practical monitoring strategy. The objectives of this study were to (1) develop linear and logistic regression models using all available test-day milk and performance variables for predicting HYK and (2) compare prediction methods (Fourier transform infrared milk ketone bodies, linear regression models, and logistic regression models) to determine which is the most predictive of HYK. Given the data available, a secondary objective was to evaluate differences in test-day milk and performance variables (continuous measurements) between Holsteins and Jerseys and between cows with or without HYK within breed. Blood samples were collected on the same day as milk sampling from 658 Holstein and 468 Jersey cows between 5 and 20 d in milk (DIM). Diagnosis of HYK was at a serum β-hydroxybutyrate (BHB) concentration ≥1.2 mmol/L. Concentrations of milk BHB and acetone were predicted by Fourier transform infrared spectrometry (Foss Analytical, Hillerød, Denmark). Thresholds of milk BHB and acetone were tested for diagnostic accuracy, and logistic models were built from continuous variables to predict HYK in primiparous and multiparous cows within breed. Linear models were constructed from continuous variables for primiparous and multiparous cows within breed that were 5 to 11 DIM or 12 to 20 DIM. Milk ketone body thresholds diagnosed HYK with 64.0 to 92.9% accuracy in Holsteins and 59.1 to 86.6% accuracy in Jerseys. Logistic models predicted HYK with 82.6 to 97.3% accuracy. Internally cross-validated multiple linear regression models diagnosed HYK of Holstein cows with 97.8% accuracy for primiparous and 83.3% accuracy for multiparous cows. Accuracy of Jersey models was 81.3% in primiparous and 83

  12. [Evaluation of combined determinations of Epstein-Barr virus antibodies for nasopharyngeal carcinoma assessed with receiver operating characteristic curve based on logistic regression].

    Science.gov (United States)

    Cai, Yong-lin; Zheng, Yu-ming; Cheng, Ji-ru; Li, Jun; Mo, Yong-kun; Zhong, Qing-yan

    2009-10-01

    This study was aimed to investigate the diagnostic value of combined determination of Epstein-Barr virus (EBV) antibodies for nasopharyngeal carcinoma (NPC), including immunoglobulin (Ig) A against EBV capsid antigens (VCA), IgA against early antigens (EA), IgG against BRLF1 transcription activator (Rta) and IgA against EBV nuclear antigen-1 (EBNA1), assessed with receiver operating characteristic (ROC) curve based on logistic regression. Serum samples derived from 211 untreated patients with NPC and 203 non-NPC ENT patients were examined for the presence of VCA/IgA and EA/IgA by immunoenzymatic assay, Rta/IgG and EBNA1/IgA by enzyme-linked immnunosorbent assay (ELISA). The different Logistic regression models were established for various combined determinations of antibodies, respectively. Using the predicted probability as the analyzed variable, ROC curve was applied to evaluate the diagnostic accuracy of different combined determinations. The sensitivity of VCA/IgA (98.1%) and the specificity of EA/IgA (98.5%) were the highest while detecting solely. The results which were analyzed by ROC curve based on Logistic regression showed that the sensitivity and specificity were improved. In two-marker combinations, VCA/IgA + Rta/IgG whose area under ROC curve (AUC) was 0.991 had the highest diagnostic accuracy, and its sensitivity, specificity and Youden index were 94.8%, 98.0% and 0.928 respectively. No significant difference of AUC were found comparing VCA/IgA + Rta/IgG with VCA/IgA + Rta/IgG + EBNA1/IgA and four-marker combination( P > 0.05), of which sensitivity, specificity and Youden index were 94.8%, 98.5%, 0.933 and 96.7%, 97.0%, 0.937, respectively. The approach of ROC curve based on Logistic regression can improve synthetic efficiency for combined determination of multiple markers. The combined determination of VCA/IgA and Rta/IgG with a complementary effect is optimal for NPC serodiagnosis.

  13. Peer and self-reported victimization: Do non-victimized students give victimization nominations to classmates who are self-reported victims?

    Science.gov (United States)

    Oldenburg, Beau; Barrera, Davide; Olthof, Tjeert; Goossens, Frits; van der Meulen, Matty; Vermande, Marjolijn; Aleva, Elisabeth; Sentse, Miranda; Veenstra, René

    2015-08-01

    Using data from 2413 Dutch first-year secondary school students (M age=13.27, SD age=0.51, 49.0% boys), this study investigated as to what extent students who according to their self-reports had not been victimized (referred to as reporters) gave victimization nominations to classmates who according to their self-reports had been victimized (referred to as receivers). Using a dyadic approach, characteristics of the reporter-receiver dyad (i.e., gender similarity) and of the reporter (i.e., reporters' behavior during bullying episodes) that were possibly associated with reporter-receiver agreement were investigated. Descriptive analyses suggested that numerous students who were self-reported victims were not perceived as victimized by their non-victimized classmates. Three-level logistic regression models (reporter-receiver dyads nested in reporters within classrooms) demonstrated greater reporter-receiver agreement in same-gender dyads, especially when the reporter and the receiver were boys. Furthermore, reporters who behaved as outsiders during bullying episodes (i.e., reporters who actively shied away from the bullying) were less likely to agree on the receiver's self-reported victimization, and in contrast, reporters who behaved as defenders (i.e., reporters who helped and supported victims) were more likely to agree on the victimization. Moreover, the results demonstrated that reporters gave fewer victimization nominations to receivers who reported they had been victimized sometimes than to receivers who reported they had been victimized often/very often. Finally, this study suggested that reporter-receiver agreement may not only depend on characteristics of the reporter-receiver dyad and of the reporter, but on classroom characteristics as well (e.g., the number of students in the classroom). Copyright © 2015 Society for the Study of School Psychology. Published by Elsevier Ltd. All rights reserved.

  14. Network exposure and homicide victimization in an African American community.

    Science.gov (United States)

    Papachristos, Andrew V; Wildeman, Christopher

    2014-01-01

    We estimated the association of an individual's exposure to homicide in a social network and the risk of individual homicide victimization across a high-crime African American community. Combining 5 years of homicide and police records, we analyzed a network of 3718 high-risk individuals that was created by instances of co-offending. We used logistic regression to model the odds of being a gunshot homicide victim by individual characteristics, network position, and indirect exposure to homicide. Forty-one percent of all gun homicides occurred within a network component containing less than 4% of the neighborhood's population. Network-level indicators reduced the association between individual risk factors and homicide victimization and improved the overall prediction of individual victimization. Network exposure to homicide was strongly associated with victimization: the closer one is to a homicide victim, the greater the risk of victimization. Regression models show that exposure diminished with social distance: each social tie removed from a homicide victim decreased one's odds of being a homicide victim by 57%. Risk of homicide in urban areas is even more highly concentrated than previously thought. We found that most of the risk of gun violence was concentrated in networks of identifiable individuals. Understanding these networks may improve prediction of individual homicide victimization within disadvantaged communities.

  15. Understanding data in clinical research: a simple graphical display for plotting data (up to four independent variables) after binary logistic regression analysis.

    Science.gov (United States)

    Mesa, José Luis

    2004-01-01

    In clinical research, suitable visualization techniques of data after statistical analysis are crucial for the researches' and physicians' understanding. Common statistical techniques to analyze data in clinical research are logistic regression models. Among these, the application of binary logistic regression analysis (LRA) has greatly increased during past years, due to its diagnostic accuracy and because scientists often want to analyze in a dichotomous way whether some event will occur or not. Such an analysis lacks a suitable, understandable, and widely used graphical display, instead providing an understandable logit function based on a linear model for the natural logarithm of the odds in favor of the occurrence of the dependent variable, Y. By simple exponential transformation, such a logit equation can be transformed into a logistic function, resulting in predicted probabilities for the presence of the dependent variable, P(Y-1/X). This model can be used to generate a simple graphical display for binary LRA. For the case of a single predictor or explanatory (independent) variable, X, a plot can be generated with X represented by the abscissa (i.e., horizontal axis) and P(Y-1/X) represented by the ordinate (i.e., vertical axis). For the case of multiple predictor models, I propose here a relief 3D surface graphic in order to plot up to four independent variables (two continuous and two discrete). By using this technique, any researcher or physician would be able to transform a lesser understandable logit function into a figure easier to grasp, thus leading to a better knowledge and interpretation of data in clinical research. For this, a sophisticated statistical package is not necessary, because the graphical display may be generated by using any 2D or 3D surface plotter.

  16. Ordinal logistic regression versus multiple binary logistic regression ...

    African Journals Online (AJOL)

    In this article, we modeled Higher Education Loans Board (HELB) loan application data from three public universities to determine whether the loan was ... It is expected that proper determination of the most accurate model will go a long way in minimizing the number of mis-classifications when awarding HELB loan.

  17. [Value of PI-RADS v2 scores combined with prostate specific antigen in diagnosis of peripheral zone prostate cancer: a logistic regression analysis].

    Science.gov (United States)

    Lei, Li-Zhi; Xu, Yi-Kai; Hou, Mei-Rong; He, Meng-Qi

    2017-08-20

    To assess the value of Prostate Imaging and Reporting and Data System: Version 2 (PI-RADS v2) combined with prostate specific antigen (PSA) in the diagnosis of peripheral zone (PZ) prostate cancer (PCa). The preoperative magnetic resonance imaging and PSA data were ananlyzed for 69 patients with pathologically confirmed PCa and 109 non-PCa patients. PI-RADS v2 scores (1-5) was used to evaluate the risk of PZ PCa. The total PSA (tPSA) level, free to total PSA ratio (f/t PSA), PSA density (PSAD), PZ-PSAD and PI-RADS v2 scores were compared between the PCa and non-PCa patients. Logistic regression models were established with parameters that differed significantly the two groups. The receiver opearting characteristics (ROC) curve was constructed based on the P values derived from the logical regression models and PI-RADS scores to assess the diagnostic efficiency. PI-RADS v2 score, tPSA, f/t PSA, PSAD and PZ-PSAD differed significantly between the two groups (PPI-RADS v2+ 0.223tPSA (A), Logit P=-4.354+1.586PI-RADS v2-12.7841f/tPSA (B), Logit P=-8.993+1.630PI-RADS v2+17.091PSAD (C), and Logit P=-9.434+1.596PI-RADS v2+10.494PZ-PSAD (D), whose area under the ROC curves was 0.908, 0.891, 0.944, and 0.961, respectively, all significantly greater than that of PI-RADS v2 score (PPI-RADS v2 score alone, the combination of PI-RADS v2 score and PSA in the logistic regression model can improve the diagnostic efficiency of PZ PCa and offers better confidence in the decision of biopsy in suspected cases.

  18. Uso de regressões logísticas múltiplas para mapeamento digital de solos no Planalto Médio do RS Multiple logistic regression applied to soil survey in rio grande do sul state, Brazil

    Directory of Open Access Journals (Sweden)

    Samuel Ribeiro Figueiredo

    2008-12-01

    Full Text Available Regressões nominais logísticas estabelecem relações matemáticas entre variáveis independentes contínuas ou discretas e variáveis dependentes discretas. Essas foram avaliadas quanto ao seu potencial em predizer a ocorrência e distribuição de classes de solos na região dos municípios de Ibirubá e Quinze de Novembro (RS. A partir de modelo numérico de terreno digital (MNT com 90 m de resolução, foram calculadas variáveis de terreno topográficas (elevação, declividade e curvatura e hidrográficas (distância dos rios, índice de umidade topográfica, comprimento de fluxo de escoamento e índice de poder de escoamento. Foram então estabelecidas regressões logísticas múltiplas entre as classes de solos da região com base em levantamento tradicional na escala 1:80.000 e as variáveis de terreno. As regressões serviram para calcular a probabilidade de ocorrência de cada classe de solo, e o mapa final de solos estimado foi produzido atribuindo-se a cada célula do mapa a denominação da classe de solo com maior probabilidade de ocorrência. Observou-se acurácia geral (AG de 58 % e acurácia pelo coeficiente Kappa de Cohen de 38 %, comparando-se o mapa original com o mapa estimado dentro da escala original. Uma simplificação de escala foi pouco significativa para o aumento da acurácia do mapa, sendo 61 % de AG e 39 % de Kappa. Concluiu-se que as regressões logísticas múltiplas apresentaram potencial preditivo para serem usadas como ferramentas no mapeamento supervisionado de solos.Logistic nominal regressions establish mathematical relations between continuous or discrete independent variables and discrete dependent variables. The prediction potential of the occurrence and distribution of soil classes in the region Ibirubá and Quinze de Novembro, RS, Brazil was evaluated. Using a digital elevation model (DEM with 90 m resolution, were calculated several topographic characteristics (elevation, slope, and curvature and

  19. A comparative study of frequency ratio, weights of evidence and logistic regression methods for landslide susceptibility mapping: Sultan Mountains, SW Turkey

    Science.gov (United States)

    Ozdemir, Adnan; Altural, Tolga

    2013-03-01

    This study evaluated and compared landslide susceptibility maps produced with three different methods, frequency ratio, weights of evidence, and logistic regression, by using validation datasets. The field surveys performed as part of this investigation mapped the locations of 90 landslides that had been identified in the Sultan Mountains of south-western Turkey. The landslide influence parameters used for this study are geology, relative permeability, land use/land cover, precipitation, elevation, slope, aspect, total curvature, plan curvature, profile curvature, wetness index, stream power index, sediment transportation capacity index, distance to drainage, distance to fault, drainage density, fault density, and spring density maps. The relationships between landslide distributions and these parameters were analysed using the three methods, and the results of these methods were then used to calculate the landslide susceptibility of the entire study area. The accuracy of the final landslide susceptibility maps was evaluated based on the landslides observed during the fieldwork, and the accuracy of the models was evaluated by calculating each model's relative operating characteristic curve. The predictive capability of each model was determined from the area under the relative operating characteristic curve and the areas under the curves obtained using the frequency ratio, logistic regression, and weights of evidence methods are 0.976, 0.952, and 0.937, respectively. These results indicate that the frequency ratio and weights of evidence models are relatively good estimators of landslide susceptibility in the study area. Specifically, the results of the correlation analysis show a high correlation between the frequency ratio and weights of evidence results, and the frequency ratio and logistic regression methods exhibit correlation coefficients of 0.771 and 0.727, respectively. The frequency ratio model is simple, and its input, calculation and output processes are

  20. Geographic information systems and logistic regression for high-resolution malaria risk mapping in a rural settlement of the southern Brazilian Amazon.

    Science.gov (United States)

    de Oliveira, Elaine Cristina; dos Santos, Emerson Soares; Zeilhofer, Peter; Souza-Santos, Reinaldo; Atanaka-Santos, Marina

    2013-11-15

    In Brazil, 99% of the cases of malaria are concentrated in the Amazon region, with high level of transmission. The objectives of the study were to use geographic information systems (GIS) analysis and logistic regression as a tool to identify and analyse the relative likelihood and its socio-environmental determinants of malaria infection in the Vale do Amanhecer rural settlement, Brazil. A GIS database of georeferenced malaria cases, recorded in 2005, and multiple explanatory data layers was built, based on a multispectral Landsat 5 TM image, digital map of the settlement blocks and a SRTM digital elevation model. Satellite imagery was used to map the spatial patterns of land use and cover (LUC) and to derive spectral indices of vegetation density (NDVI) and soil/vegetation humidity (VSHI). An Euclidian distance operator was applied to measure proximity of domiciles to potential mosquito breeding habitats and gold mining areas. The malaria risk model was generated by multiple logistic regression, in which environmental factors were considered as independent variables and the number of cases, binarized by a threshold value was the dependent variable. Out of a total of 336 cases of malaria, 133 positive slides were from inhabitants at Road 08, which corresponds to 37.60% of the notifications. The southern region of the settlement presented 276 cases and a greater number of domiciles in which more than ten cases/home were notified. From these, 102 (30.36%) cases were caused by Plasmodium falciparum and 174 (51.79%) cases by Plasmodium vivax. Malaria risk is the highest in the south of the settlement, associated with proximity to gold mining sites, intense land use, high levels of soil/vegetation humidity and low vegetation density. Mid-resolution, remote sensing data and GIS-derived distance measures can be successfully combined with digital maps of the housing location of (non-) infected inhabitants to predict relative likelihood of disease infection through the

  1. Linear and logistic regression analysis

    NARCIS (Netherlands)

    Tripepi, G.; Jager, K. J.; Dekker, F. W.; Zoccali, C.

    2008-01-01

    In previous articles of this series, we focused on relative risks and odds ratios as measures of effect to assess the relationship between exposure to risk factors and clinical outcomes and on control for confounding. In randomized clinical trials, the random allocation of patients is hoped to

  2. Comparison of Artificial Neural Network with Logistic Regression as Classification Models for Variable Selection for Prediction of Breast Cancer Patient Outcomes

    Directory of Open Access Journals (Sweden)

    Valérie Bourdès

    2010-01-01

    Full Text Available The aim of this study was to compare multilayer perceptron neural networks (NNs with standard logistic regression (LR to identify key covariates impacting on mortality from cancer causes, disease-free survival (DFS, and disease recurrence using Area Under Receiver-Operating Characteristics (AUROC in breast cancer patients. From 1996 to 2004, 2,535 patients diagnosed with primary breast cancer entered into the study at a single French centre, where they received standard treatment. For specific mortality as well as DFS analysis, the ROC curves were greater with the NN models compared to LR model with better sensitivity and specificity. Four predictive factors were retained by both approaches for mortality: clinical size stage, Scarff Bloom Richardson grade, number of invaded nodes, and progesterone receptor. The results enhanced the relevance of the use of NN models in predictive analysis in oncology, which appeared to be more accurate in prediction in this French breast cancer cohort.

  3. Measurement equivalence of the KINDL questionnaire across child self-reports and parent proxy-reports: a comparison between item response theory and ordinal logistic regression.

    Science.gov (United States)

    Jafari, Peyman; Sharafi, Zahra; Bagheri, Zahra; Shalileh, Sara

    2014-06-01

    Measurement equivalence is a necessary assumption for meaningful comparison of pediatric quality of life rated by children and parents. In this study, differential item functioning (DIF) analysis is used to examine whether children and their parents respond consistently to the items in the KINDer Lebensqualitätsfragebogen (KINDL; in German, Children Quality of Life Questionnaire). Two DIF detection methods, graded response model (GRM) and ordinal logistic regression (OLR), were applied for comparability. The KINDL was completed by 1,086 school children and 1,061 of their parents. While the GRM revealed that 12 out of the 24 items were flagged with DIF, the OLR identified 14 out of the 24 items with DIF. Seven items with DIF and five items without DIF were common across the two methods, yielding a total agreement rate of 50 %. This study revealed that parent proxy-reports cannot be used as a substitute for a child's ratings in the KINDL.

  4. Models of logistic regression analysis, support vector machine, and back-propagation neural network based on serum tumor markers in colorectal cancer diagnosis.

    Science.gov (United States)

    Zhang, B; Liang, X L; Gao, H Y; Ye, L S; Wang, Y G

    2016-05-13

    We evaluated the application of three machine learning algorithms, including logistic regression, support vector machine and back-propagation neural network, for diagnosing congenital heart disease and colorectal cancer. By inspecting related serum tumor marker levels in colorectal cancer patients and healthy subjects, early diagnosis models for colorectal cancer were built using three machine learning algorithms to assess their corresponding diagnostic values. Except for serum alpha-fetoprotein, the levels of 11 other serum markers of patients in the colorectal cancer group were higher than those in the benign colorectal cancer group (P model and back-propagation, a neural network diagnosis model was built with diagnostic accuracies of 82 and 75%, sensitivities of 85 and 80%, and specificities of 80 and 70%, respectively. Colorectal cancer diagnosis models based on the three machine learning algorithms showed high diagnostic value and can help obtain evidence for the early diagnosis of colorectal cancer.

  5. Spatio-temporal analyses of cropland degradation in the irrigated lowlands of Uzbekistan using remote-sensing and logistic regression modeling.

    Science.gov (United States)

    Dubovyk, Olena; Menz, Gunter; Conrad, Christopher; Kan, Elena; Machwitz, Miriam; Khamzina, Asia

    2013-06-01

    Advancing land degradation in the irrigated areas of Central Asia hinders sustainable development of this predominantly agricultural region. To support decisions on mitigating cropland degradation, this study combines linear trend analysis and spatial logistic regression modeling to expose a land degradation trend in the Khorezm region, Uzbekistan, and to analyze the causes. Time series of the 250-m MODIS NDVI, summed over the growing seasons of 2000-2010, were used to derive areas with an apparent negative vegetation trend; this was interpreted as an indicator of land degradation. About one third (161,000 ha) of the region's area experienced negative trends of different magnitude. The vegetation decline was particularly evident on the low-fertility lands bordering on the natural sandy desert, suggesting that these areas should be prioritized in mitigation planning. The results of logistic modeling indicate that the spatial pattern of the observed trend is mainly associated with the level of the groundwater table (odds = 330 %), land-use intensity (odds = 103 %), low soil quality (odds = 49 %), slope (odds = 29 %), and salinity of the groundwater (odds = 26 %). Areas, threatened by land degradation, were mapped by fitting the estimated model parameters to available data. The elaborated approach, combining remote-sensing and GIS, can form the basis for developing a common tool for monitoring land degradation trends in irrigated croplands of Central Asia.

  6. Using occupancy modeling and logistic regression to assess the distribution of shrimp species in lowland streams, Costa Rica: Does regional groundwater create favorable habitat?

    Science.gov (United States)

    Snyder, Marcia; Freeman, Mary C.; Purucker, S. Thomas; Pringle, Catherine M.

    2016-01-01

    Freshwater shrimps are an important biotic component of tropical ecosystems. However, they can have a low probability of detection when abundances are low. We sampled 3 of the most common freshwater shrimp species, Macrobrachium olfersii, Macrobrachium carcinus, and Macrobrachium heterochirus, and used occupancy modeling and logistic regression models to improve our limited knowledge of distribution of these cryptic species by investigating both local- and landscape-scale effects at La Selva Biological Station in Costa Rica. Local-scale factors included substrate type and stream size, and landscape-scale factors included presence or absence of regional groundwater inputs. Capture rates for 2 of the sampled species (M. olfersii and M. carcinus) were sufficient to compare the fit of occupancy models. Occupancy models did not converge for M. heterochirus, but M. heterochirus had high enough occupancy rates that logistic regression could be used to model the relationship between occupancy rates and predictors. The best-supported models for M. olfersii and M. carcinus included conductivity, discharge, and substrate parameters. Stream size was positively correlated with occupancy rates of all 3 species. High stream conductivity, which reflects the quantity of regional groundwater input into the stream, was positively correlated with M. olfersii occupancy rates. Boulder substrates increased occupancy rate of M. carcinus and decreased the detection probability of M. olfersii. Our models suggest that shrimp distribution is driven by factors that function at local (substrate and discharge) and landscape (conductivity) scales.

  7. Confirming the validity of the CONUT system for early detection and monitoring of clinical undernutrition: comparison with two logistic regression models developed using SGA as the gold standard.

    Science.gov (United States)

    González-Madroño, A; Mancha, A; Rodríguez, F J; Culebras, J; de Ulibarri, J I

    2012-01-01

    To ratify previous validations of the CONUT nutritional screening tool by the development of two probabilistic models using the parameters included in the CONUT, to see if the CONUT´s effectiveness could be improved. It is a two step prospective study. In Step 1, 101 patients were randomly selected, and SGA and CONUT was made. With data obtained an unconditional logistic regression model was developed, and two variants of CONUT were constructed: Model 1 was made by a method of logistic regression. Model 2 was made by dividing the probabilities of undernutrition obtained in model 1 in seven regular intervals. In step 2, 60 patients were selected and underwent the SGA, the original CONUT and the new models developed. The diagnostic efficacy of the original CONUT and the new models was tested by means of ROC curves. Both samples 1 and 2 were put together to measure the agreement degree between the original CONUT and SGA, and diagnostic efficacy parameters were calculated. No statistically significant differences were found between sample 1 and 2, regarding age, sex and medical/surgical distribution and undernutrition rates were similar (over 40%). The AUC for the ROC curves were 0.862 for the original CONUT, and 0.839 and 0.874, for model 1 and 2 respectively. The kappa index for the CONUT and SGA was 0.680. The CONUT, with the original scores assigned by the authors is equally good than mathematical models and thus is a valuable tool, highly useful and efficient for the purpose of Clinical Undernutrition screening.

  8. Regresión logística: Un ejemplo de su uso en Endocrinología Logistic regression: An example of its use in Endocrinology

    Directory of Open Access Journals (Sweden)

    Emma Domínguez Alonso

    2001-04-01

    Full Text Available Se intentó un acercamiento a la regresión logística, como una de las técnicas estadísticas multivariadas de más frecuente uso en las últimas décadas, para orientar a su uso correcto. Se consideraron cuestiones de tipo práctico como número de sujetos necesarios para aplicarla, situaciones en las que está indicado su uso, tipo de variables a las que es posible aplicarla y las formas en que puede ser incluida en el modelo, interpretación de los resultados, etc. Se mostró un ejemplo de la aplicación de esta técnica en una investigación en el campo de la Endocrinología. Se concluyó que la regresión logística resulta de gran utilidad para su aplicación en cualquier campo de la investigación médica cuando necesitamos precisar el efecto de un grupo de variables, consideradas potencialmente influyentes, sobre la ocurrencia de un determinado proceso.An approach to logistic regression , as one of the most used multivariate statistical techniques in the last decades, was made to recommend its correct use. Practical questions as the number of subjects necessary for its application, the situations in which it should be used, the type of variables to which it may be applied, the way it may be included in the model, the interpretation of the results, etc., were taken into consideration. An example of the application of this technique in the field of Endocrinology was given. It was concluded that the application of logistic regression is very useful in any field of medical research when we need to determine the effect of a group of variables, potentially considered as influential, on the ocurrence of a certain process.

  9. Alcohol Involvement in Homicide Victimization in the U.S

    Science.gov (United States)

    Naimi, Timothy S.; Xuan, Ziming; Cooper, Susanna E.; Coleman, Sharon M.; Hadland, Scott E.; Swahn, Monica H.; Heeren, Timothy C.

    2016-01-01

    Background Although the association between alcohol and homicide is well documented, there has been no recent study of alcohol involvement in homicide victimization in U.S. states. The objective of this paper was to determine the prevalence of alcohol involvement in homicide victimization and identify socio-demographic and other factors associated with alcohol involvement in homicide victimization. Methods Data from homicide victims with a reported blood alcohol content (BAC) level were analyzed from 17 states from 2010–12 using the National Violent Death Reporting System. Logistic regression was used to investigate factors associated with the odds of homicide victims having a BAC ≥0.08%. Results Among all homicide victims, 39.9% had a positive BAC including 13.7% with a BAC between 0.01%–0.79% and 26.2% of victims with a BAC ≥0.08%. Males were twice as likely as females to have a BAC ≥0.08% (29.1% vs. 15.2%; p homicide victims having a BAC ≥0.08 included male sex, American Indian/Alaska Native race, Hispanic ethnicity, history of intimate partner violence, and non-firearm homicides. Conclusions Alcohol is present in a substantial proportion of homicide victims in the U.S., with substantial variation by state, demographic and circumstantial characteristics. Future studies should explore the relationships between state-level alcohol policies and alcohol-involvement among perpetrators and victims of homicide. PMID:27676334

  10. The impact of meteorology on the occurrence of waterborne outbreaks of vero cytotoxin-producing Escherichia coli (VTEC): a logistic regression approach.

    Science.gov (United States)

    O'Dwyer, Jean; Morris Downes, Margaret; Adley, Catherine C

    2016-02-01

    This study analyses the relationship between meteorological phenomena and outbreaks of waterborne-transmitted vero cytotoxin-producing Escherichia coli (VTEC) in the Republic of Ireland over an 8-year period (2005-2012). Data pertaining to the notification of waterborne VTEC outbreaks were extracted from the Computerised Infectious Disease Reporting system, which is administered through the national Health Protection Surveillance Centre as part of the Health Service Executive. Rainfall and temperature data were obtained from the national meteorological office and categorised as cumulative rainfall, heavy rainfall events in the previous 7 days, and mean temperature. Regression analysis was performed using logistic regression (LR) analysis. The LR model was significant (p < 0.001), with all independent variables: cumulative rainfall, heavy rainfall and mean temperature making a statistically significant contribution to the model. The study has found that rainfall, particularly heavy rainfall in the preceding 7 days of an outbreak, is a strong statistical indicator of a waterborne outbreak and that temperature also impacts waterborne VTEC outbreak occurrence.

  11. Gene expression profiling of breast cancer survivability by pooled cDNA microarray analysis using logistic regression, artificial neural networks and decision trees.

    Science.gov (United States)

    Chou, Hsiu-Ling; Yao, Chung-Tay; Su, Sui-Lun; Lee, Chia-Yi; Hu, Kuang-Yu; Terng, Harn-Jing; Shih, Yun-Wen; Chang, Yu-Tien; Lu, Yu-Fen; Chang, Chi-Wen; Wahlqvist, Mark L; Wetter, Thomas; Chu, Chi-Ming

    2013-03-19

    Microarray technology can acquire information about thousands of genes simultaneously. We analyzed published breast cancer microarray databases to predict five-year recurrence and compared the performance of three data mining algorithms of artificial neural networks (ANN), decision trees (DT) and logistic regression (LR) and two composite models of DT-ANN and DT-LR. The collection of microarray datasets from the Gene Expression Omnibus, four breast cancer datasets were pooled for predicting five-year breast cancer relapse. After data compilation, 757 subjects, 5 clinical variables and 13,452 genetic variables were aggregated. The bootstrap method, Mann-Whitney U test and 20-fold cross-validation were performed to investigate candidate genes with 100 most-significant p-values. The predictive powers of DT, LR and ANN models were assessed using accuracy and the area under ROC curve. The associated genes were evaluated using Cox regression. The DT models exhibited the lowest predictive power and the poorest extrapolation when applied to the test samples. The ANN models displayed the best predictive power and showed the best extrapolation. The 21 most-associated genes, as determined by integration of each model, were analyzed using Cox regression with a 3.53-fold (95% CI: 2.24-5.58) increased risk of breast cancer five-year recurrence. The 21 selected genes can predict breast cancer recurrence. Among these genes, CCNB1, PLK1 and TOP2A are in the cell cycle G2/M DNA damage checkpoint pathway. Oncologists can offer the genetic information for patients when understanding the gene expression profiles on breast cancer recurrence.

  12. A Multi-way Multi-task Learning Approach for Multinomial Logistic Regression*. An Application in Joint Prediction of Appointment Miss-opportunities across Multiple Clinics.

    Science.gov (United States)

    Alaeddini, Adel; Hong, Seung Hee

    2017-08-11

    Whether they have been engineered for it or not, most healthcare systems experience a variety of unexpected events such as appointment miss-opportunities that can have significant impact on their revenue, cost and resource utilization. In this paper, a multi-way multi-task learning model based on multinomial logistic regression is proposed to jointly predict the occurrence of different types of miss-opportunities at multiple clinics. An extension of L 1  / L 2 regularization is proposed to enable transfer of information among various types of miss-opportunities as well as different clinics. A proximal algorithm is developed to transform the convex but non-smooth likelihood function of the multi-way multi-task learning model into a convex and smooth optimization problem solvable using gradient descent algorithm. A dataset of real attendance records of patients at four different clinics of a VA medical center is used to verify the performance of the proposed multi-task learning approach. Additionally, a simulation study, investigating more general data situations is provided to highlight the specific aspects of the proposed approach. Various individual and integrated multinomial logistic regression models with/without LASSO penalty along with a number of other common classification algorithms are fitted and compared against the proposed multi-way multi-task learning approach. Fivefold cross validation is used to estimate comparing models parameters and their predictive accuracy. The multi-way multi-task learning framework enables the proposed approach to achieve a considerable rate of parameter shrinkage and superior prediction accuracy across various types of miss-opportunities and clinics. The proposed approach provides an integrated structure to effectively transfer knowledge among different miss-opportunities and clinics to reduce model size, increase estimation efficacy, and more importantly improve predictions results. The proposed framework can be

  13. Predictors of success of external cephalic version and cephalic presentation at birth among 1253 women with non-cephalic presentation using logistic regression and classification tree analyses.

    Science.gov (United States)

    Hutton, Eileen K; Simioni, Julia C; Thabane, Lehana

    2017-08-01

    Among women with a fetus with a non-cephalic presentation, external cephalic version (ECV) has been shown to reduce the rate of breech presentation at birth and cesarean birth. Compared with ECV at term, beginning ECV prior to 37 weeks' gestation decreases the number of infants in a non-cephalic presentation at birth. The purpose of this secondary analysis was to investigate factors associated with a successful ECV procedure and to present this in a clinically useful format. Data were collected as part of the Early ECV Pilot and Early ECV2 Trials, which randomized 1776 women with a fetus in breech presentation to either early ECV (34-36 weeks' gestation) or delayed ECV (at or after 37 weeks). The outcome of interest was successful ECV, defined as the fetus being in a cephalic presentation immediately following the procedure, as well as at the time of birth. The importance of several factors in predicting successful ECV was investigated using two statistical methods: logistic regression and classification and regression tree (CART) analyses. Among nulliparas, non-engagement of the presenting part and an easily palpable fetal head were independently associated with success. Among multiparas, non-engagement of the presenting part, gestation less than 37 weeks and an easily palpable fetal head were found to be independent predictors of success. These findings were consistent with results of the CART analyses. Regardless of parity, descent of the presenting part was the most discriminating factor in predicting successful ECV and cephalic presentation at birth. © 2017 Nordic Federation of Societies of Obstetrics and Gynecology.

  14. Using hair and fingernails in binary logistic regression for bio-monitoring of heavy metals/metalloid in groundwater in intensively agricultural areas, Thailand.

    Science.gov (United States)

    Wongsasuluk, Pokkate; Chotpantarat, Srilert; Siriwong, Wattasit; Robson, Mark

    2018-04-01

    In this study, the hair and fingernails of the local people in an intensively cultivated agricultural area in Ubon Ratchathani province, Thailand, were used as biomarkers of exposure to arsenic (As) and heavy metals. The study area has shallow acidic groundwater that is contaminated with As and heavy metals. The local people often consume this shallow groundwater; thus, they are exposed to As and heavy metals. Hair and fingernail samples were collected to characterize the differences between shallow groundwater drinking (SGWD) and tap water drinking (TWD) residents. The concentrations of As and the heavy metals Cd, Pb and Hg were significantly higher in the hair samples from the SGWD group than those from the TWD group, especially for As (0.020-0.571 vs. 0.024-0.359µg/g) and Cd (0.009-0.575 vs. 0.013-0.230µg/g). Similarly, the concentrations of As and the heavy metals in the fingernail samples collected from the SGWD group were larger than those of the TWD group, especially for As (0.039-2.440µg/g vs. 0.049-0.806µg/g). The χ 2 statistic and binary logistic regression were used to find the associated factors and assess the associated probabilities. The regression results show that the factors associated with the concentrations of As and the heavy metals in the hair samples were drinking water source, rate of water consumption, gender, bathing water source, education, smoking and underlying disease, whereas the factors associated with the concentrations of these species in the fingernail samples were drinking water source, gender, occupation, work hours per day, alcohol consumption, and the use of pesticides. Copyright © 2017 Elsevier Inc. All rights reserved.

  15. Association of perceived stress with stressful life events, lifestyle and sociodemographic factors: a large-scale community-based study using logistic quantile regression.

    Science.gov (United States)

    Feizi, Awat; Aliyari, Roqayeh; Roohafza, Hamidreza

    2012-01-01

    The present paper aimed at investigating the association between perceived stress and major life events stressors in Iranian general population. In a cross-sectional large-scale community-based study, 4583 people aged 19 and older, living in Isfahan, Iran, were investigated. Logistic quantile regression was used for modeling perceived stress, measured by GHQ questionnaire, as the bounded outcome (dependent), variable, and as a function of most important stressful life events, as the predictor variables, controlling for major lifestyle and sociodemographic factors. This model provides empirical evidence of the predictors' effects heterogeneity depending on individual location on the distribution of perceived stress. The results showed that among four stressful life events, family conflicts and social problems were more correlated with level of perceived stress. Higher levels of education were negatively associated with perceived stress and its coefficients monotonically decrease beyond the 30th percentile. Also, higher levels of physical activity were associated with perception of low levels of stress. The pattern of gender's coefficient over the majority of quantiles implied that females are more affected by stressors. Also high perceived stress was associated with low or middle levels of income. The results of current research suggested that in a developing society with high prevalence of stress, interventions targeted toward promoting financial and social equalities, social skills training, and healthy lifestyle may have the potential benefits for large parts of the population, most notably female and lower educated people.

  16. Development and validation of a logistic regression model to distinguish transition zone cancers from benign prostatic hyperplasia on multi-parametric prostate MRI

    Energy Technology Data Exchange (ETDEWEB)

    Iyama, Yuji [Kumamoto Chuo Hospital, Department of Diagnostic Radiology, Kumamoto, Kumamoto (Japan); Kumamoto University, Department of Diagnostic Radiology, Graduate School of Medical Sciences, Kumamoto, Kumamoto (Japan); Nakaura, Takeshi; Nagayama, Yasunori; Utsunomiya, Daisuke; Yamashita, Yasuyuki [Kumamoto University, Department of Diagnostic Radiology, Graduate School of Medical Sciences, Kumamoto, Kumamoto (Japan); Katahira, Kazuhiro; Oda, Seitaro [Kumamoto Chuo Hospital, Department of Diagnostic Radiology, Kumamoto, Kumamoto (Japan); Iyama, Ayumi [National Hospital Organization Kumamoto Medical Center, Department of Diagnostic Radiology, Kumamoto, Kumamoto (Japan)

    2017-09-15

    To develop a prediction model to distinguish between transition zone (TZ) cancers and benign prostatic hyperplasia (BPH) on multi-parametric prostate magnetic resonance imaging (mp-MRI). This retrospective study enrolled 60 patients with either BPH or TZ cancer, who had undergone 3 T-MRI. We generated ten parameters for T2-weighted images (T2WI), diffusion-weighted images (DWI) and dynamic MRI. Using a t-test and multivariate logistic regression (LR) analysis to evaluate the parameters' accuracy, we developed LR models. We calculated the area under the receiver operating characteristic curve (ROC) of LR models by a leave-one-out cross-validation procedure, and the LR model's performance was compared with radiologists' performance with their opinion and with the Prostate Imaging Reporting and Data System (Pi-RADS v2) score. Multivariate LR analysis showed that only standardized T2WI signal and mean apparent diffusion coefficient (ADC) maintained their independent values (P < 0.001). The validation analysis showed that the AUC of the final LR model was comparable to that of board-certified radiologists, and superior to that of Pi-RADS scores. A standardized T2WI and mean ADC were independent factors for distinguishing between BPH and TZ cancer. The performance of the LR model was comparable to that of experienced radiologists. (orig.)

  17. Segmentation and profiling consumers in a multi-channel environment using a combination of self-organizing maps (SOM method, and logistic regression

    Directory of Open Access Journals (Sweden)

    Seyed Ali Akbar Afjeh

    2014-05-01

    Full Text Available Market segmentation plays essential role on understanding the behavior of people’s interests in purchasing various products and services through various channels. This paper presents an empirical investigation to shed light on consumer’s purchasing attitude as well as gathering information in multi-channel environment. The proposed study of this paper designed a questionnaire and distributed it among 800 people who were at least 18 years of age and had some experiences on purchasing goods and services on internet, catalog or regular shopping centers. Self-organizing map, SOM, clustering technique was performed based on consumer’s interest in gathering information as well as purchasing products through internet, catalog and shopping centers and determined four segments. There were two types of questions for the proposed study of this paper. The first group considered participants’ personal characteristics such as age, gender, income, etc. The second group of questions was associated with participants’ psychographic characteristics including price consciousness, quality consciousness, time pressure, etc. Using multinominal logistic regression technique, the study determines consumers’ behaviors in each four segments.

  18. Shallow landslide susceptibility model for the Oria river basin, Gipuzkoa province (North of Spain). Application of the logistic regression and comparison with previous studies.

    Science.gov (United States)

    Bornaetxea, Txomin; Antigüedad, Iñaki; Ormaetxea, Orbange

    2016-04-01

    In the Oria river basin (885 km2) shallow landslides are very frequent and they produce several roadblocks and damage in the infrastructure and properties, causing big economic loss every year. Considering that the zonification of the territory in different landslide susceptibility levels provides a useful tool for the territorial planning and natural risk management, this study has the objective of identifying the most prone landslide places applying an objective and reproducible methodology. To do so, a quantitative multivariate methodology, the logistic regression, has been used. Fieldwork landslide points and randomly selected stable points have been used along with Lithology, Land Use, Distance to the transport infrastructure, Altitude, Senoidal Slope and Normalized Difference Vegetation Index (NDVI) independent variables to carry out a landslide susceptibility map. The model has been validated by the prediction and success rate curves and their corresponding area under the curve (AUC). In addition, the result has been compared to those from two landslide susceptibility models, covering the study area previously applied in different scales, such as ELSUS1000 version 1 (2013) and Landslide Susceptibility Map of Gipuzkoa (2007). Validation results show an excellent prediction capacity of the proposed model (AUC 0,962), and comparisons highlight big differences with previous studies.

  19. Supply and demand analysis for flood insurance by using logistic regression model: case study at Citarum watershed in South Bandung, West Java, Indonesia

    Science.gov (United States)

    Sidi, P.; Mamat, M.; Sukono; Supian, S.

    2017-01-01

    Floods have always occurred in the Citarum river basin. The adverse effects caused by floods can cover all their property, including the destruction of houses. The impact due to damage to residential buildings is usually not small. Indeed, each of flooding, the government and several social organizations providing funds to repair the building. But the donations are given very limited, so it cannot cover the entire cost of repair was necessary. The presence of insurance products for property damage caused by the floods is considered very important. However, if its presence is also considered necessary by the public or not? In this paper, the factors that affect the supply and demand of insurance product for damaged building due to floods are analyzed. The method used in this analysis is the ordinal logistic regression. Based on the analysis that the factors that affect the supply and demand of insurance product for damaged building due to floods, it is included: age, economic circumstances, family situations, insurance motivations, and lifestyle. Simultaneously that the factors affecting supply and demand of insurance product for damaged building due to floods mounted to 65.7%.

  20. Examination By Multinomial Logistic Regression Model Of The Factors Affecting The Types Of Domestic Violence Against Women A Case Of Turkey

    Directory of Open Access Journals (Sweden)

    Erkan Ari

    2015-08-01

    Full Text Available In this paper factors affecting the types of domestic violence against women was determined by multinomial logistic regression model. In this context we used the data of Research on Domestic Violence against Women in Turkey that was applied by Turkish Statistamp305cal Institute in 2008. In the study the variable of the types of domestic violence against women was used as dependent variable that has four levels. In addition twelve independent variables were used removing irrelevant variables from the data set via chi-square test of independence. After that the maximum likelihood estimates and the odds ratios of the variables of the model were obtained. Besides the validity of the model was tested by likelihood ratio test. At last comparisons were made for three categories depending on the odds ratio according to the selected reference category. In terms of odds ratios the variables of education level of woman and husbands work sector were statistically significant in only comparison one the variables of agnation with husband education level of husband frequency of seeing drunk husband and frequency of gambling of husband were statistically significant in both comparison one and three the variables of region deceived by husband common-law female for husband were statistically significant in all comparisons.

  1. A binary logistic regression model with complex sampling design of unmet need for family planning among all women aged (15-49) in Ethiopia.

    Science.gov (United States)

    Workie, Demeke Lakew; Zike, Dereje Tesfaye; Fenta, Haile Mekonnen; Mekonnen, Mulusew Admasu

    2017-09-01

    Unintended pregnancy related to unmet need is a worldwide problem that affects societies. The main objective of this study was to identify the prevalence and determinants of unmet need for family planning among women aged (15-49) in Ethiopia. The Performance Monitoring and Accountability2020/Ethiopia was conducted in April 2016 at round-4 from 7494 women with two-stage-stratified sampling. Bi-variable and multi-variable binary logistic regression model with complex sampling design was fitted. The prevalence of unmet-need for family planning was 16.2% in Ethiopia. Women between the age range of 15-24 years were 2.266 times more likely to have unmet need family planning compared to above 35 years. Women who were currently married were about 8 times more likely to have unmet need family planning compared to never married women. Women who had no under-five child were 0.125 times less likely to have unmet need family planning compared to those who had more than two-under-5. The key determinants of unmet need family planning in Ethiopia were residence, age, marital-status, education, household members, birth-events and number of under-5 children. Thus the Government of Ethiopia would take immediate steps to address the causes of high unmet need for family planning among women.

  2. Landslide-susceptibility analysis using light detection and ranging-derived digital elevation models and logistic regression models: a case study in Mizunami City, Japan

    Science.gov (United States)

    Wang, Liang-Jie; Sawada, Kazuhide; Moriguchi, Shuji

    2013-01-01

    To mitigate the damage caused by landslide disasters, different mathematical models have been applied to predict landslide spatial distribution characteristics. Although some researchers have achieved excellent results around the world, few studies take the spatial resolution of the database into account. Four types of digital elevation model (DEM) ranging from 2 to 20 m derived from light detection and ranging technology to analyze landslide susceptibility in Mizunami City, Gifu Prefecture, Japan, are presented. Fifteen landslide-causative factors are considered using a logistic-regression approach to create models for landslide potential analysis. Pre-existing landslide bodies are used to evaluate the performance of the four models. The results revealed that the 20-m model had the highest classification accuracy (71.9%), whereas the 2-m model had the lowest value (68.7%). In the 2-m model, 89.4% of the landslide bodies fit in the medium to very high categories. For the 20-m model, only 83.3% of the landslide bodies were concentrated in the medium to very high classes. When the cell size decreases from 20 to 2 m, the area under the relative operative characteristic increases from 0.68 to 0.77. Therefore, higher-resolution DEMs would provide better results for landslide-susceptibility mapping.

  3. Association of Perceived Stress with Stressful Life Events, Lifestyle and Sociodemographic Factors: A Large-Scale Community-Based Study Using Logistic Quantile Regression

    Science.gov (United States)

    Feizi, Awat; Aliyari, Roqayeh; Roohafza, Hamidreza

    2012-01-01

    Objective. The present paper aimed at investigating the association between perceived stress and major life events stressors in Iranian general population. Methods. In a cross-sectional large-scale community-based study, 4583 people aged 19 and older, living in Isfahan, Iran, were investigated. Logistic quantile regression was used for modeling perceived stress, measured by GHQ questionnaire, as the bounded outcome (dependent), variable, and as a function of most important stressful life events, as the predictor variables, controlling for major lifestyle and sociodemographic factors. This model provides empirical evidence of the predictors' effects heterogeneity depending on individual location on the distribution of perceived stress. Results. The results showed that among four stressful life events, family conflicts and social problems were more correlated with level of perceived stress. Higher levels of education were negatively associated with perceived stress and its coefficients monotonically decrease beyond the 30th percentile. Also, higher levels of physical activity were associated with perception of low levels of stress. The pattern of gender's coefficient over the majority of quantiles implied that females are more affected by stressors. Also high perceived stress was associated with low or middle levels of income. Conclusions. The results of current research suggested that in a developing society with high prevalence of stress, interventions targeted toward promoting financial and social equalities, social skills training, and healthy lifestyle may have the potential benefits for large parts of the population, most notably female and lower educated people. PMID:23091560

  4. Morphological MRI criteria improve the detection of lymph node metastases in head and neck squamous cell carcinoma: multivariate logistic regression analysis of MRI features of cervical lymph nodes

    Energy Technology Data Exchange (ETDEWEB)

    Bondt, R.B.J. de; Bakers, F.; Hofman, P.A.M. [Maastricht University Medical Center, Department of Radiology, Maastricht (Netherlands); Nelemans, P.J. [Maastricht University Medical Center, Department of Epidemiology, Maastricht (Netherlands); Casselman, J.W. [AZ St. Jan Hospital, Department of Radiology, Bruges (Belgium); Peutz-Kootstra, C. [Maastricht University Medical Center, Department of Pathology, Maastricht (Netherlands); Kremer, B. [Maastricht University Medical Center, Department of Otolaryngology/ Head and Neck Surgery, Maastricht (Netherlands); Beets-Tan, R.G.H. [Academic Hospital Maastricht, Department of Radiology, Maastricht (Netherlands)

    2009-03-15

    The aim was to evaluate whether morphological criteria in addition to the size criterion results in better diagnostic performance of MRI for the detection of cervical lymph node metastases in patients with head and neck squamous cell carcinoma (HNSCC). Two radiologists evaluated 44 consecutive patients in which lymph node characteristics were assessed with histopathological correlation as gold standard. Assessed criteria were the short axial diameter and morphological criteria such as border irregularity and homogeneity of signal intensity on T2-weighted and contrast-enhanced T1-weighted images. Multivariate logistic regression analysis was performed: diagnostic odds ratios (DOR) with 95% confidence intervals (95% CI) and areas under the curve (AUCs) of receiver-operating characteristic (ROC) curves were determined. Border irregularity and heterogeneity of signal intensity on T{sub 2}-weighted images showed significantly increased DORs. AUCs increased from 0.67 (95% CI: 0.61-0.73) using size only to 0.81 (95% CI: 0.75-0.87) using all four criteria for observer 1 and from 0.68 (95% CI: 0.62-0.74) to 0.96 (95% CI: 0.94-0.98) for observer 2 (p < 0.001). This study demonstrated that the morphological criteria border irregularity and heterogeneity of signal intensity on T2-weighted images in addition to size significantly improved the detection of cervical lymph nodes metastases. (orig.)

  5. Forest cover dynamics analysis and prediction modelling using logistic regression model (case study: forest cover at Indragiri Hulu Regency, Riau Province)

    Science.gov (United States)

    Nahib, Irmadi; Suryanta, Jaka

    2017-01-01

    Forest destruction, climate change and global warming could reduce an indirect forest benefit because forest is the largest carbon sink and it plays a very important role in global carbon cycle. To support Reducing Emissions from Deforestation and Forest Degradation (REDD +) program, people pay attention of forest cover changes as the basis for calculating carbon stock changes. This study try to explore the forest cover dynamics as well as the prediction model of forest cover in Indragiri Hulu Regency, Riau Province Indonesia. The study aims to analyse some various explanatory variables associated with forest conversion processes and predict forest cover change using logistic regression model (LRM). The main data used in this study is Land use/cover map (1990 - 2011). Performance of developed model was assessed through a comparison of the predicted model of forest cover change and the actual forest cover in 2011. The analysis result showed that forest cover has decreased continuously between 1990 and 2011, up to the loss of 165,284.82 ha (35.19 %) of forest area. The LRM successfully predicted the forest cover for the period 2010 with reasonably high accuracy (ROC = 92.97 % and 70.26 %).

  6. Association of Perceived Stress with Stressful Life Events, Lifestyle and Sociodemographic Factors: A Large-Scale Community-Based Study Using Logistic Quantile Regression

    Directory of Open Access Journals (Sweden)

    Awat Feizi

    2012-01-01

    Full Text Available Objective. The present paper aimed at investigating the association between perceived stress and major life events stressors in Iranian general population. Methods. In a cross-sectional large-scale community-based study, 4583 people aged 19 and older, living in Isfahan, Iran, were investigated. Logistic quantile regression was used for modeling perceived stress, measured by GHQ questionnaire, as the bounded outcome (dependent, variable, and as a function of most important stressful life events, as the predictor variables, controlling for major lifestyle and sociodemographic factors. This model provides empirical evidence of the predictors’ effects heterogeneity depending on individual location on the distribution of perceived stress. Results. The results showed that among four stressful life events, family conflicts and social problems were more correlated with level of perceived stress. Higher levels of education were negatively associated with perceived stress and its coefficients monotonically decrease beyond the 30th percentile. Also, higher levels of physical activity were associated with perception of low levels of stress. The pattern of gender’s coefficient over the majority of quantiles implied that females are more affected by stressors. Also high perceived stress was associated with low or middle levels of income. Conclusions. The results of current research suggested that in a developing society with high prevalence of stress, interventions targeted toward promoting financial and social equalities, social skills training, and healthy lifestyle may have the potential benefits for large parts of the population, most notably female and lower educated people.

  7. Predictors of Latent Trajectory Classes of Dating Violence Victimization

    Science.gov (United States)

    Brooks-Russell, Ashley; Foshee, Vangie; Ennett, Susan

    2014-01-01

    This study identified classes of developmental trajectories of physical dating violence victimization from grades 8 to 12 and examined theoretically-based risk factors that distinguished among trajectory classes. Data were from a multi-wave longitudinal study spanning 8th through 12th grade (n = 2,566; 51.9% female). Growth mixture models were used to identify trajectory classes of physical dating violence victimization separately for girls and boys. Logistic and multinomial logistic regressions were used to identify situational and target vulnerability factors associated with the trajectory classes. For girls, three trajectory classes were identified: a low/non-involved class; a moderate class where victimization increased slightly until the 10th grade and then decreased through the 12th grade; and a high class where victimization started at a higher level in the 8th grade, increased substantially until the 10th grade, and then decreased until the 12th grade. For males, two classes were identified: a low/non-involved class, and a victimized class where victimization increased slightly until the 9th grade, decreased until the 11th grade, and then increased again through the 12th grade. In bivariate analyses, almost all of the situational and target vulnerability risk factors distinguished the victimization classes from the non-involved classes. However, when all risk factors and control variables were in the model, alcohol use (a situational vulnerability) was the only factor that distinguished membership in the moderate trajectory class from the non-involved class for girls; anxiety and being victimized by peers (target vulnerability factors) were the factors that distinguished the high from the non-involved classes for the girls; and victimization by peers was the only factor distinguishing the victimized from the non-involved class for boys. These findings contribute to our understanding of the heterogeneity in physical dating violence victimization during

  8. Peer victimization as reported by children, teachers, and parents in relation to children's health symptoms

    Directory of Open Access Journals (Sweden)

    Mæhle Magne

    2011-05-01

    Full Text Available Abstract Background Victims of bullying in school may experience health problems later in life. We have assessed the prevalence of children's health symptoms according to whether peer victimization was reported by the children, by their teachers, or by their parents. Methods In a cross-sectional study of 419 children in grades 1-10 the frequency of peer victimization was reported by children, teachers and parents. Emotional and somatic symptoms (sadness, anxiety, stomach ache, and headache were reported by the children. Frequencies of victimization reported by different informants were compared by the marginal homogeneity test for paired ordinal data, concordance between informants by cross-tables and Spearman's rho, and associations of victimization with health symptoms were estimated by logistic regression. Results The concordance of peer victimization reported by children, teachers, and parents varied from complete agreement to complete discordance also for the highest frequency (weekly/daily of victimization. Children's self-reported frequency of victimization was strongly and positively associated with their reports of emotional and somatic symptoms. Frequency of victimization reported by teachers or parents showed similar but weaker associations with the children's health symptoms. Conclusion The agreement between children and significant adults in reporting peer victimization was low to moderate, and the associations of reported victimization with the children's self-reported health symptoms varied substantially between informants. It may be useful to assess prospectively the effects of employing different sources of information related to peer victimization.

  9. Peer victimization as reported by children, teachers, and parents in relation to children's health symptoms.

    Science.gov (United States)

    Løhre, Audhild; Lydersen, Stian; Paulsen, Bård; Mæhle, Magne; Vatten, Lars J

    2011-05-06

    Victims of bullying in school may experience health problems later in life. We have assessed the prevalence of children's health symptoms according to whether peer victimization was reported by the children, by their teachers, or by their parents. In a cross-sectional study of 419 children in grades 1-10 the frequency of peer victimization was reported by children, teachers and parents. Emotional and somatic symptoms (sadness, anxiety, stomach ache, and headache) were reported by the children.Frequencies of victimization reported by different informants were compared by the marginal homogeneity test for paired ordinal data, concordance between informants by cross-tables and Spearman's rho, and associations of victimization with health symptoms were estimated by logistic regression. The concordance of peer victimization reported by children, teachers, and parents varied from complete agreement to complete discordance also for the highest frequency (weekly/daily) of victimization. Children's self-reported frequency of victimization was strongly and positively associated with their reports of emotional and somatic symptoms. Frequency of victimization reported by teachers or parents showed similar but weaker associations with the children's health symptoms. The agreement between children and significant adults in reporting peer victimization was low to moderate, and the associations of reported victimization with the children's self-reported health symptoms varied substantially between informants. It may be useful to assess prospectively the effects of employing different sources of information related to peer victimization.

  10. Disentangling the Effects of Violent Victimization, Violent Behavior, and Gun Carrying for Minority Inner-City Youth Living in Extreme Poverty

    Science.gov (United States)

    Spano, Richard; Bolland, John

    2013-01-01

    Two waves of longitudinal data were used to examine the sequencing between violent victimization, violent behavior, and gun carrying in a high-poverty sample of African American youth. Multivariate logistic regression results indicated that violent victimization T1 and violent behavior T1 increased the likelihood of initiation of gun carrying T2…

  11. Quasi-Likelihood Techniques in a Logistic Regression Equation for IdentifyingSimulium damnosum s.l.Larval Habitats Intra-cluster Covariates in Togo.

    Science.gov (United States)

    Jacob, Benjamin G; Novak, Robert J; Toe, Laurent; Sanfo, Moussa S; Afriyie, Abena N; Ibrahim, Mohammed A; Griffith, Daniel A; Unnasch, Thomas R

    2012-01-01

    The standard methods for regression analyses of clustered riverine larval habitat data of Simulium damnosum s.l. a major black-fly vector of Onchoceriasis, postulate models relating observational ecological-sampled parameter estimators to prolific habitats without accounting for residual intra-cluster error correlation effects. Generally, this correlation comes from two sources: (1) the design of the random effects and their assumed covariance from the multiple levels within the regression model; and, (2) the correlation structure of the residuals. Unfortunately, inconspicuous errors in residual intra-cluster correlation estimates can overstate precision in forecasted S.damnosum s.l. riverine larval habitat explanatory attributes regardless how they are treated (e.g., independent, autoregressive, Toeplitz, etc). In this research, the geographical locations for multiple riverine-based S. damnosum s.l. larval ecosystem habitats sampled from 2 pre-established epidemiological sites in Togo were identified and recorded from July 2009 to June 2010. Initially the data was aggregated into proc genmod. An agglomerative hierarchical residual cluster-based analysis was then performed. The sampled clustered study site data was then analyzed for statistical correlations using Monthly Biting Rates (MBR). Euclidean distance measurements and terrain-related geomorphological statistics were then generated in ArcGIS. A digital overlay was then performed also in ArcGIS using the georeferenced ground coordinates of high and low density clusters stratified by Annual Biting Rates (ABR). This data was overlain onto multitemporal sub-meter pixel resolution satellite data (i.e., QuickBird 0.61m wavbands ). Orthogonal spatial filter eigenvectors were then generated in SAS/GIS. Univariate and non-linear regression-based models (i.e., Logistic, Poisson and Negative Binomial) were also employed to determine probability distributions and to identify statistically significant parameter

  12. Landslide Susceptibility Analysis by the comparison and integration of Random Forest and Logistic Regression methods; application to the disaster of Nova Friburgo - Rio de Janeiro, Brasil (January 2011)

    Science.gov (United States)

    Esposito, Carlo; Barra, Anna; Evans, Stephen G.; Scarascia Mugnozza, Gabriele; Delaney, Keith

    2014-05-01

    The study of landslide susceptibility by multivariate statistical methods is based on finding a quantitative relationship between controlling factors and landslide occurrence. Such studies have become popular in the last few decades thanks to the development of geographic information systems (GIS) software and the related improved data management. In this work we applied a statistical approach to an area of high landslide susceptibility mainly due to its tropical climate and geological-geomorphological setting. The study area is located in the south-east region of Brazil that has frequently been affected by flood and landslide hazard, especially because of heavy rainfall events during the summer season. In this work we studied a disastrous event that occurred on January 11th and 12th of 2011, which involved Região Serrana (the mountainous region of Rio de Janeiro State) and caused more than 5000 landslides and at least 904 deaths. In order to produce susceptibility maps, we focused our attention on an area of 93,6 km2 that includes Nova Friburgo city. We utilized two different multivariate statistic methods: Logistic Regression (LR), already widely used in applied geosciences, and Random Forest (RF), which has only recently been applied to landslide susceptibility analysis. With reference to each mapping unit, the first method (LR) results in a probability of landslide occurrence, while the second one (RF) gives a prediction in terms of % of area susceptible to slope failure. With this aim in mind, a landslide inventory map (related to the studied event) has been drawn up through analyses of high-resolution GeoEye satellite images, in a GIS environment. Data layers of 11 causative factors have been created and processed in order to be used as continuous numerical or discrete categorical variables in statistical analysis. In particular, the logistic regression method has frequent difficulties in managing numerical continuous and discrete categorical variables

  13. Factors related to clinical pregnancy after vitrified-warmed embryo transfer: a retrospective and multivariate logistic regression analysis of 2313 transfer cycles.

    Science.gov (United States)

    Shi, Wenhao; Zhang, Silin; Zhao, Wanqiu; Xia, Xue; Wang, Min; Wang, Hui; Bai, Haiyan; Shi, Juanzi

    2013-07-01

    What factors does multivariate logistic regression show to be significantly associated with the likelihood of clinical pregnancy in vitrified-warmed embryo transfer (VET) cycles? Assisted hatching (AH) and if the reason to freeze embryos was to avoid the risk of ovarian hyperstimulation syndrome (OHSS) were significantly positively associated with a greater likelihood of clinical pregnancy. Single factor analysis has shown AH, number of embryos transferred and the reason of freezing for OHSS to be positively and damaged blastomere to be negatively significantly associated with the chance of clinical pregnancy after VET. It remains unclear what factors would be significant after multivariate analysis. The study was a retrospective analysis of 2313 VET cycles from 1481 patients performed between January 2008 and April 2012. A multivariate logistic regression analysis was performed to identify the factors to affect clinical pregnancy outcome of VET. There were 22 candidate variables selected based on clinical experiences and the literature. With the thresholds of α entry = α removal= 0.05 for both variable entry and variable removal, eight variables were chosen to contribute the multivariable model by the bootstrap stepwise variable selection algorithm (n = 1000). Eight variables were age at controlled ovarian hyperstimulation (COH), reason for freezing, AH, endometrial thickness, damaged blastomere, number of embryos transferred, number of good-quality embryos, and blood presence on transfer catheter. A descriptive comparison of the relative importance was accomplished by the proportion of explained variation (PEV). Among the reasons for freezing, the OHSS group showed a higher OR than the surplus embryo group when compared with other reasons for VET groups (OHSS versus Other, OR: 2.145; CI: 1.4-3.286; Surplus embryos versus Other, OR: 1.152; CI: 0.761-1.743) and high PEV (marginal 2.77%, P = 0.2911; partial 1.68%; CI of area under receptor operator characteristic

  14. Extension of an iterative hybrid ordinal logistic regression/item response theory approach to detect and account for differential item functioning in longitudinal data.

    Science.gov (United States)

    Mukherjee, Shubhabrata; Gibbons, Laura E; Kristjansson, Elizabeth; Crane, Paul K

    2013-04-01

    Many constructs are measured using multi-item data collection instruments. Differential item functioning (DIF) occurs when construct-irrelevant covariates interfere with the relationship between construct levels and item responses. DIF assessment is an active area of research, and several techniques are available to identify and account for DIF in cross-sectional settings. Many studies include data collected from individuals over time; yet appropriate methods for identifying and accounting for items with DIF in these settings are not widely available. We present an approach to this problem and apply it to longitudinal Modified Mini-Mental State Examination (3MS) data from English speakers in the Canadian Study of Health and Aging. We analyzed 3MS items for DIF with respect to sex, birth cohort and education. First, we focused on cross-sectional data from a subset of Canadian Study of Health and Aging participants who had complete data at all three data collection periods. We performed cross-sectional DIF analyses at each time point using an iterative hybrid ordinal logistic regression/item response theory (OLR/IRT) framework. We found that item-level findings differed at the three time points. We then developed and applied an approach to detecting and accounting for DIF using longitudinal data in which covariation within individuals over time is accounted for by clustering on person. We applied this approach to data for the "entire" dataset of English speaking participants including people who later dropped out or died. Accounting for longitudinal DIF modestly attenuated differences between groups defined by educational attainment. We conclude with a discussion of further directions for this line of research.

  15. An Objective Screening Method for Major Depressive Disorder Using Logistic Regression Analysis of Heart Rate Variability Data Obtained in a Mental Task Paradigm

    Directory of Open Access Journals (Sweden)

    Guanghao Sun

    2016-11-01

    Full Text Available Background and Objectives: Heart rate variability (HRV has been intensively studied as a promising biological marker of major depressive disorder (MDD. Our previous study confirmed that autonomic activity and reactivity in depression revealed by HRV during rest and mental task (MT conditions can be used as diagnostic measures and in clinical evaluation. In this study, logistic regression analysis (LRA was utilized for the classification and prediction of MDD based on HRV data obtained in an MT paradigm.Methods: Power spectral analysis of HRV on R-R intervals before, during, and after an MT (random number generation was performed in 44 drug-naïve patients with MDD and 47 healthy control subjects at Department of Psychiatry in Shizuoka Saiseikai General Hospital. Logit scores of LRA determined by HRV indices and heart rates discriminated patients with MDD from healthy subjects. The high frequency (HF component of HRV and the ratio of the low frequency (LF component to the HF component (LF/HF correspond to parasympathetic and sympathovagal balance, respectively.Results: The LRA achieved a sensitivity and specificity of 80.0% and 79.0%, respectively, at an optimum cutoff logit score (0.28. Misclassifications occurred only when the logit score was close to the cutoff score. Logit scores also correlated significantly with subjective self-rating depression scale scores (p < 0.05.Conclusion: HRV indices recorded during a mental task may be an objective tool for screening patients with MDD in psychiatric practice. The proposed method appears promising for not only objective and rapid MDD screening, but also evaluation of its severity.

  16. Adverse Childhood Experiences and School-Based Victimization and Perpetration.

    Science.gov (United States)

    Forster, Myriam; Gower, Amy L; McMorris, Barbara J; Borowsky, Iris W

    2017-01-01

    Retrospective studies using adult self-report data have demonstrated that adverse childhood experiences (ACEs) increase risk of violence perpetration and victimization. However, research examining the associations between adolescent reports of ACE and school violence involvement is sparse. The present study examines the relationship between adolescent reported ACE and multiple types of on-campus violence (bringing a weapon to campus, being threatened with a weapon, bullying, fighting, vandalism) for boys and girls as well as the risk of membership in victim, perpetrator, and victim-perpetrator groups. The analytic sample was comprised of ninth graders who participated in the 2013 Minnesota Student Survey ( n ~ 37,000). Multinomial logistic regression models calculated the risk of membership for victim only, perpetrator only, and victim-perpetrator subgroups, relative to no violence involvement, for students with ACE as compared with those with no ACE. Separate logistic regression models assessed the association between cumulative ACE and school-based violence, adjusting for age, ethnicity, family structure, poverty status, internalizing symptoms, and school district size. Nearly 30% of students were exposed to at least one ACE. Students with ACE represent 19% of no violence, 38% of victim only, 40% of perpetrator only, and 63% of victim-perpetrator groups. There was a strong, graded relationship between ACE and the probability of school-based victimization: physical bullying for boys but not girls, being threatened with a weapon, and theft or property destruction ( ps bullying and bringing a weapon to campus ( ps effects of cumulative ACE. We recommend that schools systematically screen for ACE, particularly among younger adolescents involved in victimization and perpetration, and develop the infrastructure to increase access to trauma-informed intervention services. Future research priorities and implications are discussed.

  17. Cyberbullying perpetration and victimization among middle-school students.

    Science.gov (United States)

    Rice, Eric; Petering, Robin; Rhoades, Harmony; Winetrobe, Hailey; Goldbach, Jeremy; Plant, Aaron; Montoya, Jorge; Kordic, Timothy

    2015-03-01

    We examined correlations between gender, race, sexual identity, and technology use, and patterns of cyberbullying experiences and behaviors among middle-school students. We collected a probability sample of 1285 students alongside the 2012 Youth Risk Behavior Survey in Los Angeles Unified School District middle schools. We used logistic regressions to assess the correlates of being a cyberbully perpetrator, victim, and perpetrator-victim (i.e., bidirectional cyberbullying behavior). In this sample, 6.6% reported being a cyberbully victim, 5.0% reported being a perpetrator, and 4.3% reported being a perpetrator-victim. Cyberbullying behavior frequently occurred on Facebook or via text messaging. Cyberbully perpetrators, victims, and perpetrators-victims all were more likely to report using the Internet for at least 3 hours per day. Sexual-minority students and students who texted at least 50 times per day were more likely to report cyberbullying victimization. Girls were more likely to report being perpetrators-victims. Cyberbullying interventions should account for gender and sexual identity, as well as the possible benefits of educational interventions for intensive Internet users and frequent texters.

  18. Cyberbullying Perpetration and Victimization Among Middle-School Students

    Science.gov (United States)

    Rice, Eric; Rhoades, Harmony; Winetrobe, Hailey; Goldbach, Jeremy; Plant, Aaron; Montoya, Jorge; Kordic, Timothy

    2015-01-01

    Objectives. We examined correlations between gender, race, sexual identity, and technology use, and patterns of cyberbullying experiences and behaviors among middle-school students. Methods. We collected a probability sample of 1285 students alongside the 2012 Youth Risk Behavior Survey in Los Angeles Unified School District middle schools. We used logistic regressions to assess the correlates of being a cyberbully perpetrator, victim, and perpetrator–victim (i.e., bidirectional cyberbullying behavior). Results. In this sample, 6.6% reported being a cyberbully victim, 5.0% reported being a perpetrator, and 4.3% reported being a perpetrator–victim. Cyberbullying behavior frequently occurred on Facebook or via text messaging. Cyberbully perpetrators, victims, and perpetrators–victims all were more likely to report using the Internet for at least 3 hours per day. Sexual-minority students and students who texted at least 50 times per day were more likely to report cyberbullying victimization. Girls were more likely to report being perpetrators–victims. Conclusions. Cyberbullying interventions should account for gender and sexual identity, as well as the possible benefits of educational interventions for intensive Internet users and frequent texters. PMID:25602905

  19. Using Historical Data and Quasi-Likelihood Logistic Regression Modeling to Test Spatial Patterns of Channel Response to Peak Flows in a Mountain Watershed

    Science.gov (United States)

    Faustini, J. M.; Jones, J. A.

    2001-12-01

    This study used an empirical modeling approach to explore landscape controls on spatial variations in reach-scale channel response to peak flows in a mountain watershed. We used historical cross-section surveys spanning 20 years at five sites on 2nd to 5th-order channels and stream gaging records spanning up to 50 years. We related the observed proportion of cross-sections at a site exhibiting detectable change between consecutive surveys to the recurrence interval of the largest peak flow during the corresponding period using a quasi-likelihood logistic regression model. Stream channel response was linearly related to flood size or return period through the logit function, but the shape of the response function varied according to basin size, bed material, and the presence or absence of large wood. At the watershed scale, we hypothesized that the spatial scale and frequency of channel adjustment should increase in the downstream direction as sediment supply increases relative to transport capacity, resulting in more transportable sediment in the channel and hence increased bed mobility. Consistent with this hypothesis, cross sections from the 4th and 5th-order main stem channels exhibit more frequent detectable changes than those at two steep third-order tributary sites. Peak flows able to mobilize bed material sufficiently to cause detectable changes in 50% of cross-section profiles had an estimated recurrence interval of 3 years for the 4th and 5th-order channels and 4 to 6 years for the 3rd-order sites. This difference increased for larger magnitude channel changes; peak flows with recurrence intervals of about 7 years produced changes in 90% of cross sections at the main stem sites, but flows able to produce the same level of response at tributary sites were three times less frequent. At finer scales, this trend of increasing bed mobility in the downstream direction is modified by variations in the degree of channel confinement by bedrock and landforms, the

  20. The role of multicollinearity in landslide susceptibility assessment by means of Binary Logistic Regression: comparison between VIF and AIC stepwise selection

    Science.gov (United States)

    Cama, Mariaelena; Cristi Nicu, Ionut; Conoscenti, Christian; Quénéhervé, Geraldine; Maerker, Michael

    2016-04-01

    Landslide susceptibility can be defined as the likelihood of a landslide occurring in a given area on the basis of local terrain conditions. In the last decades many research focused on its evaluation by means of stochastic approaches under the assumption that 'the past is the key to the future' which means that if a model is able to reproduce a known landslide spatial distribution, it will be able to predict the future locations of new (i.e. unknown) slope failures. Among the various stochastic approaches, Binary Logistic Regression (BLR) is one of the most used because it calculates the susceptibility in probabilistic terms and its results are easily interpretable from a geomorphological point of view. However, very often not much importance is given to multicollinearity assessment whose effect is that the coefficient estimates are unstable, with opposite sign and therefore difficult to interpret. Therefore, it should be evaluated every time in order to make a model whose results are geomorphologically correct. In this study the effects of multicollinearity in the predictive performance and robustness of landslide susceptibility models are analyzed. In particular, the multicollinearity is estimated by means of Variation Inflation Index (VIF) which is also used as selection criterion for the independent variables (VIF Stepwise Selection) and compared to the more commonly used AIC Stepwise Selection. The robustness of the results is evaluated through 100 replicates of the dataset. The study area selected to perform this analysis is the Moldavian Plateau where landslides are among the most frequent geomorphological processes. This area has an increasing trend of urbanization and a very high potential regarding the cultural heritage, being the place of discovery of the largest settlement belonging to the Cucuteni Culture from Eastern Europe (that led to the development of the great complex Cucuteni-Tripyllia). Therefore, identifying the areas susceptible to

  1. Personal, Social, and Game-Related Correlates of Active and Non-Active Gaming Among Dutch Gaming Adolescents: Survey-Based Multivariable, Multilevel Logistic Regression Analyses

    Science.gov (United States)

    de Vet, Emely; Chinapaw, Mai JM; de Boer, Michiel; Seidell, Jacob C; Brug, Johannes

    2014-01-01

    Background Playing video games contributes substantially to sedentary behavior in youth. A new generation of video games—active games—seems to be a promising alternative to sedentary games to promote physical activity and reduce sedentary behavior. At this time, little is known about correlates of active and non-active gaming among adolescents. Objective The objective of this study was to examine potential personal, social, and game-related correlates of both active and non-active gaming in adolescents. Methods A survey assessing game behavior and potential personal, social, and game-related correlates was conducted among adolescents (12-16 years, N=353) recruited via schools. Multivariable, multilevel logistic regression analyses, adjusted for demographics (age, sex and educational level of adolescents), were conducted to examine personal, social, and game-related correlates of active gaming ≥1 hour per week (h/wk) and non-active gaming >7 h/wk. Results Active gaming ≥1 h/wk was significantly associated with a more positive attitude toward active gaming (OR 5.3, CI 2.4-11.8; Pgames (OR 0.30, CI 0.1-0.6; P=.002), a higher score on habit strength regarding gaming (OR 1.9, CI 1.2-3.2; P=.008) and having brothers/sisters (OR 6.7, CI 2.6-17.1; Pgame engagement (OR 0.95, CI 0.91-0.997; P=.04). Non-active gaming >7 h/wk was significantly associated with a more positive attitude toward non-active gaming (OR 2.6, CI 1.1-6.3; P=.035), a stronger habit regarding gaming (OR 3.0, CI 1.7-5.3; P7 h/wk. Active gaming is most strongly (negatively) associated with attitude with respect to non-active games, followed by observed active game behavior of brothers and sisters and attitude with respect to active gaming (positive associations). On the other hand, non-active gaming is most strongly associated with observed non-active game behavior of friends, habit strength regarding gaming and attitude toward non-active gaming (positive associations). Habit strength was a

  2. Comparison between an artificial neural network and logistic regression in predicting acute graft-vs-host disease after unrelated donor hematopoietic stem cell transplantation in thalassemia patients.

    Science.gov (United States)

    Caocci, Giovanni; Baccoli, Roberto; Vacca, Adriana; Mastronuzzi, Angela; Bertaina, Alice; Piras, Eugenia; Littera, Roberto; Locatelli, Franco; Carcassi, Carlo; La Nasa, Giorgio

    2010-05-01

    There is growing interest in the development of prognostic models for predicting the occurrence of acute graft-vs-host disease (aGVHD) after unrelated donor hematopoietic stem cell transplantation. A high number of variables have been shown to play a role in aGVHD, but the search for a predictive algorithm is still ongoing. Artificial neural networks (ANNs) represent an attractive alternative to multivariate analysis for clinical prognosis. So far, no reports have investigated the ability of ANNs in predicting HSCT outcome. We compared the prognostic performance of ANNs with that of logistic regression (LR) in 78 beta-thalassemia major patients given unrelated donor hematopoietic stem cell transplantation. Twenty-four independent variables were analyzed for their potential impact on outcomes. Twenty-six patients (33.3%) developed grade II to IV aGVHD. In multivariate analysis, homozygosity for donor KIR haplotype A (p = 0.03), donor age (p = 0.05), and donor homozygosity for the deletion of the human leukocyte antigen-G 14-bp polymorphism (p = 0.05) were independently significantly correlated to aGVHD. The mean sensitivity of LR and ANNs (capability of predicting aGVHD in patients who developed aGVHD) in test datasets was 21.7% and 83.3%, respectively (p < 0.001); the mean specificity (capability of predicting absence of aGVHD in patients who did not develop aGVHD) was 80.5% and 90.1%, respectively (p = NS). Although ANNs are unable to calculate the weight of single variables on outcomes, they were found to have a better performance than LR. A combination of these two methods could be more efficient in predicting outcomes and help tailor GVHD prophylaxis regimens according to the predicted risk of each patient. Whether ANN technology will provide better predictive performance when applied to other datasets remains to be confirmed. 2010 ISEH-Society for Hematology and Stem Cells. Published by Elsevier Inc. All rights reserved.

  3. The impact of perceived childhood victimization and patriarchal gender ideology on intimate partner violence (IPV) victimization among Korean immigrant women in the USA.

    Science.gov (United States)

    Kim, Chunrye

    2017-08-01

    Childhood victimization experiences are common among intimate partner violence (IPV) victims. This study examines the link between childhood physical and sexual victimization experiences and adulthood IPV among Korean immigrant women in the USA. As Korean immigrants often use physical punishment to discipline their children, and reporting sexual abuse is discouraged due to stigmatization in this community, cultural factors (e.g. patriarchal values) related to childhood victimization and IPV were also examined. Survey data from Korean immigrant women in the USA were collected. Using a case-control design, we compared 64 Korean immigrant women who have experienced IPV in the past year with 63 Korean immigrant women who have never experienced IPV in their lifetime. The findings of this study reveal that IPV victims, compared with non-victims, experienced higher childhood victimization rates. Logistic regression analysis demonstrated that childhood victimization and patriarchal gender ideology strongly predict IPV victimization among Korean immigrants. However, patriarchal values did not moderate the relationship between childhood victimization and IPV. To prevent IPV among Korean immigrant population, we need to make special efforts to prevent childhood abuse and change ingrained cultural attitudes about child physical and sexual abuse among immigrant communities through culturally sensitive programs. Copyright © 2017 Elsevier Ltd. All rights reserved.

  4. Basic Diagnosis and Prediction of Persistent Contrail Occurrence using High-resolution Numerical Weather Analyses/Forecasts and Logistic Regression. Part I: Effects of Random Error

    Science.gov (United States)

    Duda, David P.; Minnis, Patrick

    2009-01-01

    Straightforward application of the Schmidt-Appleman contrail formation criteria to diagnose persistent contrail occurrence from numerical weather prediction data is hindered by significant bias errors in the upper tropospheric humidity. Logistic models of contrail occurrence have been proposed to overcome this problem, but basic questions remain about how random measurement error may affect their accuracy. A set of 5000 synthetic contrail observations is created to study the effects of random error in these probabilistic models. The simulated observations are based on distributions of temperature, humidity, and vertical velocity derived from Advanced Regional Prediction System (ARPS) weather analyses. The logistic models created from the simulated observations were evaluated using two common statistical measures of model accuracy, the percent correct (PC) and the Hanssen-Kuipers discriminant (HKD). To convert the probabilistic results of the logistic models into a dichotomous yes/no choice suitable for the statistical measures, two critical probability thresholds are considered. The HKD scores are higher when the climatological frequency of contrail occurrence is used as the critical threshold, while the PC scores are higher when the critical probability threshold is 0.5. For both thresholds, typical random errors in temperature, relative humidity, and vertical velocity are found to be small enough to allow for accurate logistic models of contrail occurrence. The accuracy of the models developed from synthetic data is over 85 percent for both the prediction of contrail occurrence and non-occurrence, although in practice, larger errors would be anticipated.

  5. Basic Diagnosis and Prediction of Persistent Contrail Occurrence using High-resolution Numerical Weather Analyses/Forecasts and Logistic Regression. Part II: Evaluation of Sample Models

    Science.gov (United States)

    Duda, David P.; Minnis, Patrick

    2009-01-01

    Previous studies have shown that probabilistic forecasting may be a useful method for predicting persistent contrail formation. A probabilistic forecast to accurately predict contrail formation over the contiguous United States (CONUS) is created by using meteorological data based on hourly meteorological analyses from the Advanced Regional Prediction System (ARPS) and from the Rapid Update Cycle (RUC) as well as GOES water vapor channel measurements, combined with surface and satellite observations of contrails. Two groups of logistic models were created. The first group of models (SURFACE models) is based on surface-based contrail observations supplemented with satellite observations of contrail occurrence. The second group of models (OUTBREAK models) is derived from a selected subgroup of satellite-based observations of widespread persistent contrails. The mean accuracies for both the SURFACE and OUTBREAK models typically exceeded 75 percent when based on the RUC or ARPS analysis data, but decreased when the logistic models were derived from ARPS forecast data.

  6. Alcohol Involvement in Homicide Victimization in the United States.

    Science.gov (United States)

    Naimi, Timothy S; Xuan, Ziming; Cooper, Susanna E; Coleman, Sharon M; Hadland, Scott E; Swahn, Monica H; Heeren, Timothy C

    2016-12-01

    Although the association between alcohol and homicide is well documented, there has been no recent study of alcohol involvement in homicide victimization in U.S. states. The objective of this article was to determine the prevalence of alcohol involvement in homicide victimization and to identify socio demographic and other factors associated with alcohol involvement in homicide victimization. Data from homicide victims with a reported blood alcohol content (BAC) level were analyzed from 17 states from 2010 to 2012 using the National Violent Death Reporting System. Logistic regression was used to investigate factors associated with the odds of homicide victims having a BAC ≥ 0.08%. Among all homicide victims, 39.9% had a positive BAC including 13.7% with a BAC between 0.01% and 0.79% and 26.2% of victims with a BAC ≥ 0.08%. Males were twice as likely as females to have a BAC ≥ 0.08% (29.1% vs. 15.2%; p homicide victims having a BAC ≥ 0.08 included male sex, American Indian/Alaska Native race, Hispanic ethnicity, history of intimate partner violence, and nonfirearm homicides. Alcohol is present in a substantial proportion of homicide victims in the United States, with substantial variation by state, demographic, and circumstantial characteristics. Future studies should explore the relationships between state-level alcohol policies and alcohol involvement among perpetrators and victims of homicide. Copyright © 2016 by the Research Society on Alcoholism.

  7. Tramvay Yolcu Memnuniyetinin Lojistik Regresyon Analiziyle Ölçülmesi: Estram Örneği(Measuring the Traveller Satisfaction of Tram Using Logistic Regression: A Case Study of Estram

    Directory of Open Access Journals (Sweden)

    Nuray GİRGİNER

    2008-01-01

    Full Text Available In this study, it has been investigated traveller satisfaction about the tram which is one of the mass transportation vehicles on case of Eskisehir’s Tram System (Estram using Binomial Logistic Regression Analysis. Eskisehir’s population have become dense on students and their’s satisfactions as traveller have important. So, sample of this study has formed from 300 students of Anatolia University and Eskisehir Osmangazi University which are in Eskisehir and they have selected with Simple Random Sampling. As a consequence, utilizing some of subjective and objective variables, it is investigated whether or not Estram satisfies these students. Considering latent variable about satisfaction at the binomial level, binomial logistic regression is implemented about student satisfaction. The result of analysis showed that whole independent variables had negative effect on the satisfaction of students about Estram.

  8. The social and emotional skills of bullies, victims, and bully-victims of Egyptian primary school children.

    Science.gov (United States)

    Habashy Hussein, Mohamed

    2013-01-01

    This study examined whether bullies, victims, bully-victims (who are both bullies and victims), and students who reported no or low levels of bullying and victimization differed in their levels of social and emotional skills. Data were collected from 623 children in fifth and sixth grades from four Egyptian elementary schools; their ages ranged from 10 to 12 years. K-means cluster analysis revealed four groups: bullies (n = 138), victims (n = 178), bully-victims (n = 59), and children who were not involved in bullying behaviour (n = 248). Data were analyzed using multinomial logistic regression. The findings indicated that boys were more involved in bullying behaviour than girls, and both bullies and bully-victims were less likely to adhere to social rules and politeness than children who were not involved in bullying. Both bullies and victims were less aware of the physiological reactions of their emotions than uninvolved children, and were less able to apply social rules in social interaction. Both victims and bully-victims reported less likeability than children not involved in bullying. Verbal sharing, attending to others' emotions, and analysis of emotions did not have a statistically significant relationship with the probabilities of classifying children to any bullying group versus children not involved in bullying. Social skills were more important than emotional awareness in predicting the likelihood of classifying children in one of the three bullying groups versus children who not involved in bullying. The main conclusion is that social and emotional skills together may provide an effective means of intervention for bullying problems.

  9. Factors associated with long-stay nursing home admissions among the U.S. elderly population: comparison of logistic regression and the Cox proportional hazards model with policy implications for social work.

    Science.gov (United States)

    Cai, Qian; Salmon, J Warren; Rodgers, Mark E

    2009-01-01

    Two statistical methods were compared to identify key factors associated with long-stay nursing home (LSNH) admission among the U.S. elderly population. Social Work's interest in services to the elderly makes this research critical to the profession. Effectively transitioning the "baby boomer" population into appropriate long-term care will be a great societal challenge. It remains a challenge paramount to the practice of social work. Secondary data analyses using four waves (1995, 1998, 2000, and 2002) of the Health Retirement Study (HRS) coupled with the Assets and Health Dynamics among the Oldest Old (AHEAD) surveys were conducted. Multivariable logistic regression and Cox proportional hazards model were performed and compared. Older age, lower self-perceived health, worse instrumental activities of daily living (IADL), psychiatric problems, and living alone were found significantly associated with increased risk of LSNH admission. In contrast, being female, African American, or Hispanic; owning a home; and having lower level of cognitive impairment reduced the admission risk. Home ownership showed a significant effect in logistic regression, but a marginal effect in the Cox model. The Cox model generally provided more precise parameter estimates than logistic regression. Logistic regression, used frequently in analyses, can provide a good approximation to the Cox model in identifying factors of LSNH admission. However, the Cox model gives more information on how soon the LSNH admission may happen. Our analyses, based on two models, dually identified the factors associated with LSNH admission; therefore, results discussed confidently provide implications for both public and private long-term care policies, as well as improving the assessment capabilities of social work practitioners for development of screening programs among at-risk elderly. Given the predicted surge in this population, significant factors found from this study can be utilized in a strengths

  10. Association between bullying victimization and substance use among college students in Spain.

    Science.gov (United States)

    Caravaca Sánchez, Francisco; Navarro Zaragoza, Javier; Luna Ruiz-Cabello, Aurelio; Falcón Romero, María; Luna Maldonado, Aurelio

    2016-06-14

    The purpose of this study is to analyze the prevalence and association between victimization and substance use among the university population in the southeast of Spain in a sample of 543 randomly selected college students (405 females and 138 males with an average age of 22.6 years). As a cross-sectional study, data was collected through an anonymous survey to assess victimization and drug use over the last 12 months. Results indicated that 62.2% of college students reported bullying victimization and 82.9% consumed some type of psychoactive substance, and found a statistically significant association between both variables measured. Additionally, logistic regression analysis confirmed the association between psychoactive substance use and different types of victimization. Our findings confirm the need for prevention to prevent this relation between victimization and substance use.

  11. A comparison of three methods of assessing differential item functioning (DIF) in the Hospital Anxiety Depression Scale: ordinal logistic regression, Rasch analysis and the Mantel chi-square procedure.

    Science.gov (United States)

    Cameron, Isobel M; Scott, Neil W; Adler, Mats; Reid, Ian C

    2014-12-01

    It is important for clinical practice and research that measurement scales of well-being and quality of life exhibit only minimal differential item functioning (DIF). DIF occurs where different groups of people endorse items in a scale to different extents after being matched by the intended scale attribute. We investigate the equivalence or otherwise of common methods of assessing DIF. Three methods of measuring age- and sex-related DIF (ordinal logistic regression, Rasch analysis and Mantel χ(2) procedure) were applied to Hospital Anxiety Depression Scale (HADS) data pertaining to a sample of 1,068 patients consulting primary care practitioners. Three items were flagged by all three approaches as having either age- or sex-related DIF with a consistent direction of effect; a further three items identified did not meet stricter criteria for important DIF using at least one method. When applying strict criteria for significant DIF, ordinal logistic regression was slightly less sensitive. Ordinal logistic regression, Rasch analysis and contingency table methods yielded consistent results when identifying DIF in the HADS depression and HADS anxiety scales. Regardless of methods applied, investigators should use a combination of statistical significance, magnitude of the DIF effect and investigator judgement when interpreting the results.

  12. The impact of loneliness on self-rated health symptoms among victimized school children

    Directory of Open Access Journals (Sweden)

    Løhre Audhild

    2012-05-01

    Full Text Available Abstract Background Loneliness is associated with peer victimization, and the two adverse experiences are both related to ill health in childhood and adolescence. There is, however, a lack of knowledge on the importance of loneliness among victimized children. Therefore, possible modifying effects of loneliness on victimized school children’s self-rated health were assessed. Methods A population based cross-section study included 419 children in grades 1–10 from five schools. The prevalence of loneliness and victimization across grades was analyzed by linear test for trend, and associations of the adverse experiences with four health symptoms (sadness, anxiety, stomach ache, and headache were estimated by logistic regression. Results In crude regression analysis, both victimization and loneliness showed positive associations with all the four health symptoms. However, in multivariable analysis, the associations of victimization with health symptoms were fully attenuated except for headache. In contrast, loneliness retained about the same strength of associations in the multivariable analysis as in the crude analysis. More detailed analyses demonstrated that children who reported both victimization and loneliness had three to seven times higher prevalence of health symptoms compared to children who reported neither victimization nor loneliness (the reference group. Rather surprisingly, victimized children who reported no loneliness did not have any higher prevalence of health symptoms than the reference group, whereas lonely children without experiences of victimization had almost the same prevalence of health symptoms (except for stomach ache as children who were both victimized and lonely. Conclusions Adverse effects of loneliness need to be highlighted, and for victimized children, experiences of loneliness may be an especially harsh risk factor related to ill health.

  13. Completing the Remedial Sequence and College-Level Credit-Bearing Math: Comparing Binary, Cumulative, and Continuation Ratio Logistic Regression Models

    Science.gov (United States)

    Davidson, J. Cody

    2016-01-01

    Mathematics is the most common subject area of remedial need and the majority of remedial math students never pass a college-level credit-bearing math class. The majorities of studies that investigate this phenomenon are conducted at community colleges and use some type of regression model; however, none have used a continuation ratio model. The…

  14. Regression analysis by example

    National Research Council Canada - National Science Library

    Chatterjee, Samprit; Hadi, Ali S

    2012-01-01

    .... The emphasis continues to be on exploratory data analysis rather than statistical theory. The coverage offers in-depth treatment of regression diagnostics, transformation, multicollinearity, logistic regression, and robust regression...

  15. Risk Factors Associated with Peer Victimization and Bystander Behaviors among Adolescent Students.

    Science.gov (United States)

    Huang, Zepeng; Liu, Zhenni; Liu, Xiangxiang; Lv, Laiwen; Zhang, Yan; Ou, Limin; Li, Liping

    2016-07-27

    Despite the prevalence of the phenomena of peer victimization and bystander behaviors, little data has generated to describe their relationships and risk factors. In this paper, a self-administered survey using a cross-sectional cluster-random sampling method in a sample of 5450 participants (2734 girls and 2716 boys) between 4th and 11th grades was conducted at six schools (two primary schools and four middle schools) located in Shantou, China. Self-reported peer victimization, bystander behaviors and information regarding parents' risky behaviors and individual behavioral factors were collected. Multinomial logistic regression analysis was applied to evaluate risk factors affecting peer victimization and bystander behaviors. The results indicated that urban participants were more likely to become bullying victims but less likely to become passive bystanders. Contrarily, bullying victimization was related to the increasing of passive bystander behaviors. Father drinking and mother smoking as independent factors were risk factors for peer victimization. Participants who were smoking or drinking had a tendency to be involved in both peer victimization and passive bystander behaviors. This study suggested that bystander behaviors, victims' and parents' educations play a more important role in peer victimization than previously thought.

  16. Logistic regression analysis of damp-heat and cold-damp impeding syndrome of rheumatoid arthritis: a perspective in Chinese medicine.

    Science.gov (United States)

    Wang, Zhi-Zhong; Fang, Yong-Fei; Wang, Yong; Mu, Fang-Xiang; Chen, Jun; Zou, Qing-Hua; Zhong, Bing; Li, Jing-Yi; Bo, Gan-Ping; Zhang, Rong-Hua

    2012-08-01

    To investigate a method for quantitative differential diagnosis of damp-heat and cold-damp impeding syndrome of rheumatoid arthritis (RA) in Chinese medicine (CM). Laboratory parameters were collected from 306 patients with RA. The clinical symptoms and laboratory parameters were compared between patients with these two syndromes (158 with RA of damp-heat impeding syndrome, and 148 with RA of cold-damp impeding syndrome), and a regression equation was established to facilitate discrimination of the two RA syndromes. There were significant differences in disease activity score in 28 joints [DAS28 (4)], erythrocyte sedimentation rate (ESR), white blood cell count (WBC), C-reactive protein (CRP), platelet count (PLT), albumin (ALB) and globulin (GLB) between the two syndrome of RA (Pdamp-heat from cold-damp impeding syndrome. The regression equation was as follows: P=1/{1+exp[-(3.0-0.021X (1)-0.196X (2)-0.163X (3)-1.559X (4)+1.504X (5)-0.927X (6)-1.039X (7)+1.070X (8)+1.330X (9))]}. The independent variables X (1)-X (9) were ESR, WBC, CRP, hot joint, cold joint, thirst, sweating, aversion to wind and cold, and cold limbs. A P value > 0.5 signified cold-damp impeding syndrome, and a P value damp-heat impeding syndrome. The accuracy was 90.2%. The regression equation may be useful for discriminating damp-heat from cold-damp impeding syndrome of RA.

  17. Comprehensive Logistics

    CERN Document Server

    Gudehus, Timm

    2012-01-01

    Modern logistics comprises operative logistics, analytical logistics and management of logistic networks. Central task of operative logistics is the efficient supply of required goods at the right place within the right time. Tasks of analytical logistics are designing optimal networks and systems, developing strategies for planning, scheduling and operation, and organizing efficient order and performance processes. Logistic management plans, implements and operates logistic networks and schedules orders, stocks and resources. This reference-book offers a unique survey of modern logistics. It contains proven strategies, rules and tools for the solution of a multitude of logistic problems. The analytically derived algorithms and formulas can be used for the computer-based planning of logistic systems and for the dynamic scheduling of orders and resources in supply networks. They enable significant improvements of performance, quality and costs. Their application is demonstrated by several examples from industr...

  18. Behavior in vitro of the dentin-enamel junction in human premolars submitted to high temperatures: prediction of the maximum temperature based on logistic regression analysis.

    Science.gov (United States)

    Mejía, C; Herrera, A; Sánchez, A I; Moreno, S; Moreno, F

    2016-07-01

    The aim of the study was to provide scientific evidence that would permit DEJ separation to be used as a parameter to estimate the temperature to which burnt, carbonized or incinerated cadavers or human remains had been subjected. A descriptive pseudo-experimental study was carried out in vitro using cone beam tomography to determine the physical behavior of the dentine-enamel junction in 60 human premolars submitted to high temperatures (200°C, 400°C, 600°C, 800°C and 1000°C). Spearman's concordance and correlation index was used to determine the relationship between longitudinal separation of the dentine-enamel junction (mm) and temperature (°C) and a simple linear regression model developed to show that once micro- and macrostructural changes are initiated in the enamel and dentine. The dentine-enamel junction begins to separate from the cervical towards the occlusal as temperature increases.

  19. Satisfaction with life, victimization, and perception of insecurity in Morelos

    Directory of Open Access Journals (Sweden)

    Belén Martínez-Ferrer

    2016-01-01

    Full Text Available Objective. To examines the influence of victimization, perceived insecurity and restrictions on daily routines in life satisfaction. Materials and methods. Participants were 7 535 (50.2% men aged between 12 and 60, selected from a proportional stratified sampling. MANOVA and polyto- mous logistic regression model were calculated. Results. We found significant differences in victimization, perceived insecurity and restrictions on daily routines in relation with life satisfaction levels. Also, physical protective measures, control of personal information, perception of insecurity in public areas and restrictions on daily routines were related to lower levels of satisfaction with life. Conclusions. Lowest levels of satisfaction with life were associated with victimization, perception of insecurity in public areas, and restrictions on daily routines

  20. A critical appraisal of logistic regression-based nomograms, artificial neural networks, classification and regression-tree models, look-up tables and risk-group stratification models for prostate cancer.

    Science.gov (United States)

    Chun, Felix K-H; Karakiewicz, Pierre I; Briganti, Alberto; Walz, Jochen; Kattan, Michael W; Huland, Hartwig; Graefen, Markus

    2007-04-01

    To evaluate several methods of predicting prostate cancer-related outcomes, i.e. nomograms, look-up tables, artificial neural networks (ANN), classification and regression tree (CART) analyses and risk-group stratification (RGS) models, all of which represent valid alternatives. We present four direct comparisons, where a nomogram was compared to either an ANN, a look-up table, a CART model or a RGS model. In all comparisons we assessed the predictive accuracy and performance characteristics of both models. Nomograms have several advantages over ANN, look-up tables, CART and RGS models, the most fundamental being a higher predictive accuracy and better performance characteristics. These results suggest that nomograms are more accurate and have better performance characteristics than their alternatives. However, ANN, look-up tables, CART analyses and RGS models all rely on methodologically sound and valid alternatives, which should not be abandoned.

  1. Prevalence and Associated Factors of Peer Victimization (Bullying) among Grades 7 and 8 Middle School Students in Kuwait.

    Science.gov (United States)

    Abdulsalam, Ahmad J; Al Daihani, Abdullah E; Francis, Konstantinos

    2017-01-01

    Background . Peer victimization (bullying) is a universal phenomenon with detrimental effects. The aim of this study is to determine the prevalence and factors of bullying among grades 7 and 8 middle school students in Kuwait. Methods . The study is a cross-sectional study that includes a sample of 989 7th and 8th grade middle school students randomly selected from schools. The Revised Olweus Bully/Victim Questionnaire was used to measure different forms of bullying. After adjusting for confounding, logistic regression identified the significant associated factors related to bullying. Results . Prevalence of bullying was 30.2 with 95% CI 27.4 to 33.2% (3.5% bullies, 18.9% victims, 7.8% bully victims). Children with physical disabilities and one or both non-Kuwaiti parents or children with divorced/widowed parents were more prone to be victims. Most victims and bullies were found to be current smokers. Bullies were mostly in the fail/fair final school grade category, whereas victims performed better. The logistic regression showed that male gender (adjusted odds ration = 1.671, p = 0.004), grade 8 student (adjusted odds ratio = 1.650, p = 0.004), and student with physical disabilities (adjusted odds ratio = 1.675, p = 0.003), were independently associated with bullying behavior. Conclusions . There is a need for a school-wide professional intervention program and improvement in the students' adjustment to school environment to control bullying behavior.

  2. Prevalence and Associated Factors of Peer Victimization (Bullying among Grades 7 and 8 Middle School Students in Kuwait

    Directory of Open Access Journals (Sweden)

    Ahmad J. Abdulsalam

    2017-01-01

    Full Text Available Background. Peer victimization (bullying is a universal phenomenon with detrimental effects. The aim of this study is to determine the prevalence and factors of bullying among grades 7 and 8 middle school students in Kuwait. Methods. The study is a cross-sectional study that includes a sample of 989 7th and 8th grade middle school students randomly selected from schools. The Revised Olweus Bully/Victim Questionnaire was used to measure different forms of bullying. After adjusting for confounding, logistic regression identified the significant associated factors related to bullying. Results. Prevalence of bullying was 30.2 with 95% CI 27.4 to 33.2% (3.5% bullies, 18.9% victims, 7.8% bully victims. Children with physical disabilities and one or both non-Kuwaiti parents or children with divorced/widowed parents were more prone to be victims. Most victims and bullies were found to be current smokers. Bullies were mostly in the fail/fair final school grade category, whereas victims performed better. The logistic regression showed that male gender (adjusted odds ration = 1.671, p=0.004, grade 8 student (adjusted odds ratio = 1.650, p=0.004, and student with physical disabilities (adjusted odds ratio = 1.675, p=0.003, were independently associated with bullying behavior. Conclusions. There is a need for a school-wide professional intervention program and improvement in the students’ adjustment to school environment to control bullying behavior.

  3. Alcohol consumption in adolescent homicide victims in the city of Johannesburg, South Africa.

    Science.gov (United States)

    Swart, Lu-Anne; Seedat, Mohamed; Nel, Juan

    2015-04-01

    To describe the blood alcohol concentration (BAC) of adolescent homicide victims in Johannesburg, South Africa and to identify the victim and event characteristics associated with a positive BAC at the time of death. Logistic regression of mortality data collected by the National Injury Mortality Surveillance System (NIMSS). Johannesburg, South Africa. A total of 323 adolescent (15-19 years) homicide victims for the period 2001-9 who had been tested for the presence of alcohol. Data on the victims' BAC level, demographics, weapon or method used, scene, day and time of death were drawn from NIMSS. Alcohol was present in 39.3% of the homicide victims. Of these, 88.2% had a BAC level equivalent to or in excess of the South African limit of 0.05 g/100 ml for intoxication. Multivariate logistic analysis showed that a positive BAC in homicide victims was associated significantly with the victim's sex [male: odds ratio (OR) = 2.127; 95% confidence interval (CI) = 1.012-4.471], victim's age (18-19 years: OR = 2.364; CI = 1.343-4.163); weapon used (sharp instruments: OR = 2.972; CI = 1.708-5.171); and time of death (weekend: OR = 3.149; CI = 1.842-5.383; night-time: OR = 2.175; CI = 1.243-3.804). Excessive alcohol consumption is associated with a substantial proportion of adolescent homicides in Johannesburg, South Africa, and is more prevalent among male and older adolescent victims and in victims killed with sharp instruments over the weekends and during the evenings. © 2015 Society for the Study of Addiction.

  4. Reverse Logistics

    OpenAIRE

    Kulikova, Olga

    2016-01-01

    This thesis was focused on the analysis of the concept of reverse logistics and actual reverse processes which are implemented in mining industry and finding solutions for the optimization of reverse logistics in this sphere. The objective of this paper was the assessment of the development of reverse logistics in mining industry on the example of potash production. The theoretical part was based on reverse logistics and mining waste related literature and provided foundations for further...

  5. LOGISTIC MANAGEMENT

    OpenAIRE

    Florin Tudor Ph. D Student

    2011-01-01

    Logistics is the support function of an organization and it means having the right object, at the right place, in the right time. Logistic management involves planing, organizing, leading, coordinating and controlling the logistic activities of an organization. In military science, maintaining one's supply lines while disrupting those of the enemy is a crucial element of military strategy, since an armed force without resources and transportation is defenseless. Logistics is the management of...

  6. New solution for MSW management. A regression model for the estimation of logistic costs; Nuove soluzioni alla gestione dei RSU. Un modello di regressione per la valutazione dei costi logici

    Energy Technology Data Exchange (ETDEWEB)

    Saccardi, D.; Rapaccini, M.; Tucci, M. [Florence Univ., Florence (Italy). Dipt. di Energetica, Sez. Impianti e Tecnologie Industriali

    2000-12-01

    Starting from a set of real data, relating to the electronic weighting of vehicles used in non-separate collection circuits, it was possible to realise a multivariate model of linear regression, to be used in the analytic representation of the logistic costs of MSW management system in Florence. Data have been supplied by the 'consorzio Quadrifoglio' which manages the main public hygiene and environmental services at issue. The elaborated regressive model, which has undergone an accurate validation, enables the estimation of the total logistic cost associated to a specific collection area. A global logistic parameter established on the basis of this result offers the possibility to indicate, on the whole, the supplying of services, both in the planning and executive phases. [Italian] A partire da un set di dati reali, relativi alle pesature elettroniche dei veicoli impiegati nei circuiti di raccolta dell'indifferenziato, e' stato possibile costruire un modello di regressione lineare multivariato, utilizzabile per la rappresentazione analitica dei costi logistici del sistema di gestione RSU della citta' di Firenze. I dati sono stati forniti dal consorzio Quadrifoglio, che gestisce i principali servizi di igiene pubblica ed ambientale in questione. Il modello regressivo elaborato, sui cui e' stata condotta una accurata validazione, permette di stimare il costo logistico totale associato ad una determinata area di raccolta. Sulla base di tale risultato, e' stato definito un parametro logistico globale che permette di esprimere, nel complesso, la presentazione del servizio, sia in fase progettuale che attuativa.

  7. Social anxiety and alcohol-related sexual victimization: A longitudinal pilot study of college women.

    Science.gov (United States)

    Schry, Amie R; Maddox, Brenna B; White, Susan W

    2016-10-01

    We sought to examine social anxiety as a risk factor for alcohol-related sexual victimization among college women. Women (Time 1: n = 574; Time 2: n = 88) who reported consuming alcohol at least once during the assessment timeframe participated. Social anxiety, alcohol use, alcohol-related consequences, and sexual victimization were assessed twice, approximately two months apart. Logistic regressions were used to examine social anxiety as a risk factor for alcohol-related sexual victimization at both time points. Longitudinally, women high in social anxiety were approximately three times more likely to endorse unwanted alcohol-related sexual experiences compared to women with low to moderate social anxiety. This study suggests social anxiety, a modifiable construct, increases risk for alcohol-related sexual victimization among college women. Implications for clinicians and risk-reduction program developers are discussed. Published by Elsevier Ltd.

  8. The role of alcohol consumption in female victimization: findings from a French representative sample.

    Science.gov (United States)

    Bègue, Laurent; Pérez-Diaz, Claudine; Subra, Baptiste; Ceaux, Emmanuelle; Arvers, Philippe; Bricout, Véronique Aurélie; Roché, Sebastian; Swendsen, Joel; Zorman, Michel

    2012-01-01

    Alcohol is frequently related to interpersonal aggression, but information regarding the role of alcohol consumption by victims of severe aggression is however lacking. In order to better understand the dynamic of victimization, we investigated contextual, facilitator, and psychological impact variables related to victimization in a French sample composed of 1,033 females aged 18-74 years. The participants were recruited using quota sampling methodology, and responses were measured using Computer-Assisted Self-Interviewer. A logistic regression was conducted using a backward elimination procedure to identify the significant predictors of blows and wounds suffered in the past 24 months. The results indicated that victims, relative to nonvictims, did binge drink significantly more often, had a higher aggression trait, and had experienced more social hardships in the past. The study's limitations are noted.

  9. Maternal depression and bullying victimization among adolescents: Results from the 2004 Pelotas cohort study.

    Science.gov (United States)

    Azeredo, Catarina Machado; Santos, Iná S; Barros, Aluísio J D; Barros, Fernando C; Matijasevich, Alicia

    2017-10-01

    Maternal depression impacts on several detrimental outcomes during a child's life course, and could increase their risk of victimization. This longitudinal study examined the association between antenatal maternal depression, postnatal trajectories, and current maternal depression and offspring bullying victimization at 11 years. We included 3,441 11-year-old adolescents from the 2004 Pelotas Cohort Study. Antenatal maternal depression, postnatal trajectories, and current maternal depression data were assessed during the follow-up waves. Bullying victimization was self-reported by the adolescents. We used ordinal logistic regression to estimate the odds ratio (OR) and 95% confidence intervals (CIs), for the association between maternal depression and offspring bullying victimization. The most prevalent type of bullying was verbal victimization (37.9%). We observed a positive association between antenatal maternal depression, postnatal trajectories, and current maternal depression and physical bullying victimization. Maternal mood symptoms during pregnancy were associated with physical (OR = 1.30, 95%CI = 1.11-1.53), verbal (OR = 1.29, 95%CI = 1.12-1.49), and any victimization (OR = 1.22, 95%CI = 1.05-1.41). Severe current maternal depression was associated with physical (OR = 1.34, 95%CI = 1.10-1.62), social manipulation (OR = 1.29, 95%CI = 1.08-1.53), attacks on property (OR = 1.30, 95%CI = 1.08-1.57) and any victimization (OR = 1.32, 95%CI = 1.12-1.56). Regarding maternal depression trajectories, the "chronic-high" group was associated with higher risk of social manipulation, attacks on property and any victimization, than the "low" group. Our results strengthen the evidence of association between maternal depression and offspring bullying victimization, and physical victimization appears to be the main component. Further studies are warranted to confirm our findings and to elucidate the theoretical pathways for this longitudinal association. © 2017 Wiley

  10. Peer victimization during adolescence and risk for anxiety disorders in adulthood: a prospective cohort study.

    Science.gov (United States)

    Stapinski, Lexine A; Bowes, Lucy; Wolke, Dieter; Pearson, Rebecca M; Mahedy, Liam; Button, Katherine S; Lewis, Glyn; Araya, Ricardo

    2014-07-01

    Peer victimization is ubiquitous across schools and cultures, and has been suggested as one developmental pathway to anxiety disorders. However, there is a dearth of prospective studies examining this relationship. The purpose of this cohort study was to examine the association between peer victimization during adolescence and subsequent anxiety diagnoses in adulthood. A secondary aim was to investigate whether victimization increases risk for severe anxiety presentations involving diagnostic comorbidity. The sample comprised 6,208 adolescents from the Avon Longitudinal Study of Parents and Children who were interviewed about experiences of peer victimization at age 13. Maternal report of her child's victimization was also assessed. Anxiety disorders at age 18 were assessed with the Clinical Interview Schedule-Revised. Multivariable logistic regression was used to examine the association between victimization and anxiety diagnoses adjusted for potentially confounding individual and family factors. Sensitivity analyses explored whether the association was independent of diagnostic comorbidity with depression. Frequently victimized adolescents were two to three times more likely to develop an anxiety disorder than nonvictimized adolescents (OR = 2.49, 95% CI: 1.62-3.85). The association remained after adjustment for potentially confounding individual and family factors, and was not attributable to diagnostic overlap with depression. Frequently victimized adolescents were also more likely to develop multiple internalizing diagnoses in adulthood. Victimized adolescents are at increased risk of anxiety disorders in later life. Interventions to reduce peer victimization and provide support for victims may be an effective strategy for reducing the burden associated with these disorders. © 2014 The Authors. Depression and Anxiety published by Wiley Periodicals, Inc.

  11. Demographic, Psychological, and School Environment Correlates of Bullying Victimization and School Hassles in Rural Youth

    Directory of Open Access Journals (Sweden)

    Paul R. Smokowski

    2013-01-01

    Full Text Available Little is known about bullying in rural areas. The participants in this study included 3,610 racially diverse youth (average age = 12.8 from 28 rural schools who completed the School Success Profile-Plus. Binary logistic regression models were created to predict bullying victimization in the past 12 months, and ordered logistic regression was used to predict school hassles in the past 12 months. Overall, 22.71% of the sample experienced bullying victimization and school victimization rates ranged from 11% to 38%. Risk factors for bullying victimization included younger students and students experiencing depression and anxiety. Being female, Hispanic/Latino or African American, was associated with lower bullying victimization. Thirty-nine percent of the sample reported a high level of school hassles. Younger students and students with higher levels of anxiety and depression were at increased risk for school hassles. Students from larger schools reported high levels of school hassles, while students from schools with more teachers with advanced degrees reported fewer school hassles.

  12. Computational Logistics

    DEFF Research Database (Denmark)

    Pacino, Dario; Voss, Stefan; Jensen, Rune Møller

    2013-01-01

    This book constitutes the refereed proceedings of the 4th International Conference on Computational Logistics, ICCL 2013, held in Copenhagen, Denmark, in September 2013. The 19 papers presented in this volume were carefully reviewed and selected for inclusion in the book. They are organized...... in topical sections named: maritime shipping, road transport, vehicle routing problems, aviation applications, and logistics and supply chain management....

  13. Computational Logistics

    DEFF Research Database (Denmark)

    This book constitutes the refereed proceedings of the 4th International Conference on Computational Logistics, ICCL 2013, held in Copenhagen, Denmark, in September 2013. The 19 papers presented in this volume were carefully reviewed and selected for inclusion in the book. They are organized...... in topical sections named: maritime shipping, road transport, vehicle routing problems, aviation applications, and logistics and supply chain management....

  14. Reverse logistics

    NARCIS (Netherlands)

    M.P. de Brito (Marisa); S.D.P. Flapper; R. Dekker (Rommert)

    2002-01-01

    textabstractThis paper gives an overview of scientific literature that describes and discusses cases of reverse logistics activities in practice. Over sixty case studies are considered. Based on these studies we are able to indicate critical factors for the practice of reverse logistics. In

  15. Feminism, status inconsistency, and women's intimate partner victimization in heterosexual relationships.

    Science.gov (United States)

    Franklin, Cortney A; Menaker, Tasha A

    2014-07-01

    This study used a random community sample of 303 women in romantic relationships to investigate the role of educational and employment status inconsistency and patriarchal family ideology as risk factors for intimate partner violence (IPV) victimization, while considering demographic factors and relationship context variables. Sequential multivariate logistic regression models demonstrated a decrease in the odds of IPV victimization for Hispanic women and women who were older as compared with their counterparts. In addition, increased relationship distress, family-of-origin violence, and employment status inconsistency significantly increased the odds of IPV. Clinical intervention strategies and future research directions are discussed. © The Author(s) 2014.

  16. Childhood Victimization and Crime Victimization

    Science.gov (United States)

    McIntyre, Jared Kean; Widom, Cathy Spatz

    2011-01-01

    The purpose of this study is to determine whether abused and neglected children are at increased risk for subsequent crime victimization. We ask four basic questions: (a) Does a history of child abuse/neglect increase one's risk of physical, sexual, and property crime victimization? (b) Do lifestyle characteristics (prostitution, running away,…

  17. Utilização de estratificação e modelo de regressão logística na análise de dados de estudos caso-controle Using of stratification and the logistic regression model in the analysis of data of case-control studies

    Directory of Open Access Journals (Sweden)

    Suely Godoy Agostinho Gimeno

    1995-08-01

    Full Text Available Exemplifica-se a aplicação de análise multivariada, por estratificação e com regressão logística, utilizando dados de um estudo caso-controle sobre câncer de esôfago. Oitenta e cinco casos e 292 controles foram classificados segundo sexo, idade e os hábitos de beber e de fumar. As estimativas por ponto dos odds ratios foram semelhantes, sendo as duas técnicas consideradas complementares.Data of a case-control study of esophageal cancer were used as an example of the use of multivariate analysis with stratification and logistic regression. Eighty-five cases and 292 controls were classified according to sex, age and smoking and drinking habits. The point estimates of the odds ratios were similar, and the techniques were considered complementary.

  18. For Whom Does Hate Crime Hurt More? A Comparison of Consequences of Victimization Across Motives and Crime Types.

    Science.gov (United States)

    Mellgren, Caroline; Andersson, Mika; Ivert, Anna-Karin

    2017-12-01

    Hate crimes have been found to have more severe consequences than other parallel crimes that were not motivated by the offenders' hostility toward someone because of their real or perceived difference. Many countries today have hate crime laws that make it possible to increase the penalties for such crimes. The main critique against hate crime laws is that they punish thoughts. Instead, proponents of hate crime laws argue that sentence enhancement is justified because hate crimes cause greater harm. This study compares consequences of victimization across groups of victims to test for whom hate crimes hurt more. We analyzed data that were collected through questionnaires distributed to almost 3,000 students at Malmö University, Sweden, during 2013. The survey focused on students' exposure to, and experiences of, hate crime. A series of separate logistic regression analyses were performed, which analyzed the likelihood for reporting consequences following a crime depending on crime type, perceived motive, repeat victimization, gender, and age. Analyzed as one victim group, victims of hate crime more often reported any of the consequences following a crime compared with victims of parallel non-hate-motivated crimes. And, overall victims of threat more often reported consequences compared with victims of sexual harassment and minor assault. However, all hate crime victim groups did not report more consequences than the non-hate crime victim group. The results provide grounds for questioning that hate crimes hurt the individual victim more. It seems that hate crimes do not hurt all more but hate crimes hurt some victims of some crimes more in some ways.

  19. Risk Factors for Social Networking Site Scam Victimization Among Malaysian Students.

    Science.gov (United States)

    Kirwan, Gráinne H; Fullwood, Chris; Rooney, Brendan

    2018-02-01

    Social networking sites (SNSs) can provide cybercriminals with various opportunities, including gathering of user data and login credentials to enable fraud, and directing of users toward online locations that may install malware onto their devices. The techniques employed by such cybercriminals can include clickbait (text or video), advertisement of nonexistent but potentially desirable products, and hoax competitions/giveaways. This study aimed to identify risk factors associated with falling victim to these malicious techniques. An online survey was completed by 295 Malaysian undergraduate students, finding that more than one-third had fallen victim to SNS scams. Logistic regression analysis identified several victimization risk factors including having higher scores in impulsivity (specifically cognitive complexity), using fewer devices for SNSs, and having been on an SNS for a longer duration. No reliable model was found for vulnerability to hoax valuable gift giveaways and "friend view application" advertising specifically, but vulnerability to video clickbait was predicted by lower extraversion scores, higher levels of openness to experience, using fewer devices, and being on an SNS for a longer duration. Other personality traits were not associated with either overall victimization susceptibility or increased risk of falling victim to the specific techniques. However, age approached significance within both the video clickbait and overall victimization models. These findings suggest that routine activity theory may be particularly beneficial in understanding and preventing SNSs scam victimization.

  20. Impact of bullying victimization on suicide and negative health behaviors among adolescents in Latin America.

    Science.gov (United States)

    Romo, Matthew L; Kelvin, Elizabeth A

    2016-11-01

    To compare the prevalence of bullying victimization, suicidal ideation, suicidal attempts, and negative health behaviors (current tobacco use, recent heavy alcohol use, truancy, involvement in physical fighting, and unprotected sexual intercourse) in five different Latin American countries and determine the association of bullying victimization with these outcomes, exploring both bullying type and frequency. Study data were from Global School-based Student Health Surveys from Bolivia, Costa Rica, Honduras, Peru, and Uruguay, which covered nationally representative samples of school-going adolescents. The surveys used a two-stage clustered sample design, sampling schools and then classrooms. Logistic regression models were run to determine the statistical significance of associations with bullying. Among the 14 560 school-going adolescents included in this study, the prevalence of any bullying victimization in the past 30 days was 37.8%. Bullying victimization was associated with greater odds of suicidal ideation with planning (adjusted odds ratio (AOR): 3.12; P suicide attempt (AOR: 3.07; P bullying victimization on suicide outcomes was also observed. Bullying victimization was associated with higher odds of current tobacco use (AOR: 2.14; P bullying victimization varied by country, its association with suicidal ideation and behavior and negative health behaviors remained relatively consistent. Addressing bullying needs to be made a priority in Latin America, and an integrated approach that also includes mental and physical health promotion is needed.

  1. Increased risk of sadness and suicidality among victims of bullying experiencing additional threats to physical safety.

    Science.gov (United States)

    Pham, Tammy B; Adesman, Andrew

    2017-11-23

    Objective To examine, in a nationally-representative sample of high school students, to what extent one or more additional threats to physical safety exacerbates the risk of sadness and suicidality among victims of school and/or cyber-bullying. Methods National data from the 2015 Youth Risk Behavior Survey (YRBS) were analyzed for grades 9-12 (n = 15,624). Victimization groups were characterized by school-bullying and cyber-bullying, with and without additional threats to physical safety: fighting at school, being threatened/injured at school, and skipping school out of fear for one's safety. Outcomes included 2-week sadness and suicidality. Outcomes for victimization groups were compared to non-victims using logistic regression adjusting for sex, grade and race/ethnicity. Results Overall, 20.2% of students were school-bullied, and 15.5% were cyber-bullied in the past year. Compared to non-victims, victims of school-bullying and victims of cyber-bullying (VoCBs) who did not experience additional threats to physical safety were 2.76 and 3.83 times more likely to report 2-week sadness, and 3.39 and 3.27 times more likely to exhibit suicidality, respectively. Conversely, victims of bullying who experienced one or more additional threats to physical safety were successively more likely to report these adverse outcomes. Notably, victims of school-bullying and VoCBs with all three additional risk factors were 13.13 and 17.75 times more likely to exhibit suicidality, respectively. Conclusion Risk of depression symptoms and suicidality among victims of school-bullying and/or cyber-bullying is greatly increased among those who have experienced additional threats to physical safety: fighting at school, being threatened/injured at school and skipping school out of fear for their safety.

  2. Logistical Worlds

    Directory of Open Access Journals (Sweden)

    Ned Rossiter

    2014-03-01

    Full Text Available As the managerial art and science of coordinating the movement of people, finance and things, logistical operations are central to contemporary capital. Despite its materiality in the form of communications and transport infrastructure, logistics remains an abstract machine for many. This is largely due to the compartmental structure of global supply chains and the invisibility of code. In registering the mediating force of logistics, the essay considers parametric politics as an architecture of intervention for both game design and software development. There are implications here not only for gameplay, but also the invention of method and governance of labour. How, for instance, might game design facilitate the production of a political knowledge of logistics? This becomes a matter to address for labour power vis-à-vis collective research on infrastructure, software and global supply chains.

  3. Understanding victimization

    DEFF Research Database (Denmark)

    Barslund, Mikkel Christoffer; Rand, John; Tarp, Finn

    2007-01-01

    This paper analyzes how economic and non-economic characteristics at the individual, household, and community level affect the risk of victimization in Mozambique. We use a countrywide representative household survey from Mozambique with unique individual level information and show that the proba......This paper analyzes how economic and non-economic characteristics at the individual, household, and community level affect the risk of victimization in Mozambique. We use a countrywide representative household survey from Mozambique with unique individual level information and show...... that the probability of being victimized is increasing in income, but at a diminishing rate. The effect of income is dependent on the type of crime, and poorer households are vulnerable. While less at risk of victimization, they suffer relatively greater losses when such shocks occur. Lower inequality and increased...

  4. Understanding Victimization

    DEFF Research Database (Denmark)

    Barslund, Mikkel; Rand, John; Tarp, Finn

    2007-01-01

    This paper analyzes how economic and non-economic characteristics at the individual, household, and community level affect the risk of victimization in Mozambique. We use a countrywide representative household survey from Mozambique with unique individual level information and show that the proba......This paper analyzes how economic and non-economic characteristics at the individual, household, and community level affect the risk of victimization in Mozambique. We use a countrywide representative household survey from Mozambique with unique individual level information and show...... that the probability of being victimized is increasing in income, but at a diminishing rate. The effect of income is dependent on the type of crime, and poorer households are vulnerable. While less at risk of victimization, they suffer relatively greater losses when such shocks occur. Lower inequality and increased...

  5. The Influence of Witnessing Inter-parental Violence and Bullying Victimization in Involvement in Fighting among Adolescents: Evidence from a School-based Cross-sectional Survey in Peru

    OpenAIRE

    Sharma, Bimala; Nam, Eun Woo; Kim, Ha Yun; Kim, Jong Koo

    2016-01-01

    Background Witnessing inter-parental violence and bullying victimization is common for many children and adolescents. This study examines the role of witnessing inter-parental violence and bullying victimization in involvement in physical fighting among Peruvian adolescents. Methods A cross-sectional study was conducted among 1,368 randomly selected adolescents in 2015. We conducted logistic regression analyses to obtain crude and adjusted odds ratios with 95% confidence intervals for involve...

  6. The role of visual markers in police victimization among structurally vulnerable persons in Tijuana, Mexico.

    Science.gov (United States)

    Pinedo, Miguel; Burgos, Jose Luis; Ojeda, Adriana Vargas; FitzGerald, David; Ojeda, Victoria D

    2015-05-01

    Law enforcement can shape HIV risk behaviours and undermine strategies aimed at curbing HIV infection. Little is known about factors that increase vulnerability to police victimization in Mexico. This study identifies correlates of police or army victimization (i.e., harassment or assault) in the past 6 months among patients seeking care at a free clinic in Tijuana, Mexico. From January to May 2013, 601 patients attending a binational student-run free clinic completed an interviewer-administered questionnaire. Eligible participants were: (1) ≥18 years old; (2) seeking care at the clinic; and (3) spoke Spanish or English. Multivariate logistic regression analyses identified factors associated with police/army victimization in the past 6 months. More than one-third (38%) of participants reported victimization by police/army officials in the past 6 months in Tijuana. In multivariate logistic regression analyses, males (adjusted odds ratio (AOR): 3.68; 95% CI: 2.19-6.19), tattooed persons (AOR: 1.56; 95% CI: 1.04-2.33) and those who injected drugs in the past 6 months (AOR: 2.11; 95% CI: 1.29-3.43) were significantly more likely to report past 6-month police/army victimization. Recent feelings of rejection (AOR: 3.80; 95% CI: 2.47-5.85) and being denied employment (AOR: 2.23; 95% CI: 1.50-3.32) were also independently associated with police/army victimization. Structural interventions aimed at reducing stigma against vulnerable populations and increasing social incorporation may aid in reducing victimization events by police/army in Tijuana. Police education and training to reduce abusive policing practices may be warranted. Copyright © 2014 Elsevier B.V. All rights reserved.

  7. The Influence of Witnessing Inter-parental Violence and Bullying Victimization in Involvement in Fighting among Adolescents: Evidence from a School-based Cross-sectional Survey in Peru.

    Science.gov (United States)

    Sharma, Bimala; Nam, Eun Woo; Kim, Ha Yun; Kim, Jong Koo

    2016-03-01

    Witnessing inter-parental violence and bullying victimization is common for many children and adolescents. This study examines the role of witnessing inter-parental violence and bullying victimization in involvement in physical fighting among Peruvian adolescents. A cross-sectional study was conducted among 1,368 randomly selected adolescents in 2015. We conducted logistic regression analyses to obtain crude and adjusted odds ratios with 95% confidence intervals for involvement in fighting among male and female adolescents. Among all adolescents, 35.8% had been involved in fighting in the last 12 months, 32.9% had been victim of verbal bullying and 37.9% had been the victim of physical bullying. Additionally, 39.2% and 27.8% of adolescents witnessed violence against their mother and father, respectively, at least once in their lives. Multivariate logistic regression analyses found that late adolescence, participation in economic activities, being the victim of verbal bullying, stress, and witnessing violence against the father among male adolescents, and self-rated academic performance and being the victim of physical or verbal bullying among female adolescents were associated with higher odds of being involved in fighting. Verbal bullying victimization and witnessing violence against the father in males and bullying victimization in females were associated with greater odds of adolescents being involved in fighting. Creating a non-violent environment at both home and school would be an effective strategy for reducing fighting among the adolescent population.

  8. Cyber and traditional bullying victimization as a risk factor for mental health problems and suicidal ideation in adolescents.

    Science.gov (United States)

    Bannink, Rienke; Broeren, Suzanne; van de Looij-Jansen, Petra M; de Waart, Frouwkje G; Raat, Hein

    2014-01-01

    To examine whether traditional and cyber bullying victimization were associated with adolescent's mental health problems and suicidal ideation at two-year follow-up. Gender differences were explored to determine whether bullying affects boys and girls differently. A two-year longitudinal study was conducted among first-year secondary school students (N = 3181). Traditional and cyber bullying victimization were assessed at baseline, whereas mental health status and suicidal ideation were assessed at baseline and follow-up by means of self-report questionnaires. Logistic regression analyses were conducted to assess associations between these variables while controlling for baseline problems. Additionally, we tested whether gender differences in mental health and suicidal ideation were present for the two types of bullying. There was a significant interaction between gender and traditional bullying victimization and between gender and cyber bullying victimization on mental health problems. Among boys, traditional and cyber bullying victimization were not related to mental health problems after controlling for baseline mental health. Among girls, both traditional and cyber bullying victimization were associated with mental health problems after controlling for baseline mental health. No significant interaction between gender and traditional or cyber bullying victimization on suicidal ideation was found. Traditional bullying victimization was associated with suicidal ideation, whereas cyber bullying victimization was not associated with suicidal ideation after controlling for baseline suicidal ideation. Traditional bullying victimization is associated with an increased risk of suicidal ideation, whereas traditional, as well as cyber bullying victimization is associated with an increased risk of mental health problems among girls. These findings stress the importance of programs aimed at reducing bullying behavior, especially because early-onset mental health problems

  9. Cyber and Traditional Bullying Victimization as a Risk Factor for Mental Health Problems and Suicidal Ideation in Adolescents

    Science.gov (United States)

    Bannink, Rienke; Broeren, Suzanne; van de Looij – Jansen, Petra M.; de Waart, Frouwkje G.; Raat, Hein

    2014-01-01

    Purpose To examine whether traditional and cyber bullying victimization were associated with adolescent's mental health problems and suicidal ideation at two-year follow-up. Gender differences were explored to determine whether bullying affects boys and girls differently. Methods A two-year longitudinal study was conducted among first-year secondary school students (N = 3181). Traditional and cyber bullying victimization were assessed at baseline, whereas mental health status and suicidal ideation were assessed at baseline and follow-up by means of self-report questionnaires. Logistic regression analyses were conducted to assess associations between these variables while controlling for baseline problems. Additionally, we tested whether gender differences in mental health and suicidal ideation were present for the two types of bullying. Results There was a significant interaction between gender and traditional bullying victimization and between gender and cyber bullying victimization on mental health problems. Among boys, traditional and cyber bullying victimization were not related to mental health problems after controlling for baseline mental health. Among girls, both traditional and cyber bullying victimization were associated with mental health problems after controlling for baseline mental health. No significant interaction between gender and traditional or cyber bullying victimization on suicidal ideation was found. Traditional bullying victimization was associated with suicidal ideation, whereas cyber bullying victimization was not associated with suicidal ideation after controlling for baseline suicidal ideation. Conclusions Traditional bullying victimization is associated with an increased risk of suicidal ideation, whereas traditional, as well as cyber bullying victimization is associated with an increased risk of mental health problems among girls. These findings stress the importance of programs aimed at reducing bullying behavior, especially

  10. Cyber and traditional bullying victimization as a risk factor for mental health problems and suicidal ideation in adolescents.

    Directory of Open Access Journals (Sweden)

    Rienke Bannink

    Full Text Available PURPOSE: To examine whether traditional and cyber bullying victimization were associated with adolescent's mental health problems and suicidal ideation at two-year follow-up. Gender differences were explored to determine whether bullying affects boys and girls differently. METHODS: A two-year longitudinal study was conducted among first-year secondary school students (N = 3181. Traditional and cyber bullying victimization were assessed at baseline, whereas mental health status and suicidal ideation were assessed at baseline and follow-up by means of self-report questionnaires. Logistic regression analyses were conducted to assess associations between these variables while controlling for baseline problems. Additionally, we tested whether gender differences in mental health and suicidal ideation were present for the two types of bullying. RESULTS: There was a significant interaction between gender and traditional bullying victimization and between gender and cyber bullying victimization on mental health problems. Among boys, traditional and cyber bullying victimization were not related to mental health problems after controlling for baseline mental health. Among girls, both traditional and cyber bullying victimization were associated with mental health problems after controlling for baseline mental health. No significant interaction between gender and traditional or cyber bullying victimization on suicidal ideation was found. Traditional bullying victimization was associated with suicidal ideation, whereas cyber bullying victimization was not associated with suicidal ideation after controlling for baseline suicidal ideation. CONCLUSIONS: Traditional bullying victimization is associated with an increased risk of suicidal ideation, whereas traditional, as well as cyber bullying victimization is associated with an increased risk of mental health problems among girls. These findings stress the importance of programs aimed at reducing bullying

  11. Adolescent predictors of young adult cyberbullying perpetration and victimization among Australian youth.

    Science.gov (United States)

    Hemphill, Sheryl A; Heerde, Jessica A

    2014-10-01

    The purpose of the current article was to examine the adolescent risk and protective factors (at the individual, peer group, and family level) for young adult cyberbullying perpetration and victimization. Data from 2006 (Grade 9) to 2010 (young adulthood) were analyzed from a community sample of 927 Victorian students originally recruited as a statewide representative sample in Grade 5 (age, 10-11 years) in 2002 and followed-up to age 18-19 years in 2010 (N = 809). Participants completed a self-report survey on adolescent risk and protective factors and traditional and cyberbullying perpetration and victimization and young adult cyberbullying perpetration and victimization. As young adults, 5.1% self-reported cyberbullying perpetration only, 5.0% reported cyberbullying victimization only, and 9.5% reported both cyberbullying perpetration and victimization. In fully adjusted logistic regression analyses, the adolescent predictors of cyberbullying perpetration only were traditional bullying perpetration, traditional bullying perpetration and victimization, and poor family management. For young adulthood cyberbullying victimization only, the adolescent predictor was emotion control. The adolescent predictors for young adult cyberbullying perpetration and victimization were traditional bullying perpetration and cyberbullying perpetration and victimization. Based on the results of this study, possible targets for prevention and early intervention are reducing adolescent involvement in (traditional or cyber) bullying through the development of social skills and conflict resolution skills. In addition, another important prevention target is to support families with adolescents to ensure that they set clear rules and monitor adolescents' behavior. Universal programs that assist adolescents to develop skills in emotion control are warranted. Copyright © 2014 Society for Adolescent Health and Medicine. Published by Elsevier Inc. All rights reserved.

  12. Adolescent predictors of young adult cyber-bullying perpetration and victimization among Australian youth

    Science.gov (United States)

    Hemphill, Sheryl A.; Heerde, Jessica A.

    2014-01-01

    Purpose The purpose of the current paper was to examine the adolescent risk and protective factors (at the individual, peer group, and family level) for young adult cyber-bullying perpetration and victimization. Methods Data from 2006 (Grade 9) to 2010 (young adulthood) were analyzed from a community sample of 927 Victorian students originally recruited as a state-wide representative sample in Grade 5 (age 10–11 years) in 2002 and followed up to age 18–19 years in 2010 (N = 809). Participants completed a self-report survey on adolescent risk and protective factors and traditional and cyber-bullying perpetration and victimization, and young adult cyber-bullying perpetration and victimization. Results As young adults, 5.1% self-reported cyber-bullying perpetration only, 5.0% cyber-bullying victimization only, and 9.5% reported both cyber-bullying perpetration and victimization. In fully adjusted logistic regression analyses, the adolescent predictors of cyber-bullying perpetration only were traditional bullying perpetration, traditional bullying perpetration and victimization, and poor family management. For young adulthood cyber-bullying victimization only, the adolescent predictor was emotion control. The adolescent predictors for young adult cyber-bullying perpetration and victimization were traditional bullying perpetration and cyber-bullying perpetration and victimization. Conclusions Based on the results of this study, possible targets for prevention and early intervention are reducing adolescent involvement in (traditional or cyber-) bullying through the development of social skills and conflict resolution skills. In addition, another important prevention target is to support families with adolescents to ensure they set clear rules and monitor adolescent’s behavior. Universal programs that assist adolescents to develop skills in emotion control are warranted. PMID:24939014

  13. Victimization and restricted participation among young people with disabilities in the US child welfare system.

    Science.gov (United States)

    Berg, Kristin L; Shiu, Cheng-Shi; Msall, Michael E; Acharya, Kruti

    2015-06-01

    The aim of this study was to assess the role of disability and victimization in young people's participation in developmentally salient activities by analyzing a nationally representative group of young people from the child welfare system (CWS). Data were obtained from interviews with young people and their parents, recorded by the second National Survey of Child and Adolescent Well-Being (NSCAW II). The sample group consisted of 405 females and 270 males, ranging in age from 11 to 17 years (mean age 13y 6mo), and residing with families throughout the USA. The relationships among disability status, victimization, and participation were explored using weighted logistic regression analysis. Controlling for demographical and family-related factors, the probability of young people with disabilities (YWD), involved with the CWS, reporting two or more victimizations was 120% higher (p<0.01) than that of young people without disabilities. YWD in the CWS were almost twice as likely as young people without disabilities to report participation in only one or no developmentally salient activities. Controlling for all other variables, the odds of restricted participation were 6.8-fold higher (p<0.05) for victimized YWD in the CWS. Young people with disabilities who report victimization are significantly less likely than their typically developing peers to participate in developmentally salient activities. Without coordinated efforts to prevent victimization of YWD in the CWS, there will be significant barriers to their participation, well-being, and independent living outcomes. © 2015 Mac Keith Press.

  14. Examining the contemporaneous occurrence of bullying and teen dating violence victimization.

    Science.gov (United States)

    Debnam, Katrina J; Waasdorp, Tracy E; Bradshaw, Catherine P

    2016-03-01

    Teen dating violence (TDV) is a preventable public health issue that has been linked to other forms of aggression and violence victimization. It is also a growing concern for school psychologists who may be working to prevent TDV and related behavioral problems, like bullying. The current study examined various forms of bullying victimization (verbal, physical, and relational) and their association with physical and emotional TDV. Self-report data from 17,780 adolescents (33% African American, 54% White) in Grades 9-12 across 58 high schools were analyzed using 3-level models with dichotomous outcomes. Multilevel logistic regressions indicated that adolescents who had experienced bullying (physical, relational, and verbal) were more likely to have also experienced physical and emotional dating violence. Perceived norms about students' and adults' bullying interventions were associated with reduced odds of physical (OR(adults) = .82, p bullying victimization to design and enhance prevention efforts that address both forms of violence. (c) 2016 APA, all rights reserved).

  15. Adult Victimization in Female Survivors of Childhood Violence and Abuse: The Contribution of Multiple Types of Violence.

    Science.gov (United States)

    Aakvaag, Helene Flood; Thoresen, Siri; Wentzel-Larsen, Tore; Dyb, Grete

    2016-08-30

    Child sexual abuse (CSA) is a well-established risk factor for adult victimization in women, but little is known about the importance of relationship to perpetrator and exposure to other violence types. This study interviewed 2,437 Norwegian women (response rate = 45.0%) about their experiences with violence. Logistic regression analyses were employed to estimate associations of multiple categories of childhood violence with adult victimization. Women exposed to CSA often experienced other childhood violence, and the total burden of violence was associated with adult rape and intimate partner violence (IPV). Researchers and clinicians need to take into account the full spectrum of violence exposure. © The Author(s) 2016.

  16. Logistic and linear regression model documentation for statistical relations between continuous real-time and discrete water-quality constituents in the Kansas River, Kansas, July 2012 through June 2015

    Science.gov (United States)

    Foster, Guy M.; Graham, Jennifer L.

    2016-04-06

    The Kansas River is a primary source of drinking water for about 800,000 people in northeastern Kansas. Source-water supplies are treated by a combination of chemical and physical processes to remove contaminants before distribution. Advanced notification of changing water-quality conditions and cyanobacteria and associated toxin and taste-and-odor compounds provides drinking-water treatment facilities time to develop and implement adequate treatment strategies. The U.S. Geological Survey (USGS), in cooperation with the Kansas Water Office (funded in part through the Kansas State Water Plan Fund), and the City of Lawrence, the City of Topeka, the City of Olathe, and Johnson County Water One, began a study in July 2012 to develop statistical models at two Kansas River sites located upstream from drinking-water intakes. Continuous water-quality monitors have been operated and discrete-water quality samples have been collected on the Kansas River at Wamego (USGS site number 06887500) and De Soto (USGS site number 06892350) since July 2012. Continuous and discrete water-quality data collected during July 2012 through June 2015 were used to develop statistical models for constituents of interest at the Wamego and De Soto sites. Logistic models to continuously estimate the probability of occurrence above selected thresholds were developed for cyanobacteria, microcystin, and geosmin. Linear regression models to continuously estimate constituent concentrations were developed for major ions, dissolved solids, alkalinity, nutrients (nitrogen and phosphorus species), suspended sediment, indicator bacteria (Escherichia coli, fecal coliform, and enterococci), and actinomycetes bacteria. These models will be used to provide real-time estimates of the probability that cyanobacteria and associated compounds exceed thresholds and of the concentrations of other water-quality constituents in the Kansas River. The models documented in this report are useful for characterizing changes

  17. What can dissaving tell us about catastrophic costs? Linear and logistic regression analysis of the relationship between patient costs and financial coping strategies adopted by tuberculosis patients in Bangladesh, Tanzania and Bangalore, India.

    Science.gov (United States)

    Madan, Jason; Lönnroth, Knut; Laokri, Samia; Squire, Stephen Bertel

    2015-10-22

    Tuberculosis (TB) is a major global public health problem which affects poorest individuals the worst. A high proportion of patients incur 'catastrophic costs' which have been shown to result in severe financial hardship and adverse health outcomes. Data on catastrophic cost incidence is not routinely collected, and current definitions of this indicator involve several practical and conceptual barriers to doing so. We analysed data from TB programmes in India (Bangalore), Bangladesh and Tanzania to determine whether dissaving (the sale of assets or uptake of loans) is a useful indicator of financial hardship. Data were obtained from prior studies of TB patient costs in Bangladesh (N = 96), Tanzania (N = 94) and Bangalore (N = 891). These data were analysed using logistic and linear multivariate regression to determine the association between costs (absolute and relative to income) and both the presence of dissaving and the amounts dissaved. After adjusting for covariates such as age, sex and rural/urban location, we found a significant positive association between the occurrence of dissaving and total costs incurred in Tanzania and Bangalore. We further found that, for patients in Bangalore an increase in dissaving of $10 USD was associated with an increase in the cost-income ratio of 0.10 (p financial hardship in its own right, and could therefore play an important role as an indicator to monitor and evaluate the impact of financial protection and service delivery interventions in reducing hardship and facilitating universal health coverage. Further research is required to understand the patterns and types of dissaving that have the strongest relationship with financial hardship and clinical outcomes in order to move toward evidence-based policy making.

  18. Multiple Logistic Regression Analysis of Risk Factors Associated with Denture Plaque and Staining in Chinese Removable Denture Wearers over 40 Years Old in Xi’an – a Cross-Sectional Study

    Science.gov (United States)

    Chai, Zhiguo; Chen, Jihua; Zhang, Shaofeng

    2014-01-01

    Background Removable dentures are subject to plaque and/or staining problems. Denture hygiene habits and risk factors differ among countries and regions. The aims of this study were to assess hygiene habits and denture plaque and staining risk factors in Chinese removable denture wearers aged >40 years in Xi’an through multiple logistic regression analysis (MLRA). Methods Questionnaires were administered to 222 patients whose removable dentures were examined clinically to assess wear status and levels of plaque and staining. Univariate analyses were performed to identify potential risk factors for denture plaque/staining. MLRA was performed to identify significant risk factors. Results Brushing (77.93%) was the most prevalent cleaning method in the present study. Only 16.4% of patients regularly used commercial cleansers. Most (81.08%) patients removed their dentures overnight. MLRA indicated that potential risk factors for denture plaque were the duration of denture use (reference, ≤0.5 years; 2.1–5 years: OR = 4.155, P = 0.001; >5 years: OR = 7.238, Pdenture staining were female gender (OR = 0.377, P = 0.013), smoking (OR = 5.471, P = 0.031), tea consumption (OR = 3.957, P = 0.002), denture scratching (OR = 4.557, P = 0.036), duration of denture use (reference, ≤0.5 years; 2.1–5 years: OR = 7.899, P = 0.001; >5 years: OR = 27.226, PDenture hygiene habits need further improvement. An understanding of the risk factors for denture plaque and staining may provide the basis for preventive efforts. PMID:24498369

  19. Multiple logistic regression analysis of risk factors associated with denture plaque and staining in Chinese removable denture wearers over 40 years old in Xi'an--a cross-sectional study.

    Science.gov (United States)

    Yang, Yanwei; Zhang, Hongchen; Chai, Zhiguo; Chen, Jihua; Zhang, Shaofeng

    2014-01-01

    Removable dentures are subject to plaque and/or staining problems. Denture hygiene habits and risk factors differ among countries and regions. The aims of this study were to assess hygiene habits and denture plaque and staining risk factors in Chinese removable denture wearers aged >40 years in Xi'an through multiple logistic regression analysis (MLRA). Questionnaires were administered to 222 patients whose removable dentures were examined clinically to assess wear status and levels of plaque and staining. Univariate analyses were performed to identify potential risk factors for denture plaque/staining. MLRA was performed to identify significant risk factors. Brushing (77.93%) was the most prevalent cleaning method in the present study. Only 16.4% of patients regularly used commercial cleansers. Most (81.08%) patients removed their dentures overnight. MLRA indicated that potential risk factors for denture plaque were the duration of denture use (reference, ≤0.5 years; 2.1-5 years: OR = 4.155, P = 0.001; >5 years: OR = 7.238, Pdenture staining were female gender (OR = 0.377, P = 0.013), smoking (OR = 5.471, P = 0.031), tea consumption (OR = 3.957, P = 0.002), denture scratching (OR = 4.557, P = 0.036), duration of denture use (reference, ≤0.5 years; 2.1-5 years: OR = 7.899, P = 0.001; >5 years: OR = 27.226, PDenture hygiene habits need further improvement. An understanding of the risk factors for denture plaque and staining may provide the basis for preventive efforts.

  20. From crime scene actions in stranger rape to prediction of rapist type: single-victim or serial rapist?

    Science.gov (United States)

    Corovic, Jelena; Christianson, Sven Å; Bergman, Lars R

    2012-01-01

    The differences in crime scene actions in cases of stranger rape committed by convicted offenders were examined between 31 single-victim rapists and 35 serial rapists. Data were collected from police files, court verdicts, psychiatric evaluations, and criminal records. Findings indicate that the serial rapists were more criminally sophisticated than the single-victim rapists, during their first and second rapes. The single-victim rapists were significantly more likely to engage in the interpersonal involvement behavior of kissing the victim, and to engage in pre-assault alcohol use, than the serial rapists. There was, however, no significant difference in physically violent or sexual behaviors. To investigate the possibility of predicting rapist type, logistic regression analyses were performed. Results indicate that three behaviors in conjunction, kissed victim, controlled victim, and offender drank alcohol before the offense, predicted whether an unknown offender is a single-victim or serial rapist with a classification accuracy of 80.4%. The findings have implications for the classification of stranger rapists in offender profiling. Copyright © 2012 John Wiley & Sons, Ltd.

  1. Child and family-level correlates of direct and indirect peer victimization among children ages 6-9.

    Science.gov (United States)

    Boel-Studt, Shamra; Renner, Lynette M

    2014-06-01

    The purpose of this study was to examine the prevalence and child and family-level correlates of direct and indirect victimization by peers among children ages 6-9. Four hundred and twenty-five children were included in the final sample. Data for this study were drawn from the first wave of the Developmental Victimization Survey. Logistic regression models were used to examine associations between children's demographics, anxiety, depression, anger, parent-child relationship, and exposure to family violence and children's experience of direct or indirect victimization by peers. The results showed that increased depression scores and exposure to family violence were associated with increased risk for direct and indirect victimization by peers. Black children were more likely to experience direct victimization and less likely to experience indirect victimization compared to White children. Child's race significantly moderated the association between parental criticism and indirect victimization. Child's gender did not significantly moderate these associations. Implications for developmentally specific prevention and intervention approaches that are grounded in a social-ecological framework are discussed. Copyright © 2013 Elsevier Ltd. All rights reserved.

  2. Trajectories of Intimate Partner Violence Victimization

    Directory of Open Access Journals (Sweden)

    Kevin M. Swartout

    2012-08-01

    Full Text Available Introduction: The purposes of this study were to assess the extent to which latent trajectories of female intimate partner violence (IPV victimization exist; and, if so, use negative childhood experiences to predict trajectory membership.Methods: We collected data from 1,575 women at 5 time-points regarding experiences during adolescence and their 4 years of college. We used latent class growth analysis to fit a series of personcentered, longitudinal models ranging from 1 to 5 trajectories. Once the best-fitting model was selected, we used negative childhood experience variables—sexual abuse, physical abuse, and witnessing domestic violence—to predict most-likely trajectory membership via multinomial logistic regression.Results: A 5-trajectory model best fit the data both statistically and in terms of interpretability. The trajectories across time were interpreted as low or no IPV, low to moderate IPV, moderate to low IPV, high to moderate IPV, and high and increasing IPV, respectively. Negative childhood experiences differentiated trajectory membership, somewhat, with childhood sexual abuse as a consistent predictor of membership in elevated IPV trajectories.Conclusion: Our analyses show how IPV risk changes over time and in different ways. These differential patterns of IPV suggest the need for prevention strategies tailored for women that consider victimization experiences in childhood and early adulthood. [West J Emerg Med. 2012;13(3:272–277.

  3. Cyberstalking victimization

    Directory of Open Access Journals (Sweden)

    Vilić Vida

    2013-01-01

    Full Text Available Global social networks contributed to the creation of new, inconspicuous, technically perfect shape of criminality which is hard to suppress because of its intangible characteristics. The most common forms of virtual communications’ abuse are: cyberstalking and harassment, identity theft, online fraud, manipulation and misuse of personal information and personal photos, monitoring e-mail accounts and spamming, interception and recording of chat rooms. Cyberstalking is defined as persistent and targeted harassment of an individual by using electronic communication. The victim becomes insecure, frightened, intimidated and does not figure out the best reaction which will terminate the harassment. The aim of this paper is to emphasize the importance and necessity of studying cyberstalking and to point out its forms in order to find the best ways to prevent this negative social phenomenon. Basic topics that will be analyzed in this paper are the various definitions of cyberstalking, forms of cyberstalking, and the most important characteristics of victims and perpetators.

  4. Cyberbullying victimization and mental health in adolescents and the moderating role of family dinners.

    Science.gov (United States)

    Elgar, Frank J; Napoletano, Anthony; Saul, Grace; Dirks, Melanie A; Craig, Wendy; Poteat, V Paul; Holt, Melissa; Koenig, Brian W

    2014-11-01

    This study presents evidence that cyberbullying victimization relates to internalizing, externalizing, and substance use problems in adolescents and that the frequency of family dinners attenuate these associations. To examine the unique association between cyberbullying victimization and adolescen