WorldWideScience

Sample records for poisson regression model

  1. A Seemingly Unrelated Poisson Regression Model

    OpenAIRE

    King, Gary

    1989-01-01

    This article introduces a new estimator for the analysis of two contemporaneously correlated endogenous event count variables. This seemingly unrelated Poisson regression model (SUPREME) estimator combines the efficiencies created by single equation Poisson regression model estimators and insights from "seemingly unrelated" linear regression models.

  2. Modified Regression Correlation Coefficient for Poisson Regression Model

    Science.gov (United States)

    Kaengthong, Nattacha; Domthong, Uthumporn

    2017-09-01

    This study gives attention to indicators in predictive power of the Generalized Linear Model (GLM) which are widely used; however, often having some restrictions. We are interested in regression correlation coefficient for a Poisson regression model. This is a measure of predictive power, and defined by the relationship between the dependent variable (Y) and the expected value of the dependent variable given the independent variables [E(Y|X)] for the Poisson regression model. The dependent variable is distributed as Poisson. The purpose of this research was modifying regression correlation coefficient for Poisson regression model. We also compare the proposed modified regression correlation coefficient with the traditional regression correlation coefficient in the case of two or more independent variables, and having multicollinearity in independent variables. The result shows that the proposed regression correlation coefficient is better than the traditional regression correlation coefficient based on Bias and the Root Mean Square Error (RMSE).

  3. Poisson Mixture Regression Models for Heart Disease Prediction.

    Science.gov (United States)

    Mufudza, Chipo; Erol, Hamza

    2016-01-01

    Early heart disease control can be achieved by high disease prediction and diagnosis efficiency. This paper focuses on the use of model based clustering techniques to predict and diagnose heart disease via Poisson mixture regression models. Analysis and application of Poisson mixture regression models is here addressed under two different classes: standard and concomitant variable mixture regression models. Results show that a two-component concomitant variable Poisson mixture regression model predicts heart disease better than both the standard Poisson mixture regression model and the ordinary general linear Poisson regression model due to its low Bayesian Information Criteria value. Furthermore, a Zero Inflated Poisson Mixture Regression model turned out to be the best model for heart prediction over all models as it both clusters individuals into high or low risk category and predicts rate to heart disease componentwise given clusters available. It is deduced that heart disease prediction can be effectively done by identifying the major risks componentwise using Poisson mixture regression model.

  4. Poisson Mixture Regression Models for Heart Disease Prediction

    Science.gov (United States)

    Erol, Hamza

    2016-01-01

    Early heart disease control can be achieved by high disease prediction and diagnosis efficiency. This paper focuses on the use of model based clustering techniques to predict and diagnose heart disease via Poisson mixture regression models. Analysis and application of Poisson mixture regression models is here addressed under two different classes: standard and concomitant variable mixture regression models. Results show that a two-component concomitant variable Poisson mixture regression model predicts heart disease better than both the standard Poisson mixture regression model and the ordinary general linear Poisson regression model due to its low Bayesian Information Criteria value. Furthermore, a Zero Inflated Poisson Mixture Regression model turned out to be the best model for heart prediction over all models as it both clusters individuals into high or low risk category and predicts rate to heart disease componentwise given clusters available. It is deduced that heart disease prediction can be effectively done by identifying the major risks componentwise using Poisson mixture regression model. PMID:27999611

  5. A test of inflated zeros for Poisson regression models.

    Science.gov (United States)

    He, Hua; Zhang, Hui; Ye, Peng; Tang, Wan

    2017-01-01

    Excessive zeros are common in practice and may cause overdispersion and invalidate inference when fitting Poisson regression models. There is a large body of literature on zero-inflated Poisson models. However, methods for testing whether there are excessive zeros are less well developed. The Vuong test comparing a Poisson and a zero-inflated Poisson model is commonly applied in practice. However, the type I error of the test often deviates seriously from the nominal level, rendering serious doubts on the validity of the test in such applications. In this paper, we develop a new approach for testing inflated zeros under the Poisson model. Unlike the Vuong test for inflated zeros, our method does not require a zero-inflated Poisson model to perform the test. Simulation studies show that when compared with the Vuong test our approach not only better at controlling type I error rate, but also yield more power.

  6. Understanding poisson regression.

    Science.gov (United States)

    Hayat, Matthew J; Higgins, Melinda

    2014-04-01

    Nurse investigators often collect study data in the form of counts. Traditional methods of data analysis have historically approached analysis of count data either as if the count data were continuous and normally distributed or with dichotomization of the counts into the categories of occurred or did not occur. These outdated methods for analyzing count data have been replaced with more appropriate statistical methods that make use of the Poisson probability distribution, which is useful for analyzing count data. The purpose of this article is to provide an overview of the Poisson distribution and its use in Poisson regression. Assumption violations for the standard Poisson regression model are addressed with alternative approaches, including addition of an overdispersion parameter or negative binomial regression. An illustrative example is presented with an application from the ENSPIRE study, and regression modeling of comorbidity data is included for illustrative purposes. Copyright 2014, SLACK Incorporated.

  7. Collision prediction models using multivariate Poisson-lognormal regression.

    Science.gov (United States)

    El-Basyouny, Karim; Sayed, Tarek

    2009-07-01

    This paper advocates the use of multivariate Poisson-lognormal (MVPLN) regression to develop models for collision count data. The MVPLN approach presents an opportunity to incorporate the correlations across collision severity levels and their influence on safety analyses. The paper introduces a new multivariate hazardous location identification technique, which generalizes the univariate posterior probability of excess that has been commonly proposed and applied in the literature. In addition, the paper presents an alternative approach for quantifying the effect of the multivariate structure on the precision of expected collision frequency. The MVPLN approach is compared with the independent (separate) univariate Poisson-lognormal (PLN) models with respect to model inference, goodness-of-fit, identification of hot spots and precision of expected collision frequency. The MVPLN is modeled using the WinBUGS platform which facilitates computation of posterior distributions as well as providing a goodness-of-fit measure for model comparisons. The results indicate that the estimates of the extra Poisson variation parameters were considerably smaller under MVPLN leading to higher precision. The improvement in precision is due mainly to the fact that MVPLN accounts for the correlation between the latent variables representing property damage only (PDO) and injuries plus fatalities (I+F). This correlation was estimated at 0.758, which is highly significant, suggesting that higher PDO rates are associated with higher I+F rates, as the collision likelihood for both types is likely to rise due to similar deficiencies in roadway design and/or other unobserved factors. In terms of goodness-of-fit, the MVPLN model provided a superior fit than the independent univariate models. The multivariate hazardous location identification results demonstrated that some hazardous locations could be overlooked if the analysis was restricted to the univariate models.

  8. Modeling the number of car theft using Poisson regression

    Science.gov (United States)

    Zulkifli, Malina; Ling, Agnes Beh Yen; Kasim, Maznah Mat; Ismail, Noriszura

    2016-10-01

    Regression analysis is the most popular statistical methods used to express the relationship between the variables of response with the covariates. The aim of this paper is to evaluate the factors that influence the number of car theft using Poisson regression model. This paper will focus on the number of car thefts that occurred in districts in Peninsular Malaysia. There are two groups of factor that have been considered, namely district descriptive factors and socio and demographic factors. The result of the study showed that Bumiputera composition, Chinese composition, Other ethnic composition, foreign migration, number of residence with the age between 25 to 64, number of employed person and number of unemployed person are the most influence factors that affect the car theft cases. These information are very useful for the law enforcement department, insurance company and car owners in order to reduce and limiting the car theft cases in Peninsular Malaysia.

  9. [Application of detecting and taking overdispersion into account in Poisson regression model].

    Science.gov (United States)

    Bouche, G; Lepage, B; Migeot, V; Ingrand, P

    2009-08-01

    Researchers often use the Poisson regression model to analyze count data. Overdispersion can occur when a Poisson regression model is used, resulting in an underestimation of variance of the regression model parameters. Our objective was to take overdispersion into account and assess its impact with an illustration based on the data of a study investigating the relationship between use of the Internet to seek health information and number of primary care consultations. Three methods, overdispersed Poisson, a robust estimator, and negative binomial regression, were performed to take overdispersion into account in explaining variation in the number (Y) of primary care consultations. We tested overdispersion in the Poisson regression model using the ratio of the sum of Pearson residuals over the number of degrees of freedom (chi(2)/df). We then fitted the three models and compared parameter estimation to the estimations given by Poisson regression model. Variance of the number of primary care consultations (Var[Y]=21.03) was greater than the mean (E[Y]=5.93) and the chi(2)/df ratio was 3.26, which confirmed overdispersion. Standard errors of the parameters varied greatly between the Poisson regression model and the three other regression models. Interpretation of estimates from two variables (using the Internet to seek health information and single parent family) would have changed according to the model retained, with significant levels of 0.06 and 0.002 (Poisson), 0.29 and 0.09 (overdispersed Poisson), 0.29 and 0.13 (use of a robust estimator) and 0.45 and 0.13 (negative binomial) respectively. Different methods exist to solve the problem of underestimating variance in the Poisson regression model when overdispersion is present. The negative binomial regression model seems to be particularly accurate because of its theorical distribution ; in addition this regression is easy to perform with ordinary statistical software packages.

  10. Parameter estimation and statistical test of geographically weighted bivariate Poisson inverse Gaussian regression models

    Science.gov (United States)

    Amalia, Junita; Purhadi, Otok, Bambang Widjanarko

    2017-11-01

    Poisson distribution is a discrete distribution with count data as the random variables and it has one parameter defines both mean and variance. Poisson regression assumes mean and variance should be same (equidispersion). Nonetheless, some case of the count data unsatisfied this assumption because variance exceeds mean (over-dispersion). The ignorance of over-dispersion causes underestimates in standard error. Furthermore, it causes incorrect decision in the statistical test. Previously, paired count data has a correlation and it has bivariate Poisson distribution. If there is over-dispersion, modeling paired count data is not sufficient with simple bivariate Poisson regression. Bivariate Poisson Inverse Gaussian Regression (BPIGR) model is mix Poisson regression for modeling paired count data within over-dispersion. BPIGR model produces a global model for all locations. In another hand, each location has different geographic conditions, social, cultural and economic so that Geographically Weighted Regression (GWR) is needed. The weighting function of each location in GWR generates a different local model. Geographically Weighted Bivariate Poisson Inverse Gaussian Regression (GWBPIGR) model is used to solve over-dispersion and to generate local models. Parameter estimation of GWBPIGR model obtained by Maximum Likelihood Estimation (MLE) method. Meanwhile, hypothesis testing of GWBPIGR model acquired by Maximum Likelihood Ratio Test (MLRT) method.

  11. Maximum Entropy Discrimination Poisson Regression for Software Reliability Modeling.

    Science.gov (United States)

    Chatzis, Sotirios P; Andreou, Andreas S

    2015-11-01

    Reliably predicting software defects is one of the most significant tasks in software engineering. Two of the major components of modern software reliability modeling approaches are: 1) extraction of salient features for software system representation, based on appropriately designed software metrics and 2) development of intricate regression models for count data, to allow effective software reliability data modeling and prediction. Surprisingly, research in the latter frontier of count data regression modeling has been rather limited. More specifically, a lack of simple and efficient algorithms for posterior computation has made the Bayesian approaches appear unattractive, and thus underdeveloped in the context of software reliability modeling. In this paper, we try to address these issues by introducing a novel Bayesian regression model for count data, based on the concept of max-margin data modeling, effected in the context of a fully Bayesian model treatment with simple and efficient posterior distribution updates. Our novel approach yields a more discriminative learning technique, making more effective use of our training data during model inference. In addition, it allows of better handling uncertainty in the modeled data, which can be a significant problem when the training data are limited. We derive elegant inference algorithms for our model under the mean-field paradigm and exhibit its effectiveness using the publicly available benchmark data sets.

  12. A generalized right truncated bivariate Poisson regression model with applications to health data.

    Science.gov (United States)

    Islam, M Ataharul; Chowdhury, Rafiqul I

    2017-01-01

    A generalized right truncated bivariate Poisson regression model is proposed in this paper. Estimation and tests for goodness of fit and over or under dispersion are illustrated for both untruncated and right truncated bivariate Poisson regression models using marginal-conditional approach. Estimation and test procedures are illustrated for bivariate Poisson regression models with applications to Health and Retirement Study data on number of health conditions and the number of health care services utilized. The proposed test statistics are easy to compute and it is evident from the results that the models fit the data very well. A comparison between the right truncated and untruncated bivariate Poisson regression models using the test for nonnested models clearly shows that the truncated model performs significantly better than the untruncated model.

  13. Modelling infant mortality rate in Central Java, Indonesia use generalized poisson regression method

    Science.gov (United States)

    Prahutama, Alan; Sudarno

    2018-05-01

    The infant mortality rate is the number of deaths under one year of age occurring among the live births in a given geographical area during a given year, per 1,000 live births occurring among the population of the given geographical area during the same year. This problem needs to be addressed because it is an important element of a country’s economic development. High infant mortality rate will disrupt the stability of a country as it relates to the sustainability of the population in the country. One of regression model that can be used to analyze the relationship between dependent variable Y in the form of discrete data and independent variable X is Poisson regression model. Recently The regression modeling used for data with dependent variable is discrete, among others, poisson regression, negative binomial regression and generalized poisson regression. In this research, generalized poisson regression modeling gives better AIC value than poisson regression. The most significant variable is the Number of health facilities (X1), while the variable that gives the most influence to infant mortality rate is the average breastfeeding (X9).

  14. Poisson regression for modeling count and frequency outcomes in trauma research.

    Science.gov (United States)

    Gagnon, David R; Doron-LaMarca, Susan; Bell, Margret; O'Farrell, Timothy J; Taft, Casey T

    2008-10-01

    The authors describe how the Poisson regression method for analyzing count or frequency outcome variables can be applied in trauma studies. The outcome of interest in trauma research may represent a count of the number of incidents of behavior occurring in a given time interval, such as acts of physical aggression or substance abuse. Traditional linear regression approaches assume a normally distributed outcome variable with equal variances over the range of predictor variables, and may not be optimal for modeling count outcomes. An application of Poisson regression is presented using data from a study of intimate partner aggression among male patients in an alcohol treatment program and their female partners. Results of Poisson regression and linear regression models are compared.

  15. Misspecified poisson regression models for large-scale registry data

    DEFF Research Database (Denmark)

    Grøn, Randi; Gerds, Thomas A.; Andersen, Per K.

    2016-01-01

    working models that are then likely misspecified. To support and improve conclusions drawn from such models, we discuss methods for sensitivity analysis, for estimation of average exposure effects using aggregated data, and a semi-parametric bootstrap method to obtain robust standard errors. The methods...

  16. Modeling animal-vehicle collisions using diagonal inflated bivariate Poisson regression.

    Science.gov (United States)

    Lao, Yunteng; Wu, Yao-Jan; Corey, Jonathan; Wang, Yinhai

    2011-01-01

    Two types of animal-vehicle collision (AVC) data are commonly adopted for AVC-related risk analysis research: reported AVC data and carcass removal data. One issue with these two data sets is that they were found to have significant discrepancies by previous studies. In order to model these two types of data together and provide a better understanding of highway AVCs, this study adopts a diagonal inflated bivariate Poisson regression method, an inflated version of bivariate Poisson regression model, to fit the reported AVC and carcass removal data sets collected in Washington State during 2002-2006. The diagonal inflated bivariate Poisson model not only can model paired data with correlation, but also handle under- or over-dispersed data sets as well. Compared with three other types of models, double Poisson, bivariate Poisson, and zero-inflated double Poisson, the diagonal inflated bivariate Poisson model demonstrates its capability of fitting two data sets with remarkable overlapping portions resulting from the same stochastic process. Therefore, the diagonal inflated bivariate Poisson model provides researchers a new approach to investigating AVCs from a different perspective involving the three distribution parameters (λ(1), λ(2) and λ(3)). The modeling results show the impacts of traffic elements, geometric design and geographic characteristics on the occurrences of both reported AVC and carcass removal data. It is found that the increase of some associated factors, such as speed limit, annual average daily traffic, and shoulder width, will increase the numbers of reported AVCs and carcass removals. Conversely, the presence of some geometric factors, such as rolling and mountainous terrain, will decrease the number of reported AVCs. Published by Elsevier Ltd.

  17. A LATENT CLASS POISSON REGRESSION-MODEL FOR HETEROGENEOUS COUNT DATA

    NARCIS (Netherlands)

    WEDEL, M; DESARBO, WS; BULT, [No Value; RAMASWAMY, [No Value

    1993-01-01

    In this paper an approach is developed that accommodates heterogeneity in Poisson regression models for count data. The model developed assumes that heterogeneity arises from a distribution of both the intercept and the coefficients of the explanatory variables. We assume that the mixing

  18. Poisson regression approach for modeling fatal injury rates amongst Malaysian workers

    International Nuclear Information System (INIS)

    Kamarulzaman Ibrahim; Heng Khai Theng

    2005-01-01

    Many safety studies are based on the analysis carried out on injury surveillance data. The injury surveillance data gathered for the analysis include information on number of employees at risk of injury in each of several strata where the strata are defined in terms of a series of important predictor variables. Further insight into the relationship between fatal injury rates and predictor variables may be obtained by the poisson regression approach. Poisson regression is widely used in analyzing count data. In this study, poisson regression is used to model the relationship between fatal injury rates and predictor variables which are year (1995-2002), gender, recording system and industry type. Data for the analysis were obtained from PERKESO and Jabatan Perangkaan Malaysia. It is found that the assumption that the data follow poisson distribution has been violated. After correction for the problem of over dispersion, the predictor variables that are found to be significant in the model are gender, system of recording, industry type, two interaction effects (interaction between recording system and industry type and between year and industry type). Introduction Regression analysis is one of the most popular

  19. Analisis Faktor – Faktor yang Mempengaruhi Jumlah Kejahatan Pencurian Kendaraan Bermotor (Curanmor) Menggunakan Model Geographically Weighted Poisson Regression (Gwpr)

    OpenAIRE

    Haris, Muhammad; Yasin, Hasbi; Hoyyi, Abdul

    2015-01-01

    Theft is an act taking someone else's property, partially or entierely, with intention to have it illegally. Motor vehicle theft is one of the most highlighted crime type and disturbing the communities. Regression analysis is a statistical analysis for modeling the relationships between response variable and predictor variable. If the response variable follows a Poisson distribution or categorized as a count data, so the regression model used is Poisson regression. Geographically Weighted Poi...

  20. Zero inflated Poisson and negative binomial regression models: application in education.

    Science.gov (United States)

    Salehi, Masoud; Roudbari, Masoud

    2015-01-01

    The number of failed courses and semesters in students are indicators of their performance. These amounts have zero inflated (ZI) distributions. Using ZI Poisson and negative binomial distributions we can model these count data to find the associated factors and estimate the parameters. This study aims at to investigate the important factors related to the educational performance of students. This cross-sectional study performed in 2008-2009 at Iran University of Medical Sciences (IUMS) with a population of almost 6000 students, 670 students selected using stratified random sampling. The educational and demographical data were collected using the University records. The study design was approved at IUMS and the students' data kept confidential. The descriptive statistics and ZI Poisson and negative binomial regressions were used to analyze the data. The data were analyzed using STATA. In the number of failed semesters, Poisson and negative binomial distributions with ZI, students' total average and quota system had the most roles. For the number of failed courses, total average, and being in undergraduate or master levels had the most effect in both models. In all models the total average have the most effect on the number of failed courses or semesters. The next important factor is quota system in failed semester and undergraduate and master levels in failed courses. Therefore, average has an important inverse effect on the numbers of failed courses and semester.

  1. Use of Poisson spatiotemporal regression models for the Brazilian Amazon Forest: malaria count data

    Directory of Open Access Journals (Sweden)

    Jorge Alberto Achcar

    2011-12-01

    Full Text Available INTRODUCTION: Malaria is a serious problem in the Brazilian Amazon region, and the detection of possible risk factors could be of great interest for public health authorities. The objective of this article was to investigate the association between environmental variables and the yearly registers of malaria in the Amazon region using Bayesian spatiotemporal methods. METHODS: We used Poisson spatiotemporal regression models to analyze the Brazilian Amazon forest malaria count for the period from 1999 to 2008. In this study, we included some covariates that could be important in the yearly prediction of malaria, such as deforestation rate. We obtained the inferences using a Bayesian approach and Markov Chain Monte Carlo (MCMC methods to simulate samples for the joint posterior distribution of interest. The discrimination of different models was also discussed. RESULTS: The model proposed here suggests that deforestation rate, the number of inhabitants per km², and the human development index (HDI are important in the prediction of malaria cases. CONCLUSIONS: It is possible to conclude that human development, population growth, deforestation, and their associated ecological alterations are conducive to increasing malaria risk. We conclude that the use of Poisson regression models that capture the spatial and temporal effects under the Bayesian paradigm is a good strategy for modeling malaria counts.

  2. Use of Poisson spatiotemporal regression models for the Brazilian Amazon Forest: malaria count data.

    Science.gov (United States)

    Achcar, Jorge Alberto; Martinez, Edson Zangiacomi; Souza, Aparecida Doniseti Pires de; Tachibana, Vilma Mayumi; Flores, Edilson Ferreira

    2011-01-01

    Malaria is a serious problem in the Brazilian Amazon region, and the detection of possible risk factors could be of great interest for public health authorities. The objective of this article was to investigate the association between environmental variables and the yearly registers of malaria in the Amazon region using bayesian spatiotemporal methods. We used Poisson spatiotemporal regression models to analyze the Brazilian Amazon forest malaria count for the period from 1999 to 2008. In this study, we included some covariates that could be important in the yearly prediction of malaria, such as deforestation rate. We obtained the inferences using a bayesian approach and Markov Chain Monte Carlo (MCMC) methods to simulate samples for the joint posterior distribution of interest. The discrimination of different models was also discussed. The model proposed here suggests that deforestation rate, the number of inhabitants per km², and the human development index (HDI) are important in the prediction of malaria cases. It is possible to conclude that human development, population growth, deforestation, and their associated ecological alterations are conducive to increasing malaria risk. We conclude that the use of Poisson regression models that capture the spatial and temporal effects under the bayesian paradigm is a good strategy for modeling malaria counts.

  3. Climate changes and their effects in the public health: use of poisson regression models

    Directory of Open Access Journals (Sweden)

    Jonas Bodini Alonso

    2010-08-01

    Full Text Available In this paper, we analyze the daily number of hospitalizations in São Paulo City, Brazil, in the period of January 01, 2002 to December 31, 2005. This data set relates to pneumonia, coronary ischemic diseases, diabetes and chronic diseases in different age categories. In order to verify the effect of climate changes the following covariates are considered: atmosphere pressure, air humidity, temperature, year season and also a covariate related to the week day when the hospitalization occurred. The possible effects of the assumed covariates in the number of hospitalization are studied using a Poisson regression model in the presence or not of a random effect which captures the possible correlation among the hospitalization accounting for the different age categories in the same day and the extra-Poisson variability for the longitudinal data. The inferences of interest are obtained using the Bayesian paradigm and MCMC (Markov chain Monte Carlo methods.Neste artigo, analisamos os dados relativos aos números diários de hospitalizações na cidade de São Paulo, Brasil no período de 01/01/2002 a 31/12/2005 devido a pneumonia, doenças isquêmicas, diabetes e doenças crônicas e de acordo com a faixa etária. Com o objetivo de estudar o efeito de mudanças climáticas são consideradas algumas covariáveis climáticas os índices diários de pressão atmosférica, umidade do ar, temperatura e estação do ano, e uma covariável relacionada ao dia da semana da ocorrência de hospitalização. Para verificar os efeitos das covariáveis nas respostas dadas pelo numero de hospitalizações, consideramos um modelo de regressão de Poisson na presença ou não de um efeito aleatório que captura a possível correlação entre as contagens para as faixas etárias de um mesmo dia e a variabilidade extra-poisson para os dados longitudinais. As inferências de interesse são obtidas usando o paradigma bayesiano e métodos de simulação MCMC (Monte Carlo

  4. A comparison between Poisson and zero-inflated Poisson regression models with an application to number of black spots in Corriedale sheep

    Directory of Open Access Journals (Sweden)

    Rodrigues-Motta Mariana

    2008-07-01

    Full Text Available Abstract Dark spots in the fleece area are often associated with dark fibres in wool, which limits its competitiveness with other textile fibres. Field data from a sheep experiment in Uruguay revealed an excess number of zeros for dark spots. We compared the performance of four Poisson and zero-inflated Poisson (ZIP models under four simulation scenarios. All models performed reasonably well under the same scenario for which the data were simulated. The deviance information criterion favoured a Poisson model with residual, while the ZIP model with a residual gave estimates closer to their true values under all simulation scenarios. Both Poisson and ZIP models with an error term at the regression level performed better than their counterparts without such an error. Field data from Corriedale sheep were analysed with Poisson and ZIP models with residuals. Parameter estimates were similar for both models. Although the posterior distribution of the sire variance was skewed due to a small number of rams in the dataset, the median of this variance suggested a scope for genetic selection. The main environmental factor was the age of the sheep at shearing. In summary, age related processes seem to drive the number of dark spots in this breed of sheep.

  5. A coregionalization model can assist specification of Geographically Weighted Poisson Regression: Application to an ecological study.

    Science.gov (United States)

    Ribeiro, Manuel Castro; Sousa, António Jorge; Pereira, Maria João

    2016-05-01

    The geographical distribution of health outcomes is influenced by socio-economic and environmental factors operating on different spatial scales. Geographical variations in relationships can be revealed with semi-parametric Geographically Weighted Poisson Regression (sGWPR), a model that can combine both geographically varying and geographically constant parameters. To decide whether a parameter should vary geographically, two models are compared: one in which all parameters are allowed to vary geographically and one in which all except the parameter being evaluated are allowed to vary geographically. The model with the lower corrected Akaike Information Criterion (AICc) is selected. Delivering model selection exclusively according to the AICc might hide important details in spatial variations of associations. We propose assisting the decision by using a Linear Model of Coregionalization (LMC). Here we show how LMC can refine sGWPR on ecological associations between socio-economic and environmental variables and low birth weight outcomes in the west-north-central region of Portugal. Copyright © 2016 Elsevier Ltd. All rights reserved.

  6. Modeling both of the number of pausibacillary and multibacillary leprosy patients by using bivariate poisson regression

    Science.gov (United States)

    Winahju, W. S.; Mukarromah, A.; Putri, S.

    2015-03-01

    Leprosy is a chronic infectious disease caused by bacteria of leprosy (Mycobacterium leprae). Leprosy has become an important thing in Indonesia because its morbidity is quite high. Based on WHO data in 2014, in 2012 Indonesia has the highest number of new leprosy patients after India and Brazil with a contribution of 18.994 people (8.7% of the world). This number makes Indonesia automatically placed as the country with the highest number of leprosy morbidity of ASEAN countries. The province that most contributes to the number of leprosy patients in Indonesia is East Java. There are two kind of leprosy. They consist of pausibacillary and multibacillary. The morbidity of multibacillary leprosy is higher than pausibacillary leprosy. This paper will discuss modeling both of the number of multibacillary and pausibacillary leprosy patients as responses variables. These responses are count variables, so modeling will be conducted by using bivariate poisson regression method. Unit experiment used is in East Java, and predictors involved are: environment, demography, and poverty. The model uses data in 2012, and the result indicates that all predictors influence significantly.

  7. Decoding and modelling of time series count data using Poisson hidden Markov model and Markov ordinal logistic regression models.

    Science.gov (United States)

    Sebastian, Tunny; Jeyaseelan, Visalakshi; Jeyaseelan, Lakshmanan; Anandan, Shalini; George, Sebastian; Bangdiwala, Shrikant I

    2018-01-01

    Hidden Markov models are stochastic models in which the observations are assumed to follow a mixture distribution, but the parameters of the components are governed by a Markov chain which is unobservable. The issues related to the estimation of Poisson-hidden Markov models in which the observations are coming from mixture of Poisson distributions and the parameters of the component Poisson distributions are governed by an m-state Markov chain with an unknown transition probability matrix are explained here. These methods were applied to the data on Vibrio cholerae counts reported every month for 11-year span at Christian Medical College, Vellore, India. Using Viterbi algorithm, the best estimate of the state sequence was obtained and hence the transition probability matrix. The mean passage time between the states were estimated. The 95% confidence interval for the mean passage time was estimated via Monte Carlo simulation. The three hidden states of the estimated Markov chain are labelled as 'Low', 'Moderate' and 'High' with the mean counts of 1.4, 6.6 and 20.2 and the estimated average duration of stay of 3, 3 and 4 months, respectively. Environmental risk factors were studied using Markov ordinal logistic regression analysis. No significant association was found between disease severity levels and climate components.

  8. The Poisson-exponential regression model under different latent activation schemes

    OpenAIRE

    Louzada, Francisco; Cancho, Vicente G; Barriga, Gladys D.C

    2012-01-01

    In this paper, a new family of survival distributions is presented. It is derived by considering that the latent number of failure causes follows a Poisson distribution and the time for these causes to be activated follows an exponential distribution. Three different activationschemes are also considered. Moreover, we propose the inclusion of covariates in the model formulation in order to study their effect on the expected value of the number of causes and on the failure rate function. Infer...

  9. Background stratified Poisson regression analysis of cohort data.

    Science.gov (United States)

    Richardson, David B; Langholz, Bryan

    2012-03-01

    Background stratified Poisson regression is an approach that has been used in the analysis of data derived from a variety of epidemiologically important studies of radiation-exposed populations, including uranium miners, nuclear industry workers, and atomic bomb survivors. We describe a novel approach to fit Poisson regression models that adjust for a set of covariates through background stratification while directly estimating the radiation-disease association of primary interest. The approach makes use of an expression for the Poisson likelihood that treats the coefficients for stratum-specific indicator variables as 'nuisance' variables and avoids the need to explicitly estimate the coefficients for these stratum-specific parameters. Log-linear models, as well as other general relative rate models, are accommodated. This approach is illustrated using data from the Life Span Study of Japanese atomic bomb survivors and data from a study of underground uranium miners. The point estimate and confidence interval obtained from this 'conditional' regression approach are identical to the values obtained using unconditional Poisson regression with model terms for each background stratum. Moreover, it is shown that the proposed approach allows estimation of background stratified Poisson regression models of non-standard form, such as models that parameterize latency effects, as well as regression models in which the number of strata is large, thereby overcoming the limitations of previously available statistical software for fitting background stratified Poisson regression models.

  10. Background stratified Poisson regression analysis of cohort data

    International Nuclear Information System (INIS)

    Richardson, David B.; Langholz, Bryan

    2012-01-01

    Background stratified Poisson regression is an approach that has been used in the analysis of data derived from a variety of epidemiologically important studies of radiation-exposed populations, including uranium miners, nuclear industry workers, and atomic bomb survivors. We describe a novel approach to fit Poisson regression models that adjust for a set of covariates through background stratification while directly estimating the radiation-disease association of primary interest. The approach makes use of an expression for the Poisson likelihood that treats the coefficients for stratum-specific indicator variables as 'nuisance' variables and avoids the need to explicitly estimate the coefficients for these stratum-specific parameters. Log-linear models, as well as other general relative rate models, are accommodated. This approach is illustrated using data from the Life Span Study of Japanese atomic bomb survivors and data from a study of underground uranium miners. The point estimate and confidence interval obtained from this 'conditional' regression approach are identical to the values obtained using unconditional Poisson regression with model terms for each background stratum. Moreover, it is shown that the proposed approach allows estimation of background stratified Poisson regression models of non-standard form, such as models that parameterize latency effects, as well as regression models in which the number of strata is large, thereby overcoming the limitations of previously available statistical software for fitting background stratified Poisson regression models. (orig.)

  11. Predictors of the number of under-five malnourished children in Bangladesh: application of the generalized poisson regression model.

    Science.gov (United States)

    Islam, Mohammad Mafijul; Alam, Morshed; Tariquzaman, Md; Kabir, Mohammad Alamgir; Pervin, Rokhsona; Begum, Munni; Khan, Md Mobarak Hossain

    2013-01-08

    Malnutrition is one of the principal causes of child mortality in developing countries including Bangladesh. According to our knowledge, most of the available studies, that addressed the issue of malnutrition among under-five children, considered the categorical (dichotomous/polychotomous) outcome variables and applied logistic regression (binary/multinomial) to find their predictors. In this study malnutrition variable (i.e. outcome) is defined as the number of under-five malnourished children in a family, which is a non-negative count variable. The purposes of the study are (i) to demonstrate the applicability of the generalized Poisson regression (GPR) model as an alternative of other statistical methods and (ii) to find some predictors of this outcome variable. The data is extracted from the Bangladesh Demographic and Health Survey (BDHS) 2007. Briefly, this survey employs a nationally representative sample which is based on a two-stage stratified sample of households. A total of 4,460 under-five children is analysed using various statistical techniques namely Chi-square test and GPR model. The GPR model (as compared to the standard Poisson regression and negative Binomial regression) is found to be justified to study the above-mentioned outcome variable because of its under-dispersion (variance variable namely mother's education, father's education, wealth index, sanitation status, source of drinking water, and total number of children ever born to a woman. Consistencies of our findings in light of many other studies suggest that the GPR model is an ideal alternative of other statistical models to analyse the number of under-five malnourished children in a family. Strategies based on significant predictors may improve the nutritional status of children in Bangladesh.

  12. Analyzing hospitalization data: potential limitations of Poisson regression.

    Science.gov (United States)

    Weaver, Colin G; Ravani, Pietro; Oliver, Matthew J; Austin, Peter C; Quinn, Robert R

    2015-08-01

    Poisson regression is commonly used to analyze hospitalization data when outcomes are expressed as counts (e.g. number of days in hospital). However, data often violate the assumptions on which Poisson regression is based. More appropriate extensions of this model, while available, are rarely used. We compared hospitalization data between 206 patients treated with hemodialysis (HD) and 107 treated with peritoneal dialysis (PD) using Poisson regression and compared results from standard Poisson regression with those obtained using three other approaches for modeling count data: negative binomial (NB) regression, zero-inflated Poisson (ZIP) regression and zero-inflated negative binomial (ZINB) regression. We examined the appropriateness of each model and compared the results obtained with each approach. During a mean 1.9 years of follow-up, 183 of 313 patients (58%) were never hospitalized (indicating an excess of 'zeros'). The data also displayed overdispersion (variance greater than mean), violating another assumption of the Poisson model. Using four criteria, we determined that the NB and ZINB models performed best. According to these two models, patients treated with HD experienced similar hospitalization rates as those receiving PD {NB rate ratio (RR): 1.04 [bootstrapped 95% confidence interval (CI): 0.49-2.20]; ZINB summary RR: 1.21 (bootstrapped 95% CI 0.60-2.46)}. Poisson and ZIP models fit the data poorly and had much larger point estimates than the NB and ZINB models [Poisson RR: 1.93 (bootstrapped 95% CI 0.88-4.23); ZIP summary RR: 1.84 (bootstrapped 95% CI 0.88-3.84)]. We found substantially different results when modeling hospitalization data, depending on the approach used. Our results argue strongly for a sound model selection process and improved reporting around statistical methods used for modeling count data. © The Author 2015. Published by Oxford University Press on behalf of ERA-EDTA. All rights reserved.

  13. Modeling urban coastal flood severity from crowd-sourced flood reports using Poisson regression and Random Forest

    Science.gov (United States)

    Sadler, J. M.; Goodall, J. L.; Morsy, M. M.; Spencer, K.

    2018-04-01

    Sea level rise has already caused more frequent and severe coastal flooding and this trend will likely continue. Flood prediction is an essential part of a coastal city's capacity to adapt to and mitigate this growing problem. Complex coastal urban hydrological systems however, do not always lend themselves easily to physically-based flood prediction approaches. This paper presents a method for using a data-driven approach to estimate flood severity in an urban coastal setting using crowd-sourced data, a non-traditional but growing data source, along with environmental observation data. Two data-driven models, Poisson regression and Random Forest regression, are trained to predict the number of flood reports per storm event as a proxy for flood severity, given extensive environmental data (i.e., rainfall, tide, groundwater table level, and wind conditions) as input. The method is demonstrated using data from Norfolk, Virginia USA from September 2010 to October 2016. Quality-controlled, crowd-sourced street flooding reports ranging from 1 to 159 per storm event for 45 storm events are used to train and evaluate the models. Random Forest performed better than Poisson regression at predicting the number of flood reports and had a lower false negative rate. From the Random Forest model, total cumulative rainfall was by far the most dominant input variable in predicting flood severity, followed by low tide and lower low tide. These methods serve as a first step toward using data-driven methods for spatially and temporally detailed coastal urban flood prediction.

  14. Bayesian regression of piecewise homogeneous Poisson processes

    Directory of Open Access Journals (Sweden)

    Diego Sevilla

    2015-12-01

    Full Text Available In this paper, a Bayesian method for piecewise regression is adapted to handle counting processes data distributed as Poisson. A numerical code in Mathematica is developed and tested analyzing simulated data. The resulting method is valuable for detecting breaking points in the count rate of time series for Poisson processes. Received: 2 November 2015, Accepted: 27 November 2015; Edited by: R. Dickman; Reviewed by: M. Hutter, Australian National University, Canberra, Australia.; DOI: http://dx.doi.org/10.4279/PIP.070018 Cite as: D J R Sevilla, Papers in Physics 7, 070018 (2015

  15. Misspecified poisson regression models for large-scale registry data: inference for 'large n and small p'.

    Science.gov (United States)

    Grøn, Randi; Gerds, Thomas A; Andersen, Per K

    2016-03-30

    Poisson regression is an important tool in register-based epidemiology where it is used to study the association between exposure variables and event rates. In this paper, we will discuss the situation with 'large n and small p', where n is the sample size and p is the number of available covariates. Specifically, we are concerned with modeling options when there are time-varying covariates that can have time-varying effects. One problem is that tests of the proportional hazards assumption, of no interactions between exposure and other observed variables, or of other modeling assumptions have large power due to the large sample size and will often indicate statistical significance even for numerically small deviations that are unimportant for the subject matter. Another problem is that information on important confounders may be unavailable. In practice, this situation may lead to simple working models that are then likely misspecified. To support and improve conclusions drawn from such models, we discuss methods for sensitivity analysis, for estimation of average exposure effects using aggregated data, and a semi-parametric bootstrap method to obtain robust standard errors. The methods are illustrated using data from the Danish national registries investigating the diabetes incidence for individuals treated with antipsychotics compared with the general unexposed population. Copyright © 2015 John Wiley & Sons, Ltd.

  16. Poisson Regression Analysis of Illness and Injury Surveillance Data

    Energy Technology Data Exchange (ETDEWEB)

    Frome E.L., Watkins J.P., Ellis E.D.

    2012-12-12

    The Department of Energy (DOE) uses illness and injury surveillance to monitor morbidity and assess the overall health of the work force. Data collected from each participating site include health events and a roster file with demographic information. The source data files are maintained in a relational data base, and are used to obtain stratified tables of health event counts and person time at risk that serve as the starting point for Poisson regression analysis. The explanatory variables that define these tables are age, gender, occupational group, and time. Typical response variables of interest are the number of absences due to illness or injury, i.e., the response variable is a count. Poisson regression methods are used to describe the effect of the explanatory variables on the health event rates using a log-linear main effects model. Results of fitting the main effects model are summarized in a tabular and graphical form and interpretation of model parameters is provided. An analysis of deviance table is used to evaluate the importance of each of the explanatory variables on the event rate of interest and to determine if interaction terms should be considered in the analysis. Although Poisson regression methods are widely used in the analysis of count data, there are situations in which over-dispersion occurs. This could be due to lack-of-fit of the regression model, extra-Poisson variation, or both. A score test statistic and regression diagnostics are used to identify over-dispersion. A quasi-likelihood method of moments procedure is used to evaluate and adjust for extra-Poisson variation when necessary. Two examples are presented using respiratory disease absence rates at two DOE sites to illustrate the methods and interpretation of the results. In the first example the Poisson main effects model is adequate. In the second example the score test indicates considerable over-dispersion and a more detailed analysis attributes the over-dispersion to extra-Poisson

  17. A Poisson regression approach to model monthly hail occurrence in Northern Switzerland using large-scale environmental variables

    Science.gov (United States)

    Madonna, Erica; Ginsbourger, David; Martius, Olivia

    2018-05-01

    In Switzerland, hail regularly causes substantial damage to agriculture, cars and infrastructure, however, little is known about its long-term variability. To study the variability, the monthly number of days with hail in northern Switzerland is modeled in a regression framework using large-scale predictors derived from ERA-Interim reanalysis. The model is developed and verified using radar-based hail observations for the extended summer season (April-September) in the period 2002-2014. The seasonality of hail is explicitly modeled with a categorical predictor (month) and monthly anomalies of several large-scale predictors are used to capture the year-to-year variability. Several regression models are applied and their performance tested with respect to standard scores and cross-validation. The chosen model includes four predictors: the monthly anomaly of the two meter temperature, the monthly anomaly of the logarithm of the convective available potential energy (CAPE), the monthly anomaly of the wind shear and the month. This model well captures the intra-annual variability and slightly underestimates its inter-annual variability. The regression model is applied to the reanalysis data back in time to 1980. The resulting hail day time series shows an increase of the number of hail days per month, which is (in the model) related to an increase in temperature and CAPE. The trend corresponds to approximately 0.5 days per month per decade. The results of the regression model have been compared to two independent data sets. All data sets agree on the sign of the trend, but the trend is weaker in the other data sets.

  18. Branes in Poisson sigma models

    International Nuclear Information System (INIS)

    Falceto, Fernando

    2010-01-01

    In this review we discuss possible boundary conditions (branes) for the Poisson sigma model. We show how to carry out the perturbative quantization in the presence of a general pre-Poisson brane and how this is related to the deformation quantization of Poisson structures. We conclude with an open problem: the perturbative quantization of the system when the boundary has several connected components and we use a different pre-Poisson brane in every component.

  19. [Application of negative binomial regression and modified Poisson regression in the research of risk factors for injury frequency].

    Science.gov (United States)

    Cao, Qingqing; Wu, Zhenqiang; Sun, Ying; Wang, Tiezhu; Han, Tengwei; Gu, Chaomei; Sun, Yehuan

    2011-11-01

    To Eexplore the application of negative binomial regression and modified Poisson regression analysis in analyzing the influential factors for injury frequency and the risk factors leading to the increase of injury frequency. 2917 primary and secondary school students were selected from Hefei by cluster random sampling method and surveyed by questionnaire. The data on the count event-based injuries used to fitted modified Poisson regression and negative binomial regression model. The risk factors incurring the increase of unintentional injury frequency for juvenile students was explored, so as to probe the efficiency of these two models in studying the influential factors for injury frequency. The Poisson model existed over-dispersion (P Poisson regression and negative binomial regression model, was fitted better. respectively. Both showed that male gender, younger age, father working outside of the hometown, the level of the guardian being above junior high school and smoking might be the results of higher injury frequencies. On a tendency of clustered frequency data on injury event, both the modified Poisson regression analysis and negative binomial regression analysis can be used. However, based on our data, the modified Poisson regression fitted better and this model could give a more accurate interpretation of relevant factors affecting the frequency of injury.

  20. Performance of the modified Poisson regression approach for estimating relative risks from clustered prospective data.

    Science.gov (United States)

    Yelland, Lisa N; Salter, Amy B; Ryan, Philip

    2011-10-15

    Modified Poisson regression, which combines a log Poisson regression model with robust variance estimation, is a useful alternative to log binomial regression for estimating relative risks. Previous studies have shown both analytically and by simulation that modified Poisson regression is appropriate for independent prospective data. This method is often applied to clustered prospective data, despite a lack of evidence to support its use in this setting. The purpose of this article is to evaluate the performance of the modified Poisson regression approach for estimating relative risks from clustered prospective data, by using generalized estimating equations to account for clustering. A simulation study is conducted to compare log binomial regression and modified Poisson regression for analyzing clustered data from intervention and observational studies. Both methods generally perform well in terms of bias, type I error, and coverage. Unlike log binomial regression, modified Poisson regression is not prone to convergence problems. The methods are contrasted by using example data sets from 2 large studies. The results presented in this article support the use of modified Poisson regression as an alternative to log binomial regression for analyzing clustered prospective data when clustering is taken into account by using generalized estimating equations.

  1. A study of the dengue epidemic and meteorological factors in Guangzhou, China, by using a zero-inflated Poisson regression model.

    Science.gov (United States)

    Wang, Chenggang; Jiang, Baofa; Fan, Jingchun; Wang, Furong; Liu, Qiyong

    2014-01-01

    The aim of this study is to develop a model that correctly identifies and quantifies the relationship between dengue and meteorological factors in Guangzhou, China. By cross-correlation analysis, meteorological variables and their lag effects were determined. According to the epidemic characteristics of dengue in Guangzhou, those statistically significant variables were modeled by a zero-inflated Poisson regression model. The number of dengue cases and minimum temperature at 1-month lag, along with average relative humidity at 0- to 1-month lag were all positively correlated with the prevalence of dengue fever, whereas wind velocity and temperature in the same month along with rainfall at 2 months' lag showed negative association with dengue incidence. Minimum temperature at 1-month lag and wind velocity in the same month had a greater impact on the dengue epidemic than other variables in Guangzhou.

  2. Reprint of "Modelling the influence of temperature and rainfall on malaria incidence in four endemic provinces of Zambia using semiparametric Poisson regression".

    Science.gov (United States)

    Shimaponda-Mataa, Nzooma M; Tembo-Mwase, Enala; Gebreslasie, Michael; Achia, Thomas N O; Mukaratirwa, Samson

    2017-11-01

    Although malaria morbidity and mortality are greatly reduced globally owing to great control efforts, the disease remains the main contributor. In Zambia, all provinces are malaria endemic. However, the transmission intensities vary mainly depending on environmental factors as they interact with the vectors. Generally in Africa, possibly due to the varying perspectives and methods used, there is variation on the relative importance of malaria risk determinants. In Zambia, the role climatic factors play on malaria case rates has not been determined in combination of space and time using robust methods in modelling. This is critical considering the reversal in malaria reduction after the year 2010 and the variation by transmission zones. Using a geoadditive or structured additive semiparametric Poisson regression model, we determined the influence of climatic factors on malaria incidence in four endemic provinces of Zambia. We demonstrate a strong positive association between malaria incidence and precipitation as well as minimum temperature. The risk of malaria was 95% lower in Lusaka (ARR=0.05, 95% CI=0.04-0.06) and 68% lower in the Western Province (ARR=0.31, 95% CI=0.25-0.41) compared to Luapula Province. North-western Province did not vary from Luapula Province. The effects of geographical region are clearly demonstrated by the unique behaviour and effects of minimum and maximum temperatures in the four provinces. Environmental factors such as landscape in urbanised places may also be playing a role. Copyright © 2017 Elsevier B.V. All rights reserved.

  3. Comparison of In Vitro Fertilization/Intracytoplasmic Sperm Injection Cycle Outcome in Patients with and without Polycystic Ovary Syndrome: A Modified Poisson Regression Model.

    Science.gov (United States)

    Almasi-Hashiani, Amir; Mansournia, Mohammad Ali; Sepidarkish, Mahdi; Vesali, Samira; Ghaheri, Azadeh; Esmailzadeh, Arezoo; Omani-Samani, Reza

    2018-01-01

    Polycystic ovary syndrome (PCOS) is a frequent condition in reproductive age women with a prevalence rate of 5-10%. This study intends to determine the relationship between PCOS and the outcome of assisted reproductive treatment (ART) in Tehran, Iran. In this historical cohort study, we included 996 infertile women who referred to Royan Institute (Tehran, Iran) between January 2012 and December 2013. PCOS, as the main variable, and other potential confounder variables were gathered. Modified Poisson Regression was used for data analysis. Stata software, version 13 was used for all statistical analyses. Unadjusted analysis showed a significantly lower risk for failure in PCOS cases compared to cases without PCOS [risk ratio (RR): 0.79, 95% confidence intervals (CI): 0.66-0.95, P=0.014]. After adjusting for the confounder variables, there was no difference between risk of non-pregnancy in women with and without PCOS (RR: 0.87, 95% CI: 0.72-1.05, P=0.15). Significant predictors of the ART outcome included the treatment protocol type, numbers of embryos transferred (grades A and AB), numbers of injected ampules, and age. The results obtained from this model showed no difference between patients with and without PCOS according to the risk for non-pregnancy. Therefore, other factors might affect conception in PCOS patients. Copyright© by Royan Institute. All rights reserved.

  4. Zero-inflated Poisson regression models for QTL mapping applied to tick-resistance in a Gyr x Holstein F2 population

    Directory of Open Access Journals (Sweden)

    Fabyano Fonseca Silva

    2011-01-01

    Full Text Available Nowadays, an important and interesting alternative in the control of tick-infestation in cattle is to select resistant animals, and identify the respective quantitative trait loci (QTLs and DNA markers, for posterior use in breeding programs. The number of ticks/animal is characterized as a discrete-counting trait, which could potentially follow Poisson distribution. However, in the case of an excess of zeros, due to the occurrence of several noninfected animals, zero-inflated Poisson and generalized zero-inflated distribution (GZIP may provide a better description of the data. Thus, the objective here was to compare through simulation, Poisson and ZIP models (simple and generalized with classical approaches, for QTL mapping with counting phenotypes under different scenarios, and to apply these approaches to a QTL study of tick resistance in an F2 cattle (Gyr x Holstein population. It was concluded that, when working with zero-inflated data, it is recommendable to use the generalized and simple ZIP model for analysis. On the other hand, when working with data with zeros, but not zero-inflated, the Poisson model or a data-transformation-approach, such as square-root or Box-Cox transformation, are applicable.

  5. Zero-inflated Poisson regression models for QTL mapping applied to tick-resistance in a Gyr × Holstein F2 population

    Science.gov (United States)

    Silva, Fabyano Fonseca; Tunin, Karen P.; Rosa, Guilherme J.M.; da Silva, Marcos V.B.; Azevedo, Ana Luisa Souza; da Silva Verneque, Rui; Machado, Marco Antonio; Packer, Irineu Umberto

    2011-01-01

    Now a days, an important and interesting alternative in the control of tick-infestation in cattle is to select resistant animals, and identify the respective quantitative trait loci (QTLs) and DNA markers, for posterior use in breeding programs. The number of ticks/animal is characterized as a discrete-counting trait, which could potentially follow Poisson distribution. However, in the case of an excess of zeros, due to the occurrence of several noninfected animals, zero-inflated Poisson and generalized zero-inflated distribution (GZIP) may provide a better description of the data. Thus, the objective here was to compare through simulation, Poisson and ZIP models (simple and generalized) with classical approaches, for QTL mapping with counting phenotypes under different scenarios, and to apply these approaches to a QTL study of tick resistance in an F2 cattle (Gyr × Holstein) population. It was concluded that, when working with zero-inflated data, it is recommendable to use the generalized and simple ZIP model for analysis. On the other hand, when working with data with zeros, but not zero-inflated, the Poisson model or a data-transformation-approach, such as square-root or Box-Cox transformation, are applicable. PMID:22215960

  6. Detecting overdispersion in count data: A zero-inflated Poisson regression analysis

    Science.gov (United States)

    Afiqah Muhamad Jamil, Siti; Asrul Affendi Abdullah, M.; Kek, Sie Long; Nor, Maria Elena; Mohamed, Maryati; Ismail, Norradihah

    2017-09-01

    This study focusing on analysing count data of butterflies communities in Jasin, Melaka. In analysing count dependent variable, the Poisson regression model has been known as a benchmark model for regression analysis. Continuing from the previous literature that used Poisson regression analysis, this study comprising the used of zero-inflated Poisson (ZIP) regression analysis to gain acute precision on analysing the count data of butterfly communities in Jasin, Melaka. On the other hands, Poisson regression should be abandoned in the favour of count data models, which are capable of taking into account the extra zeros explicitly. By far, one of the most popular models include ZIP regression model. The data of butterfly communities which had been called as the number of subjects in this study had been taken in Jasin, Melaka and consisted of 131 number of subjects visits Jasin, Melaka. Since the researchers are considering the number of subjects, this data set consists of five families of butterfly and represent the five variables involve in the analysis which are the types of subjects. Besides, the analysis of ZIP used the SAS procedure of overdispersion in analysing zeros value and the main purpose of continuing the previous study is to compare which models would be better than when exists zero values for the observation of the count data. The analysis used AIC, BIC and Voung test of 5% level significance in order to achieve the objectives. The finding indicates that there is a presence of over-dispersion in analysing zero value. The ZIP regression model is better than Poisson regression model when zero values exist.

  7. Methods for estimating disease transmission rates: Evaluating the precision of Poisson regression and two novel methods

    DEFF Research Database (Denmark)

    Kirkeby, Carsten Thure; Hisham Beshara Halasa, Tariq; Gussmann, Maya Katrin

    2017-01-01

    the transmission rate. We use data from the two simulation models and vary the sampling intervals and the size of the population sampled. We devise two new methods to determine transmission rate, and compare these to the frequently used Poisson regression method in both epidemic and endemic situations. For most...... tested scenarios these new methods perform similar or better than Poisson regression, especially in the case of long sampling intervals. We conclude that transmission rate estimates are easily biased, which is important to take into account when using these rates in simulation models....

  8. Relaxed Poisson cure rate models.

    Science.gov (United States)

    Rodrigues, Josemar; Cordeiro, Gauss M; Cancho, Vicente G; Balakrishnan, N

    2016-03-01

    The purpose of this article is to make the standard promotion cure rate model (Yakovlev and Tsodikov, ) more flexible by assuming that the number of lesions or altered cells after a treatment follows a fractional Poisson distribution (Laskin, ). It is proved that the well-known Mittag-Leffler relaxation function (Berberan-Santos, ) is a simple way to obtain a new cure rate model that is a compromise between the promotion and geometric cure rate models allowing for superdispersion. So, the relaxed cure rate model developed here can be considered as a natural and less restrictive extension of the popular Poisson cure rate model at the cost of an additional parameter, but a competitor to negative-binomial cure rate models (Rodrigues et al., ). Some mathematical properties of a proper relaxed Poisson density are explored. A simulation study and an illustration of the proposed cure rate model from the Bayesian point of view are finally presented. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  9. Topological Poisson Sigma models on Poisson-Lie groups

    International Nuclear Information System (INIS)

    Calvo, Ivan; Falceto, Fernando; Garcia-Alvarez, David

    2003-01-01

    We solve the topological Poisson Sigma model for a Poisson-Lie group G and its dual G*. We show that the gauge symmetry for each model is given by its dual group that acts by dressing transformations on the target. The resolution of both models in the open geometry reveals that there exists a map from the reduced phase of each model (P and P*) to the main symplectic leaf of the Heisenberg double (D 0 ) such that the symplectic forms on P, P* are obtained as the pull-back by those maps of the symplectic structure on D 0 . This uncovers a duality between P and P* under the exchange of bulk degrees of freedom of one model with boundary degrees of freedom of the other one. We finally solve the Poisson Sigma model for the Poisson structure on G given by a pair of r-matrices that generalizes the Poisson-Lie case. The Hamiltonian analysis of the theory requires the introduction of a deformation of the Heisenberg double. (author)

  10. Development of planning level transportation safety tools using Geographically Weighted Poisson Regression.

    Science.gov (United States)

    Hadayeghi, Alireza; Shalaby, Amer S; Persaud, Bhagwant N

    2010-03-01

    A common technique used for the calibration of collision prediction models is the Generalized Linear Modeling (GLM) procedure with the assumption of Negative Binomial or Poisson error distribution. In this technique, fixed coefficients that represent the average relationship between the dependent variable and each explanatory variable are estimated. However, the stationary relationship assumed may hide some important spatial factors of the number of collisions at a particular traffic analysis zone. Consequently, the accuracy of such models for explaining the relationship between the dependent variable and the explanatory variables may be suspected since collision frequency is likely influenced by many spatially defined factors such as land use, demographic characteristics, and traffic volume patterns. The primary objective of this study is to investigate the spatial variations in the relationship between the number of zonal collisions and potential transportation planning predictors, using the Geographically Weighted Poisson Regression modeling technique. The secondary objective is to build on knowledge comparing the accuracy of Geographically Weighted Poisson Regression models to that of Generalized Linear Models. The results show that the Geographically Weighted Poisson Regression models are useful for capturing spatially dependent relationships and generally perform better than the conventional Generalized Linear Models. Copyright 2009 Elsevier Ltd. All rights reserved.

  11. Short-Term Effects of Climatic Variables on Hand, Foot, and Mouth Disease in Mainland China, 2008-2013: A Multilevel Spatial Poisson Regression Model Accounting for Overdispersion.

    Science.gov (United States)

    Liao, Jiaqiang; Yu, Shicheng; Yang, Fang; Yang, Min; Hu, Yuehua; Zhang, Juying

    2016-01-01

    Hand, Foot, and Mouth Disease (HFMD) is a worldwide infectious disease. In China, many provinces have reported HFMD cases, especially the south and southwest provinces. Many studies have found a strong association between the incidence of HFMD and climatic factors such as temperature, rainfall, and relative humidity. However, few studies have analyzed cluster effects between various geographical units. The nonlinear relationships and lag effects between weekly HFMD cases and climatic variables were estimated for the period of 2008-2013 using a polynomial distributed lag model. The extra-Poisson multilevel spatial polynomial model was used to model the exact relationship between weekly HFMD incidence and climatic variables after considering cluster effects, provincial correlated structure of HFMD incidence and overdispersion. The smoothing spline methods were used to detect threshold effects between climatic factors and HFMD incidence. The HFMD incidence spatial heterogeneity distributed among provinces, and the scale measurement of overdispersion was 548.077. After controlling for long-term trends, spatial heterogeneity and overdispersion, temperature was highly associated with HFMD incidence. Weekly average temperature and weekly temperature difference approximate inverse "V" shape and "V" shape relationships associated with HFMD incidence. The lag effects for weekly average temperature and weekly temperature difference were 3 weeks and 2 weeks. High spatial correlated HFMD incidence were detected in northern, central and southern province. Temperature can be used to explain most of variation of HFMD incidence in southern and northeastern provinces. After adjustment for temperature, eastern and Northern provinces still had high variation HFMD incidence. We found a relatively strong association between weekly HFMD incidence and weekly average temperature. The association between the HFMD incidence and climatic variables spatial heterogeneity distributed across

  12. Short-Term Effects of Climatic Variables on Hand, Foot, and Mouth Disease in Mainland China, 2008–2013: A Multilevel Spatial Poisson Regression Model Accounting for Overdispersion

    Science.gov (United States)

    Yang, Fang; Yang, Min; Hu, Yuehua; Zhang, Juying

    2016-01-01

    Background Hand, Foot, and Mouth Disease (HFMD) is a worldwide infectious disease. In China, many provinces have reported HFMD cases, especially the south and southwest provinces. Many studies have found a strong association between the incidence of HFMD and climatic factors such as temperature, rainfall, and relative humidity. However, few studies have analyzed cluster effects between various geographical units. Methods The nonlinear relationships and lag effects between weekly HFMD cases and climatic variables were estimated for the period of 2008–2013 using a polynomial distributed lag model. The extra-Poisson multilevel spatial polynomial model was used to model the exact relationship between weekly HFMD incidence and climatic variables after considering cluster effects, provincial correlated structure of HFMD incidence and overdispersion. The smoothing spline methods were used to detect threshold effects between climatic factors and HFMD incidence. Results The HFMD incidence spatial heterogeneity distributed among provinces, and the scale measurement of overdispersion was 548.077. After controlling for long-term trends, spatial heterogeneity and overdispersion, temperature was highly associated with HFMD incidence. Weekly average temperature and weekly temperature difference approximate inverse “V” shape and “V” shape relationships associated with HFMD incidence. The lag effects for weekly average temperature and weekly temperature difference were 3 weeks and 2 weeks. High spatial correlated HFMD incidence were detected in northern, central and southern province. Temperature can be used to explain most of variation of HFMD incidence in southern and northeastern provinces. After adjustment for temperature, eastern and Northern provinces still had high variation HFMD incidence. Conclusion We found a relatively strong association between weekly HFMD incidence and weekly average temperature. The association between the HFMD incidence and climatic

  13. Non-Poisson Processes: Regression to Equilibrium Versus Equilibrium Correlation Functions

    Science.gov (United States)

    2004-07-07

    ARTICLE IN PRESSPhysica A 347 (2005) 268–2880378-4371/$ - doi:10.1016/j Correspo E-mail adwww.elsevier.com/locate/physaNon- Poisson processes : regression...05.40.a; 89.75.k; 02.50.Ey Keywords: Stochastic processes; Non- Poisson processes ; Liouville and Liouville-like equations; Correlation function...which is not legitimate with renewal non- Poisson processes , is a correct property if the deviation from the exponential relaxation is obtained by time

  14. A Spline-Based Lack-Of-Fit Test for Independent Variable Effect in Poisson Regression.

    Science.gov (United States)

    Li, Chin-Shang; Tu, Wanzhu

    2007-05-01

    In regression analysis of count data, independent variables are often modeled by their linear effects under the assumption of log-linearity. In reality, the validity of such an assumption is rarely tested, and its use is at times unjustifiable. A lack-of-fit test is proposed for the adequacy of a postulated functional form of an independent variable within the framework of semiparametric Poisson regression models based on penalized splines. It offers added flexibility in accommodating the potentially non-loglinear effect of the independent variable. A likelihood ratio test is constructed for the adequacy of the postulated parametric form, for example log-linearity, of the independent variable effect. Simulations indicate that the proposed model performs well, and misspecified parametric model has much reduced power. An example is given.

  15. A SAS-macro for estimation of the cumulative incidence using Poisson regression

    DEFF Research Database (Denmark)

    Waltoft, Berit Lindum

    2009-01-01

    the hazard rates, and the hazard rates are often estimated by the Cox regression. This procedure may not be suitable for large studies due to limited computer resources. Instead one uses Poisson regression, which approximates the Cox regression. Rosthøj et al. presented a SAS-macro for the estimation...... of the cumulative incidences based on the Cox regression. I present the functional form of the probabilities and variances when using piecewise constant hazard rates and a SAS-macro for the estimation using Poisson regression. The use of the macro is demonstrated through examples and compared to the macro presented...

  16. An application of the Autoregressive Conditional Poisson (ACP) model

    CSIR Research Space (South Africa)

    Holloway, Jennifer P

    2010-11-01

    Full Text Available When modelling count data that comes in the form of a time series, the static Poisson regression and standard time series models are often not appropriate. A current study therefore involves the evaluation of several observation-driven and parameter...

  17. Evaluating the double Poisson generalized linear model.

    Science.gov (United States)

    Zou, Yaotian; Geedipally, Srinivas Reddy; Lord, Dominique

    2013-10-01

    The objectives of this study are to: (1) examine the applicability of the double Poisson (DP) generalized linear model (GLM) for analyzing motor vehicle crash data characterized by over- and under-dispersion and (2) compare the performance of the DP GLM with the Conway-Maxwell-Poisson (COM-Poisson) GLM in terms of goodness-of-fit and theoretical soundness. The DP distribution has seldom been investigated and applied since its first introduction two decades ago. The hurdle for applying the DP is related to its normalizing constant (or multiplicative constant) which is not available in closed form. This study proposed a new method to approximate the normalizing constant of the DP with high accuracy and reliability. The DP GLM and COM-Poisson GLM were developed using two observed over-dispersed datasets and one observed under-dispersed dataset. The modeling results indicate that the DP GLM with its normalizing constant approximated by the new method can handle crash data characterized by over- and under-dispersion. Its performance is comparable to the COM-Poisson GLM in terms of goodness-of-fit (GOF), although COM-Poisson GLM provides a slightly better fit. For the over-dispersed data, the DP GLM performs similar to the NB GLM. Considering the fact that the DP GLM can be easily estimated with inexpensive computation and that it is simpler to interpret coefficients, it offers a flexible and efficient alternative for researchers to model count data. Copyright © 2013 Elsevier Ltd. All rights reserved.

  18. Affine Poisson Groups and WZW Model

    Directory of Open Access Journals (Sweden)

    Ctirad Klimcík

    2008-01-01

    Full Text Available We give a detailed description of a dynamical system which enjoys a Poisson-Lie symmetry with two non-isomorphic dual groups. The system is obtained by taking the q → ∞ limit of the q-deformed WZW model and the understanding of its symmetry structure results in uncovering an interesting duality of its exchange relations.

  19. Regression modeling methods, theory, and computation with SAS

    CERN Document Server

    Panik, Michael

    2009-01-01

    Regression Modeling: Methods, Theory, and Computation with SAS provides an introduction to a diverse assortment of regression techniques using SAS to solve a wide variety of regression problems. The author fully documents the SAS programs and thoroughly explains the output produced by the programs.The text presents the popular ordinary least squares (OLS) approach before introducing many alternative regression methods. It covers nonparametric regression, logistic regression (including Poisson regression), Bayesian regression, robust regression, fuzzy regression, random coefficients regression,

  20. PEMODELAN JUMLAH ANAK PUTUS SEKOLAH DI PROVINSI BALI DENGAN PENDEKATAN SEMI-PARAMETRIC GEOGRAPHICALLY WEIGHTED POISSON REGRESSION

    Directory of Open Access Journals (Sweden)

    GUSTI AYU RATIH ASTARI

    2013-11-01

    Full Text Available Dropout number is one of the important indicators to measure the human progress resources in education sector. This research uses the approaches of Semi-parametric Geographically Weighted Poisson Regression to get the best model and to determine the influencing factors of dropout number for primary education in Bali. The analysis results show that there are no significant differences between the Poisson regression model with GWPR and Semi-parametric GWPR. Factors which significantly influence the dropout number for primary education in Bali are the ratio of students to school, ratio of students to teachers, the number of families with the latest educational fathers is elementary or junior high school, illiteracy rates, and the average number of family members.

  1. Analysis of Relationship Between Personality and Favorite Places with Poisson Regression Analysis

    Directory of Open Access Journals (Sweden)

    Yoon Song Ha

    2018-01-01

    Full Text Available A relationship between human personality and preferred locations have been a long conjecture for human mobility research. In this paper, we analyzed the relationship between personality and visiting place with Poisson Regression. Poisson Regression can analyze correlation between countable dependent variable and independent variable. For this analysis, 33 volunteers provided their personality data and 49 location categories data are used. Raw location data is preprocessed to be normalized into rates of visit and outlier data is prunned. For the regression analysis, independent variables are personality data and dependent variables are preprocessed location data. Several meaningful results are found. For example, persons with high tendency of frequent visiting to university laboratory has personality with high conscientiousness and low openness. As well, other meaningful location categories are presented in this paper.

  2. Exploring factors associated with traumatic dental injuries in preschool children: a Poisson regression analysis.

    Science.gov (United States)

    Feldens, Carlos Alberto; Kramer, Paulo Floriani; Ferreira, Simone Helena; Spiguel, Mônica Hermann; Marquezan, Marcela

    2010-04-01

    This cross-sectional study aimed to investigate the factors associated with dental trauma in preschool children using Poisson regression analysis with robust variance. The study population comprised 888 children aged 3- to 5-year-old attending public nurseries in Canoas, southern Brazil. Questionnaires assessing information related to the independent variables (age, gender, race, mother's educational level and family income) were completed by the parents. Clinical examinations were carried out by five trained examiners in order to assess traumatic dental injuries (TDI) according to Andreasen's classification. One of the five examiners was calibrated to assess orthodontic characteristics (open bite and overjet). Multivariable Poisson regression analysis with robust variance was used to determine the factors associated with dental trauma as well as the strengths of association. Traditional logistic regression was also performed in order to compare the estimates obtained by both methods of statistical analysis. 36.4% (323/888) of the children suffered dental trauma and there was no difference in prevalence rates from 3 to 5 years of age. Poisson regression analysis showed that the probability of the outcome was almost 30% higher for children whose mothers had more than 8 years of education (Prevalence Ratio = 1.28; 95% CI = 1.03-1.60) and 63% higher for children with an overjet greater than 2 mm (Prevalence Ratio = 1.63; 95% CI = 1.31-2.03). Odds ratios clearly overestimated the size of the effect when compared with prevalence ratios. These findings indicate the need for preventive orientation regarding TDI, in order to educate parents and caregivers about supervising infants, particularly those with increased overjet and whose mothers have a higher level of education. Poisson regression with robust variance represents a better alternative than logistic regression to estimate the risk of dental trauma in preschool children.

  3. A Poisson Regression Examination of the Relationship between Website Traffic and Search Engine Queries

    OpenAIRE

    Tierney, Heather L.R.; Pan, Bing

    2010-01-01

    A new area of research involves the use of Google data, which has been normalized and scaled to predict economic activity. This new source of data holds both many advantages as well as disadvantages, which are discussed through the use of daily and weekly data. Daily and weekly data are employed to show the effect of aggregation as it pertains to Google data, which can lead to contradictory findings. In this paper, Poisson regressions are used to explore the relationship between the online...

  4. Quasi-Poisson versus negative binomial regression models in identifying factors affecting initial CD4 cell count change due to antiretroviral therapy administered to HIV-positive adults in North-West Ethiopia (Amhara region).

    Science.gov (United States)

    Seyoum, Awoke; Ndlovu, Principal; Zewotir, Temesgen

    2016-01-01

    CD4 cells are a type of white blood cells that plays a significant role in protecting humans from infectious diseases. Lack of information on associated factors on CD4 cell count reduction is an obstacle for improvement of cells in HIV positive adults. Therefore, the main objective of this study was to investigate baseline factors that could affect initial CD4 cell count change after highly active antiretroviral therapy had been given to adult patients in North West Ethiopia. A retrospective cross-sectional study was conducted among 792 HIV positive adult patients who already started antiretroviral therapy for 1 month of therapy. A Chi square test of association was used to assess of predictor covariates on the variable of interest. Data was secondary source and modeled using generalized linear models, especially Quasi-Poisson regression. The patients' CD4 cell count changed within a month ranged from 0 to 109 cells/mm 3 with a mean of 15.9 cells/mm 3 and standard deviation 18.44 cells/mm 3 . The first month CD4 cell count change was significantly affected by poor adherence to highly active antiretroviral therapy (aRR = 0.506, P value = 2e -16 ), fair adherence (aRR = 0.592, P value = 0.0120), initial CD4 cell count (aRR = 1.0212, P value = 1.54e -15 ), low household income (aRR = 0.63, P value = 0.671e -14 ), middle income (aRR = 0.74, P value = 0.629e -12 ), patients without cell phone (aRR = 0.67, P value = 0.615e -16 ), WHO stage 2 (aRR = 0.91, P value = 0.0078), WHO stage 3 (aRR = 0.91, P value = 0.0058), WHO stage 4 (0876, P value = 0.0214), age (aRR = 0.987, P value = 0.000) and weight (aRR = 1.0216, P value = 3.98e -14 ). Adherence to antiretroviral therapy, initial CD4 cell count, household income, WHO stages, age, weight and owner of cell phone played a major role for the variation of CD4 cell count in our data. Hence, we recommend a close follow-up of patients to adhere the prescribed medication for

  5. Conditional Poisson models: a flexible alternative to conditional logistic case cross-over analysis.

    Science.gov (United States)

    Armstrong, Ben G; Gasparrini, Antonio; Tobias, Aurelio

    2014-11-24

    The time stratified case cross-over approach is a popular alternative to conventional time series regression for analysing associations between time series of environmental exposures (air pollution, weather) and counts of health outcomes. These are almost always analyzed using conditional logistic regression on data expanded to case-control (case crossover) format, but this has some limitations. In particular adjusting for overdispersion and auto-correlation in the counts is not possible. It has been established that a Poisson model for counts with stratum indicators gives identical estimates to those from conditional logistic regression and does not have these limitations, but it is little used, probably because of the overheads in estimating many stratum parameters. The conditional Poisson model avoids estimating stratum parameters by conditioning on the total event count in each stratum, thus simplifying the computing and increasing the number of strata for which fitting is feasible compared with the standard unconditional Poisson model. Unlike the conditional logistic model, the conditional Poisson model does not require expanding the data, and can adjust for overdispersion and auto-correlation. It is available in Stata, R, and other packages. By applying to some real data and using simulations, we demonstrate that conditional Poisson models were simpler to code and shorter to run than are conditional logistic analyses and can be fitted to larger data sets than possible with standard Poisson models. Allowing for overdispersion or autocorrelation was possible with the conditional Poisson model but when not required this model gave identical estimates to those from conditional logistic regression. Conditional Poisson regression models provide an alternative to case crossover analysis of stratified time series data with some advantages. The conditional Poisson model can also be used in other contexts in which primary control for confounding is by fine

  6. Nonlocal Poisson-Fermi model for ionic solvent.

    Science.gov (United States)

    Xie, Dexuan; Liu, Jinn-Liang; Eisenberg, Bob

    2016-07-01

    We propose a nonlocal Poisson-Fermi model for ionic solvent that includes ion size effects and polarization correlations among water molecules in the calculation of electrostatic potential. It includes the previous Poisson-Fermi models as special cases, and its solution is the convolution of a solution of the corresponding nonlocal Poisson dielectric model with a Yukawa-like kernel function. The Fermi distribution is shown to be a set of optimal ionic concentration functions in the sense of minimizing an electrostatic potential free energy. Numerical results are reported to show the difference between a Poisson-Fermi solution and a corresponding Poisson solution.

  7. Poisson regression analysis of the mortality among a cohort of World War II nuclear industry workers

    International Nuclear Information System (INIS)

    Frome, E.L.; Cragle, D.L.; McLain, R.W.

    1990-01-01

    A historical cohort mortality study was conducted among 28,008 white male employees who had worked for at least 1 month in Oak Ridge, Tennessee, during World War II. The workers were employed at two plants that were producing enriched uranium and a research and development laboratory. Vital status was ascertained through 1980 for 98.1% of the cohort members and death certificates were obtained for 96.8% of the 11,671 decedents. A modified version of the traditional standardized mortality ratio (SMR) analysis was used to compare the cause-specific mortality experience of the World War II workers with the U.S. white male population. An SMR and a trend statistic were computed for each cause-of-death category for the 30-year interval from 1950 to 1980. The SMR for all causes was 1.11, and there was a significant upward trend of 0.74% per year. The excess mortality was primarily due to lung cancer and diseases of the respiratory system. Poisson regression methods were used to evaluate the influence of duration of employment, facility of employment, socioeconomic status, birth year, period of follow-up, and radiation exposure on cause-specific mortality. Maximum likelihood estimates of the parameters in a main-effects model were obtained to describe the joint effects of these six factors on cause-specific mortality of the World War II workers. We show that these multivariate regression techniques provide a useful extension of conventional SMR analysis and illustrate their effective use in a large occupational cohort study

  8. Characterizing the performance of the Conway-Maxwell Poisson generalized linear model.

    Science.gov (United States)

    Francis, Royce A; Geedipally, Srinivas Reddy; Guikema, Seth D; Dhavala, Soma Sekhar; Lord, Dominique; LaRocca, Sarah

    2012-01-01

    Count data are pervasive in many areas of risk analysis; deaths, adverse health outcomes, infrastructure system failures, and traffic accidents are all recorded as count events, for example. Risk analysts often wish to estimate the probability distribution for the number of discrete events as part of doing a risk assessment. Traditional count data regression models of the type often used in risk assessment for this problem suffer from limitations due to the assumed variance structure. A more flexible model based on the Conway-Maxwell Poisson (COM-Poisson) distribution was recently proposed, a model that has the potential to overcome the limitations of the traditional model. However, the statistical performance of this new model has not yet been fully characterized. This article assesses the performance of a maximum likelihood estimation method for fitting the COM-Poisson generalized linear model (GLM). The objectives of this article are to (1) characterize the parameter estimation accuracy of the MLE implementation of the COM-Poisson GLM, and (2) estimate the prediction accuracy of the COM-Poisson GLM using simulated data sets. The results of the study indicate that the COM-Poisson GLM is flexible enough to model under-, equi-, and overdispersed data sets with different sample mean values. The results also show that the COM-Poisson GLM yields accurate parameter estimates. The COM-Poisson GLM provides a promising and flexible approach for performing count data regression. © 2011 Society for Risk Analysis.

  9. Poisson-Boltzmann-Nernst-Planck model

    International Nuclear Information System (INIS)

    Zheng Qiong; Wei Guowei

    2011-01-01

    The Poisson-Nernst-Planck (PNP) model is based on a mean-field approximation of ion interactions and continuum descriptions of concentration and electrostatic potential. It provides qualitative explanation and increasingly quantitative predictions of experimental measurements for the ion transport problems in many areas such as semiconductor devices, nanofluidic systems, and biological systems, despite many limitations. While the PNP model gives a good prediction of the ion transport phenomenon for chemical, physical, and biological systems, the number of equations to be solved and the number of diffusion coefficient profiles to be determined for the calculation directly depend on the number of ion species in the system, since each ion species corresponds to one Nernst-Planck equation and one position-dependent diffusion coefficient profile. In a complex system with multiple ion species, the PNP can be computationally expensive and parameter demanding, as experimental measurements of diffusion coefficient profiles are generally quite limited for most confined regions such as ion channels, nanostructures and nanopores. We propose an alternative model to reduce number of Nernst-Planck equations to be solved in complex chemical and biological systems with multiple ion species by substituting Nernst-Planck equations with Boltzmann distributions of ion concentrations. As such, we solve the coupled Poisson-Boltzmann and Nernst-Planck (PBNP) equations, instead of the PNP equations. The proposed PBNP equations are derived from a total energy functional by using the variational principle. We design a number of computational techniques, including the Dirichlet to Neumann mapping, the matched interface and boundary, and relaxation based iterative procedure, to ensure efficient solution of the proposed PBNP equations. Two protein molecules, cytochrome c551 and Gramicidin A, are employed to validate the proposed model under a wide range of bulk ion concentrations and external

  10. Poisson-Boltzmann-Nernst-Planck model.

    Science.gov (United States)

    Zheng, Qiong; Wei, Guo-Wei

    2011-05-21

    The Poisson-Nernst-Planck (PNP) model is based on a mean-field approximation of ion interactions and continuum descriptions of concentration and electrostatic potential. It provides qualitative explanation and increasingly quantitative predictions of experimental measurements for the ion transport problems in many areas such as semiconductor devices, nanofluidic systems, and biological systems, despite many limitations. While the PNP model gives a good prediction of the ion transport phenomenon for chemical, physical, and biological systems, the number of equations to be solved and the number of diffusion coefficient profiles to be determined for the calculation directly depend on the number of ion species in the system, since each ion species corresponds to one Nernst-Planck equation and one position-dependent diffusion coefficient profile. In a complex system with multiple ion species, the PNP can be computationally expensive and parameter demanding, as experimental measurements of diffusion coefficient profiles are generally quite limited for most confined regions such as ion channels, nanostructures and nanopores. We propose an alternative model to reduce number of Nernst-Planck equations to be solved in complex chemical and biological systems with multiple ion species by substituting Nernst-Planck equations with Boltzmann distributions of ion concentrations. As such, we solve the coupled Poisson-Boltzmann and Nernst-Planck (PBNP) equations, instead of the PNP equations. The proposed PBNP equations are derived from a total energy functional by using the variational principle. We design a number of computational techniques, including the Dirichlet to Neumann mapping, the matched interface and boundary, and relaxation based iterative procedure, to ensure efficient solution of the proposed PBNP equations. Two protein molecules, cytochrome c551 and Gramicidin A, are employed to validate the proposed model under a wide range of bulk ion concentrations and external

  11. Double generalized linear compound poisson models to insurance claims data

    DEFF Research Database (Denmark)

    Andersen, Daniel Arnfeldt; Bonat, Wagner Hugo

    2017-01-01

    This paper describes the specification, estimation and comparison of double generalized linear compound Poisson models based on the likelihood paradigm. The models are motivated by insurance applications, where the distribution of the response variable is composed by a degenerate distribution...... implementation and illustrate the application of double generalized linear compound Poisson models using a data set about car insurances....

  12. Application of zero-inflated poisson mixed models in prognostic factors of hepatitis C.

    Science.gov (United States)

    Akbarzadeh Baghban, Alireza; Pourhoseingholi, Asma; Zayeri, Farid; Jafari, Ali Akbar; Alavian, Seyed Moayed

    2013-01-01

    In recent years, hepatitis C virus (HCV) infection represents a major public health problem. Evaluation of risk factors is one of the solutions which help protect people from the infection. This study aims to employ zero-inflated Poisson mixed models to evaluate prognostic factors of hepatitis C. The data was collected from a longitudinal study during 2005-2010. First, mixed Poisson regression (PR) model was fitted to the data. Then, a mixed zero-inflated Poisson model was fitted with compound Poisson random effects. For evaluating the performance of the proposed mixed model, standard errors of estimators were compared. The results obtained from mixed PR showed that genotype 3 and treatment protocol were statistically significant. Results of zero-inflated Poisson mixed model showed that age, sex, genotypes 2 and 3, the treatment protocol, and having risk factors had significant effects on viral load of HCV patients. Of these two models, the estimators of zero-inflated Poisson mixed model had the minimum standard errors. The results showed that a mixed zero-inflated Poisson model was the almost best fit. The proposed model can capture serial dependence, additional overdispersion, and excess zeros in the longitudinal count data.

  13. The analysis of incontinence episodes and other count data in patients with overactive bladder by Poisson and negative binomial regression.

    Science.gov (United States)

    Martina, R; Kay, R; van Maanen, R; Ridder, A

    2015-01-01

    Clinical studies in overactive bladder have traditionally used analysis of covariance or nonparametric methods to analyse the number of incontinence episodes and other count data. It is known that if the underlying distributional assumptions of a particular parametric method do not hold, an alternative parametric method may be more efficient than a nonparametric one, which makes no assumptions regarding the underlying distribution of the data. Therefore, there are advantages in using methods based on the Poisson distribution or extensions of that method, which incorporate specific features that provide a modelling framework for count data. One challenge with count data is overdispersion, but methods are available that can account for this through the introduction of random effect terms in the modelling, and it is this modelling framework that leads to the negative binomial distribution. These models can also provide clinicians with a clearer and more appropriate interpretation of treatment effects in terms of rate ratios. In this paper, the previously used parametric and non-parametric approaches are contrasted with those based on Poisson regression and various extensions in trials evaluating solifenacin and mirabegron in patients with overactive bladder. In these applications, negative binomial models are seen to fit the data well. Copyright © 2014 John Wiley & Sons, Ltd.

  14. Systematic review of treatment modalities for gingival depigmentation: a random-effects poisson regression analysis.

    Science.gov (United States)

    Lin, Yi Hung; Tu, Yu Kang; Lu, Chun Tai; Chung, Wen Chen; Huang, Chiung Fang; Huang, Mao Suan; Lu, Hsein Kun

    2014-01-01

    Repigmentation variably occurs with different treatment methods in patients with gingival pigmentation. A systemic review was conducted of various treatment modalities for eliminating melanin pigmentation of the gingiva, comprising bur abrasion, scalpel surgery, cryosurgery, electrosurgery, gingival grafts, and laser techniques, to compare the recurrence rates (Rrs) of these treatment procedures. Electronic databases, including PubMed, Web of Science, Google, and Medline were comprehensively searched, and manual searches were conducted for studies published from January 1951 to June 2013. After applying inclusion and exclusion criteria, the final list of articles was reviewed in depth to achieve the objectives of this review. A Poisson regression was used to analyze the outcome of depigmentation using the various treatment methods. The systematic review was based on case reports mainly. In total, 61 eligible publications met the defined criteria. The various therapeutic procedures showed variable clinical results with a wide range of Rrs. A random-effects Poisson regression showed that cryosurgery (Rr = 0.32%), electrosurgery (Rr = 0.74%), and laser depigmentation (Rr = 1.16%) yielded superior result, whereas bur abrasion yielded the highest Rr (8.89%). Within the limit of the sampling level, the present evidence-based results show that cryosurgery exhibits the optimal predictability for depigmentation of the gingiva among all procedures examined, followed by electrosurgery and laser techniques. It is possible to treat melanin pigmentation of the gingiva with various methods and prevent repigmentation. Among those treatment modalities, cryosurgery, electrosurgery, and laser surgery appear to be the best choices for treating gingival pigmentation. © 2014 Wiley Periodicals, Inc.

  15. Analysis of Blood Transfusion Data Using Bivariate Zero-Inflated Poisson Model: A Bayesian Approach.

    Science.gov (United States)

    Mohammadi, Tayeb; Kheiri, Soleiman; Sedehi, Morteza

    2016-01-01

    Recognizing the factors affecting the number of blood donation and blood deferral has a major impact on blood transfusion. There is a positive correlation between the variables "number of blood donation" and "number of blood deferral": as the number of return for donation increases, so does the number of blood deferral. On the other hand, due to the fact that many donors never return to donate, there is an extra zero frequency for both of the above-mentioned variables. In this study, in order to apply the correlation and to explain the frequency of the excessive zero, the bivariate zero-inflated Poisson regression model was used for joint modeling of the number of blood donation and number of blood deferral. The data was analyzed using the Bayesian approach applying noninformative priors at the presence and absence of covariates. Estimating the parameters of the model, that is, correlation, zero-inflation parameter, and regression coefficients, was done through MCMC simulation. Eventually double-Poisson model, bivariate Poisson model, and bivariate zero-inflated Poisson model were fitted on the data and were compared using the deviance information criteria (DIC). The results showed that the bivariate zero-inflated Poisson regression model fitted the data better than the other models.

  16. 2D Poisson sigma models with gauged vectorial supersymmetry

    Energy Technology Data Exchange (ETDEWEB)

    Bonezzi, Roberto [Dipartimento di Fisica ed Astronomia, Università di Bologna and INFN, Sezione di Bologna,via Irnerio 46, I-40126 Bologna (Italy); Departamento de Ciencias Físicas, Universidad Andres Bello,Republica 220, Santiago (Chile); Sundell, Per [Departamento de Ciencias Físicas, Universidad Andres Bello,Republica 220, Santiago (Chile); Torres-Gomez, Alexander [Departamento de Ciencias Físicas, Universidad Andres Bello,Republica 220, Santiago (Chile); Instituto de Ciencias Físicas y Matemáticas, Universidad Austral de Chile-UACh,Valdivia (Chile)

    2015-08-12

    In this note, we gauge the rigid vectorial supersymmetry of the two-dimensional Poisson sigma model presented in arXiv:1503.05625. We show that the consistency of the construction does not impose any further constraints on the differential Poisson algebra geometry than those required for the ungauged model. We conclude by proposing that the gauged model provides a first-quantized framework for higher spin gravity.

  17. Network Traffic Monitoring Using Poisson Dynamic Linear Models

    Energy Technology Data Exchange (ETDEWEB)

    Merl, D. M. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

    2011-05-09

    In this article, we discuss an approach for network forensics using a class of nonstationary Poisson processes with embedded dynamic linear models. As a modeling strategy, the Poisson DLM (PoDLM) provides a very flexible framework for specifying structured effects that may influence the evolution of the underlying Poisson rate parameter, including diurnal and weekly usage patterns. We develop a novel particle learning algorithm for online smoothing and prediction for the PoDLM, and demonstrate the suitability of the approach to real-time deployment settings via a new application to computer network traffic monitoring.

  18. Improved Denoising via Poisson Mixture Modeling of Image Sensor Noise.

    Science.gov (United States)

    Zhang, Jiachao; Hirakawa, Keigo

    2017-04-01

    This paper describes a study aimed at comparing the real image sensor noise distribution to the models of noise often assumed in image denoising designs. A quantile analysis in pixel, wavelet transform, and variance stabilization domains reveal that the tails of Poisson, signal-dependent Gaussian, and Poisson-Gaussian models are too short to capture real sensor noise behavior. A new Poisson mixture noise model is proposed to correct the mismatch of tail behavior. Based on the fact that noise model mismatch results in image denoising that undersmoothes real sensor data, we propose a mixture of Poisson denoising method to remove the denoising artifacts without affecting image details, such as edge and textures. Experiments with real sensor data verify that denoising for real image sensor data is indeed improved by this new technique.

  19. The Use of a Poisson Regression to Evaluate Antihistamines and Fatal Aircraft Mishaps in Instrument Meteorological Conditions.

    Science.gov (United States)

    Gildea, Kevin M; Hileman, Christy R; Rogers, Paul; Salazar, Guillermo J; Paskoff, Lawrence N

    2018-04-01

    Research indicates that first-generation antihistamine usage may impair pilot performance by increasing the likelihood of vestibular illusions, spatial disorientation, and/or cognitive impairment. Second- and third-generation antihistamines generally have fewer impairing side effects and are approved for pilot use. We hypothesized that toxicological findings positive for second- and third-generation antihistamines are less likely to be associated with pilots involved in fatal mishaps than first-generation antihistamines. The evaluated population consisted of 1475 U.S. civil pilots fatally injured between September 30, 2008, and October 1, 2014. Mishap factors evaluated included year, weather conditions, airman rating, recent airman flight time, quarter of year, and time of day. Due to the low prevalence of positive antihistamine findings, a count-based model was selected, which can account for rare outcomes. The means and variances were close for both regression models supporting the assumption that the data follow a Poisson distribution; first-generation antihistamine mishap airmen (N = 582, M = 0.17, S2 = 0.17) with second- and third-generation antihistamine mishap airmen (N = 116, M = 0.20, S2 = 0.18). The data indicate fewer airmen with second- and third-generation antihistamines than first-generation antihistamines in their system are fatally injured while flying in IMC conditions. Whether the lower incidence is a factor of greater usage of first-generation antihistamines versus second- and third-generation antihistamines by the pilot population or fewer deleterious side effects with second- and third-generation antihistamines is unclear. These results engender cautious optimism, but additional research is necessary to determine why these differences exist.Gildea KM, Hileman CR, Rogers P, Salazar GJ, Paskoff LN. The use of a Poisson regression to evaluate antihistamines and fatal aircraft mishaps in instrument meteorological conditions. Aerosp Med Hum Perform

  20. Modeling laser velocimeter signals as triply stochastic Poisson processes

    Science.gov (United States)

    Mayo, W. T., Jr.

    1976-01-01

    Previous models of laser Doppler velocimeter (LDV) systems have not adequately described dual-scatter signals in a manner useful for analysis and simulation of low-level photon-limited signals. At low photon rates, an LDV signal at the output of a photomultiplier tube is a compound nonhomogeneous filtered Poisson process, whose intensity function is another (slower) Poisson process with the nonstationary rate and frequency parameters controlled by a random flow (slowest) process. In the present paper, generalized Poisson shot noise models are developed for low-level LDV signals. Theoretical results useful in detection error analysis and simulation are presented, along with measurements of burst amplitude statistics. Computer generated simulations illustrate the difference between Gaussian and Poisson models of low-level signals.

  1. Association between large strongyle genera in larval cultures--using rare-event poisson regression.

    Science.gov (United States)

    Cao, X; Vidyashankar, A N; Nielsen, M K

    2013-09-01

    Decades of intensive anthelmintic treatment has caused equine large strongyles to become quite rare, while the cyathostomins have developed resistance to several drug classes. The larval culture has been associated with low to moderate negative predictive values for detecting Strongylus vulgaris infection. It is unknown whether detection of other large strongyle species can be statistically associated with presence of S. vulgaris. This remains a statistical challenge because of the rare occurrence of large strongyle species. This study used a modified Poisson regression to analyse a dataset for associations between S. vulgaris infection and simultaneous occurrence of Strongylus edentatus and Triodontophorus spp. In 663 horses on 42 Danish farms, the individual prevalences of S. vulgaris, S. edentatus and Triodontophorus spp. were 12%, 3% and 12%, respectively. Both S. edentatus and Triodontophorus spp. were significantly associated with S. vulgaris infection with relative risks above 1. Further, S. edentatus was associated with use of selective therapy on the farms, as well as negatively associated with anthelmintic treatment carried out within 6 months prior to the study. The findings illustrate that occurrence of S. vulgaris in larval cultures can be interpreted as indicative of other large strongyles being likely to be present.

  2. Simulation on Poisson and negative binomial models of count road accident modeling

    Science.gov (United States)

    Sapuan, M. S.; Razali, A. M.; Zamzuri, Z. H.; Ibrahim, K.

    2016-11-01

    Accident count data have often been shown to have overdispersion. On the other hand, the data might contain zero count (excess zeros). The simulation study was conducted to create a scenarios which an accident happen in T-junction with the assumption the dependent variables of generated data follows certain distribution namely Poisson and negative binomial distribution with different sample size of n=30 to n=500. The study objective was accomplished by fitting Poisson regression, negative binomial regression and Hurdle negative binomial model to the simulated data. The model validation was compared and the simulation result shows for each different sample size, not all model fit the data nicely even though the data generated from its own distribution especially when the sample size is larger. Furthermore, the larger sample size indicates that more zeros accident count in the dataset.

  3. Poisson-Based Inference for Perturbation Models in Adaptive Spelling Training

    Science.gov (United States)

    Baschera, Gian-Marco; Gross, Markus

    2010-01-01

    We present an inference algorithm for perturbation models based on Poisson regression. The algorithm is designed to handle unclassified input with multiple errors described by independent mal-rules. This knowledge representation provides an intelligent tutoring system with local and global information about a student, such as error classification…

  4. Markov modulated Poisson process models incorporating covariates for rainfall intensity.

    Science.gov (United States)

    Thayakaran, R; Ramesh, N I

    2013-01-01

    Time series of rainfall bucket tip times at the Beaufort Park station, Bracknell, in the UK are modelled by a class of Markov modulated Poisson processes (MMPP) which may be thought of as a generalization of the Poisson process. Our main focus in this paper is to investigate the effects of including covariate information into the MMPP model framework on statistical properties. In particular, we look at three types of time-varying covariates namely temperature, sea level pressure, and relative humidity that are thought to be affecting the rainfall arrival process. Maximum likelihood estimation is used to obtain the parameter estimates, and likelihood ratio tests are employed in model comparison. Simulated data from the fitted model are used to make statistical inferences about the accumulated rainfall in the discrete time interval. Variability of the daily Poisson arrival rates is studied.

  5. Prediction of forest fires occurrences with area-level Poisson mixed models.

    Science.gov (United States)

    Boubeta, Miguel; Lombardía, María José; Marey-Pérez, Manuel Francisco; Morales, Domingo

    2015-05-01

    The number of fires in forest areas of Galicia (north-west of Spain) during the summer period is quite high. Local authorities are interested in analyzing the factors that explain this phenomenon. Poisson regression models are good tools for describing and predicting the number of fires per forest areas. This work employs area-level Poisson mixed models for treating real data about fires in forest areas. A parametric bootstrap method is applied for estimating the mean squared errors of fires predictors. The developed methodology and software are applied to a real data set of fires in forest areas of Galicia. Copyright © 2015 Elsevier Ltd. All rights reserved.

  6. Modeling corporate defaults: Poisson autoregressions with exogenous covariates (PARX)

    DEFF Research Database (Denmark)

    Agosto, Arianna; Cavaliere, Guiseppe; Kristensen, Dennis

    We develop a class of Poisson autoregressive models with additional covariates (PARX) that can be used to model and forecast time series of counts. We establish the time series properties of the models, including conditions for stationarity and existence of moments. These results are in turn used...

  7. Poisson-generalized gamma empirical Bayes model for disease ...

    African Journals Online (AJOL)

    In spatial disease mapping, the use of Bayesian models of estimation technique is becoming popular for smoothing relative risks estimates for disease mapping. The most common Bayesian conjugate model for disease mapping is the Poisson-Gamma Model (PG). To explore further the activity of smoothing of relative risk ...

  8. Dilaton gravity, Poisson sigma models and loop quantum gravity

    International Nuclear Information System (INIS)

    Bojowald, Martin; Reyes, Juan D

    2009-01-01

    Spherically symmetric gravity in Ashtekar variables coupled to Yang-Mills theory in two dimensions and its relation to dilaton gravity and Poisson sigma models are discussed. After introducing its loop quantization, quantum corrections for inverse triad components are shown to provide a consistent deformation without anomalies. The relation to Poisson sigma models provides a covariant action principle of the quantum-corrected theory with effective couplings. Results are also used to provide loop quantizations of spherically symmetric models in arbitrary D spacetime dimensions.

  9. The coupling of Poisson sigma models to topological backgrounds

    Energy Technology Data Exchange (ETDEWEB)

    Rosa, Dario [School of Physics, Korea Institute for Advanced Study,Seoul 02455 (Korea, Republic of)

    2016-12-13

    We extend the coupling to the topological backgrounds, recently worked out for the 2-dimensional BF-model, to the most general Poisson sigma models. The coupling involves the choice of a Casimir function on the target manifold and modifies the BRST transformations. This in turn induces a change in the BRST cohomology of the resulting theory. The observables of the coupled theory are analyzed and their geometrical interpretation is given. We finally couple the theory to 2-dimensional topological gravity: this is the first step to study a topological string theory in propagation on a Poisson manifold. As an application, we show that the gauge-fixed vectorial supersymmetry of the Poisson sigma models has a natural explanation in terms of the theory coupled to topological gravity.

  10. Logistic regression models

    CERN Document Server

    Hilbe, Joseph M

    2009-01-01

    This book really does cover everything you ever wanted to know about logistic regression … with updates available on the author's website. Hilbe, a former national athletics champion, philosopher, and expert in astronomy, is a master at explaining statistical concepts and methods. Readers familiar with his other expository work will know what to expect-great clarity.The book provides considerable detail about all facets of logistic regression. No step of an argument is omitted so that the book will meet the needs of the reader who likes to see everything spelt out, while a person familiar with some of the topics has the option to skip "obvious" sections. The material has been thoroughly road-tested through classroom and web-based teaching. … The focus is on helping the reader to learn and understand logistic regression. The audience is not just students meeting the topic for the first time, but also experienced users. I believe the book really does meet the author's goal … .-Annette J. Dobson, Biometric...

  11. Introduction to the use of regression models in epidemiology.

    Science.gov (United States)

    Bender, Ralf

    2009-01-01

    Regression modeling is one of the most important statistical techniques used in analytical epidemiology. By means of regression models the effect of one or several explanatory variables (e.g., exposures, subject characteristics, risk factors) on a response variable such as mortality or cancer can be investigated. From multiple regression models, adjusted effect estimates can be obtained that take the effect of potential confounders into account. Regression methods can be applied in all epidemiologic study designs so that they represent a universal tool for data analysis in epidemiology. Different kinds of regression models have been developed in dependence on the measurement scale of the response variable and the study design. The most important methods are linear regression for continuous outcomes, logistic regression for binary outcomes, Cox regression for time-to-event data, and Poisson regression for frequencies and rates. This chapter provides a nontechnical introduction to these regression models with illustrating examples from cancer research.

  12. Poisson sigma model with branes and hyperelliptic Riemann surfaces

    International Nuclear Information System (INIS)

    Ferrario, Andrea

    2008-01-01

    We derive the explicit form of the superpropagators in the presence of general boundary conditions (coisotropic branes) for the Poisson sigma model. This generalizes the results presented by Cattaneo and Felder [''A path integral approach to the Kontsevich quantization formula,'' Commun. Math. Phys. 212, 591 (2000)] and Cattaneo and Felder ['Coisotropic submanifolds in Poisson geometry and branes in the Poisson sigma model', Lett. Math. Phys. 69, 157 (2004)] for Kontsevich's angle function [Kontsevich, M., 'Deformation quantization of Poisson manifolds I', e-print arXiv:hep.th/0101170] used in the deformation quantization program of Poisson manifolds. The relevant superpropagators for n branes are defined as gauge fixed homotopy operators of a complex of differential forms on n sided polygons P n with particular ''alternating'' boundary conditions. In the presence of more than three branes we use first order Riemann theta functions with odd singular characteristics on the Jacobian variety of a hyperelliptic Riemann surface (canonical setting). In genus g the superpropagators present g zero mode contributions

  13. Log-normal frailty models fitted as Poisson generalized linear mixed models.

    Science.gov (United States)

    Hirsch, Katharina; Wienke, Andreas; Kuss, Oliver

    2016-12-01

    The equivalence of a survival model with a piecewise constant baseline hazard function and a Poisson regression model has been known since decades. As shown in recent studies, this equivalence carries over to clustered survival data: A frailty model with a log-normal frailty term can be interpreted and estimated as a generalized linear mixed model with a binary response, a Poisson likelihood, and a specific offset. Proceeding this way, statistical theory and software for generalized linear mixed models are readily available for fitting frailty models. This gain in flexibility comes at the small price of (1) having to fix the number of pieces for the baseline hazard in advance and (2) having to "explode" the data set by the number of pieces. In this paper we extend the simulations of former studies by using a more realistic baseline hazard (Gompertz) and by comparing the model under consideration with competing models. Furthermore, the SAS macro %PCFrailty is introduced to apply the Poisson generalized linear mixed approach to frailty models. The simulations show good results for the shared frailty model. Our new %PCFrailty macro provides proper estimates, especially in case of 4 events per piece. The suggested Poisson generalized linear mixed approach for log-normal frailty models based on the %PCFrailty macro provides several advantages in the analysis of clustered survival data with respect to more flexible modelling of fixed and random effects, exact (in the sense of non-approximate) maximum likelihood estimation, and standard errors and different types of confidence intervals for all variance parameters. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  14. Impact of a New Law to Reduce the Legal Blood Alcohol Concentration Limit - A Poisson Regression Analysis and Descriptive Approach.

    Science.gov (United States)

    Nistal-Nuño, Beatriz

    2017-03-31

    In Chile, a new law introduced in March 2012 lowered the blood alcohol concentration (BAC) limit for impaired drivers from 0.1% to 0.08% and the BAC limit for driving under the influence of alcohol from 0.05% to 0.03%, but its effectiveness remains uncertain. The goal of this investigation was to evaluate the effects of this enactment on road traffic injuries and fatalities in Chile. A retrospective cohort study. Data were analyzed using a descriptive and a Generalized Linear Models approach, type of Poisson regression, to analyze deaths and injuries in a series of additive Log-Linear Models accounting for the effects of law implementation, month influence, a linear time trend and population exposure. A review of national databases in Chile was conducted from 2003 to 2014 to evaluate the monthly rates of traffic fatalities and injuries associated to alcohol and in total. It was observed a decrease by 28.1 percent in the monthly rate of traffic fatalities related to alcohol as compared to before the law (Plaw (Plaw implemented in 2012 in Chile. Chile experienced a significant reduction in alcohol-related traffic fatalities and injuries, being a successful public health intervention.

  15. (Non) linear regression modelling

    NARCIS (Netherlands)

    Cizek, P.; Gentle, J.E.; Hardle, W.K.; Mori, Y.

    2012-01-01

    We will study causal relationships of a known form between random variables. Given a model, we distinguish one or more dependent (endogenous) variables Y = (Y1,…,Yl), l ∈ N, which are explained by a model, and independent (exogenous, explanatory) variables X = (X1,…,Xp),p ∈ N, which explain or

  16. Monitoring Poisson time series using multi-process models

    DEFF Research Database (Denmark)

    Engebjerg, Malene Dahl Skov; Lundbye-Christensen, Søren; Kjær, Birgitte B.

    aspects of health resource management may also be addressed. In this paper we center on the detection of outbreaks of infectious diseases. This is achieved by a multi-process Poisson state space model taking autocorrelation and overdispersion into account, which has been applied to a data set concerning...

  17. A physiologically based nonhomogeneous Poisson counter model of visual identification

    DEFF Research Database (Denmark)

    Christensen, Jeppe H; Markussen, Bo; Bundesen, Claus

    2018-01-01

    A physiologically based nonhomogeneous Poisson counter model of visual identification is presented. The model was developed in the framework of a Theory of Visual Attention (Bundesen, 1990; Kyllingsbæk, Markussen, & Bundesen, 2012) and meant for modeling visual identification of objects that are ......A physiologically based nonhomogeneous Poisson counter model of visual identification is presented. The model was developed in the framework of a Theory of Visual Attention (Bundesen, 1990; Kyllingsbæk, Markussen, & Bundesen, 2012) and meant for modeling visual identification of objects...... that mimicked the dynamics of receptive field selectivity as found in neurophysiological studies. Furthermore, the initial sensory response yielded theoretical hazard rate functions that closely resembled empirically estimated ones. Finally, supplied with a Naka-Rushton type contrast gain control, the model...

  18. Tutorial on Using Regression Models with Count Outcomes Using R

    Directory of Open Access Journals (Sweden)

    A. Alexander Beaujean

    2016-02-01

    Full Text Available Education researchers often study count variables, such as times a student reached a goal, discipline referrals, and absences. Most researchers that study these variables use typical regression methods (i.e., ordinary least-squares either with or without transforming the count variables. In either case, using typical regression for count data can produce parameter estimates that are biased, thus diminishing any inferences made from such data. As count-variable regression models are seldom taught in training programs, we present a tutorial to help educational researchers use such methods in their own research. We demonstrate analyzing and interpreting count data using Poisson, negative binomial, zero-inflated Poisson, and zero-inflated negative binomial regression models. The count regression methods are introduced through an example using the number of times students skipped class. The data for this example are freely available and the R syntax used run the example analyses are included in the Appendix.

  19. Bayesian Poisson hierarchical models for crash data analysis: Investigating the impact of model choice on site-specific predictions.

    Science.gov (United States)

    Khazraee, S Hadi; Johnson, Valen; Lord, Dominique

    2018-08-01

    The Poisson-gamma (PG) and Poisson-lognormal (PLN) regression models are among the most popular means for motor vehicle crash data analysis. Both models belong to the Poisson-hierarchical family of models. While numerous studies have compared the overall performance of alternative Bayesian Poisson-hierarchical models, little research has addressed the impact of model choice on the expected crash frequency prediction at individual sites. This paper sought to examine whether there are any trends among candidate models predictions e.g., that an alternative model's prediction for sites with certain conditions tends to be higher (or lower) than that from another model. In addition to the PG and PLN models, this research formulated a new member of the Poisson-hierarchical family of models: the Poisson-inverse gamma (PIGam). Three field datasets (from Texas, Michigan and Indiana) covering a wide range of over-dispersion characteristics were selected for analysis. This study demonstrated that the model choice can be critical when the calibrated models are used for prediction at new sites, especially when the data are highly over-dispersed. For all three datasets, the PIGam model would predict higher expected crash frequencies than would the PLN and PG models, in order, indicating a clear link between the models predictions and the shape of their mixing distributions (i.e., gamma, lognormal, and inverse gamma, respectively). The thicker tail of the PIGam and PLN models (in order) may provide an advantage when the data are highly over-dispersed. The analysis results also illustrated a major deficiency of the Deviance Information Criterion (DIC) in comparing the goodness-of-fit of hierarchical models; models with drastically different set of coefficients (and thus predictions for new sites) may yield similar DIC values, because the DIC only accounts for the parameters in the lowest (observation) level of the hierarchy and ignores the higher levels (regression coefficients

  20. Critical elements on fitting the Bayesian multivariate Poisson Lognormal model

    Science.gov (United States)

    Zamzuri, Zamira Hasanah binti

    2015-10-01

    Motivated by a problem on fitting multivariate models to traffic accident data, a detailed discussion of the Multivariate Poisson Lognormal (MPL) model is presented. This paper reveals three critical elements on fitting the MPL model: the setting of initial estimates, hyperparameters and tuning parameters. These issues have not been highlighted in the literature. Based on simulation studies conducted, we have shown that to use the Univariate Poisson Model (UPM) estimates as starting values, at least 20,000 iterations are needed to obtain reliable final estimates. We also illustrated the sensitivity of the specific hyperparameter, which if it is not given extra attention, may affect the final estimates. The last issue is regarding the tuning parameters where they depend on the acceptance rate. Finally, a heuristic algorithm to fit the MPL model is presented. This acts as a guide to ensure that the model works satisfactorily given any data set.

  1. 2D sigma models and differential Poisson algebras

    International Nuclear Information System (INIS)

    Arias, Cesar; Boulanger, Nicolas; Sundell, Per; Torres-Gomez, Alexander

    2015-01-01

    We construct a two-dimensional topological sigma model whose target space is endowed with a Poisson algebra for differential forms. The model consists of an equal number of bosonic and fermionic fields of worldsheet form degrees zero and one. The action is built using exterior products and derivatives, without any reference to a worldsheet metric, and is of the covariant Hamiltonian form. The equations of motion define a universally Cartan integrable system. In addition to gauge symmetries, the model has one rigid nilpotent supersymmetry corresponding to the target space de Rham operator. The rigid and local symmetries of the action, respectively, are equivalent to the Poisson bracket being compatible with the de Rham operator and obeying graded Jacobi identities. We propose that perturbative quantization of the model yields a covariantized differential star product algebra of Kontsevich type. We comment on the resemblance to the topological A model.

  2. Application of Poisson random effect models for highway network screening.

    Science.gov (United States)

    Jiang, Ximiao; Abdel-Aty, Mohamed; Alamili, Samer

    2014-02-01

    In recent years, Bayesian random effect models that account for the temporal and spatial correlations of crash data became popular in traffic safety research. This study employs random effect Poisson Log-Normal models for crash risk hotspot identification. Both the temporal and spatial correlations of crash data were considered. Potential for Safety Improvement (PSI) were adopted as a measure of the crash risk. Using the fatal and injury crashes that occurred on urban 4-lane divided arterials from 2006 to 2009 in the Central Florida area, the random effect approaches were compared to the traditional Empirical Bayesian (EB) method and the conventional Bayesian Poisson Log-Normal model. A series of method examination tests were conducted to evaluate the performance of different approaches. These tests include the previously developed site consistence test, method consistence test, total rank difference test, and the modified total score test, as well as the newly proposed total safety performance measure difference test. Results show that the Bayesian Poisson model accounting for both temporal and spatial random effects (PTSRE) outperforms the model that with only temporal random effect, and both are superior to the conventional Poisson Log-Normal model (PLN) and the EB model in the fitting of crash data. Additionally, the method evaluation tests indicate that the PTSRE model is significantly superior to the PLN model and the EB model in consistently identifying hotspots during successive time periods. The results suggest that the PTSRE model is a superior alternative for road site crash risk hotspot identification. Copyright © 2013 Elsevier Ltd. All rights reserved.

  3. Panel Smooth Transition Regression Models

    DEFF Research Database (Denmark)

    González, Andrés; Terasvirta, Timo; Dijk, Dick van

    We introduce the panel smooth transition regression model. This new model is intended for characterizing heterogeneous panels, allowing the regression coefficients to vary both across individuals and over time. Specifically, heterogeneity is allowed for by assuming that these coefficients are bou...

  4. Numerical solution of dynamic equilibrium models under Poisson uncertainty

    DEFF Research Database (Denmark)

    Posch, Olaf; Trimborn, Timo

    2013-01-01

    We propose a simple and powerful numerical algorithm to compute the transition process in continuous-time dynamic equilibrium models with rare events. In this paper we transform the dynamic system of stochastic differential equations into a system of functional differential equations of the retar...... solution to Lucas' endogenous growth model under Poisson uncertainty are used to compute the exact numerical error. We show how (potential) catastrophic events such as rare natural disasters substantially affect the economic decisions of households....

  5. Analyzing Seasonal Variations in Suicide With Fourier Poisson Time-Series Regression: A Registry-Based Study From Norway, 1969-2007.

    Science.gov (United States)

    Bramness, Jørgen G; Walby, Fredrik A; Morken, Gunnar; Røislien, Jo

    2015-08-01

    Seasonal variation in the number of suicides has long been acknowledged. It has been suggested that this seasonality has declined in recent years, but studies have generally used statistical methods incapable of confirming this. We examined all suicides occurring in Norway during 1969-2007 (more than 20,000 suicides in total) to establish whether seasonality decreased over time. Fitting of additive Fourier Poisson time-series regression models allowed for formal testing of a possible linear decrease in seasonality, or a reduction at a specific point in time, while adjusting for a possible smooth nonlinear long-term change without having to categorize time into discrete yearly units. The models were compared using Akaike's Information Criterion and analysis of variance. A model with a seasonal pattern was significantly superior to a model without one. There was a reduction in seasonality during the period. Both the model assuming a linear decrease in seasonality and the model assuming a change at a specific point in time were both superior to a model assuming constant seasonality, thus confirming by formal statistical testing that the magnitude of the seasonality in suicides has diminished. The additive Fourier Poisson time-series regression model would also be useful for studying other temporal phenomena with seasonal components. © The Author 2015. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  6. A physiologically based nonhomogeneous Poisson counter model of visual identification.

    Science.gov (United States)

    Christensen, Jeppe H; Markussen, Bo; Bundesen, Claus; Kyllingsbæk, Søren

    2018-04-30

    A physiologically based nonhomogeneous Poisson counter model of visual identification is presented. The model was developed in the framework of a Theory of Visual Attention (Bundesen, 1990; Kyllingsbæk, Markussen, & Bundesen, 2012) and meant for modeling visual identification of objects that are mutually confusable and hard to see. The model assumes that the visual system's initial sensory response consists in tentative visual categorizations, which are accumulated by leaky integration of both transient and sustained components comparable with those found in spike density patterns of early sensory neurons. The sensory response (tentative categorizations) feeds independent Poisson counters, each of which accumulates tentative object categorizations of a particular type to guide overt identification performance. We tested the model's ability to predict the effect of stimulus duration on observed distributions of responses in a nonspeeded (pure accuracy) identification task with eight response alternatives. The time courses of correct and erroneous categorizations were well accounted for when the event-rates of competing Poisson counters were allowed to vary independently over time in a way that mimicked the dynamics of receptive field selectivity as found in neurophysiological studies. Furthermore, the initial sensory response yielded theoretical hazard rate functions that closely resembled empirically estimated ones. Finally, supplied with a Naka-Rushton type contrast gain control, the model provided an explanation for Bloch's law. (PsycINFO Database Record (c) 2018 APA, all rights reserved).

  7. Modeling environmental noise exceedances using non-homogeneous Poisson processes.

    Science.gov (United States)

    Guarnaccia, Claudio; Quartieri, Joseph; Barrios, Juan M; Rodrigues, Eliane R

    2014-10-01

    In this work a non-homogeneous Poisson model is considered to study noise exposure. The Poisson process, counting the number of times that a sound level surpasses a threshold, is used to estimate the probability that a population is exposed to high levels of noise a certain number of times in a given time interval. The rate function of the Poisson process is assumed to be of a Weibull type. The presented model is applied to community noise data from Messina, Sicily (Italy). Four sets of data are used to estimate the parameters involved in the model. After the estimation and tuning are made, a way of estimating the probability that an environmental noise threshold is exceeded a certain number of times in a given time interval is presented. This estimation can be very useful in the study of noise exposure of a population and also to predict, given the current behavior of the data, the probability of occurrence of high levels of noise in the near future. One of the most important features of the model is that it implicitly takes into account different noise sources, which need to be treated separately when using usual models.

  8. Differential expression analysis for RNAseq using Poisson mixed models.

    Science.gov (United States)

    Sun, Shiquan; Hood, Michelle; Scott, Laura; Peng, Qinke; Mukherjee, Sayan; Tung, Jenny; Zhou, Xiang

    2017-06-20

    Identifying differentially expressed (DE) genes from RNA sequencing (RNAseq) studies is among the most common analyses in genomics. However, RNAseq DE analysis presents several statistical and computational challenges, including over-dispersed read counts and, in some settings, sample non-independence. Previous count-based methods rely on simple hierarchical Poisson models (e.g. negative binomial) to model independent over-dispersion, but do not account for sample non-independence due to relatedness, population structure and/or hidden confounders. Here, we present a Poisson mixed model with two random effects terms that account for both independent over-dispersion and sample non-independence. We also develop a scalable sampling-based inference algorithm using a latent variable representation of the Poisson distribution. With simulations, we show that our method properly controls for type I error and is generally more powerful than other widely used approaches, except in small samples (n <15) with other unfavorable properties (e.g. small effect sizes). We also apply our method to three real datasets that contain related individuals, population stratification or hidden confounders. Our results show that our method increases power in all three data compared to other approaches, though the power gain is smallest in the smallest sample (n = 6). Our method is implemented in MACAU, freely available at www.xzlab.org/software.html. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  9. Forecasting with Dynamic Regression Models

    CERN Document Server

    Pankratz, Alan

    2012-01-01

    One of the most widely used tools in statistical forecasting, single equation regression models is examined here. A companion to the author's earlier work, Forecasting with Univariate Box-Jenkins Models: Concepts and Cases, the present text pulls together recent time series ideas and gives special attention to possible intertemporal patterns, distributed lag responses of output to input series and the auto correlation patterns of regression disturbance. It also includes six case studies.

  10. A Generalized QMRA Beta-Poisson Dose-Response Model.

    Science.gov (United States)

    Xie, Gang; Roiko, Anne; Stratton, Helen; Lemckert, Charles; Dunn, Peter K; Mengersen, Kerrie

    2016-10-01

    Quantitative microbial risk assessment (QMRA) is widely accepted for characterizing the microbial risks associated with food, water, and wastewater. Single-hit dose-response models are the most commonly used dose-response models in QMRA. Denoting PI(d) as the probability of infection at a given mean dose d, a three-parameter generalized QMRA beta-Poisson dose-response model, PI(d|α,β,r*), is proposed in which the minimum number of organisms required for causing infection, K min , is not fixed, but a random variable following a geometric distribution with parameter 0Poisson model, PI(d|α,β), is a special case of the generalized model with K min = 1 (which implies r*=1). The generalized beta-Poisson model is based on a conceptual model with greater detail in the dose-response mechanism. Since a maximum likelihood solution is not easily available, a likelihood-free approximate Bayesian computation (ABC) algorithm is employed for parameter estimation. By fitting the generalized model to four experimental data sets from the literature, this study reveals that the posterior median r* estimates produced fall short of meeting the required condition of r* = 1 for single-hit assumption. However, three out of four data sets fitted by the generalized models could not achieve an improvement in goodness of fit. These combined results imply that, at least in some cases, a single-hit assumption for characterizing the dose-response process may not be appropriate, but that the more complex models may be difficult to support especially if the sample size is small. The three-parameter generalized model provides a possibility to investigate the mechanism of a dose-response process in greater detail than is possible under a single-hit model. © 2016 Society for Risk Analysis.

  11. Current and Predicted Fertility using Poisson Regression Model ...

    African Journals Online (AJOL)

    AJRH Managing Editor

    We used descriptive statistics and analysis of variance (ANOVA) to ... women and estimated the probabilities of a woman ... observed events, the probability of finding more than one ...... Fahrmeir L & Lang S. Bayesian inference for generalized.

  12. Study of non-Hodgkin's lymphoma mortality associated with industrial pollution in Spain, using Poisson models

    Directory of Open Access Journals (Sweden)

    Lope Virginia

    2009-01-01

    Full Text Available Abstract Background Non-Hodgkin's lymphomas (NHLs have been linked to proximity to industrial areas, but evidence regarding the health risk posed by residence near pollutant industries is very limited. The European Pollutant Emission Register (EPER is a public register that furnishes valuable information on industries that release pollutants to air and water, along with their geographical location. This study sought to explore the relationship between NHL mortality in small areas in Spain and environmental exposure to pollutant emissions from EPER-registered industries, using three Poisson-regression-based mathematical models. Methods Observed cases were drawn from mortality registries in Spain for the period 1994–2003. Industries were grouped into the following sectors: energy; metal; mineral; organic chemicals; waste; paper; food; and use of solvents. Populations having an industry within a radius of 1, 1.5, or 2 kilometres from the municipal centroid were deemed to be exposed. Municipalities outside those radii were considered as reference populations. The relative risks (RRs associated with proximity to pollutant industries were estimated using the following methods: Poisson Regression; mixed Poisson model with random provincial effect; and spatial autoregressive modelling (BYM model. Results Only proximity of paper industries to population centres (>2 km could be associated with a greater risk of NHL mortality (mixed model: RR:1.24, 95% CI:1.09–1.42; BYM model: RR:1.21, 95% CI:1.01–1.45; Poisson model: RR:1.16, 95% CI:1.06–1.27. Spatial models yielded higher estimates. Conclusion The reported association between exposure to air pollution from the paper, pulp and board industry and NHL mortality is independent of the model used. Inclusion of spatial random effects terms in the risk estimate improves the study of associations between environmental exposures and mortality. The EPER could be of great utility when studying the effects of

  13. Comparing the cancer in Ninawa during three periods (1980-1990, 1991-2000, 2001-2010 using Poisson regression

    Directory of Open Access Journals (Sweden)

    Muzahem Mohammed Yahya AL-Hashimi

    2013-01-01

    Full Text Available Background: Iraq fought three wars in three consecutive decades, Iran-Iraq war (1980-1988, Persian Gulf War in 1991, and the Iraq′s war in 2003. In the nineties of the last century and up to the present time, there have been anecdotal reports of increase in cancer in Ninawa as in all provinces of Iraq, possibly as a result of exposure to depleted uranium used by American troops in the last two wars. This paper deals with cancer incidence in Ninawa, the most importance province in Iraq, where many of her sons were soldiers in the Iraqi army, and they have participated in the wars. Materials and Methods: The data was derived from the Directorate of Health in Ninawa. The data was divided into three sub periods: 1980-1990, 1991-2000, and 2001-2010. The analyses are performed using Poisson regressions. The response variable is the cancer incidence number. Cancer cases, age, sex, and years were considered as the explanatory variables. The logarithm of the population of Ninawa is used as an offset. The aim of this paper is to model the cancer incidence data and estimate the cancer incidence rate ratio (IRR to illustrate the changes that have occurred of incidence cancer in Ninawa in these three periods. Results: There is evidence of a reduction in the cancer IRR in Ninawa in the third period as well as in the second period. Our analyses found that breast cancer remained the first common cancer; while the lung, trachea, and bronchus the second in spite of decreasing as dramatically. Modest increases in incidence of prostate, penis, and other male genitals for the duration of the study period and stability in incidence of colon in the second and third periods. Modest increases in incidence of placenta and metastatic tumors, while the highest increase was in leukemia in the third period relates to the second period but not to the first period. The cancer IRR in men was decreased from more than 33% than those of females in the first period, more than 39

  14. Comparing the cancer in Ninawa during three periods (1980-1990, 1991-2000, 2001-2010) using Poisson regression.

    Science.gov (United States)

    Al-Hashimi, Muzahem Mohammed Yahya; Wang, Xiangjun

    2013-12-01

    Iraq fought three wars in three consecutive decades, Iran-Iraq war (1980-1988), Persian Gulf War in 1991, and the Iraq's war in 2003. In the nineties of the last century and up to the present time, there have been anecdotal reports of increase in cancer in Ninawa as in all provinces of Iraq, possibly as a result of exposure to depleted uranium used by American troops in the last two wars. This paper deals with cancer incidence in Ninawa, the most importance province in Iraq, where many of her sons were soldiers in the Iraqi army, and they have participated in the wars. The data was derived from the Directorate of Health in Ninawa. The data was divided into three sub periods: 1980-1990, 1991-2000, and 2001-2010. The analyses are performed using Poisson regressions. The response variable is the cancer incidence number. Cancer cases, age, sex, and years were considered as the explanatory variables. The logarithm of the population of Ninawa is used as an offset. The aim of this paper is to model the cancer incidence data and estimate the cancer incidence rate ratio (IRR) to illustrate the changes that have occurred of incidence cancer in Ninawa in these three periods. There is evidence of a reduction in the cancer IRR in Ninawa in the third period as well as in the second period. Our analyses found that breast cancer remained the first common cancer; while the lung, trachea, and bronchus the second in spite of decreasing as dramatically. Modest increases in incidence of prostate, penis, and other male genitals for the duration of the study period and stability in incidence of colon in the second and third periods. Modest increases in incidence of placenta and metastatic tumors, while the highest increase was in leukemia in the third period relates to the second period but not to the first period. The cancer IRR in men was decreased from more than 33% than those of females in the first period, more than 39% in the second period, and regressed to 9.56% in the third

  15. Nonparametric Mixture of Regression Models.

    Science.gov (United States)

    Huang, Mian; Li, Runze; Wang, Shaoli

    2013-07-01

    Motivated by an analysis of US house price index data, we propose nonparametric finite mixture of regression models. We study the identifiability issue of the proposed models, and develop an estimation procedure by employing kernel regression. We further systematically study the sampling properties of the proposed estimators, and establish their asymptotic normality. A modified EM algorithm is proposed to carry out the estimation procedure. We show that our algorithm preserves the ascent property of the EM algorithm in an asymptotic sense. Monte Carlo simulations are conducted to examine the finite sample performance of the proposed estimation procedure. An empirical analysis of the US house price index data is illustrated for the proposed methodology.

  16. Identifying traffic accident black spots with Poisson-Tweedie models

    DEFF Research Database (Denmark)

    Debrabant, Birgit; Halekoh, Ulrich; Bonat, Wagner Hugo

    2018-01-01

    This paper aims at the identification of black spots for traffic accidents, i.e. locations with accident counts beyond what is usual for similar locations, using spatially and temporally aggregated hospital records from Funen, Denmark. Specifically, we apply an autoregressive Poisson-Tweedie model...... considered calendar years and calculated by simulations a probability of p=0.03 for these to be chance findings. Altogether, our results recommend these sites for further investigation and suggest that our simple approach could play a role in future area based traffic accident prevention planning....

  17. On population size estimators in the Poisson mixture model.

    Science.gov (United States)

    Mao, Chang Xuan; Yang, Nan; Zhong, Jinhua

    2013-09-01

    Estimating population sizes via capture-recapture experiments has enormous applications. The Poisson mixture model can be adopted for those applications with a single list in which individuals appear one or more times. We compare several nonparametric estimators, including the Chao estimator, the Zelterman estimator, two jackknife estimators and the bootstrap estimator. The target parameter of the Chao estimator is a lower bound of the population size. Those of the other four estimators are not lower bounds, and they may produce lower confidence limits for the population size with poor coverage probabilities. A simulation study is reported and two examples are investigated. © 2013, The International Biometric Society.

  18. A Local Poisson Graphical Model for inferring networks from sequencing data.

    Science.gov (United States)

    Allen, Genevera I; Liu, Zhandong

    2013-09-01

    Gaussian graphical models, a class of undirected graphs or Markov Networks, are often used to infer gene networks based on microarray expression data. Many scientists, however, have begun using high-throughput sequencing technologies such as RNA-sequencing or next generation sequencing to measure gene expression. As the resulting data consists of counts of sequencing reads for each gene, Gaussian graphical models are not optimal for this discrete data. In this paper, we propose a novel method for inferring gene networks from sequencing data: the Local Poisson Graphical Model. Our model assumes a Local Markov property where each variable conditional on all other variables is Poisson distributed. We develop a neighborhood selection algorithm to fit our model locally by performing a series of l1 penalized Poisson, or log-linear, regressions. This yields a fast parallel algorithm for estimating networks from next generation sequencing data. In simulations, we illustrate the effectiveness of our methods for recovering network structure from count data. A case study on breast cancer microRNAs (miRNAs), a novel application of graphical models, finds known regulators of breast cancer genes and discovers novel miRNA clusters and hubs that are targets for future research.

  19. Regression Models for Repairable Systems

    Czech Academy of Sciences Publication Activity Database

    Novák, Petr

    2015-01-01

    Roč. 17, č. 4 (2015), s. 963-972 ISSN 1387-5841 Institutional support: RVO:67985556 Keywords : Reliability analysis * Repair models * Regression Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.782, year: 2015 http://library.utia.cas.cz/separaty/2015/SI/novak-0450902.pdf

  20. Using poisson regression to compare rates of unsatisfactory pap smears among gynecologists and to evaluate a performance improvement plan.

    Science.gov (United States)

    Wachtel, Mitchell S; Hatley, Warren G; de Riese, Cornelia

    2009-01-01

    To evaluate impact of a performance improvement (PI) plan implemented after initial analysis, comparing 7 gynecologists working in 2 clinics. From January to October 2005, unsatisfactory rates for gynecologists and clinics were calculated. A PI plan was instituted at the end of the first quarter of 2006. Unsatisfactory rates for each quarter of 2006 and the first quarter of 2007 were calculated. Poisson regression analyzed results. A total of 100 ThinPrep Pap smears initially deemed unsatisfactory underwent reprocessing and revaluation. The study's first part evaluated 2890 smears. Clinic unsatisfactory rates, 2.7% and 2.6%, were similar (p > 0.05). Gynecologists' unsatisfactory rates were 4.8-0.6%; differences between each of the highest 2 and lowest rates were significant (p improvement. Reprocessing ThinPrep smears is an important means of reducing unsatisfactory rates but should not be a substitute for attention to quality in sampling.

  1. Lindley frailty model for a class of compound Poisson processes

    Science.gov (United States)

    Kadilar, Gamze Özel; Ata, Nihal

    2013-10-01

    The Lindley distribution gain importance in survival analysis for the similarity of exponential distribution and allowance for the different shapes of hazard function. Frailty models provide an alternative to proportional hazards model where misspecified or omitted covariates are described by an unobservable random variable. Despite of the distribution of the frailty is generally assumed to be continuous, it is appropriate to consider discrete frailty distributions In some circumstances. In this paper, frailty models with discrete compound Poisson process for the Lindley distributed failure time are introduced. Survival functions are derived and maximum likelihood estimation procedures for the parameters are studied. Then, the fit of the models to the earthquake data set of Turkey are examined.

  2. Poisson Autoregression

    DEFF Research Database (Denmark)

    Fokianos, Konstantinos; Rahbek, Anders Christian; Tjøstheim, Dag

    This paper considers geometric ergodicity and likelihood based inference for linear and nonlinear Poisson autoregressions. In the linear case the conditional mean is linked linearly to its past values as well as the observed values of the Poisson process. This also applies to the conditional...... variance, implying an interpretation as an integer valued GARCH process. In a nonlinear conditional Poisson model, the conditional mean is a nonlinear function of its past values and a nonlinear function of past observations. As a particular example an exponential autoregressive Poisson model for time...

  3. Poisson Autoregression

    DEFF Research Database (Denmark)

    Fokianos, Konstantinos; Rahbæk, Anders; Tjøstheim, Dag

    This paper considers geometric ergodicity and likelihood based inference for linear and nonlinear Poisson autoregressions. In the linear case the conditional mean is linked linearly to its past values as well as the observed values of the Poisson process. This also applies to the conditional...... variance, making an interpretation as an integer valued GARCH process possible. In a nonlinear conditional Poisson model, the conditional mean is a nonlinear function of its past values and a nonlinear function of past observations. As a particular example an exponential autoregressive Poisson model...

  4. Perbandingan Regresi Binomial Negatif dan Regresi Conway-Maxwell-Poisson dalam Mengatasi Overdispersi pada Regresi Poisson

    Directory of Open Access Journals (Sweden)

    Lusi Eka Afri

    2017-03-01

    Full Text Available Regresi Binomial Negatif dan regresi Conway-Maxwell-Poisson merupakan solusi untuk mengatasi overdispersi pada regresi Poisson. Kedua model tersebut merupakan perluasan dari model regresi Poisson. Menurut Hinde dan Demetrio (2007, terdapat beberapa kemungkinan terjadi overdispersi pada regresi Poisson yaitu keragaman hasil pengamatan keragaman individu sebagai komponen yang tidak dijelaskan oleh model, korelasi antar respon individu, terjadinya pengelompokan dalam populasi dan peubah teramati yang dihilangkan. Akibatnya dapat menyebabkan pendugaan galat baku yang terlalu rendah dan akan menghasilkan pendugaan parameter yang bias ke bawah (underestimate. Penelitian ini bertujuan untuk membandingan model Regresi Binomial Negatif dan model regresi Conway-Maxwell-Poisson (COM-Poisson dalam mengatasi overdispersi pada data distribusi Poisson berdasarkan statistik uji devians. Data yang digunakan dalam penelitian ini terdiri dari dua sumber data yaitu data simulasi dan data kasus terapan. Data simulasi yang digunakan diperoleh dengan membangkitkan data berdistribusi Poisson yang mengandung overdispersi dengan menggunakan bahasa pemrograman R berdasarkan karakteristik data berupa , peluang munculnya nilai nol (p serta ukuran sampel (n. Data dibangkitkan berguna untuk mendapatkan estimasi koefisien parameter pada regresi binomial negatif dan COM-Poisson.   Kata Kunci: overdispersi, regresi binomial negatif, regresi Conway-Maxwell-Poisson Negative binomial regression and Conway-Maxwell-Poisson regression could be used to overcome over dispersion on Poisson regression. Both models are the extension of Poisson regression model. According to Hinde and Demetrio (2007, there will be some over dispersion on Poisson regression: observed variance in individual variance cannot be described by a model, correlation among individual response, and the population group and the observed variables are eliminated. Consequently, this can lead to low standard error

  5. The Jackson Queueing Network Model Built Using Poisson Measures. Application To A Bank Model

    Directory of Open Access Journals (Sweden)

    Ciuiu Daniel

    2014-07-01

    Full Text Available In this paper we will build a bank model using Poisson measures and Jackson queueing networks. We take into account the relationship between the Poisson and the exponential distributions, and we consider for each credit/deposit type a node where shocks are modeled as the compound Poisson processes. The transmissions of the shocks are modeled as moving between nodes in Jackson queueing networks, the external shocks are modeled as external arrivals, and the absorption of shocks as departures from the network.

  6. Estimation of a Non-homogeneous Poisson Model: An Empirical ...

    African Journals Online (AJOL)

    This article aims at applying the Nonhomogeneous Poisson process to trends of economic development. For this purpose, a modified Nonhomogeneous Poisson process is derived when the intensity rate is considered as a solution of stochastic differential equation which satisfies the geometric Brownian motion. The mean ...

  7. Electroneutral models for dynamic Poisson-Nernst-Planck systems

    Science.gov (United States)

    Song, Zilong; Cao, Xiulei; Huang, Huaxiong

    2018-01-01

    The Poisson-Nernst-Planck (PNP) system is a standard model for describing ion transport. In many applications, e.g., ions in biological tissues, the presence of thin boundary layers poses both modeling and computational challenges. In this paper, we derive simplified electroneutral (EN) models where the thin boundary layers are replaced by effective boundary conditions. There are two major advantages of EN models. First, it is much cheaper to solve them numerically. Second, EN models are easier to deal with compared to the original PNP system; therefore, it would also be easier to derive macroscopic models for cellular structures using EN models. Even though the approach used here is applicable to higher-dimensional cases, this paper mainly focuses on the one-dimensional system, including the general multi-ion case. Using systematic asymptotic analysis, we derive a variety of effective boundary conditions directly applicable to the EN system for the bulk region. This EN system can be solved directly and efficiently without computing the solution in the boundary layer. The derivation is based on matched asymptotics, and the key idea is to bring back higher-order contributions into the effective boundary conditions. For Dirichlet boundary conditions, the higher-order terms can be neglected and the classical results (continuity of electrochemical potential) are recovered. For flux boundary conditions, higher-order terms account for the accumulation of ions in boundary layer and neglecting them leads to physically incorrect solutions. To validate the EN model, numerical computations are carried out for several examples. Our results show that solving the EN model is much more efficient than the original PNP system. Implemented with the Hodgkin-Huxley model, the computational time for solving the EN model is significantly reduced without sacrificing the accuracy of the solution due to the fact that it allows for relatively large mesh and time-step sizes.

  8. Interpretation of commonly used statistical regression models.

    Science.gov (United States)

    Kasza, Jessica; Wolfe, Rory

    2014-01-01

    A review of some regression models commonly used in respiratory health applications is provided in this article. Simple linear regression, multiple linear regression, logistic regression and ordinal logistic regression are considered. The focus of this article is on the interpretation of the regression coefficients of each model, which are illustrated through the application of these models to a respiratory health research study. © 2013 The Authors. Respirology © 2013 Asian Pacific Society of Respirology.

  9. Comparison of robustness to outliers between robust poisson models and log-binomial models when estimating relative risks for common binary outcomes: a simulation study.

    Science.gov (United States)

    Chen, Wansu; Shi, Jiaxiao; Qian, Lei; Azen, Stanley P

    2014-06-26

    To estimate relative risks or risk ratios for common binary outcomes, the most popular model-based methods are the robust (also known as modified) Poisson and the log-binomial regression. Of the two methods, it is believed that the log-binomial regression yields more efficient estimators because it is maximum likelihood based, while the robust Poisson model may be less affected by outliers. Evidence to support the robustness of robust Poisson models in comparison with log-binomial models is very limited. In this study a simulation was conducted to evaluate the performance of the two methods in several scenarios where outliers existed. The findings indicate that for data coming from a population where the relationship between the outcome and the covariate was in a simple form (e.g. log-linear), the two models yielded comparable biases and mean square errors. However, if the true relationship contained a higher order term, the robust Poisson models consistently outperformed the log-binomial models even when the level of contamination is low. The robust Poisson models are more robust (or less sensitive) to outliers compared to the log-binomial models when estimating relative risks or risk ratios for common binary outcomes. Users should be aware of the limitations when choosing appropriate models to estimate relative risks or risk ratios.

  10. Extension of the application of conway-maxwell-poisson models: analyzing traffic crash data exhibiting underdispersion.

    Science.gov (United States)

    Lord, Dominique; Geedipally, Srinivas Reddy; Guikema, Seth D

    2010-08-01

    The objective of this article is to evaluate the performance of the COM-Poisson GLM for analyzing crash data exhibiting underdispersion (when conditional on the mean). The COM-Poisson distribution, originally developed in 1962, has recently been reintroduced by statisticians for analyzing count data subjected to either over- or underdispersion. Over the last year, the COM-Poisson GLM has been evaluated in the context of crash data analysis and it has been shown that the model performs as well as the Poisson-gamma model for crash data exhibiting overdispersion. To accomplish the objective of this study, several COM-Poisson models were estimated using crash data collected at 162 railway-highway crossings in South Korea between 1998 and 2002. This data set has been shown to exhibit underdispersion when models linking crash data to various explanatory variables are estimated. The modeling results were compared to those produced from the Poisson and gamma probability models documented in a previous published study. The results of this research show that the COM-Poisson GLM can handle crash data when the modeling output shows signs of underdispersion. Finally, they also show that the model proposed in this study provides better statistical performance than the gamma probability and the traditional Poisson models, at least for this data set.

  11. The Rasch Poisson counts model for incomplete data : An application of the EM algorithm

    NARCIS (Netherlands)

    Jansen, G.G.H.

    Rasch's Poisson counts model is a latent trait model for the situation in which K tests are administered to N examinees and the test score is a count [e.g., the repeated occurrence of some event, such as the number of items completed or the number of items answered (in)correctly]. The Rasch Poisson

  12. The Lie-Poisson structure of integrable classical non-linear sigma models

    International Nuclear Information System (INIS)

    Bordemann, M.; Forger, M.; Schaeper, U.; Laartz, J.

    1993-01-01

    The canonical structure of classical non-linear sigma models on Riemannian symmetric spaces, which constitute the most general class of classical non-linear sigma models known to be integrable, is shown to be governed by a fundamental Poisson bracket relation that fits into the r-s-matrix formalism for non-ultralocal integrable models first discussed by Maillet. The matrices r and s are computed explicitly and, being field dependent, satisfy fundamental Poisson bracket relations of their own, which can be expressed in terms of a new numerical matrix c. It is proposed that all these Poisson brackets taken together are representation conditions for a new kind of algebra which, for this class of models, replaces the classical Yang-Baxter algebra governing the canonical structure of ultralocal models. The Poisson brackets for the transition matrices are also computed, and the notorious regularization problem associated with the definition of the Poisson brackets for the monodromy matrices is discussed. (orig.)

  13. Poisson Autoregression

    DEFF Research Database (Denmark)

    Fokianos, Konstantinos; Rahbek, Anders Christian; Tjøstheim, Dag

    2009-01-01

    In this article we consider geometric ergodicity and likelihood-based inference for linear and nonlinear Poisson autoregression. In the linear case, the conditional mean is linked linearly to its past values, as well as to the observed values of the Poisson process. This also applies...... to the conditional variance, making possible interpretation as an integer-valued generalized autoregressive conditional heteroscedasticity process. In a nonlinear conditional Poisson model, the conditional mean is a nonlinear function of its past values and past observations. As a particular example, we consider...... an exponential autoregressive Poisson model for time series. Under geometric ergodicity, the maximum likelihood estimators are shown to be asymptotically Gaussian in the linear model. In addition, we provide a consistent estimator of their asymptotic covariance matrix. Our approach to verifying geometric...

  14. Bayesian approach to errors-in-variables in regression models

    Science.gov (United States)

    Rozliman, Nur Aainaa; Ibrahim, Adriana Irawati Nur; Yunus, Rossita Mohammad

    2017-05-01

    In many applications and experiments, data sets are often contaminated with error or mismeasured covariates. When at least one of the covariates in a model is measured with error, Errors-in-Variables (EIV) model can be used. Measurement error, when not corrected, would cause misleading statistical inferences and analysis. Therefore, our goal is to examine the relationship of the outcome variable and the unobserved exposure variable given the observed mismeasured surrogate by applying the Bayesian formulation to the EIV model. We shall extend the flexible parametric method proposed by Hossain and Gustafson (2009) to another nonlinear regression model which is the Poisson regression model. We shall then illustrate the application of this approach via a simulation study using Markov chain Monte Carlo sampling methods.

  15. Application of the Hyper-Poisson Generalized Linear Model for Analyzing Motor Vehicle Crashes.

    Science.gov (United States)

    Khazraee, S Hadi; Sáez-Castillo, Antonio Jose; Geedipally, Srinivas Reddy; Lord, Dominique

    2015-05-01

    The hyper-Poisson distribution can handle both over- and underdispersion, and its generalized linear model formulation allows the dispersion of the distribution to be observation-specific and dependent on model covariates. This study's objective is to examine the potential applicability of a newly proposed generalized linear model framework for the hyper-Poisson distribution in analyzing motor vehicle crash count data. The hyper-Poisson generalized linear model was first fitted to intersection crash data from Toronto, characterized by overdispersion, and then to crash data from railway-highway crossings in Korea, characterized by underdispersion. The results of this study are promising. When fitted to the Toronto data set, the goodness-of-fit measures indicated that the hyper-Poisson model with a variable dispersion parameter provided a statistical fit as good as the traditional negative binomial model. The hyper-Poisson model was also successful in handling the underdispersed data from Korea; the model performed as well as the gamma probability model and the Conway-Maxwell-Poisson model previously developed for the same data set. The advantages of the hyper-Poisson model studied in this article are noteworthy. Unlike the negative binomial model, which has difficulties in handling underdispersed data, the hyper-Poisson model can handle both over- and underdispersed crash data. Although not a major issue for the Conway-Maxwell-Poisson model, the effect of each variable on the expected mean of crashes is easily interpretable in the case of this new model. © 2014 Society for Risk Analysis.

  16. The Dependent Poisson Race Model and Modeling Dependence in Conjoint Choice Experiments

    Science.gov (United States)

    Ruan, Shiling; MacEachern, Steven N.; Otter, Thomas; Dean, Angela M.

    2008-01-01

    Conjoint choice experiments are used widely in marketing to study consumer preferences amongst alternative products. We develop a class of choice models, belonging to the class of Poisson race models, that describe a "random utility" which lends itself to a process-based description of choice. The models incorporate a dependence structure which…

  17. PENERAPAN REGRESI BINOMIAL NEGATIF UNTUK MENGATASI OVERDISPERSI PADA REGRESI POISSON

    Directory of Open Access Journals (Sweden)

    PUTU SUSAN PRADAWATI

    2013-09-01

    Full Text Available Poisson regression was used to analyze the count data which Poisson distributed. Poisson regression analysis requires state equidispersion, in which the mean value of the response variable is equal to the value of the variance. However, there are deviations in which the value of the response variable variance is greater than the mean. This is called overdispersion. If overdispersion happens and Poisson Regression analysis is being used, then underestimated standard errors will be obtained. Negative Binomial Regression can handle overdispersion because it contains a dispersion parameter. From the simulation data which experienced overdispersion in the Poisson Regression model it was found that the Negative Binomial Regression was better than the Poisson Regression model.

  18. Poisson statistics application in modelling of neutron detection

    International Nuclear Information System (INIS)

    Avdic, S.; Marinkovic, P.

    1996-01-01

    The main purpose of this study is taking into account statistical analysis of the experimental data which were measured by 3 He neutron spectrometer. The unfolding method based on principle of maximum likelihood incorporates the Poisson approximation of counting statistics applied (aithor)

  19. Zero-Inflated Poisson Modeling of Fall Risk Factors in Community-Dwelling Older Adults.

    Science.gov (United States)

    Jung, Dukyoo; Kang, Younhee; Kim, Mi Young; Ma, Rye-Won; Bhandari, Pratibha

    2016-02-01

    The aim of this study was to identify risk factors for falls among community-dwelling older adults. The study used a cross-sectional descriptive design. Self-report questionnaires were used to collect data from 658 community-dwelling older adults and were analyzed using logistic and zero-inflated Poisson (ZIP) regression. Perceived health status was a significant factor in the count model, and fall efficacy emerged as a significant predictor in the logistic models. The findings suggest that fall efficacy is important for predicting not only faller and nonfaller status but also fall counts in older adults who may or may not have experienced a previous fall. The fall predictors identified in this study--perceived health status and fall efficacy--indicate the need for fall-prevention programs tailored to address both the physical and psychological issues unique to older adults. © The Author(s) 2014.

  20. Gaussian Process Regression Model in Spatial Logistic Regression

    Science.gov (United States)

    Sofro, A.; Oktaviarina, A.

    2018-01-01

    Spatial analysis has developed very quickly in the last decade. One of the favorite approaches is based on the neighbourhood of the region. Unfortunately, there are some limitations such as difficulty in prediction. Therefore, we offer Gaussian process regression (GPR) to accommodate the issue. In this paper, we will focus on spatial modeling with GPR for binomial data with logit link function. The performance of the model will be investigated. We will discuss the inference of how to estimate the parameters and hyper-parameters and to predict as well. Furthermore, simulation studies will be explained in the last section.

  1. On terminating Poisson processes in some shock models

    Energy Technology Data Exchange (ETDEWEB)

    Finkelstein, Maxim, E-mail: FinkelMI@ufs.ac.z [Department of Mathematical Statistics, University of the Free State, Bloemfontein (South Africa); Max Planck Institute for Demographic Research, Rostock (Germany); Marais, Francois, E-mail: fmarais@csc.co [CSC, Cape Town (South Africa)

    2010-08-15

    A system subject to a point process of shocks is considered. Shocks occur in accordance with the homogeneous Poisson process. Different criteria of system failure (termination) are discussed and the corresponding probabilities of failure (accident)-free performance are derived. The described analytical approach is based on deriving integral equations for each setting and solving these equations through the Laplace transform. Some approximations are analyzed and further generalizations and applications are discussed.

  2. On terminating Poisson processes in some shock models

    International Nuclear Information System (INIS)

    Finkelstein, Maxim; Marais, Francois

    2010-01-01

    A system subject to a point process of shocks is considered. Shocks occur in accordance with the homogeneous Poisson process. Different criteria of system failure (termination) are discussed and the corresponding probabilities of failure (accident)-free performance are derived. The described analytical approach is based on deriving integral equations for each setting and solving these equations through the Laplace transform. Some approximations are analyzed and further generalizations and applications are discussed.

  3. The Spatial Distribution of Hepatitis C Virus Infections and Associated Determinants--An Application of a Geographically Weighted Poisson Regression for Evidence-Based Screening Interventions in Hotspots.

    Science.gov (United States)

    Kauhl, Boris; Heil, Jeanne; Hoebe, Christian J P A; Schweikart, Jürgen; Krafft, Thomas; Dukers-Muijrers, Nicole H T M

    2015-01-01

    Hepatitis C Virus (HCV) infections are a major cause for liver diseases. A large proportion of these infections remain hidden to care due to its mostly asymptomatic nature. Population-based screening and screening targeted on behavioural risk groups had not proven to be effective in revealing these hidden infections. Therefore, more practically applicable approaches to target screenings are necessary. Geographic Information Systems (GIS) and spatial epidemiological methods may provide a more feasible basis for screening interventions through the identification of hotspots as well as demographic and socio-economic determinants. Analysed data included all HCV tests (n = 23,800) performed in the southern area of the Netherlands between 2002-2008. HCV positivity was defined as a positive immunoblot or polymerase chain reaction test. Population data were matched to the geocoded HCV test data. The spatial scan statistic was applied to detect areas with elevated HCV risk. We applied global regression models to determine associations between population-based determinants and HCV risk. Geographically weighted Poisson regression models were then constructed to determine local differences of the association between HCV risk and population-based determinants. HCV prevalence varied geographically and clustered in urban areas. The main population at risk were middle-aged males, non-western immigrants and divorced persons. Socio-economic determinants consisted of one-person households, persons with low income and mean property value. However, the association between HCV risk and demographic as well as socio-economic determinants displayed strong regional and intra-urban differences. The detection of local hotspots in our study may serve as a basis for prioritization of areas for future targeted interventions. Demographic and socio-economic determinants associated with HCV risk show regional differences underlining that a one-size-fits-all approach even within small geographic

  4. Fractional poisson--a simple dose-response model for human norovirus.

    Science.gov (United States)

    Messner, Michael J; Berger, Philip; Nappier, Sharon P

    2014-10-01

    This study utilizes old and new Norovirus (NoV) human challenge data to model the dose-response relationship for human NoV infection. The combined data set is used to update estimates from a previously published beta-Poisson dose-response model that includes parameters for virus aggregation and for a beta-distribution that describes variable susceptibility among hosts. The quality of the beta-Poisson model is examined and a simpler model is proposed. The new model (fractional Poisson) characterizes hosts as either perfectly susceptible or perfectly immune, requiring a single parameter (the fraction of perfectly susceptible hosts) in place of the two-parameter beta-distribution. A second parameter is included to account for virus aggregation in the same fashion as it is added to the beta-Poisson model. Infection probability is simply the product of the probability of nonzero exposure (at least one virus or aggregate is ingested) and the fraction of susceptible hosts. The model is computationally simple and appears to be well suited to the data from the NoV human challenge studies. The model's deviance is similar to that of the beta-Poisson, but with one parameter, rather than two. As a result, the Akaike information criterion favors the fractional Poisson over the beta-Poisson model. At low, environmentally relevant exposure levels (Poisson model; however, caution is advised because no subjects were challenged at such a low dose. New low-dose data would be of great value to further clarify the NoV dose-response relationship and to support improved risk assessment for environmentally relevant exposures. © 2014 Society for Risk Analysis Published 2014. This article is a U.S. Government work and is in the public domain for the U.S.A.

  5. Analysis of dental caries using generalized linear and count regression models

    Directory of Open Access Journals (Sweden)

    Javali M. Phil

    2013-11-01

    Full Text Available Generalized linear models (GLM are generalization of linear regression models, which allow fitting regression models to response data in all the sciences especially medical and dental sciences that follow a general exponential family. These are flexible and widely used class of such models that can accommodate response variables. Count data are frequently characterized by overdispersion and excess zeros. Zero-inflated count models provide a parsimonious yet powerful way to model this type of situation. Such models assume that the data are a mixture of two separate data generation processes: one generates only zeros, and the other is either a Poisson or a negative binomial data-generating process. Zero inflated count regression models such as the zero-inflated Poisson (ZIP, zero-inflated negative binomial (ZINB regression models have been used to handle dental caries count data with many zeros. We present an evaluation framework to the suitability of applying the GLM, Poisson, NB, ZIP and ZINB to dental caries data set where the count data may exhibit evidence of many zeros and over-dispersion. Estimation of the model parameters using the method of maximum likelihood is provided. Based on the Vuong test statistic and the goodness of fit measure for dental caries data, the NB and ZINB regression models perform better than other count regression models.

  6. Model-based Quantile Regression for Discrete Data

    KAUST Repository

    Padellini, Tullia

    2018-04-10

    Quantile regression is a class of methods voted to the modelling of conditional quantiles. In a Bayesian framework quantile regression has typically been carried out exploiting the Asymmetric Laplace Distribution as a working likelihood. Despite the fact that this leads to a proper posterior for the regression coefficients, the resulting posterior variance is however affected by an unidentifiable parameter, hence any inferential procedure beside point estimation is unreliable. We propose a model-based approach for quantile regression that considers quantiles of the generating distribution directly, and thus allows for a proper uncertainty quantification. We then create a link between quantile regression and generalised linear models by mapping the quantiles to the parameter of the response variable, and we exploit it to fit the model with R-INLA. We extend it also in the case of discrete responses, where there is no 1-to-1 relationship between quantiles and distribution\\'s parameter, by introducing continuous generalisations of the most common discrete variables (Poisson, Binomial and Negative Binomial) to be exploited in the fitting.

  7. Stochastic Interest Model Based on Compound Poisson Process and Applications in Actuarial Science

    OpenAIRE

    Li, Shilong; Yin, Chuancun; Zhao, Xia; Dai, Hongshuai

    2017-01-01

    Considering stochastic behavior of interest rates in financial market, we construct a new class of interest models based on compound Poisson process. Different from the references, this paper describes the randomness of interest rates by modeling the force of interest with Poisson random jumps directly. To solve the problem in calculation of accumulated interest force function, one important integral technique is employed. And a conception called the critical value is introduced to investigat...

  8. Bayesian inference on multiscale models for poisson intensity estimation: applications to photon-limited image denoising.

    Science.gov (United States)

    Lefkimmiatis, Stamatios; Maragos, Petros; Papandreou, George

    2009-08-01

    We present an improved statistical model for analyzing Poisson processes, with applications to photon-limited imaging. We build on previous work, adopting a multiscale representation of the Poisson process in which the ratios of the underlying Poisson intensities (rates) in adjacent scales are modeled as mixtures of conjugate parametric distributions. Our main contributions include: 1) a rigorous and robust regularized expectation-maximization (EM) algorithm for maximum-likelihood estimation of the rate-ratio density parameters directly from the noisy observed Poisson data (counts); 2) extension of the method to work under a multiscale hidden Markov tree model (HMT) which couples the mixture label assignments in consecutive scales, thus modeling interscale coefficient dependencies in the vicinity of image edges; 3) exploration of a 2-D recursive quad-tree image representation, involving Dirichlet-mixture rate-ratio densities, instead of the conventional separable binary-tree image representation involving beta-mixture rate-ratio densities; and 4) a novel multiscale image representation, which we term Poisson-Haar decomposition, that better models the image edge structure, thus yielding improved performance. Experimental results on standard images with artificially simulated Poisson noise and on real photon-limited images demonstrate the effectiveness of the proposed techniques.

  9. The Kramers-Kronig relations for usual and anomalous Poisson-Nernst-Planck models

    OpenAIRE

    Evangelista, Luiz Roberto; Lenzi, Ervin Kaminski; Barbero, Giovanni

    2013-01-01

    The consistency of the frequency response predicted by a class of electrochemical impedance expressions is analytically checked by invoking the Kramers-Kronig (KK) relations. These expressions are obtained in the context of Poisson-Nernst-Planck usual (PNP) or anomalous (PNPA) diffusional models that satisfy Poisson's equation in a finite-length situation. The theoretical results, besides being successful in interpreting experimental data, are also shown to obey the KK relations when these re...

  10. The Kramers-Kronig relations for usual and anomalous Poisson-Nernst-Planck models.

    Science.gov (United States)

    Evangelista, Luiz Roberto; Lenzi, Ervin Kaminski; Barbero, Giovanni

    2013-11-20

    The consistency of the frequency response predicted by a class of electrochemical impedance expressions is analytically checked by invoking the Kramers-Kronig (KK) relations. These expressions are obtained in the context of Poisson-Nernst-Planck usual or anomalous diffusional models that satisfy Poisson's equation in a finite length situation. The theoretical results, besides being successful in interpreting experimental data, are also shown to obey the KK relations when these relations are modified accordingly.

  11. Regression models of reactor diagnostic signals

    International Nuclear Information System (INIS)

    Vavrin, J.

    1989-01-01

    The application is described of an autoregression model as the simplest regression model of diagnostic signals in experimental analysis of diagnostic systems, in in-service monitoring of normal and anomalous conditions and their diagnostics. The method of diagnostics is described using a regression type diagnostic data base and regression spectral diagnostics. The diagnostics is described of neutron noise signals from anomalous modes in the experimental fuel assembly of a reactor. (author)

  12. Zero-inflated Poisson model based likelihood ratio test for drug safety signal detection.

    Science.gov (United States)

    Huang, Lan; Zheng, Dan; Zalkikar, Jyoti; Tiwari, Ram

    2017-02-01

    In recent decades, numerous methods have been developed for data mining of large drug safety databases, such as Food and Drug Administration's (FDA's) Adverse Event Reporting System, where data matrices are formed by drugs such as columns and adverse events as rows. Often, a large number of cells in these data matrices have zero cell counts and some of them are "true zeros" indicating that the drug-adverse event pairs cannot occur, and these zero counts are distinguished from the other zero counts that are modeled zero counts and simply indicate that the drug-adverse event pairs have not occurred yet or have not been reported yet. In this paper, a zero-inflated Poisson model based likelihood ratio test method is proposed to identify drug-adverse event pairs that have disproportionately high reporting rates, which are also called signals. The maximum likelihood estimates of the model parameters of zero-inflated Poisson model based likelihood ratio test are obtained using the expectation and maximization algorithm. The zero-inflated Poisson model based likelihood ratio test is also modified to handle the stratified analyses for binary and categorical covariates (e.g. gender and age) in the data. The proposed zero-inflated Poisson model based likelihood ratio test method is shown to asymptotically control the type I error and false discovery rate, and its finite sample performance for signal detection is evaluated through a simulation study. The simulation results show that the zero-inflated Poisson model based likelihood ratio test method performs similar to Poisson model based likelihood ratio test method when the estimated percentage of true zeros in the database is small. Both the zero-inflated Poisson model based likelihood ratio test and likelihood ratio test methods are applied to six selected drugs, from the 2006 to 2011 Adverse Event Reporting System database, with varying percentages of observed zero-count cells.

  13. Variable importance in latent variable regression models

    NARCIS (Netherlands)

    Kvalheim, O.M.; Arneberg, R.; Bleie, O.; Rajalahti, T.; Smilde, A.K.; Westerhuis, J.A.

    2014-01-01

    The quality and practical usefulness of a regression model are a function of both interpretability and prediction performance. This work presents some new graphical tools for improved interpretation of latent variable regression models that can also assist in improved algorithms for variable

  14. Doubly stochastic Poisson process models for precipitation at fine time-scales

    Science.gov (United States)

    Ramesh, Nadarajah I.; Onof, Christian; Xie, Dichao

    2012-09-01

    This paper considers a class of stochastic point process models, based on doubly stochastic Poisson processes, in the modelling of rainfall. We examine the application of this class of models, a neglected alternative to the widely-known Poisson cluster models, in the analysis of fine time-scale rainfall intensity. These models are mainly used to analyse tipping-bucket raingauge data from a single site but an extension to multiple sites is illustrated which reveals the potential of this class of models to study the temporal and spatial variability of precipitation at fine time-scales.

  15. Regression modeling of ground-water flow

    Science.gov (United States)

    Cooley, R.L.; Naff, R.L.

    1985-01-01

    Nonlinear multiple regression methods are developed to model and analyze groundwater flow systems. Complete descriptions of regression methodology as applied to groundwater flow models allow scientists and engineers engaged in flow modeling to apply the methods to a wide range of problems. Organization of the text proceeds from an introduction that discusses the general topic of groundwater flow modeling, to a review of basic statistics necessary to properly apply regression techniques, and then to the main topic: exposition and use of linear and nonlinear regression to model groundwater flow. Statistical procedures are given to analyze and use the regression models. A number of exercises and answers are included to exercise the student on nearly all the methods that are presented for modeling and statistical analysis. Three computer programs implement the more complex methods. These three are a general two-dimensional, steady-state regression model for flow in an anisotropic, heterogeneous porous medium, a program to calculate a measure of model nonlinearity with respect to the regression parameters, and a program to analyze model errors in computed dependent variables such as hydraulic head. (USGS)

  16. Bayesian spatial modeling of HIV mortality via zero-inflated Poisson models.

    Science.gov (United States)

    Musal, Muzaffer; Aktekin, Tevfik

    2013-01-30

    In this paper, we investigate the effects of poverty and inequality on the number of HIV-related deaths in 62 New York counties via Bayesian zero-inflated Poisson models that exhibit spatial dependence. We quantify inequality via the Theil index and poverty via the ratios of two Census 2000 variables, the number of people under the poverty line and the number of people for whom poverty status is determined, in each Zip Code Tabulation Area. The purpose of this study was to investigate the effects of inequality and poverty in addition to spatial dependence between neighboring regions on HIV mortality rate, which can lead to improved health resource allocation decisions. In modeling county-specific HIV counts, we propose Bayesian zero-inflated Poisson models whose rates are functions of both covariate and spatial/random effects. To show how the proposed models work, we used three different publicly available data sets: TIGER Shapefiles, Census 2000, and mortality index files. In addition, we introduce parameter estimation issues of Bayesian zero-inflated Poisson models and discuss MCMC method implications. Copyright © 2012 John Wiley & Sons, Ltd.

  17. [From clinical judgment to linear regression model.

    Science.gov (United States)

    Palacios-Cruz, Lino; Pérez, Marcela; Rivas-Ruiz, Rodolfo; Talavera, Juan O

    2013-01-01

    When we think about mathematical models, such as linear regression model, we think that these terms are only used by those engaged in research, a notion that is far from the truth. Legendre described the first mathematical model in 1805, and Galton introduced the formal term in 1886. Linear regression is one of the most commonly used regression models in clinical practice. It is useful to predict or show the relationship between two or more variables as long as the dependent variable is quantitative and has normal distribution. Stated in another way, the regression is used to predict a measure based on the knowledge of at least one other variable. Linear regression has as it's first objective to determine the slope or inclination of the regression line: Y = a + bx, where "a" is the intercept or regression constant and it is equivalent to "Y" value when "X" equals 0 and "b" (also called slope) indicates the increase or decrease that occurs when the variable "x" increases or decreases in one unit. In the regression line, "b" is called regression coefficient. The coefficient of determination (R 2 ) indicates the importance of independent variables in the outcome.

  18. Regression Models for Market-Shares

    DEFF Research Database (Denmark)

    Birch, Kristina; Olsen, Jørgen Kai; Tjur, Tue

    2005-01-01

    On the background of a data set of weekly sales and prices for three brands of coffee, this paper discusses various regression models and their relation to the multiplicative competitive-interaction model (the MCI model, see Cooper 1988, 1993) for market-shares. Emphasis is put on the interpretat......On the background of a data set of weekly sales and prices for three brands of coffee, this paper discusses various regression models and their relation to the multiplicative competitive-interaction model (the MCI model, see Cooper 1988, 1993) for market-shares. Emphasis is put...... on the interpretation of the parameters in relation to models for the total sales based on discrete choice models.Key words and phrases. MCI model, discrete choice model, market-shares, price elasitcity, regression model....

  19. Numerical solution of continuous-time DSGE models under Poisson uncertainty

    DEFF Research Database (Denmark)

    Posch, Olaf; Trimborn, Timo

    We propose a simple and powerful method for determining the transition process in continuous-time DSGE models under Poisson uncertainty numerically. The idea is to transform the system of stochastic differential equations into a system of functional differential equations of the retarded type. We...... classes of models. We illustrate the algorithm simulating both the stochastic neoclassical growth model and the Lucas model under Poisson uncertainty which is motivated by the Barro-Rietz rare disaster hypothesis. We find that, even for non-linear policy functions, the maximum (absolute) error is very...

  20. Categorical regression dose-response modeling

    Science.gov (United States)

    The goal of this training is to provide participants with training on the use of the U.S. EPA’s Categorical Regression soft¬ware (CatReg) and its application to risk assessment. Categorical regression fits mathematical models to toxicity data that have been assigned ord...

  1. [Prediction and spatial distribution of recruitment trees of natural secondary forest based on geographically weighted Poisson model].

    Science.gov (United States)

    Zhang, Ling Yu; Liu, Zhao Gang

    2017-12-01

    Based on the data collected from 108 permanent plots of the forest resources survey in Maoershan Experimental Forest Farm during 2004-2016, this study investigated the spatial distribution of recruitment trees in natural secondary forest by global Poisson regression and geographically weighted Poisson regression (GWPR) with four bandwidths of 2.5, 5, 10 and 15 km. The simulation effects of the 5 regressions and the factors influencing the recruitment trees in stands were analyzed, a description was given to the spatial autocorrelation of the regression residuals on global and local levels using Moran's I. The results showed that the spatial distribution of the number of natural secondary forest recruitment was significantly influenced by stands and topographic factors, especially average DBH. The GWPR model with small scale (2.5 km) had high accuracy of model fitting, a large range of model parameter estimates was generated, and the localized spatial distribution effect of the model parameters was obtained. The GWPR model at small scale (2.5 and 5 km) had produced a small range of model residuals, and the stability of the model was improved. The global spatial auto-correlation of the GWPR model residual at the small scale (2.5 km) was the lowe-st, and the local spatial auto-correlation was significantly reduced, in which an ideal spatial distribution pattern of small clusters with different observations was formed. The local model at small scale (2.5 km) was much better than the global model in the simulation effect on the spatial distribution of recruitment tree number.

  2. Time series regression model for infectious disease and weather.

    Science.gov (United States)

    Imai, Chisato; Armstrong, Ben; Chalabi, Zaid; Mangtani, Punam; Hashizume, Masahiro

    2015-10-01

    Time series regression has been developed and long used to evaluate the short-term associations of air pollution and weather with mortality or morbidity of non-infectious diseases. The application of the regression approaches from this tradition to infectious diseases, however, is less well explored and raises some new issues. We discuss and present potential solutions for five issues often arising in such analyses: changes in immune population, strong autocorrelations, a wide range of plausible lag structures and association patterns, seasonality adjustments, and large overdispersion. The potential approaches are illustrated with datasets of cholera cases and rainfall from Bangladesh and influenza and temperature in Tokyo. Though this article focuses on the application of the traditional time series regression to infectious diseases and weather factors, we also briefly introduce alternative approaches, including mathematical modeling, wavelet analysis, and autoregressive integrated moving average (ARIMA) models. Modifications proposed to standard time series regression practice include using sums of past cases as proxies for the immune population, and using the logarithm of lagged disease counts to control autocorrelation due to true contagion, both of which are motivated from "susceptible-infectious-recovered" (SIR) models. The complexity of lag structures and association patterns can often be informed by biological mechanisms and explored by using distributed lag non-linear models. For overdispersed models, alternative distribution models such as quasi-Poisson and negative binomial should be considered. Time series regression can be used to investigate dependence of infectious diseases on weather, but may need modifying to allow for features specific to this context. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.

  3. Two-part zero-inflated negative binomial regression model for quantitative trait loci mapping with count trait.

    Science.gov (United States)

    Moghimbeigi, Abbas

    2015-05-07

    Poisson regression models provide a standard framework for quantitative trait locus (QTL) mapping of count traits. In practice, however, count traits are often over-dispersed relative to the Poisson distribution. In these situations, the zero-inflated Poisson (ZIP), zero-inflated generalized Poisson (ZIGP) and zero-inflated negative binomial (ZINB) regression may be useful for QTL mapping of count traits. Added genetic variables to the negative binomial part equation, may also affect extra zero data. In this study, to overcome these challenges, I apply two-part ZINB model. The EM algorithm with Newton-Raphson method in the M-step uses for estimating parameters. An application of the two-part ZINB model for QTL mapping is considered to detect associations between the formation of gallstone and the genotype of markers. Copyright © 2015 Elsevier Ltd. All rights reserved.

  4. The Hitchin model, Poisson-quasi-Nijenhuis, geometry and symmetry reduction

    International Nuclear Information System (INIS)

    Zucchini, Roberto

    2007-01-01

    We revisit our earlier work on the AKSZ-like formulation of topological sigma model on generalized complex manifolds, or Hitchin model, [20]. We show that the target space geometry geometry implied by the BV master equations is Poisson-quasi-Nijenhuis geometry recently introduced and studied by Stienon and Xu (in the untwisted case) in [44]. Poisson-quasi-Nijenhuis geometry is more general than generalized complex geometry and comprises it as a particular case. Next, we show how gauging and reduction can be implemented in the Hitchin model. We find that the geometry resulting form the BV master equation is closely related to but more general than that recently described by Lin and Tolman in [40, 41], suggesting a natural framework for the study of reduction of Poisson-quasi-Nijenhuis manifolds

  5. A dynamic bivariate Poisson model for analysing and forecasting match results in the English Premier League

    NARCIS (Netherlands)

    Koopman, S.J.; Lit, R.

    2015-01-01

    Summary: We develop a statistical model for the analysis and forecasting of football match results which assumes a bivariate Poisson distribution with intensity coefficients that change stochastically over time. The dynamic model is a novelty in the statistical time series analysis of match results

  6. On the Modeling and Analysis of Heterogeneous Radio Access Networks using a Poisson Cluster Process

    DEFF Research Database (Denmark)

    Suryaprakash, Vinay; Møller, Jesper; Fettweis, Gerhard P.

    processes, some of which are alluded to (later) in this paper. We model a heterogeneous network consisting of two types of base stations by using a particular Poisson cluster process model. The main contributions are two-fold. First, a complete description of the interference in heterogeneous networks...

  7. A Poisson-lognormal conditional-autoregressive model for multivariate spatial analysis of pedestrian crash counts across neighborhoods.

    Science.gov (United States)

    Wang, Yiyi; Kockelman, Kara M

    2013-11-01

    This work examines the relationship between 3-year pedestrian crash counts across Census tracts in Austin, Texas, and various land use, network, and demographic attributes, such as land use balance, residents' access to commercial land uses, sidewalk density, lane-mile densities (by roadway class), and population and employment densities (by type). The model specification allows for region-specific heterogeneity, correlation across response types, and spatial autocorrelation via a Poisson-based multivariate conditional auto-regressive (CAR) framework and is estimated using Bayesian Markov chain Monte Carlo methods. Least-squares regression estimates of walk-miles traveled per zone serve as the exposure measure. Here, the Poisson-lognormal multivariate CAR model outperforms an aspatial Poisson-lognormal multivariate model and a spatial model (without cross-severity correlation), both in terms of fit and inference. Positive spatial autocorrelation emerges across neighborhoods, as expected (due to latent heterogeneity or missing variables that trend in space, resulting in spatial clustering of crash counts). In comparison, the positive aspatial, bivariate cross correlation of severe (fatal or incapacitating) and non-severe crash rates reflects latent covariates that have impacts across severity levels but are more local in nature (such as lighting conditions and local sight obstructions), along with spatially lagged cross correlation. Results also suggest greater mixing of residences and commercial land uses is associated with higher pedestrian crash risk across different severity levels, ceteris paribus, presumably since such access produces more potential conflicts between pedestrian and vehicle movements. Interestingly, network densities show variable effects, and sidewalk provision is associated with lower severe-crash rates. Copyright © 2013 Elsevier Ltd. All rights reserved.

  8. A Poisson hierarchical modelling approach to detecting copy number variation in sequence coverage data

    KAUST Repository

    Sepú lveda, Nuno; Campino, Susana G; Assefa, Samuel A; Sutherland, Colin J; Pain, Arnab; Clark, Taane G

    2013-01-01

    Background: The advent of next generation sequencing technology has accelerated efforts to map and catalogue copy number variation (CNV) in genomes of important micro-organisms for public health. A typical analysis of the sequence data involves mapping reads onto a reference genome, calculating the respective coverage, and detecting regions with too-low or too-high coverage (deletions and amplifications, respectively). Current CNV detection methods rely on statistical assumptions (e.g., a Poisson model) that may not hold in general, or require fine-tuning the underlying algorithms to detect known hits. We propose a new CNV detection methodology based on two Poisson hierarchical models, the Poisson-Gamma and Poisson-Lognormal, with the advantage of being sufficiently flexible to describe different data patterns, whilst robust against deviations from the often assumed Poisson model.Results: Using sequence coverage data of 7 Plasmodium falciparum malaria genomes (3D7 reference strain, HB3, DD2, 7G8, GB4, OX005, and OX006), we showed that empirical coverage distributions are intrinsically asymmetric and overdispersed in relation to the Poisson model. We also demonstrated a low baseline false positive rate for the proposed methodology using 3D7 resequencing data and simulation. When applied to the non-reference isolate data, our approach detected known CNV hits, including an amplification of the PfMDR1 locus in DD2 and a large deletion in the CLAG3.2 gene in GB4, and putative novel CNV regions. When compared to the recently available FREEC and cn.MOPS approaches, our findings were more concordant with putative hits from the highest quality array data for the 7G8 and GB4 isolates.Conclusions: In summary, the proposed methodology brings an increase in flexibility, robustness, accuracy and statistical rigour to CNV detection using sequence coverage data. 2013 Seplveda et al.; licensee BioMed Central Ltd.

  9. A Poisson hierarchical modelling approach to detecting copy number variation in sequence coverage data.

    Science.gov (United States)

    Sepúlveda, Nuno; Campino, Susana G; Assefa, Samuel A; Sutherland, Colin J; Pain, Arnab; Clark, Taane G

    2013-02-26

    The advent of next generation sequencing technology has accelerated efforts to map and catalogue copy number variation (CNV) in genomes of important micro-organisms for public health. A typical analysis of the sequence data involves mapping reads onto a reference genome, calculating the respective coverage, and detecting regions with too-low or too-high coverage (deletions and amplifications, respectively). Current CNV detection methods rely on statistical assumptions (e.g., a Poisson model) that may not hold in general, or require fine-tuning the underlying algorithms to detect known hits. We propose a new CNV detection methodology based on two Poisson hierarchical models, the Poisson-Gamma and Poisson-Lognormal, with the advantage of being sufficiently flexible to describe different data patterns, whilst robust against deviations from the often assumed Poisson model. Using sequence coverage data of 7 Plasmodium falciparum malaria genomes (3D7 reference strain, HB3, DD2, 7G8, GB4, OX005, and OX006), we showed that empirical coverage distributions are intrinsically asymmetric and overdispersed in relation to the Poisson model. We also demonstrated a low baseline false positive rate for the proposed methodology using 3D7 resequencing data and simulation. When applied to the non-reference isolate data, our approach detected known CNV hits, including an amplification of the PfMDR1 locus in DD2 and a large deletion in the CLAG3.2 gene in GB4, and putative novel CNV regions. When compared to the recently available FREEC and cn.MOPS approaches, our findings were more concordant with putative hits from the highest quality array data for the 7G8 and GB4 isolates. In summary, the proposed methodology brings an increase in flexibility, robustness, accuracy and statistical rigour to CNV detection using sequence coverage data.

  10. A Poisson hierarchical modelling approach to detecting copy number variation in sequence coverage data

    KAUST Repository

    Sepúlveda, Nuno

    2013-02-26

    Background: The advent of next generation sequencing technology has accelerated efforts to map and catalogue copy number variation (CNV) in genomes of important micro-organisms for public health. A typical analysis of the sequence data involves mapping reads onto a reference genome, calculating the respective coverage, and detecting regions with too-low or too-high coverage (deletions and amplifications, respectively). Current CNV detection methods rely on statistical assumptions (e.g., a Poisson model) that may not hold in general, or require fine-tuning the underlying algorithms to detect known hits. We propose a new CNV detection methodology based on two Poisson hierarchical models, the Poisson-Gamma and Poisson-Lognormal, with the advantage of being sufficiently flexible to describe different data patterns, whilst robust against deviations from the often assumed Poisson model.Results: Using sequence coverage data of 7 Plasmodium falciparum malaria genomes (3D7 reference strain, HB3, DD2, 7G8, GB4, OX005, and OX006), we showed that empirical coverage distributions are intrinsically asymmetric and overdispersed in relation to the Poisson model. We also demonstrated a low baseline false positive rate for the proposed methodology using 3D7 resequencing data and simulation. When applied to the non-reference isolate data, our approach detected known CNV hits, including an amplification of the PfMDR1 locus in DD2 and a large deletion in the CLAG3.2 gene in GB4, and putative novel CNV regions. When compared to the recently available FREEC and cn.MOPS approaches, our findings were more concordant with putative hits from the highest quality array data for the 7G8 and GB4 isolates.Conclusions: In summary, the proposed methodology brings an increase in flexibility, robustness, accuracy and statistical rigour to CNV detection using sequence coverage data. 2013 Seplveda et al.; licensee BioMed Central Ltd.

  11. Applied Regression Modeling A Business Approach

    CERN Document Server

    Pardoe, Iain

    2012-01-01

    An applied and concise treatment of statistical regression techniques for business students and professionals who have little or no background in calculusRegression analysis is an invaluable statistical methodology in business settings and is vital to model the relationship between a response variable and one or more predictor variables, as well as the prediction of a response value given values of the predictors. In view of the inherent uncertainty of business processes, such as the volatility of consumer spending and the presence of market uncertainty, business professionals use regression a

  12. Guidelines for Use of the Approximate Beta-Poisson Dose-Response Model.

    Science.gov (United States)

    Xie, Gang; Roiko, Anne; Stratton, Helen; Lemckert, Charles; Dunn, Peter K; Mengersen, Kerrie

    2017-07-01

    For dose-response analysis in quantitative microbial risk assessment (QMRA), the exact beta-Poisson model is a two-parameter mechanistic dose-response model with parameters α>0 and β>0, which involves the Kummer confluent hypergeometric function. Evaluation of a hypergeometric function is a computational challenge. Denoting PI(d) as the probability of infection at a given mean dose d, the widely used dose-response model PI(d)=1-(1+dβ)-α is an approximate formula for the exact beta-Poisson model. Notwithstanding the required conditions α1, issues related to the validity and approximation accuracy of this approximate formula have remained largely ignored in practice, partly because these conditions are too general to provide clear guidance. Consequently, this study proposes a probability measure Pr(0 (22α̂)0.50 for 0.020.99) . This validity measure and rule of thumb were validated by application to all the completed beta-Poisson models (related to 85 data sets) from the QMRA community portal (QMRA Wiki). The results showed that the higher the probability Pr(0 Poisson model dose-response curve. © 2016 Society for Risk Analysis.

  13. A new multivariate zero-adjusted Poisson model with applications to biomedicine.

    Science.gov (United States)

    Liu, Yin; Tian, Guo-Liang; Tang, Man-Lai; Yuen, Kam Chuen

    2018-05-25

    Recently, although advances were made on modeling multivariate count data, existing models really has several limitations: (i) The multivariate Poisson log-normal model (Aitchison and Ho, ) cannot be used to fit multivariate count data with excess zero-vectors; (ii) The multivariate zero-inflated Poisson (ZIP) distribution (Li et al., 1999) cannot be used to model zero-truncated/deflated count data and it is difficult to apply to high-dimensional cases; (iii) The Type I multivariate zero-adjusted Poisson (ZAP) distribution (Tian et al., 2017) could only model multivariate count data with a special correlation structure for random components that are all positive or negative. In this paper, we first introduce a new multivariate ZAP distribution, based on a multivariate Poisson distribution, which allows the correlations between components with a more flexible dependency structure, that is some of the correlation coefficients could be positive while others could be negative. We then develop its important distributional properties, and provide efficient statistical inference methods for multivariate ZAP model with or without covariates. Two real data examples in biomedicine are used to illustrate the proposed methods. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  14. A Conway-Maxwell-Poisson (CMP) model to address data dispersion on positron emission tomography.

    Science.gov (United States)

    Santarelli, Maria Filomena; Della Latta, Daniele; Scipioni, Michele; Positano, Vincenzo; Landini, Luigi

    2016-10-01

    Positron emission tomography (PET) in medicine exploits the properties of positron-emitting unstable nuclei. The pairs of γ- rays emitted after annihilation are revealed by coincidence detectors and stored as projections in a sinogram. It is well known that radioactive decay follows a Poisson distribution; however, deviation from Poisson statistics occurs on PET projection data prior to reconstruction due to physical effects, measurement errors, correction of deadtime, scatter, and random coincidences. A model that describes the statistical behavior of measured and corrected PET data can aid in understanding the statistical nature of the data: it is a prerequisite to develop efficient reconstruction and processing methods and to reduce noise. The deviation from Poisson statistics in PET data could be described by the Conway-Maxwell-Poisson (CMP) distribution model, which is characterized by the centring parameter λ and the dispersion parameter ν, the latter quantifying the deviation from a Poisson distribution model. In particular, the parameter ν allows quantifying over-dispersion (ν1) of data. A simple and efficient method for λ and ν parameters estimation is introduced and assessed using Monte Carlo simulation for a wide range of activity values. The application of the method to simulated and experimental PET phantom data demonstrated that the CMP distribution parameters could detect deviation from the Poisson distribution both in raw and corrected PET data. It may be usefully implemented in image reconstruction algorithms and quantitative PET data analysis, especially in low counting emission data, as in dynamic PET data, where the method demonstrated the best accuracy. Copyright © 2016 Elsevier Ltd. All rights reserved.

  15. INVESTIGATION OF E-MAIL TRAFFIC BY USING ZERO-INFLATED REGRESSION MODELS

    Directory of Open Access Journals (Sweden)

    Yılmaz KAYA

    2012-06-01

    Full Text Available Based on count data obtained with a value of zero may be greater than anticipated. These types of data sets should be used to analyze by regression methods taking into account zero values. Zero- Inflated Poisson (ZIP, Zero-Inflated negative binomial (ZINB, Poisson Hurdle (PH, negative binomial Hurdle (NBH are more common approaches in modeling more zero value possessing dependent variables than expected. In the present study, the e-mail traffic of Yüzüncü Yıl University in 2009 spring semester was investigated. ZIP and ZINB, PH and NBH regression methods were applied on the data set because more zeros counting (78.9% were found in data set than expected. ZINB and NBH regression considered zero dispersion and overdispersion were found to be more accurate results due to overdispersion and zero dispersion in sending e-mail. ZINB is determined to be best model accordingto Vuong statistics and information criteria.

  16. Regression Models For Multivariate Count Data.

    Science.gov (United States)

    Zhang, Yiwen; Zhou, Hua; Zhou, Jin; Sun, Wei

    2017-01-01

    Data with multivariate count responses frequently occur in modern applications. The commonly used multinomial-logit model is limiting due to its restrictive mean-variance structure. For instance, analyzing count data from the recent RNA-seq technology by the multinomial-logit model leads to serious errors in hypothesis testing. The ubiquity of over-dispersion and complicated correlation structures among multivariate counts calls for more flexible regression models. In this article, we study some generalized linear models that incorporate various correlation structures among the counts. Current literature lacks a treatment of these models, partly due to the fact that they do not belong to the natural exponential family. We study the estimation, testing, and variable selection for these models in a unifying framework. The regression models are compared on both synthetic and real RNA-seq data.

  17. Modeling of Electrokinetic Processes Using the Nernst-Plank-Poisson System

    DEFF Research Database (Denmark)

    Paz-Garcia, Juan Manuel; Johannesson, Björn; Ottosen, Lisbeth M.

    2010-01-01

    Electrokinetic processes are known as the mobilization of species within the pore solution of porous materials under the effect of an external electric field. A finite elements model was implemented and used for the integration of the coupled Nernst-Plank-Poisson system of equations in order...

  18. Poisson Growth Mixture Modeling of Intensive Longitudinal Data: An Application to Smoking Cessation Behavior

    Science.gov (United States)

    Shiyko, Mariya P.; Li, Yuelin; Rindskopf, David

    2012-01-01

    Intensive longitudinal data (ILD) have become increasingly common in the social and behavioral sciences; count variables, such as the number of daily smoked cigarettes, are frequently used outcomes in many ILD studies. We demonstrate a generalized extension of growth mixture modeling (GMM) to Poisson-distributed ILD for identifying qualitatively…

  19. The Fixed-Effects Zero-Inflated Poisson Model with an Application to Health Care Utilization

    NARCIS (Netherlands)

    Majo, M.C.; van Soest, A.H.O.

    2011-01-01

    Response variables that are scored as counts and that present a large number of zeros often arise in quantitative health care analysis. We define a zero-in flated Poisson model with fixed-effects in both of its equations to identify respondent and health-related characteristics associated with

  20. Testing homogeneity in Weibull-regression models.

    Science.gov (United States)

    Bolfarine, Heleno; Valença, Dione M

    2005-10-01

    In survival studies with families or geographical units it may be of interest testing whether such groups are homogeneous for given explanatory variables. In this paper we consider score type tests for group homogeneity based on a mixing model in which the group effect is modelled as a random variable. As opposed to hazard-based frailty models, this model presents survival times that conditioned on the random effect, has an accelerated failure time representation. The test statistics requires only estimation of the conventional regression model without the random effect and does not require specifying the distribution of the random effect. The tests are derived for a Weibull regression model and in the uncensored situation, a closed form is obtained for the test statistic. A simulation study is used for comparing the power of the tests. The proposed tests are applied to real data sets with censored data.

  1. The Poisson model limits in NBA basketball: Complexity in team sports

    Science.gov (United States)

    Martín-González, Juan Manuel; de Saá Guerra, Yves; García-Manso, Juan Manuel; Arriaza, Enrique; Valverde-Estévez, Teresa

    2016-12-01

    Team sports are frequently studied by researchers. There is presumption that scoring in basketball is a random process and that can be described using the Poisson Model. Basketball is a collaboration-opposition sport, where the non-linear local interactions among players are reflected in the evolution of the score that ultimately determines the winner. In the NBA, the outcomes of close games are often decided in the last minute, where fouls play a main role. We examined 6130 NBA games in order to analyze the time intervals between baskets and scoring dynamics. Most numbers of baskets (n) over a time interval (ΔT) follow a Poisson distribution, but some (e.g., ΔT = 10 s, n > 3) behave as a Power Law. The Poisson distribution includes most baskets in any game, in most game situations, but in close games in the last minute, the numbers of events are distributed following a Power Law. The number of events can be adjusted by a mixture of two distributions. In close games, both teams try to maintain their advantage solely in order to reach the last minute: a completely different game. For this reason, we propose to use the Poisson model as a reference. The complex dynamics will emerge from the limits of this model.

  2. Mixed-effects regression models in linguistics

    CERN Document Server

    Heylen, Kris; Geeraerts, Dirk

    2018-01-01

    When data consist of grouped observations or clusters, and there is a risk that measurements within the same group are not independent, group-specific random effects can be added to a regression model in order to account for such within-group associations. Regression models that contain such group-specific random effects are called mixed-effects regression models, or simply mixed models. Mixed models are a versatile tool that can handle both balanced and unbalanced datasets and that can also be applied when several layers of grouping are present in the data; these layers can either be nested or crossed.  In linguistics, as in many other fields, the use of mixed models has gained ground rapidly over the last decade. This methodological evolution enables us to build more sophisticated and arguably more realistic models, but, due to its technical complexity, also introduces new challenges. This volume brings together a number of promising new evolutions in the use of mixed models in linguistics, but also addres...

  3. Hidden Markov models for zero-inflated Poisson counts with an application to substance use.

    Science.gov (United States)

    DeSantis, Stacia M; Bandyopadhyay, Dipankar

    2011-06-30

    Paradigms for substance abuse cue-reactivity research involve pharmacological or stressful stimulation designed to elicit stress and craving responses in cocaine-dependent subjects. It is unclear as to whether stress induced from participation in such studies increases drug-seeking behavior. We propose a 2-state Hidden Markov model to model the number of cocaine abuses per week before and after participation in a stress-and cue-reactivity study. The hypothesized latent state corresponds to 'high' or 'low' use. To account for a preponderance of zeros, we assume a zero-inflated Poisson model for the count data. Transition probabilities depend on the prior week's state, fixed demographic variables, and time-varying covariates. We adopt a Bayesian approach to model fitting, and use the conditional predictive ordinate statistic to demonstrate that the zero-inflated Poisson hidden Markov model outperforms other models for longitudinal count data. Copyright © 2011 John Wiley & Sons, Ltd.

  4. Stochastic Interest Model Based on Compound Poisson Process and Applications in Actuarial Science

    Directory of Open Access Journals (Sweden)

    Shilong Li

    2017-01-01

    Full Text Available Considering stochastic behavior of interest rates in financial market, we construct a new class of interest models based on compound Poisson process. Different from the references, this paper describes the randomness of interest rates by modeling the force of interest with Poisson random jumps directly. To solve the problem in calculation of accumulated interest force function, one important integral technique is employed. And a conception called the critical value is introduced to investigate the validity condition of this new model. We also discuss actuarial present values of several life annuities under this new interest model. Simulations are done to illustrate the theoretical results and the effect of parameters in interest model on actuarial present values is also analyzed.

  5. Influence diagnostics in meta-regression model.

    Science.gov (United States)

    Shi, Lei; Zuo, ShanShan; Yu, Dalei; Zhou, Xiaohua

    2017-09-01

    This paper studies the influence diagnostics in meta-regression model including case deletion diagnostic and local influence analysis. We derive the subset deletion formulae for the estimation of regression coefficient and heterogeneity variance and obtain the corresponding influence measures. The DerSimonian and Laird estimation and maximum likelihood estimation methods in meta-regression are considered, respectively, to derive the results. Internal and external residual and leverage measure are defined. The local influence analysis based on case-weights perturbation scheme, responses perturbation scheme, covariate perturbation scheme, and within-variance perturbation scheme are explored. We introduce a method by simultaneous perturbing responses, covariate, and within-variance to obtain the local influence measure, which has an advantage of capable to compare the influence magnitude of influential studies from different perturbations. An example is used to illustrate the proposed methodology. Copyright © 2017 John Wiley & Sons, Ltd.

  6. AIRLINE ACTIVITY FORECASTING BY REGRESSION MODELS

    Directory of Open Access Journals (Sweden)

    Н. Білак

    2012-04-01

    Full Text Available Proposed linear and nonlinear regression models, which take into account the equation of trend and seasonality indices for the analysis and restore the volume of passenger traffic over the past period of time and its prediction for future years, as well as the algorithm of formation of these models based on statistical analysis over the years. The desired model is the first step for the synthesis of more complex models, which will enable forecasting of passenger (income level airline with the highest accuracy and time urgency.

  7. Modeling oil production based on symbolic regression

    International Nuclear Information System (INIS)

    Yang, Guangfei; Li, Xianneng; Wang, Jianliang; Lian, Lian; Ma, Tieju

    2015-01-01

    Numerous models have been proposed to forecast the future trends of oil production and almost all of them are based on some predefined assumptions with various uncertainties. In this study, we propose a novel data-driven approach that uses symbolic regression to model oil production. We validate our approach on both synthetic and real data, and the results prove that symbolic regression could effectively identify the true models beneath the oil production data and also make reliable predictions. Symbolic regression indicates that world oil production will peak in 2021, which broadly agrees with other techniques used by researchers. Our results also show that the rate of decline after the peak is almost half the rate of increase before the peak, and it takes nearly 12 years to drop 4% from the peak. These predictions are more optimistic than those in several other reports, and the smoother decline will provide the world, especially the developing countries, with more time to orchestrate mitigation plans. -- Highlights: •A data-driven approach has been shown to be effective at modeling the oil production. •The Hubbert model could be discovered automatically from data. •The peak of world oil production is predicted to appear in 2021. •The decline rate after peak is half of the increase rate before peak. •Oil production projected to decline 4% post-peak

  8. Stochastic modeling for neural spiking events based on fractional superstatistical Poisson process

    Directory of Open Access Journals (Sweden)

    Hidetoshi Konno

    2018-01-01

    Full Text Available In neural spike counting experiments, it is known that there are two main features: (i the counting number has a fractional power-law growth with time and (ii the waiting time (i.e., the inter-spike-interval distribution has a heavy tail. The method of superstatistical Poisson processes (SSPPs is examined whether these main features are properly modeled. Although various mixed/compound Poisson processes are generated with selecting a suitable distribution of the birth-rate of spiking neurons, only the second feature (ii can be modeled by the method of SSPPs. Namely, the first one (i associated with the effect of long-memory cannot be modeled properly. Then, it is shown that the two main features can be modeled successfully by a class of fractional SSPP (FSSPP.

  9. Stochastic modeling for neural spiking events based on fractional superstatistical Poisson process

    Science.gov (United States)

    Konno, Hidetoshi; Tamura, Yoshiyasu

    2018-01-01

    In neural spike counting experiments, it is known that there are two main features: (i) the counting number has a fractional power-law growth with time and (ii) the waiting time (i.e., the inter-spike-interval) distribution has a heavy tail. The method of superstatistical Poisson processes (SSPPs) is examined whether these main features are properly modeled. Although various mixed/compound Poisson processes are generated with selecting a suitable distribution of the birth-rate of spiking neurons, only the second feature (ii) can be modeled by the method of SSPPs. Namely, the first one (i) associated with the effect of long-memory cannot be modeled properly. Then, it is shown that the two main features can be modeled successfully by a class of fractional SSPP (FSSPP).

  10. Beta-Poisson model for single-cell RNA-seq data analyses.

    Science.gov (United States)

    Vu, Trung Nghia; Wills, Quin F; Kalari, Krishna R; Niu, Nifang; Wang, Liewei; Rantalainen, Mattias; Pawitan, Yudi

    2016-07-15

    Single-cell RNA-sequencing technology allows detection of gene expression at the single-cell level. One typical feature of the data is a bimodality in the cellular distribution even for highly expressed genes, primarily caused by a proportion of non-expressing cells. The standard and the over-dispersed gamma-Poisson models that are commonly used in bulk-cell RNA-sequencing are not able to capture this property. We introduce a beta-Poisson mixture model that can capture the bimodality of the single-cell gene expression distribution. We further integrate the model into the generalized linear model framework in order to perform differential expression analyses. The whole analytical procedure is called BPSC. The results from several real single-cell RNA-seq datasets indicate that ∼90% of the transcripts are well characterized by the beta-Poisson model; the model-fit from BPSC is better than the fit of the standard gamma-Poisson model in > 80% of the transcripts. Moreover, in differential expression analyses of simulated and real datasets, BPSC performs well against edgeR, a conventional method widely used in bulk-cell RNA-sequencing data, and against scde and MAST, two recent methods specifically designed for single-cell RNA-seq data. An R package BPSC for model fitting and differential expression analyses of single-cell RNA-seq data is available under GPL-3 license at https://github.com/nghiavtr/BPSC CONTACT: yudi.pawitan@ki.se or mattias.rantalainen@ki.se Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  11. On Partial Defaults in Portfolio Credit Risk : A Poisson Mixture Model Approach

    OpenAIRE

    Weißbach, Rafael; von Lieres und Wilkau, Carsten

    2005-01-01

    Most credit portfolio models exclusively calculate the loss distribution for a portfolio of performing counterparts. Conservative default definitions cause considerable insecurity about the loss for a long time after the default. We present three approaches to account for defaulted counterparts in the calculation of the economic capital. Two of the approaches are based on the Poisson mixture model CreditRisk+ and derive a loss distribution for an integrated portfolio. The third method treats ...

  12. Geographically weighted regression model on poverty indicator

    Science.gov (United States)

    Slamet, I.; Nugroho, N. F. T. A.; Muslich

    2017-12-01

    In this research, we applied geographically weighted regression (GWR) for analyzing the poverty in Central Java. We consider Gaussian Kernel as weighted function. The GWR uses the diagonal matrix resulted from calculating kernel Gaussian function as a weighted function in the regression model. The kernel weights is used to handle spatial effects on the data so that a model can be obtained for each location. The purpose of this paper is to model of poverty percentage data in Central Java province using GWR with Gaussian kernel weighted function and to determine the influencing factors in each regency/city in Central Java province. Based on the research, we obtained geographically weighted regression model with Gaussian kernel weighted function on poverty percentage data in Central Java province. We found that percentage of population working as farmers, population growth rate, percentage of households with regular sanitation, and BPJS beneficiaries are the variables that affect the percentage of poverty in Central Java province. In this research, we found the determination coefficient R2 are 68.64%. There are two categories of district which are influenced by different of significance factors.

  13. A Poisson-Fault Model for Testing Power Transformers in Service

    Directory of Open Access Journals (Sweden)

    Dengfu Zhao

    2014-01-01

    Full Text Available This paper presents a method for assessing the instant failure rate of a power transformer under different working conditions. The method can be applied to a dataset of a power transformer under periodic inspections and maintenance. We use a Poisson-fault model to describe failures of a power transformer. When investigating a Bayes estimate of the instant failure rate under the model, we find that complexities of a classical method and a Monte Carlo simulation are unacceptable. Through establishing a new filtered estimate of Poisson process observations, we propose a quick algorithm of the Bayes estimate of the instant failure rate. The proposed algorithm is tested by simulation datasets of a power transformer. For these datasets, the proposed estimators of parameters of the model have better performance than other estimators. The simulation results reveal the suggested algorithms are quickest among three candidates.

  14. Pareto genealogies arising from a Poisson branching evolution model with selection.

    Science.gov (United States)

    Huillet, Thierry E

    2014-02-01

    We study a class of coalescents derived from a sampling procedure out of N i.i.d. Pareto(α) random variables, normalized by their sum, including β-size-biasing on total length effects (β Poisson-Dirichlet (α, -β) Ξ-coalescent (α ε[0, 1)), or to a family of continuous-time Beta (2 - α, α - β)Λ-coalescents (α ε[1, 2)), or to the Kingman coalescent (α ≥ 2). We indicate that this class of coalescent processes (and their scaling limits) may be viewed as the genealogical processes of some forward in time evolving branching population models including selection effects. In such constant-size population models, the reproduction step, which is based on a fitness-dependent Poisson Point Process with scaling power-law(α) intensity, is coupled to a selection step consisting of sorting out the N fittest individuals issued from the reproduction step.

  15. Adaptive regression for modeling nonlinear relationships

    CERN Document Server

    Knafl, George J

    2016-01-01

    This book presents methods for investigating whether relationships are linear or nonlinear and for adaptively fitting appropriate models when they are nonlinear. Data analysts will learn how to incorporate nonlinearity in one or more predictor variables into regression models for different types of outcome variables. Such nonlinear dependence is often not considered in applied research, yet nonlinear relationships are common and so need to be addressed. A standard linear analysis can produce misleading conclusions, while a nonlinear analysis can provide novel insights into data, not otherwise possible. A variety of examples of the benefits of modeling nonlinear relationships are presented throughout the book. Methods are covered using what are called fractional polynomials based on real-valued power transformations of primary predictor variables combined with model selection based on likelihood cross-validation. The book covers how to formulate and conduct such adaptive fractional polynomial modeling in the s...

  16. Bayesian Inference of a Multivariate Regression Model

    Directory of Open Access Journals (Sweden)

    Marick S. Sinay

    2014-01-01

    Full Text Available We explore Bayesian inference of a multivariate linear regression model with use of a flexible prior for the covariance structure. The commonly adopted Bayesian setup involves the conjugate prior, multivariate normal distribution for the regression coefficients and inverse Wishart specification for the covariance matrix. Here we depart from this approach and propose a novel Bayesian estimator for the covariance. A multivariate normal prior for the unique elements of the matrix logarithm of the covariance matrix is considered. Such structure allows for a richer class of prior distributions for the covariance, with respect to strength of beliefs in prior location hyperparameters, as well as the added ability, to model potential correlation amongst the covariance structure. The posterior moments of all relevant parameters of interest are calculated based upon numerical results via a Markov chain Monte Carlo procedure. The Metropolis-Hastings-within-Gibbs algorithm is invoked to account for the construction of a proposal density that closely matches the shape of the target posterior distribution. As an application of the proposed technique, we investigate a multiple regression based upon the 1980 High School and Beyond Survey.

  17. General regression and representation model for classification.

    Directory of Open Access Journals (Sweden)

    Jianjun Qian

    Full Text Available Recently, the regularized coding-based classification methods (e.g. SRC and CRC show a great potential for pattern classification. However, most existing coding methods assume that the representation residuals are uncorrelated. In real-world applications, this assumption does not hold. In this paper, we take account of the correlations of the representation residuals and develop a general regression and representation model (GRR for classification. GRR not only has advantages of CRC, but also takes full use of the prior information (e.g. the correlations between representation residuals and representation coefficients and the specific information (weight matrix of image pixels to enhance the classification performance. GRR uses the generalized Tikhonov regularization and K Nearest Neighbors to learn the prior information from the training data. Meanwhile, the specific information is obtained by using an iterative algorithm to update the feature (or image pixel weights of the test sample. With the proposed model as a platform, we design two classifiers: basic general regression and representation classifier (B-GRR and robust general regression and representation classifier (R-GRR. The experimental results demonstrate the performance advantages of proposed methods over state-of-the-art algorithms.

  18. Existence theory for a Poisson-Nernst-Planck model of electrophoresis

    OpenAIRE

    Bedin, Luciano; Thompson, Mark

    2011-01-01

    A system modeling the electrophoretic motion of a charged rigid macromolecule immersed in a incompressible ionized fluid is considered. The ionic concentration is governing by the Nernst-Planck equation coupled with the Poisson equation for the electrostatic potential, Navier-Stokes and Newtonian equations for the fluid and the macromolecule dynamics, respectively. A local in time existence result for suitable weak solutions is established, following the approach of Desjardins and Esteban [Co...

  19. Studies on a Double Poisson-Geometric Insurance Risk Model with Interference

    Directory of Open Access Journals (Sweden)

    Yujuan Huang

    2013-01-01

    Full Text Available This paper mainly studies a generalized double Poisson-Geometric insurance risk model. By martingale and stopping time approach, we obtain adjustment coefficient equation, the Lundberg inequality, and the formula for the ruin probability. Also the Laplace transformation of the time when the surplus reaches a given level for the first time is discussed, and the expectation and its variance are obtained. Finally, we give the numerical examples.

  20. Quasistationary model of high-current relativistic electron beam. 1. Exact solution of Poisson equations

    International Nuclear Information System (INIS)

    Brenner, S.E.; Gandyl', E.M.; Podkopaev, A.P.

    1995-01-01

    The dynamics of high-current relativistic electron beam moving trough the cylindrical drift space has been modelled by the large particles, the shape of which allows to solve the Poisson equations exactly, and in such a way to avoid the linearization being usually used in those problems. The expressions for the components of own electric field of electron beam passing through the cylindrical drift space have been obtained. (author). 11 refs., 1 fig

  1. Confidence bands for inverse regression models

    International Nuclear Information System (INIS)

    Birke, Melanie; Bissantz, Nicolai; Holzmann, Hajo

    2010-01-01

    We construct uniform confidence bands for the regression function in inverse, homoscedastic regression models with convolution-type operators. Here, the convolution is between two non-periodic functions on the whole real line rather than between two periodic functions on a compact interval, since the former situation arguably arises more often in applications. First, following Bickel and Rosenblatt (1973 Ann. Stat. 1 1071–95) we construct asymptotic confidence bands which are based on strong approximations and on a limit theorem for the supremum of a stationary Gaussian process. Further, we propose bootstrap confidence bands based on the residual bootstrap and prove consistency of the bootstrap procedure. A simulation study shows that the bootstrap confidence bands perform reasonably well for moderate sample sizes. Finally, we apply our method to data from a gel electrophoresis experiment with genetically engineered neuronal receptor subunits incubated with rat brain extract

  2. Modeling Tetanus Neonatorum case using the regression of negative binomial and zero-inflated negative binomial

    Science.gov (United States)

    Amaliana, Luthfatul; Sa'adah, Umu; Wayan Surya Wardhani, Ni

    2017-12-01

    Tetanus Neonatorum is an infectious disease that can be prevented by immunization. The number of Tetanus Neonatorum cases in East Java Province is the highest in Indonesia until 2015. Tetanus Neonatorum data contain over dispersion and big enough proportion of zero-inflation. Negative Binomial (NB) regression is an alternative method when over dispersion happens in Poisson regression. However, the data containing over dispersion and zero-inflation are more appropriately analyzed by using Zero-Inflated Negative Binomial (ZINB) regression. The purpose of this study are: (1) to model Tetanus Neonatorum cases in East Java Province with 71.05 percent proportion of zero-inflation by using NB and ZINB regression, (2) to obtain the best model. The result of this study indicates that ZINB is better than NB regression with smaller AIC.

  3. Modeling Repeated Count Data : Some Extensions of the Rasch Poisson Counts Model

    NARCIS (Netherlands)

    van Duijn, M.A.J.; Jansen, Margo

    1995-01-01

    We consider data that can be summarized as an N X K table of counts-for example, test data obtained by administering K tests to N subjects. The cell entries y(ij) are assumed to be conditionally independent Poisson-distributed random variables, given the NK Poisson intensity parameters mu(ij). The

  4. Multitask Quantile Regression under the Transnormal Model.

    Science.gov (United States)

    Fan, Jianqing; Xue, Lingzhou; Zou, Hui

    2016-01-01

    We consider estimating multi-task quantile regression under the transnormal model, with focus on high-dimensional setting. We derive a surprisingly simple closed-form solution through rank-based covariance regularization. In particular, we propose the rank-based ℓ 1 penalization with positive definite constraints for estimating sparse covariance matrices, and the rank-based banded Cholesky decomposition regularization for estimating banded precision matrices. By taking advantage of alternating direction method of multipliers, nearest correlation matrix projection is introduced that inherits sampling properties of the unprojected one. Our work combines strengths of quantile regression and rank-based covariance regularization to simultaneously deal with nonlinearity and nonnormality for high-dimensional regression. Furthermore, the proposed method strikes a good balance between robustness and efficiency, achieves the "oracle"-like convergence rate, and provides the provable prediction interval under the high-dimensional setting. The finite-sample performance of the proposed method is also examined. The performance of our proposed rank-based method is demonstrated in a real application to analyze the protein mass spectroscopy data.

  5. Poisson versus threshold models for genetic analysis of clinical mastitis in US Holsteins.

    Science.gov (United States)

    Vazquez, A I; Weigel, K A; Gianola, D; Bates, D M; Perez-Cabal, M A; Rosa, G J M; Chang, Y M

    2009-10-01

    Typically, clinical mastitis is coded as the presence or absence of disease in a given lactation, and records are analyzed with either linear models or binary threshold models. Because the presence of mastitis may include cows with multiple episodes, there is a loss of information when counts are treated as binary responses. Poisson models are appropriated for random variables measured as the number of events, and although these models are used extensively in studying the epidemiology of mastitis, they have rarely been used for studying the genetic aspects of mastitis. Ordinal threshold models are pertinent for ordered categorical responses; although one can hypothesize that the number of clinical mastitis episodes per animal reflects a continuous underlying increase in mastitis susceptibility, these models have rarely been used in genetic analysis of mastitis. The objective of this study was to compare probit, Poisson, and ordinal threshold models for the genetic evaluation of US Holstein sires for clinical mastitis. Mastitis was measured as a binary trait or as the number of mastitis cases. Data from 44,908 first-parity cows recorded in on-farm herd management software were gathered, edited, and processed for the present study. The cows were daughters of 1,861 sires, distributed over 94 herds. Predictive ability was assessed via a 5-fold cross-validation using 2 loss functions: mean squared error of prediction (MSEP) as the end point and a cost difference function. The heritability estimates were 0.061 for mastitis measured as a binary trait in the probit model and 0.085 and 0.132 for the number of mastitis cases in the ordinal threshold and Poisson models, respectively; because of scale differences, only the probit and ordinal threshold models are directly comparable. Among healthy animals, MSEP was smallest for the probit model, and the cost function was smallest for the ordinal threshold model. Among diseased animals, MSEP and the cost function were smallest

  6. Bayesian Estimation Of Shift Point In Poisson Model Under Asymmetric Loss Functions

    Directory of Open Access Journals (Sweden)

    uma srivastava

    2012-01-01

    Full Text Available The paper deals with estimating  shift point which occurs in any sequence of independent observations  of Poisson model in statistical process control. This shift point occurs in the sequence when  i.e. m  life data are observed. The Bayes estimator on shift point 'm' and before and after shift process means are derived for symmetric and asymmetric loss functions under informative and non informative priors. The sensitivity analysis of Bayes estimators are carried out by simulation and numerical comparisons with  R-programming. The results shows the effectiveness of shift in sequence of Poisson disribution .

  7. Receiver design for SPAD-based VLC systems under Poisson-Gaussian mixed noise model.

    Science.gov (United States)

    Mao, Tianqi; Wang, Zhaocheng; Wang, Qi

    2017-01-23

    Single-photon avalanche diode (SPAD) is a promising photosensor because of its high sensitivity to optical signals in weak illuminance environment. Recently, it has drawn much attention from researchers in visible light communications (VLC). However, existing literature only deals with the simplified channel model, which only considers the effects of Poisson noise introduced by SPAD, but neglects other noise sources. Specifically, when an analog SPAD detector is applied, there exists Gaussian thermal noise generated by the transimpedance amplifier (TIA) and the digital-to-analog converter (D/A). Therefore, in this paper, we propose an SPAD-based VLC system with pulse-amplitude-modulation (PAM) under Poisson-Gaussian mixed noise model, where Gaussian-distributed thermal noise at the receiver is also investigated. The closed-form conditional likelihood of received signals is derived using the Laplace transform and the saddle-point approximation method, and the corresponding quasi-maximum-likelihood (quasi-ML) detector is proposed. Furthermore, the Poisson-Gaussian-distributed signals are converted to Gaussian variables with the aid of the generalized Anscombe transform (GAT), leading to an equivalent additive white Gaussian noise (AWGN) channel, and a hard-decision-based detector is invoked. Simulation results demonstrate that, the proposed GAT-based detector can reduce the computational complexity with marginal performance loss compared with the proposed quasi-ML detector, and both detectors are capable of accurately demodulating the SPAD-based PAM signals.

  8. Crime Modeling using Spatial Regression Approach

    Science.gov (United States)

    Saleh Ahmar, Ansari; Adiatma; Kasim Aidid, M.

    2018-01-01

    Act of criminality in Indonesia increased both variety and quantity every year. As murder, rape, assault, vandalism, theft, fraud, fencing, and other cases that make people feel unsafe. Risk of society exposed to crime is the number of reported cases in the police institution. The higher of the number of reporter to the police institution then the number of crime in the region is increasing. In this research, modeling criminality in South Sulawesi, Indonesia with the dependent variable used is the society exposed to the risk of crime. Modelling done by area approach is the using Spatial Autoregressive (SAR) and Spatial Error Model (SEM) methods. The independent variable used is the population density, the number of poor population, GDP per capita, unemployment and the human development index (HDI). Based on the analysis using spatial regression can be shown that there are no dependencies spatial both lag or errors in South Sulawesi.

  9. Non-chiral, molecular model of negative Poisson ratio in two dimensions

    International Nuclear Information System (INIS)

    Wojciechowski, K W

    2003-01-01

    A two-dimensional model of tri-atomic molecules (in which 'atoms' are distributed on vertices of equilateral triangles, and which are further referred to as cyclic trimers) is solved exactly in the static (zero-temperature) limit for the nearest-neighbour site-site interactions. It is shown that the cyclic trimers form a mechanically stable and elastically isotropic non-chiral phase of negative Poisson ratio. The properties of the system are illustrated by three examples of atom-atom interaction potentials: (i) the purely repulsive (n-inverse-power) potential, (ii) the purely attractive (n-power) potential and (iii) the Lennard-Jones potential which shows both the repulsive and the attractive part. The analytic form of the dependence of the Poisson ratio on the interatomic potential is obtained. It is shown that the Poisson ratio depends, in a universal way, only on the trimer anisotropy parameter both (1) in the limit of n → ∞ for cases (i) and (ii), as well as (2) at the zero external pressure for any potential with a doubly differentiable minimum, case (iii) is an example

  10. Poisson distribution

    NARCIS (Netherlands)

    Hallin, M.; Piegorsch, W.; El Shaarawi, A.

    2012-01-01

    The random variable X taking values 0,1,2,…,x,… with probabilities pλ(x) = e−λλx/x!, where λ∈R0+ is called a Poisson variable, and its distribution a Poisson distribution, with parameter λ. The Poisson distribution with parameter λ can be obtained as the limit, as n → ∞ and p → 0 in such a way that

  11. Fractional Poisson-Nernst-Planck Model for Ion Channels I: Basic Formulations and Algorithms.

    Science.gov (United States)

    Chen, Duan

    2017-11-01

    In this work, we propose a fractional Poisson-Nernst-Planck model to describe ion permeation in gated ion channels. Due to the intrinsic conformational changes, crowdedness in narrow channel pores, binding and trapping introduced by functioning units of channel proteins, ionic transport in the channel exhibits a power-law-like anomalous diffusion dynamics. We start from continuous-time random walk model for a single ion and use a long-tailed density distribution function for the particle jump waiting time, to derive the fractional Fokker-Planck equation. Then, it is generalized to the macroscopic fractional Poisson-Nernst-Planck model for ionic concentrations. Necessary computational algorithms are designed to implement numerical simulations for the proposed model, and the dynamics of gating current is investigated. Numerical simulations show that the fractional PNP model provides a more qualitatively reasonable match to the profile of gating currents from experimental observations. Meanwhile, the proposed model motivates new challenges in terms of mathematical modeling and computations.

  12. A Hierarchical Poisson Log-Normal Model for Network Inference from RNA Sequencing Data

    Science.gov (United States)

    Gallopin, Mélina; Rau, Andrea; Jaffrézic, Florence

    2013-01-01

    Gene network inference from transcriptomic data is an important methodological challenge and a key aspect of systems biology. Although several methods have been proposed to infer networks from microarray data, there is a need for inference methods able to model RNA-seq data, which are count-based and highly variable. In this work we propose a hierarchical Poisson log-normal model with a Lasso penalty to infer gene networks from RNA-seq data; this model has the advantage of directly modelling discrete data and accounting for inter-sample variance larger than the sample mean. Using real microRNA-seq data from breast cancer tumors and simulations, we compare this method to a regularized Gaussian graphical model on log-transformed data, and a Poisson log-linear graphical model with a Lasso penalty on power-transformed data. For data simulated with large inter-sample dispersion, the proposed model performs better than the other methods in terms of sensitivity, specificity and area under the ROC curve. These results show the necessity of methods specifically designed for gene network inference from RNA-seq data. PMID:24147011

  13. AN APPLICATION OF FUNCTIONAL MULTIVARIATE REGRESSION MODEL TO MULTICLASS CLASSIFICATION

    OpenAIRE

    Krzyśko, Mirosław; Smaga, Łukasz

    2017-01-01

    In this paper, the scale response functional multivariate regression model is considered. By using the basis functions representation of functional predictors and regression coefficients, this model is rewritten as a multivariate regression model. This representation of the functional multivariate regression model is used for multiclass classification for multivariate functional data. Computational experiments performed on real labelled data sets demonstrate the effectiveness of the proposed ...

  14. Entrepreneurial intention modeling using hierarchical multiple regression

    Directory of Open Access Journals (Sweden)

    Marina Jeger

    2014-12-01

    Full Text Available The goal of this study is to identify the contribution of effectuation dimensions to the predictive power of the entrepreneurial intention model over and above that which can be accounted for by other predictors selected and confirmed in previous studies. As is often the case in social and behavioral studies, some variables are likely to be highly correlated with each other. Therefore, the relative amount of variance in the criterion variable explained by each of the predictors depends on several factors such as the order of variable entry and sample specifics. The results show the modest predictive power of two dimensions of effectuation prior to the introduction of the theory of planned behavior elements. The article highlights the main advantages of applying hierarchical regression in social sciences as well as in the specific context of entrepreneurial intention formation, and addresses some of the potential pitfalls that this type of analysis entails.

  15. An Additive-Multiplicative Cox-Aalen Regression Model

    DEFF Research Database (Denmark)

    Scheike, Thomas H.; Zhang, Mei-Jie

    2002-01-01

    Aalen model; additive risk model; counting processes; Cox regression; survival analysis; time-varying effects......Aalen model; additive risk model; counting processes; Cox regression; survival analysis; time-varying effects...

  16. SU-E-T-144: Bayesian Inference of Local Relapse Data Using a Poisson-Based Tumour Control Probability Model

    Energy Technology Data Exchange (ETDEWEB)

    La Russa, D [The Ottawa Hospital Cancer Centre, Ottawa, ON (Canada)

    2015-06-15

    Purpose: The purpose of this project is to develop a robust method of parameter estimation for a Poisson-based TCP model using Bayesian inference. Methods: Bayesian inference was performed using the PyMC3 probabilistic programming framework written in Python. A Poisson-based TCP regression model that accounts for clonogen proliferation was fit to observed rates of local relapse as a function of equivalent dose in 2 Gy fractions for a population of 623 stage-I non-small-cell lung cancer patients. The Slice Markov Chain Monte Carlo sampling algorithm was used to sample the posterior distributions, and was initiated using the maximum of the posterior distributions found by optimization. The calculation of TCP with each sample step required integration over the free parameter α, which was performed using an adaptive 24-point Gauss-Legendre quadrature. Convergence was verified via inspection of the trace plot and posterior distribution for each of the fit parameters, as well as with comparisons of the most probable parameter values with their respective maximum likelihood estimates. Results: Posterior distributions for α, the standard deviation of α (σ), the average tumour cell-doubling time (Td), and the repopulation delay time (Tk), were generated assuming α/β = 10 Gy, and a fixed clonogen density of 10{sup 7} cm−{sup 3}. Posterior predictive plots generated from samples from these posterior distributions are in excellent agreement with the observed rates of local relapse used in the Bayesian inference. The most probable values of the model parameters also agree well with maximum likelihood estimates. Conclusion: A robust method of performing Bayesian inference of TCP data using a complex TCP model has been established.

  17. Poisson processes

    NARCIS (Netherlands)

    Boxma, O.J.; Yechiali, U.; Ruggeri, F.; Kenett, R.S.; Faltin, F.W.

    2007-01-01

    The Poisson process is a stochastic counting process that arises naturally in a large variety of daily life situations. We present a few definitions of the Poisson process and discuss several properties as well as relations to some well-known probability distributions. We further briefly discuss the

  18. Information transfer with rate-modulated Poisson processes: a simple model for nonstationary stochastic resonance.

    Science.gov (United States)

    Goychuk, I

    2001-08-01

    Stochastic resonance in a simple model of information transfer is studied for sensory neurons and ensembles of ion channels. An exact expression for the information gain is obtained for the Poisson process with the signal-modulated spiking rate. This result allows one to generalize the conventional stochastic resonance (SR) problem (with periodic input signal) to the arbitrary signals of finite duration (nonstationary SR). Moreover, in the case of a periodic signal, the rate of information gain is compared with the conventional signal-to-noise ratio. The paper establishes the general nonequivalence between both measures notwithstanding their apparent similarity in the limit of weak signals.

  19. Variable selection and model choice in geoadditive regression models.

    Science.gov (United States)

    Kneib, Thomas; Hothorn, Torsten; Tutz, Gerhard

    2009-06-01

    Model choice and variable selection are issues of major concern in practical regression analyses, arising in many biometric applications such as habitat suitability analyses, where the aim is to identify the influence of potentially many environmental conditions on certain species. We describe regression models for breeding bird communities that facilitate both model choice and variable selection, by a boosting algorithm that works within a class of geoadditive regression models comprising spatial effects, nonparametric effects of continuous covariates, interaction surfaces, and varying coefficients. The major modeling components are penalized splines and their bivariate tensor product extensions. All smooth model terms are represented as the sum of a parametric component and a smooth component with one degree of freedom to obtain a fair comparison between the model terms. A generic representation of the geoadditive model allows us to devise a general boosting algorithm that automatically performs model choice and variable selection.

  20. Classifying next-generation sequencing data using a zero-inflated Poisson model.

    Science.gov (United States)

    Zhou, Yan; Wan, Xiang; Zhang, Baoxue; Tong, Tiejun

    2018-04-15

    With the development of high-throughput techniques, RNA-sequencing (RNA-seq) is becoming increasingly popular as an alternative for gene expression analysis, such as RNAs profiling and classification. Identifying which type of diseases a new patient belongs to with RNA-seq data has been recognized as a vital problem in medical research. As RNA-seq data are discrete, statistical methods developed for classifying microarray data cannot be readily applied for RNA-seq data classification. Witten proposed a Poisson linear discriminant analysis (PLDA) to classify the RNA-seq data in 2011. Note, however, that the count datasets are frequently characterized by excess zeros in real RNA-seq or microRNA sequence data (i.e. when the sequence depth is not enough or small RNAs with the length of 18-30 nucleotides). Therefore, it is desired to develop a new model to analyze RNA-seq data with an excess of zeros. In this paper, we propose a Zero-Inflated Poisson Logistic Discriminant Analysis (ZIPLDA) for RNA-seq data with an excess of zeros. The new method assumes that the data are from a mixture of two distributions: one is a point mass at zero, and the other follows a Poisson distribution. We then consider a logistic relation between the probability of observing zeros and the mean of the genes and the sequencing depth in the model. Simulation studies show that the proposed method performs better than, or at least as well as, the existing methods in a wide range of settings. Two real datasets including a breast cancer RNA-seq dataset and a microRNA-seq dataset are also analyzed, and they coincide with the simulation results that our proposed method outperforms the existing competitors. The software is available at http://www.math.hkbu.edu.hk/∼tongt. xwan@comp.hkbu.edu.hk or tongt@hkbu.edu.hk. Supplementary data are available at Bioinformatics online.

  1. Pendugaan Angka Kematian Bayi dengan Menggunakan Model Poisson Bayes Berhirarki Dua-Level

    Directory of Open Access Journals (Sweden)

    Nusar Hajarisman

    2013-06-01

    Full Text Available Official institutions of national data providers such as the BPS-Statistics Indonesia is required to produce and present the statistical information, as necessary as a form of contributory BPS region in support of regional development policy and planning. There are survey conducted by BPS capability estimation techniques are still limited, due to the resulting estimators have not been able to directly assumed for small areas. In this article we propose the hierarchical Bayesian models, especially for count data which are Poisson distributed, in small area estimation problem. The model was developed by combining concept of generalized linear model and Fay-Herriot model. The results of the development of this model is implemented to estimate the infant mortality rate in Bojonegoro district, East Java Province.

  2. Estimating effectiveness in HIV prevention trials with a Bayesian hierarchical compound Poisson frailty model

    Science.gov (United States)

    Coley, Rebecca Yates; Browna, Elizabeth R.

    2016-01-01

    Inconsistent results in recent HIV prevention trials of pre-exposure prophylactic interventions may be due to heterogeneity in risk among study participants. Intervention effectiveness is most commonly estimated with the Cox model, which compares event times between populations. When heterogeneity is present, this population-level measure underestimates intervention effectiveness for individuals who are at risk. We propose a likelihood-based Bayesian hierarchical model that estimates the individual-level effectiveness of candidate interventions by accounting for heterogeneity in risk with a compound Poisson-distributed frailty term. This model reflects the mechanisms of HIV risk and allows that some participants are not exposed to HIV and, therefore, have no risk of seroconversion during the study. We assess model performance via simulation and apply the model to data from an HIV prevention trial. PMID:26869051

  3. Hierarchical regression analysis in structural Equation Modeling

    NARCIS (Netherlands)

    de Jong, P.F.

    1999-01-01

    In a hierarchical or fixed-order regression analysis, the independent variables are entered into the regression equation in a prespecified order. Such an analysis is often performed when the extra amount of variance accounted for in a dependent variable by a specific independent variable is the main

  4. Non-Poisson counting statistics of a hybrid G-M counter dead time model

    International Nuclear Information System (INIS)

    Lee, Sang Hoon; Jae, Moosung; Gardner, Robin P.

    2007-01-01

    The counting statistics of a G-M counter with a considerable dead time event rate deviates from Poisson statistics. Important characteristics such as observed counting rates as a function true counting rates, variances and interval distributions were analyzed for three dead time models, non-paralyzable, paralyzable and hybrid, with the help of GMSIM, a Monte Carlo dead time effect simulator. The simulation results showed good agreements with the models in observed counting rates and variances. It was found through GMSIM simulations that the interval distribution for the hybrid model showed three distinctive regions, a complete cutoff region for the duration of the total dead time, a degraded exponential and an enhanced exponential regions. By measuring the cutoff and the duration of degraded exponential from the pulse interval distribution, it is possible to evaluate the two dead times in the hybrid model

  5. Modeling maximum daily temperature using a varying coefficient regression model

    Science.gov (United States)

    Han Li; Xinwei Deng; Dong-Yum Kim; Eric P. Smith

    2014-01-01

    Relationships between stream water and air temperatures are often modeled using linear or nonlinear regression methods. Despite a strong relationship between water and air temperatures and a variety of models that are effective for data summarized on a weekly basis, such models did not yield consistently good predictions for summaries such as daily maximum temperature...

  6. Integrate-and-fire vs Poisson models of LGN input to V1 cortex: noisier inputs reduce orientation selectivity.

    Science.gov (United States)

    Lin, I-Chun; Xing, Dajun; Shapley, Robert

    2012-12-01

    One of the reasons the visual cortex has attracted the interest of computational neuroscience is that it has well-defined inputs. The lateral geniculate nucleus (LGN) of the thalamus is the source of visual signals to the primary visual cortex (V1). Most large-scale cortical network models approximate the spike trains of LGN neurons as simple Poisson point processes. However, many studies have shown that neurons in the early visual pathway are capable of spiking with high temporal precision and their discharges are not Poisson-like. To gain an understanding of how response variability in the LGN influences the behavior of V1, we study response properties of model V1 neurons that receive purely feedforward inputs from LGN cells modeled either as noisy leaky integrate-and-fire (NLIF) neurons or as inhomogeneous Poisson processes. We first demonstrate that the NLIF model is capable of reproducing many experimentally observed statistical properties of LGN neurons. Then we show that a V1 model in which the LGN input to a V1 neuron is modeled as a group of NLIF neurons produces higher orientation selectivity than the one with Poisson LGN input. The second result implies that statistical characteristics of LGN spike trains are important for V1's function. We conclude that physiologically motivated models of V1 need to include more realistic LGN spike trains that are less noisy than inhomogeneous Poisson processes.

  7. Fluctuations and pseudo long range dependence in network flows: A non-stationary Poisson process model

    International Nuclear Information System (INIS)

    Yu-Dong, Chen; Li, Li; Yi, Zhang; Jian-Ming, Hu

    2009-01-01

    In the study of complex networks (systems), the scaling phenomenon of flow fluctuations refers to a certain power-law between the mean flux (activity) (F i ) of the i-th node and its variance σ i as σ i α (F i ) α . Such scaling laws are found to be prevalent both in natural and man-made network systems, but the understanding of their origins still remains limited. This paper proposes a non-stationary Poisson process model to give an analytical explanation of the non-universal scaling phenomenon: the exponent α varies between 1/2 and 1 depending on the size of sampling time window and the relative strength of the external/internal driven forces of the systems. The crossover behaviour and the relation of fluctuation scaling with pseudo long range dependence are also accounted for by the model. Numerical experiments show that the proposed model can recover the multi-scaling phenomenon. (general)

  8. Model performance analysis and model validation in logistic regression

    Directory of Open Access Journals (Sweden)

    Rosa Arboretti Giancristofaro

    2007-10-01

    Full Text Available In this paper a new model validation procedure for a logistic regression model is presented. At first, we illustrate a brief review of different techniques of model validation. Next, we define a number of properties required for a model to be considered "good", and a number of quantitative performance measures. Lastly, we describe a methodology for the assessment of the performance of a given model by using an example taken from a management study.

  9. WAITING TIME DISTRIBUTION OF SOLAR ENERGETIC PARTICLE EVENTS MODELED WITH A NON-STATIONARY POISSON PROCESS

    International Nuclear Information System (INIS)

    Li, C.; Su, W.; Fang, C.; Zhong, S. J.; Wang, L.

    2014-01-01

    We present a study of the waiting time distributions (WTDs) of solar energetic particle (SEP) events observed with the spacecraft WIND and GOES. The WTDs of both solar electron events (SEEs) and solar proton events (SPEs) display a power-law tail of ∼Δt –γ . The SEEs display a broken power-law WTD. The power-law index is γ 1 = 0.99 for the short waiting times (<70 hr) and γ 2 = 1.92 for large waiting times (>100 hr). The break of the WTD of SEEs is probably due to the modulation of the corotating interaction regions. The power-law index, γ ∼ 1.82, is derived for the WTD of the SPEs which is consistent with the WTD of type II radio bursts, indicating a close relationship between the shock wave and the production of energetic protons. The WTDs of SEP events can be modeled with a non-stationary Poisson process, which was proposed to understand the waiting time statistics of solar flares. We generalize the method and find that, if the SEP event rate λ = 1/Δt varies as the time distribution of event rate f(λ) = Aλ –α exp (– βλ), the time-dependent Poisson distribution can produce a power-law tail WTD of ∼Δt α –3 , where 0 ≤ α < 2

  10. Poisson Coordinates.

    Science.gov (United States)

    Li, Xian-Ying; Hu, Shi-Min

    2013-02-01

    Harmonic functions are the critical points of a Dirichlet energy functional, the linear projections of conformal maps. They play an important role in computer graphics, particularly for gradient-domain image processing and shape-preserving geometric computation. We propose Poisson coordinates, a novel transfinite interpolation scheme based on the Poisson integral formula, as a rapid way to estimate a harmonic function on a certain domain with desired boundary values. Poisson coordinates are an extension of the Mean Value coordinates (MVCs) which inherit their linear precision, smoothness, and kernel positivity. We give explicit formulas for Poisson coordinates in both continuous and 2D discrete forms. Superior to MVCs, Poisson coordinates are proved to be pseudoharmonic (i.e., they reproduce harmonic functions on n-dimensional balls). Our experimental results show that Poisson coordinates have lower Dirichlet energies than MVCs on a number of typical 2D domains (particularly convex domains). As well as presenting a formula, our approach provides useful insights for further studies on coordinates-based interpolation and fast estimation of harmonic functions.

  11. Logistic Regression Modeling of Diminishing Manufacturing Sources for Integrated Circuits

    National Research Council Canada - National Science Library

    Gravier, Michael

    1999-01-01

    .... The research identified logistic regression as a powerful tool for analysis of DMSMS and further developed twenty models attempting to identify the "best" way to model and predict DMSMS using logistic regression...

  12. Double-observer line transect surveys with Markov-modulated Poisson process models for animal availability.

    Science.gov (United States)

    Borchers, D L; Langrock, R

    2015-12-01

    We develop maximum likelihood methods for line transect surveys in which animals go undetected at distance zero, either because they are stochastically unavailable while within view or because they are missed when they are available. These incorporate a Markov-modulated Poisson process model for animal availability, allowing more clustered availability events than is possible with Poisson availability models. They include a mark-recapture component arising from the independent-observer survey, leading to more accurate estimation of detection probability given availability. We develop models for situations in which (a) multiple detections of the same individual are possible and (b) some or all of the availability process parameters are estimated from the line transect survey itself, rather than from independent data. We investigate estimator performance by simulation, and compare the multiple-detection estimators with estimators that use only initial detections of individuals, and with a single-observer estimator. Simultaneous estimation of detection function parameters and availability model parameters is shown to be feasible from the line transect survey alone with multiple detections and double-observer data but not with single-observer data. Recording multiple detections of individuals improves estimator precision substantially when estimating the availability model parameters from survey data, and we recommend that these data be gathered. We apply the methods to estimate detection probability from a double-observer survey of North Atlantic minke whales, and find that double-observer data greatly improve estimator precision here too. © 2015 The Authors Biometrics published by Wiley Periodicals, Inc. on behalf of International Biometric Society.

  13. Repairable-conditionally repairable damage model based on dual Poisson processes.

    Science.gov (United States)

    Lind, B K; Persson, L M; Edgren, M R; Hedlöf, I; Brahme, A

    2003-09-01

    The advent of intensity-modulated radiation therapy makes it increasingly important to model the response accurately when large volumes of normal tissues are irradiated by controlled graded dose distributions aimed at maximizing tumor cure and minimizing normal tissue toxicity. The cell survival model proposed here is very useful and flexible for accurate description of the response of healthy tissues as well as tumors in classical and truly radiobiologically optimized radiation therapy. The repairable-conditionally repairable (RCR) model distinguishes between two different types of damage, namely the potentially repairable, which may also be lethal, i.e. if unrepaired or misrepaired, and the conditionally repairable, which may be repaired or may lead to apoptosis if it has not been repaired correctly. When potentially repairable damage is being repaired, for example by nonhomologous end joining, conditionally repairable damage may require in addition a high-fidelity correction by homologous repair. The induction of both types of damage is assumed to be described by Poisson statistics. The resultant cell survival expression has the unique ability to fit most experimental data well at low doses (the initial hypersensitive range), intermediate doses (on the shoulder of the survival curve), and high doses (on the quasi-exponential region of the survival curve). The complete Poisson expression can be approximated well by a simple bi-exponential cell survival expression, S(D) = e(-aD) + bDe(-cD), where the first term describes the survival of undamaged cells and the last term represents survival after complete repair of sublethal damage. The bi-exponential expression makes it easy to derive D(0), D(q), n and alpha, beta values to facilitate comparison with classical cell survival models.

  14. Nambu–Poisson gauge theory

    Energy Technology Data Exchange (ETDEWEB)

    Jurčo, Branislav, E-mail: jurco@karlin.mff.cuni.cz [Charles University in Prague, Faculty of Mathematics and Physics, Mathematical Institute, Prague 186 75 (Czech Republic); Schupp, Peter, E-mail: p.schupp@jacobs-university.de [Jacobs University Bremen, 28759 Bremen (Germany); Vysoký, Jan, E-mail: vysokjan@fjfi.cvut.cz [Jacobs University Bremen, 28759 Bremen (Germany); Czech Technical University in Prague, Faculty of Nuclear Sciences and Physical Engineering, Prague 115 19 (Czech Republic)

    2014-06-02

    We generalize noncommutative gauge theory using Nambu–Poisson structures to obtain a new type of gauge theory with higher brackets and gauge fields. The approach is based on covariant coordinates and higher versions of the Seiberg–Witten map. We construct a covariant Nambu–Poisson gauge theory action, give its first order expansion in the Nambu–Poisson tensor and relate it to a Nambu–Poisson matrix model.

  15. Nambu–Poisson gauge theory

    International Nuclear Information System (INIS)

    Jurčo, Branislav; Schupp, Peter; Vysoký, Jan

    2014-01-01

    We generalize noncommutative gauge theory using Nambu–Poisson structures to obtain a new type of gauge theory with higher brackets and gauge fields. The approach is based on covariant coordinates and higher versions of the Seiberg–Witten map. We construct a covariant Nambu–Poisson gauge theory action, give its first order expansion in the Nambu–Poisson tensor and relate it to a Nambu–Poisson matrix model.

  16. A simple approach to power and sample size calculations in logistic regression and Cox regression models.

    Science.gov (United States)

    Vaeth, Michael; Skovlund, Eva

    2004-06-15

    For a given regression problem it is possible to identify a suitably defined equivalent two-sample problem such that the power or sample size obtained for the two-sample problem also applies to the regression problem. For a standard linear regression model the equivalent two-sample problem is easily identified, but for generalized linear models and for Cox regression models the situation is more complicated. An approximately equivalent two-sample problem may, however, also be identified here. In particular, we show that for logistic regression and Cox regression models the equivalent two-sample problem is obtained by selecting two equally sized samples for which the parameters differ by a value equal to the slope times twice the standard deviation of the independent variable and further requiring that the overall expected number of events is unchanged. In a simulation study we examine the validity of this approach to power calculations in logistic regression and Cox regression models. Several different covariate distributions are considered for selected values of the overall response probability and a range of alternatives. For the Cox regression model we consider both constant and non-constant hazard rates. The results show that in general the approach is remarkably accurate even in relatively small samples. Some discrepancies are, however, found in small samples with few events and a highly skewed covariate distribution. Comparison with results based on alternative methods for logistic regression models with a single continuous covariate indicates that the proposed method is at least as good as its competitors. The method is easy to implement and therefore provides a simple way to extend the range of problems that can be covered by the usual formulas for power and sample size determination. Copyright 2004 John Wiley & Sons, Ltd.

  17. The MIDAS Touch: Mixed Data Sampling Regression Models

    OpenAIRE

    Ghysels, Eric; Santa-Clara, Pedro; Valkanov, Rossen

    2004-01-01

    We introduce Mixed Data Sampling (henceforth MIDAS) regression models. The regressions involve time series data sampled at different frequencies. Technically speaking MIDAS models specify conditional expectations as a distributed lag of regressors recorded at some higher sampling frequencies. We examine the asymptotic properties of MIDAS regression estimation and compare it with traditional distributed lag models. MIDAS regressions have wide applicability in macroeconomics and �nance.

  18. Model selection in kernel ridge regression

    DEFF Research Database (Denmark)

    Exterkate, Peter

    2013-01-01

    Kernel ridge regression is a technique to perform ridge regression with a potentially infinite number of nonlinear transformations of the independent variables as regressors. This method is gaining popularity as a data-rich nonlinear forecasting tool, which is applicable in many different contexts....... The influence of the choice of kernel and the setting of tuning parameters on forecast accuracy is investigated. Several popular kernels are reviewed, including polynomial kernels, the Gaussian kernel, and the Sinc kernel. The latter two kernels are interpreted in terms of their smoothing properties......, and the tuning parameters associated to all these kernels are related to smoothness measures of the prediction function and to the signal-to-noise ratio. Based on these interpretations, guidelines are provided for selecting the tuning parameters from small grids using cross-validation. A Monte Carlo study...

  19. Analytical estimation of effective charges at saturation in Poisson-Boltzmann cell models

    International Nuclear Information System (INIS)

    Trizac, Emmanuel; Aubouy, Miguel; Bocquet, Lyderic

    2003-01-01

    We propose a simple approximation scheme for computing the effective charges of highly charged colloids (spherical or cylindrical with infinite length). Within non-linear Poisson-Boltzmann theory, we start from an expression for the effective charge in the infinite-dilution limit which is asymptotically valid for large salt concentrations; this result is then extended to finite colloidal concentration, approximating the salt partitioning effect which relates the salt content in the suspension to that of a dialysing reservoir. This leads to an analytical expression for the effective charge as a function of colloid volume fraction and salt concentration. These results compare favourably with the effective charges at saturation (i.e. in the limit of large bare charge) computed numerically following the standard prescription proposed by Alexander et al within the cell model

  20. Zero-truncated panel Poisson mixture models: Estimating the impact on tourism benefits in Fukushima Prefecture.

    Science.gov (United States)

    Narukawa, Masaki; Nohara, Katsuhito

    2018-04-01

    This study proposes an estimation approach to panel count data, truncated at zero, in order to apply a contingent behavior travel cost method to revealed and stated preference data collected via a web-based survey. We develop zero-truncated panel Poisson mixture models by focusing on respondents who visited a site. In addition, we introduce an inverse Gaussian distribution to unobserved individual heterogeneity as an alternative to a popular gamma distribution, making it possible to capture effectively the long tail typically observed in trip data. We apply the proposed method to estimate the impact on tourism benefits in Fukushima Prefecture as a result of the Fukushima Nuclear Power Plant No. 1 accident. Copyright © 2018 Elsevier Ltd. All rights reserved.

  1. Identification of temporal patterns in the seismicity of Sumatra using Poisson Hidden Markov models

    Directory of Open Access Journals (Sweden)

    Katerina Orfanogiannaki

    2014-05-01

    Full Text Available On 26 December 2004 and 28 March 2005 two large earthquakes occurred between the Indo-Australian and the southeastern Eurasian plates with moment magnitudes Mw=9.1 and Mw=8.6, respectively. Complete data (mb≥4.2 of the post-1993 time interval have been used to apply Poisson Hidden Markov models (PHMMs for identifying temporal patterns in the time series of the two earthquake sequences. Each time series consists of earthquake counts, in given and constant time units, in the regions determined by the aftershock zones of the two mainshocks. In PHMMs each count is generated by one of m different Poisson processes that are called states. The series of states is unobserved and is in fact a Markov chain. The model incorporates a varying seismicity rate, it assigns a different rate to each state and it detects the changes on the rate over time. In PHMMs unobserved factors, related to the local properties of the region are considered affecting the earthquake occurrence rate. Estimation and interpretation of the unobserved sequence of states that underlie the data contribute to better understanding of the geophysical processes that take place in the region. We applied PHMMs to the time series of the two mainshocks and we estimated the unobserved sequences of states that underlie the data. The results obtained showed that the region of the 26 December 2004 earthquake was in state of low seismicity during almost the entire observation period. On the contrary, in the region of the 28 March 2005 earthquake the seismic activity is attributed to triggered seismicity, due to stress transfer from the region of the 2004 mainshock.

  2. Partitioning the aggregation of parasites on hosts into intrinsic and extrinsic components via an extended Poisson-gamma mixture model.

    Directory of Open Access Journals (Sweden)

    Justin M Calabrese

    Full Text Available It is well known that parasites are often highly aggregated on their hosts such that relatively few individuals host the large majority of parasites. When the parasites are vectors of infectious disease, a key consequence of this aggregation can be increased disease transmission rates. The cause of this aggregation, however, is much less clear, especially for parasites such as arthropod vectors, which generally spend only a short time on their hosts. Regression-based analyses of ticks on various hosts have focused almost exclusively on identifying the intrinsic host characteristics associated with large burdens, but these efforts have had mixed results; most host traits examined have some small influence, but none are key. An alternative approach, the Poisson-gamma mixture distribution, has often been used to describe aggregated parasite distributions in a range of host/macroparasite systems, but lacks a clear mechanistic basis. Here, we extend this framework by linking it to a general model of parasite accumulation. Then, focusing on blacklegged ticks (Ixodes scapularis on mice (Peromyscus leucopus, we fit the extended model to the best currently available larval tick burden datasets via hierarchical Bayesian methods, and use it to explore the relative contributions of intrinsic and extrinsic factors on observed tick burdens. Our results suggest that simple bad luck-inhabiting a home range with high vector density-may play a much larger role in determining parasite burdens than is currently appreciated.

  3. Semiparametric bivariate zero-inflated Poisson models with application to studies of abundance for multiple species

    Science.gov (United States)

    Arab, Ali; Holan, Scott H.; Wikle, Christopher K.; Wildhaber, Mark L.

    2012-01-01

    Ecological studies involving counts of abundance, presence–absence or occupancy rates often produce data having a substantial proportion of zeros. Furthermore, these types of processes are typically multivariate and only adequately described by complex nonlinear relationships involving externally measured covariates. Ignoring these aspects of the data and implementing standard approaches can lead to models that fail to provide adequate scientific understanding of the underlying ecological processes, possibly resulting in a loss of inferential power. One method of dealing with data having excess zeros is to consider the class of univariate zero-inflated generalized linear models. However, this class of models fails to address the multivariate and nonlinear aspects associated with the data usually encountered in practice. Therefore, we propose a semiparametric bivariate zero-inflated Poisson model that takes into account both of these data attributes. The general modeling framework is hierarchical Bayes and is suitable for a broad range of applications. We demonstrate the effectiveness of our model through a motivating example on modeling catch per unit area for multiple species using data from the Missouri River Benthic Fishes Study, implemented by the United States Geological Survey.

  4. Mixture of Regression Models with Single-Index

    OpenAIRE

    Xiang, Sijia; Yao, Weixin

    2016-01-01

    In this article, we propose a class of semiparametric mixture regression models with single-index. We argue that many recently proposed semiparametric/nonparametric mixture regression models can be considered special cases of the proposed model. However, unlike existing semiparametric mixture regression models, the new pro- posed model can easily incorporate multivariate predictors into the nonparametric components. Backfitting estimates and the corresponding algorithms have been proposed for...

  5. Testing a Poisson Counter Model for Visual Identification of Briefly Presented, Mutually Confusable Single Stimuli in Pure Accuracy Tasks

    Science.gov (United States)

    Kyllingsbaek, Soren; Markussen, Bo; Bundesen, Claus

    2012-01-01

    The authors propose and test a simple model of the time course of visual identification of briefly presented, mutually confusable single stimuli in pure accuracy tasks. The model implies that during stimulus analysis, tentative categorizations that stimulus i belongs to category j are made at a constant Poisson rate, v(i, j). The analysis is…

  6. On Poisson functions

    OpenAIRE

    Terashima, Yuji

    2008-01-01

    In this paper, defining Poisson functions on super manifolds, we show that the graphs of Poisson functions are Dirac structures, and find Poisson functions which include as special cases both quasi-Poisson structures and twisted Poisson structures.

  7. Linear regression crash prediction models : issues and proposed solutions.

    Science.gov (United States)

    2010-05-01

    The paper develops a linear regression model approach that can be applied to : crash data to predict vehicle crashes. The proposed approach involves novice data aggregation : to satisfy linear regression assumptions; namely error structure normality ...

  8. Model-based Quantile Regression for Discrete Data

    KAUST Repository

    Padellini, Tullia; Rue, Haavard

    2018-01-01

    Quantile regression is a class of methods voted to the modelling of conditional quantiles. In a Bayesian framework quantile regression has typically been carried out exploiting the Asymmetric Laplace Distribution as a working likelihood. Despite

  9. Spatio-energetic cross talk in photon counting detectors: Detector model and correlated Poisson data generator.

    Science.gov (United States)

    Taguchi, Katsuyuki; Polster, Christoph; Lee, Okkyun; Stierstorfer, Karl; Kappler, Steffen

    2016-12-01

    An x-ray photon interacts with photon counting detectors (PCDs) and generates an electron charge cloud or multiple clouds. The clouds (thus, the photon energy) may be split between two adjacent PCD pixels when the interaction occurs near pixel boundaries, producing a count at both of the pixels. This is called double-counting with charge sharing. (A photoelectric effect with K-shell fluorescence x-ray emission would result in double-counting as well). As a result, PCD data are spatially and energetically correlated, although the output of individual PCD pixels is Poisson distributed. Major problems include the lack of a detector noise model for the spatio-energetic cross talk and lack of a computationally efficient simulation tool for generating correlated Poisson data. A Monte Carlo (MC) simulation can accurately simulate these phenomena and produce noisy data; however, it is not computationally efficient. In this study, the authors developed a new detector model and implemented it in an efficient software simulator that uses a Poisson random number generator to produce correlated noisy integer counts. The detector model takes the following effects into account: (1) detection efficiency; (2) incomplete charge collection and ballistic effect; (3) interaction with PCDs via photoelectric effect (with or without K-shell fluorescence x-ray emission, which may escape from the PCDs or be reabsorbed); and (4) electronic noise. The correlation was modeled by using these two simplifying assumptions: energy conservation and mutual exclusiveness. The mutual exclusiveness is that no more than two pixels measure energy from one photon. The effect of model parameters has been studied and results were compared with MC simulations. The agreement, with respect to the spectrum, was evaluated using the reduced χ 2 statistics or a weighted sum of squared errors, χ red 2 (≥1), where χ red 2 =1 indicates a perfect fit. The model produced spectra with flat field irradiation that

  10. Poisson-Boltzmann theory of charged colloids: limits of the cell model for salty suspensions

    International Nuclear Information System (INIS)

    Denton, A R

    2010-01-01

    Thermodynamic properties of charge-stabilized colloidal suspensions and polyelectrolyte solutions are commonly modelled by implementing the mean-field Poisson-Boltzmann (PB) theory within a cell model. This approach models a bulk system by a single macroion, together with counterions and salt ions, confined to a symmetrically shaped, electroneutral cell. While easing numerical solution of the nonlinear PB equation, the cell model neglects microion-induced interactions and correlations between macroions, precluding modelling of macroion ordering phenomena. An alternative approach, which avoids the artificial constraints of cell geometry, exploits the mapping of a macroion-microion mixture onto a one-component model of pseudo-macroions governed by effective interparticle interactions. In practice, effective-interaction models are usually based on linear-screening approximations, which can accurately describe strong nonlinear screening only by incorporating an effective (renormalized) macroion charge. Combining charge renormalization and linearized PB theories, in both the cell model and an effective-interaction (cell-free) model, we compute osmotic pressures of highly charged colloids and monovalent microions, in Donnan equilibrium with a salt reservoir, over a range of concentrations. By comparing predictions with primitive model simulation data for salt-free suspensions, and with predictions from nonlinear PB theory for salty suspensions, we chart the limits of both the cell model and linear-screening approximations in modelling bulk thermodynamic properties. Up to moderately strong electrostatic couplings, the cell model proves accurate for predicting osmotic pressures of deionized (counterion-dominated) suspensions. With increasing salt concentration, however, the relative contribution of macroion interactions to the osmotic pressure grows, leading predictions from the cell and effective-interaction models to deviate. No evidence is found for a liquid

  11. Exact protein distributions for stochastic models of gene expression using partitioning of Poisson processes.

    Science.gov (United States)

    Pendar, Hodjat; Platini, Thierry; Kulkarni, Rahul V

    2013-04-01

    Stochasticity in gene expression gives rise to fluctuations in protein levels across a population of genetically identical cells. Such fluctuations can lead to phenotypic variation in clonal populations; hence, there is considerable interest in quantifying noise in gene expression using stochastic models. However, obtaining exact analytical results for protein distributions has been an intractable task for all but the simplest models. Here, we invoke the partitioning property of Poisson processes to develop a mapping that significantly simplifies the analysis of stochastic models of gene expression. The mapping leads to exact protein distributions using results for mRNA distributions in models with promoter-based regulation. Using this approach, we derive exact analytical results for steady-state and time-dependent distributions for the basic two-stage model of gene expression. Furthermore, we show how the mapping leads to exact protein distributions for extensions of the basic model that include the effects of posttranscriptional and posttranslational regulation. The approach developed in this work is widely applicable and can contribute to a quantitative understanding of stochasticity in gene expression and its regulation.

  12. Exact protein distributions for stochastic models of gene expression using partitioning of Poisson processes

    Science.gov (United States)

    Pendar, Hodjat; Platini, Thierry; Kulkarni, Rahul V.

    2013-04-01

    Stochasticity in gene expression gives rise to fluctuations in protein levels across a population of genetically identical cells. Such fluctuations can lead to phenotypic variation in clonal populations; hence, there is considerable interest in quantifying noise in gene expression using stochastic models. However, obtaining exact analytical results for protein distributions has been an intractable task for all but the simplest models. Here, we invoke the partitioning property of Poisson processes to develop a mapping that significantly simplifies the analysis of stochastic models of gene expression. The mapping leads to exact protein distributions using results for mRNA distributions in models with promoter-based regulation. Using this approach, we derive exact analytical results for steady-state and time-dependent distributions for the basic two-stage model of gene expression. Furthermore, we show how the mapping leads to exact protein distributions for extensions of the basic model that include the effects of posttranscriptional and posttranslational regulation. The approach developed in this work is widely applicable and can contribute to a quantitative understanding of stochasticity in gene expression and its regulation.

  13. Forecasting Ebola with a regression transmission model

    Directory of Open Access Journals (Sweden)

    Jason Asher

    2018-03-01

    Full Text Available We describe a relatively simple stochastic model of Ebola transmission that was used to produce forecasts with the lowest mean absolute error among Ebola Forecasting Challenge participants. The model enabled prediction of peak incidence, the timing of this peak, and final size of the outbreak. The underlying discrete-time compartmental model used a time-varying reproductive rate modeled as a multiplicative random walk driven by the number of infectious individuals. This structure generalizes traditional Susceptible-Infected-Recovered (SIR disease modeling approaches and allows for the flexible consideration of outbreaks with complex trajectories of disease dynamics. Keywords: Ebola, Forecasting, Mathematical modeling, Bayesian inference

  14. Forecasting Ebola with a regression transmission model

    OpenAIRE

    Asher, Jason

    2017-01-01

    We describe a relatively simple stochastic model of Ebola transmission that was used to produce forecasts with the lowest mean absolute error among Ebola Forecasting Challenge participants. The model enabled prediction of peak incidence, the timing of this peak, and final size of the outbreak. The underlying discrete-time compartmental model used a time-varying reproductive rate modeled as a multiplicative random walk driven by the number of infectious individuals. This structure generalizes ...

  15. Markov model of fatigue of a composite material with the poisson process of defect initiation

    Science.gov (United States)

    Paramonov, Yu.; Chatys, R.; Andersons, J.; Kleinhofs, M.

    2012-05-01

    As a development of the model where only one weak microvolume (WMV) and only a pulsating cyclic loading are considered, in the current version of the model, we take into account the presence of several weak sites where fatigue damage can accumulate and a loading with an arbitrary (but positive) stress ratio. The Poisson process of initiation of WMVs is considered, whose rate depends on the size of a specimen. The cumulative distribution function (cdf) of the fatigue life of every individual WMV is calculated using the Markov model of fatigue. For the case where this function is approximated by a lognormal distribution, a formula for calculating the cdf of fatigue life of the specimen (modeled as a chain of WMVs) is obtained. Only a pulsating cyclic loading was considered in the previous version of the model. Now, using the modified energy method, a loading cycle with an arbitrary stress ratio is "transformed" into an equivalent cycle with some other stress ratio. In such a way, the entire probabilistic fatigue diagram for any stress ratio with a positive cycle stress can be obtained. Numerical examples are presented.

  16. Elementary Statistical Models for Vector Collision-Sequence Interference Effects with Poisson-Distributed Collision Times

    International Nuclear Information System (INIS)

    Lewis, J.C.

    2011-01-01

    In a recent paper (Lewis, 2008) a class of models suitable for application to collision-sequence interference was introduced. In these models velocities are assumed to be completely randomized in each collision. The distribution of velocities was assumed to be Gaussian. The integrated induced dipole moment μk, for vector interference, or the scalar modulation μk, for scalar interference, was assumed to be a function of the impulse (integrated force) fk, or its magnitude fk, experienced by the molecule in a collision. For most of (Lewis, 2008) it was assumed that μk fk and μk fk, but it proved to be possible to extend the models, so that the magnitude of the induced dipole moment is equal to an arbitrary power or sum of powers of the intermolecular force. This allows estimates of the in filling of the interference dip by the dis proportionality of the induced dipole moment and force. One particular such model, using data from (Herman and Lewis, 2006), leads to the most realistic estimate for the in filling of the vector interference dip yet obtained. In (Lewis, 2008) the drastic assumption was made that collision times occurred at equal intervals. In the present paper that assumption is removed: the collision times are taken to form a Poisson process. This is much more realistic than the equal-intervals assumption. The interference dip is found to be a Lorentzian in this model

  17. The analysis of the radiation induced cancer risks of workers of the nuclear industry - Liquidators of the accident on Chernobyl Atomic Station on the basis of modified poisson regressions

    International Nuclear Information System (INIS)

    Shafransky, I.L.; Tukov, A.R.

    2008-01-01

    Full text: The purpose of work consisted in reception of adequate estimations for additional relative risk in recalculation on 1 Sv in two dose diapason: up to 200 mSv and up to 500 mSv on the basis of materials on prevalence of malignant disease of workers of the nuclear industry - liquidators of the accident on Chernobyl Atomic Station. For this purpose methods of cohort analysis were used. This method realized on the basis of Poisson regression has been used. Estimations ERR on 1 Sv have been calculated as under the traditional scheme with use of module AMFIT (software EPICURE), and under the modified formula offered Paretzke. The received results have shown, that in some cases estimations for the risks, received on the modified formula, are more realistic, in other cases both estimations have close values. Also the lead analysis has shown no correct the procedure of carry of estimations of the risk received on one dose interval, on another a dose interval in a kind of nonlinear dependence of function of risk from a doze. As a whole, it is possible to tell, on an interval up to 200 mSv estimations of risk demand use of more complexes, than regression, models. In a range of dozes up to 500 mSv and even up to 1000 mSv the estimation of risk under the modified formula is more adequate. In a range of small doses application of the traditional approach on the basis of Linear non-threshold concept cannot be statistically justified and correct. (author)

  18. Model Selection in Kernel Ridge Regression

    DEFF Research Database (Denmark)

    Exterkate, Peter

    Kernel ridge regression is gaining popularity as a data-rich nonlinear forecasting tool, which is applicable in many different contexts. This paper investigates the influence of the choice of kernel and the setting of tuning parameters on forecast accuracy. We review several popular kernels......, including polynomial kernels, the Gaussian kernel, and the Sinc kernel. We interpret the latter two kernels in terms of their smoothing properties, and we relate the tuning parameters associated to all these kernels to smoothness measures of the prediction function and to the signal-to-noise ratio. Based...... on these interpretations, we provide guidelines for selecting the tuning parameters from small grids using cross-validation. A Monte Carlo study confirms the practical usefulness of these rules of thumb. Finally, the flexible and smooth functional forms provided by the Gaussian and Sinc kernels makes them widely...

  19. Multivariate poisson lognormal modeling of crashes by type and severity on rural two lane highways.

    Science.gov (United States)

    Wang, Kai; Ivan, John N; Ravishanker, Nalini; Jackson, Eric

    2017-02-01

    In an effort to improve traffic safety, there has been considerable interest in estimating crash prediction models and identifying factors contributing to crashes. To account for crash frequency variations among crash types and severities, crash prediction models have been estimated by type and severity. The univariate crash count models have been used by researchers to estimate crashes by crash type or severity, in which the crash counts by type or severity are assumed to be independent of one another and modelled separately. When considering crash types and severities simultaneously, this may neglect the potential correlations between crash counts due to the presence of shared unobserved factors across crash types or severities for a specific roadway intersection or segment, and might lead to biased parameter estimation and reduce model accuracy. The focus on this study is to estimate crashes by both crash type and crash severity using the Integrated Nested Laplace Approximation (INLA) Multivariate Poisson Lognormal (MVPLN) model, and identify the different effects of contributing factors on different crash type and severity counts on rural two-lane highways. The INLA MVPLN model can simultaneously model crash counts by crash type and crash severity by accounting for the potential correlations among them and significantly decreases the computational time compared with a fully Bayesian fitting of the MVPLN model using Markov Chain Monte Carlo (MCMC) method. This paper describes estimation of MVPLN models for three-way stop controlled (3ST) intersections, four-way stop controlled (4ST) intersections, four-way signalized (4SG) intersections, and roadway segments on rural two-lane highways. Annual Average Daily traffic (AADT) and variables describing roadway conditions (including presence of lighting, presence of left-turn/right-turn lane, lane width and shoulder width) were used as predictors. A Univariate Poisson Lognormal (UPLN) was estimated by crash type and

  20. Analytical solutions of nonlocal Poisson dielectric models with multiple point charges inside a dielectric sphere

    Science.gov (United States)

    Xie, Dexuan; Volkmer, Hans W.; Ying, Jinyong

    2016-04-01

    The nonlocal dielectric approach has led to new models and solvers for predicting electrostatics of proteins (or other biomolecules), but how to validate and compare them remains a challenge. To promote such a study, in this paper, two typical nonlocal dielectric models are revisited. Their analytical solutions are then found in the expressions of simple series for a dielectric sphere containing any number of point charges. As a special case, the analytical solution of the corresponding Poisson dielectric model is also derived in simple series, which significantly improves the well known Kirkwood's double series expansion. Furthermore, a convolution of one nonlocal dielectric solution with a commonly used nonlocal kernel function is obtained, along with the reaction parts of these local and nonlocal solutions. To turn these new series solutions into a valuable research tool, they are programed as a free fortran software package, which can input point charge data directly from a protein data bank file. Consequently, different validation tests can be quickly done on different proteins. Finally, a test example for a protein with 488 atomic charges is reported to demonstrate the differences between the local and nonlocal models as well as the importance of using the reaction parts to develop local and nonlocal dielectric solvers.

  1. Advanced diffusion model in compacted bentonite based on modified Poisson-Boltzmann equations

    International Nuclear Information System (INIS)

    Yotsuji, K.; Tachi, Y.; Nishimaki, Y.

    2012-01-01

    Document available in extended abstract form only. Diffusion and sorption of radionuclides in compacted bentonite are the key processes in the safe geological disposal of radioactive waste. JAEA has developed the integrated sorption and diffusion (ISD) model for compacted bentonite by coupling the pore water chemistry, sorption and diffusion processes in consistent way. The diffusion model accounts consistently for cation excess and anion exclusion in narrow pores in compacted bentonite by the electric double layer (EDL) theory. The firstly developed ISD model could predict the diffusivity of the monovalent cation/anion in compacted bentonite as a function of dry density. This ISD model was modified by considering the visco-electric effect, and applied for diffusion data for various radionuclides measured under wide range of conditions (salinity, density, etc.). This modified ISD model can give better quantitative agreement with diffusion data for monovalent cation/anion, however, the model predictions still disagree with experimental data for multivalent cation and complex species. In this study we extract the additional key factors influencing diffusion model in narrow charged pores, and the effects of these factors were investigated to reach a better understanding of diffusion processes in compacted bentonite. We investigated here the dielectric saturation effect and the excluded volume effect into the present ISD model and numerically solved these modified Poisson-Boltzmann equations. In the vicinity of the negatively charged clay surfaces, it is necessary to evaluate concentration distribution of electrolytes considering the dielectric saturation effects. The Poisson-Boltzmann (P-B) equation coupled with the dielectric saturation effects was solved numerically by using Runge-Kutta and Shooting methods. Figure 1(a) shows the concentration distributions of Na + as numerical solutions of the modified and original P-B equations for 0.01 M pore water, 800 kg m -3

  2. Regression and regression analysis time series prediction modeling on climate data of quetta, pakistan

    International Nuclear Information System (INIS)

    Jafri, Y.Z.; Kamal, L.

    2007-01-01

    Various statistical techniques was used on five-year data from 1998-2002 of average humidity, rainfall, maximum and minimum temperatures, respectively. The relationships to regression analysis time series (RATS) were developed for determining the overall trend of these climate parameters on the basis of which forecast models can be corrected and modified. We computed the coefficient of determination as a measure of goodness of fit, to our polynomial regression analysis time series (PRATS). The correlation to multiple linear regression (MLR) and multiple linear regression analysis time series (MLRATS) were also developed for deciphering the interdependence of weather parameters. Spearman's rand correlation and Goldfeld-Quandt test were used to check the uniformity or non-uniformity of variances in our fit to polynomial regression (PR). The Breusch-Pagan test was applied to MLR and MLRATS, respectively which yielded homoscedasticity. We also employed Bartlett's test for homogeneity of variances on a five-year data of rainfall and humidity, respectively which showed that the variances in rainfall data were not homogenous while in case of humidity, were homogenous. Our results on regression and regression analysis time series show the best fit to prediction modeling on climatic data of Quetta, Pakistan. (author)

  3. Corporate prediction models, ratios or regression analysis?

    NARCIS (Netherlands)

    Bijnen, E.J.; Wijn, M.F.C.M.

    1994-01-01

    The models developed in the literature with respect to the prediction of a company s failure are based on ratios. It has been shown before that these models should be rejected on theoretical grounds. Our study of industrial companies in the Netherlands shows that the ratios which are used in

  4. STREAMFLOW AND WATER QUALITY REGRESSION MODELING ...

    African Journals Online (AJOL)

    ... downstream Obigbo station show: consistent time-trends in degree of contamination; linear and non-linear relationships for water quality models against total dissolved solids (TDS), total suspended sediment (TSS), chloride, pH and sulphate; and non-linear relationship for streamflow and water quality transport models.

  5. Parameters Estimation of Geographically Weighted Ordinal Logistic Regression (GWOLR) Model

    Science.gov (United States)

    Zuhdi, Shaifudin; Retno Sari Saputro, Dewi; Widyaningsih, Purnami

    2017-06-01

    A regression model is the representation of relationship between independent variable and dependent variable. The dependent variable has categories used in the logistic regression model to calculate odds on. The logistic regression model for dependent variable has levels in the logistics regression model is ordinal. GWOLR model is an ordinal logistic regression model influenced the geographical location of the observation site. Parameters estimation in the model needed to determine the value of a population based on sample. The purpose of this research is to parameters estimation of GWOLR model using R software. Parameter estimation uses the data amount of dengue fever patients in Semarang City. Observation units used are 144 villages in Semarang City. The results of research get GWOLR model locally for each village and to know probability of number dengue fever patient categories.

  6. Multiattribute shopping models and ridge regression analysis

    NARCIS (Netherlands)

    Timmermans, H.J.P.

    1981-01-01

    Policy decisions regarding retailing facilities essentially involve multiple attributes of shopping centres. If mathematical shopping models are to contribute to these decision processes, their structure should reflect the multiattribute character of retailing planning. Examination of existing

  7. Linear Regression Models for Estimating True Subsurface ...

    Indian Academy of Sciences (India)

    47

    The objective is to minimize the processing time and computer memory required. 10 to carry out inversion .... to the mainland by two long bridges. .... term. In this approach, the model converges when the squared sum of the differences. 143.

  8. Nonparametric estimation of the heterogeneity of a random medium using compound Poisson process modeling of wave multiple scattering.

    Science.gov (United States)

    Le Bihan, Nicolas; Margerin, Ludovic

    2009-07-01

    In this paper, we present a nonparametric method to estimate the heterogeneity of a random medium from the angular distribution of intensity of waves transmitted through a slab of random material. Our approach is based on the modeling of forward multiple scattering using compound Poisson processes on compact Lie groups. The estimation technique is validated through numerical simulations based on radiative transfer theory.

  9. Relative risk estimation of Chikungunya disease in Malaysia: An analysis based on Poisson-gamma model

    Science.gov (United States)

    Samat, N. A.; Ma'arof, S. H. Mohd Imam

    2015-05-01

    Disease mapping is a method to display the geographical distribution of disease occurrence, which generally involves the usage and interpretation of a map to show the incidence of certain diseases. Relative risk (RR) estimation is one of the most important issues in disease mapping. This paper begins by providing a brief overview of Chikungunya disease. This is followed by a review of the classical model used in disease mapping, based on the standardized morbidity ratio (SMR), which we then apply to our Chikungunya data. We then fit an extension of the classical model, which we refer to as a Poisson-Gamma model, when prior distributions for the relative risks are assumed known. Both results are displayed and compared using maps and we reveal a smoother map with fewer extremes values of estimated relative risk. The extensions of this paper will consider other methods that are relevant to overcome the drawbacks of the existing methods, in order to inform and direct government strategy for monitoring and controlling Chikungunya disease.

  10. Moderation analysis using a two-level regression model.

    Science.gov (United States)

    Yuan, Ke-Hai; Cheng, Ying; Maxwell, Scott

    2014-10-01

    Moderation analysis is widely used in social and behavioral research. The most commonly used model for moderation analysis is moderated multiple regression (MMR) in which the explanatory variables of the regression model include product terms, and the model is typically estimated by least squares (LS). This paper argues for a two-level regression model in which the regression coefficients of a criterion variable on predictors are further regressed on moderator variables. An algorithm for estimating the parameters of the two-level model by normal-distribution-based maximum likelihood (NML) is developed. Formulas for the standard errors (SEs) of the parameter estimates are provided and studied. Results indicate that, when heteroscedasticity exists, NML with the two-level model gives more efficient and more accurate parameter estimates than the LS analysis of the MMR model. When error variances are homoscedastic, NML with the two-level model leads to essentially the same results as LS with the MMR model. Most importantly, the two-level regression model permits estimating the percentage of variance of each regression coefficient that is due to moderator variables. When applied to data from General Social Surveys 1991, NML with the two-level model identified a significant moderation effect of race on the regression of job prestige on years of education while LS with the MMR model did not. An R package is also developed and documented to facilitate the application of the two-level model.

  11. Alternative regression models to assess increase in childhood BMI

    OpenAIRE

    Beyerlein, Andreas; Fahrmeir, Ludwig; Mansmann, Ulrich; Toschke, André M

    2008-01-01

    Abstract Background Body mass index (BMI) data usually have skewed distributions, for which common statistical modeling approaches such as simple linear or logistic regression have limitations. Methods Different regression approaches to predict childhood BMI by goodness-of-fit measures and means of interpretation were compared including generalized linear models (GLMs), quantile regression and Generalized Additive Models for Location, Scale and Shape (GAMLSS). We analyzed data of 4967 childre...

  12. Support Vector Regression Model Based on Empirical Mode Decomposition and Auto Regression for Electric Load Forecasting

    Directory of Open Access Journals (Sweden)

    Hong-Juan Li

    2013-04-01

    Full Text Available Electric load forecasting is an important issue for a power utility, associated with the management of daily operations such as energy transfer scheduling, unit commitment, and load dispatch. Inspired by strong non-linear learning capability of support vector regression (SVR, this paper presents a SVR model hybridized with the empirical mode decomposition (EMD method and auto regression (AR for electric load forecasting. The electric load data of the New South Wales (Australia market are employed for comparing the forecasting performances of different forecasting models. The results confirm the validity of the idea that the proposed model can simultaneously provide forecasting with good accuracy and interpretability.

  13. Ship-Track Models Based on Poisson-Distributed Port-Departure Times

    National Research Council Canada - National Science Library

    Heitmeyer, Richard

    2006-01-01

    ... of those ships, and their nominal speeds. The probability law assumes that the ship departure times are Poisson-distributed with a time-varying departure rate and that the ship speeds and ship routes are statistically independent...

  14. Accounting for Zero Inflation of Mussel Parasite Counts Using Discrete Regression Models

    Directory of Open Access Journals (Sweden)

    Emel Çankaya

    2017-06-01

    Full Text Available In many ecological applications, the absences of species are inevitable due to either detection faults in samples or uninhabitable conditions for their existence, resulting in high number of zero counts or abundance. Usual practice for modelling such data is regression modelling of log(abundance+1 and it is well know that resulting model is inadequate for prediction purposes. New discrete models accounting for zero abundances, namely zero-inflated regression (ZIP and ZINB, Hurdle-Poisson (HP and Hurdle-Negative Binomial (HNB amongst others are widely preferred to the classical regression models. Due to the fact that mussels are one of the economically most important aquatic products of Turkey, the purpose of this study is therefore to examine the performances of these four models in determination of the significant biotic and abiotic factors on the occurrences of Nematopsis legeri parasite harming the existence of Mediterranean mussels (Mytilus galloprovincialis L.. The data collected from the three coastal regions of Sinop city in Turkey showed more than 50% of parasite counts on the average are zero-valued and model comparisons were based on information criterion. The results showed that the probability of the occurrence of this parasite is here best formulated by ZINB or HNB models and influential factors of models were found to be correspondent with ecological differences of the regions.

  15. Likelihood inference for COM-Poisson cure rate model with interval-censored data and Weibull lifetimes.

    Science.gov (United States)

    Pal, Suvra; Balakrishnan, N

    2017-10-01

    In this paper, we consider a competing cause scenario and assume the number of competing causes to follow a Conway-Maxwell Poisson distribution which can capture both over and under dispersion that is usually encountered in discrete data. Assuming the population of interest having a component cure and the form of the data to be interval censored, as opposed to the usually considered right-censored data, the main contribution is in developing the steps of the expectation maximization algorithm for the determination of the maximum likelihood estimates of the model parameters of the flexible Conway-Maxwell Poisson cure rate model with Weibull lifetimes. An extensive Monte Carlo simulation study is carried out to demonstrate the performance of the proposed estimation method. Model discrimination within the Conway-Maxwell Poisson distribution is addressed using the likelihood ratio test and information-based criteria to select a suitable competing cause distribution that provides the best fit to the data. A simulation study is also carried out to demonstrate the loss in efficiency when selecting an improper competing cause distribution which justifies the use of a flexible family of distributions for the number of competing causes. Finally, the proposed methodology and the flexibility of the Conway-Maxwell Poisson distribution are illustrated with two known data sets from the literature: smoking cessation data and breast cosmesis data.

  16. A new approach for handling longitudinal count data with zero-inflation and overdispersion: poisson geometric process model.

    Science.gov (United States)

    Wan, Wai-Yin; Chan, Jennifer S K

    2009-08-01

    For time series of count data, correlated measurements, clustering as well as excessive zeros occur simultaneously in biomedical applications. Ignoring such effects might contribute to misleading treatment outcomes. A generalized mixture Poisson geometric process (GMPGP) model and a zero-altered mixture Poisson geometric process (ZMPGP) model are developed from the geometric process model, which was originally developed for modelling positive continuous data and was extended to handle count data. These models are motivated by evaluating the trend development of new tumour counts for bladder cancer patients as well as by identifying useful covariates which affect the count level. The models are implemented using Bayesian method with Markov chain Monte Carlo (MCMC) algorithms and are assessed using deviance information criterion (DIC).

  17. Survival analysis of clinical mastitis data using a nested frailty Cox model fit as a mixed-effects Poisson model.

    Science.gov (United States)

    Elghafghuf, Adel; Dufour, Simon; Reyher, Kristen; Dohoo, Ian; Stryhn, Henrik

    2014-12-01

    Mastitis is a complex disease affecting dairy cows and is considered to be the most costly disease of dairy herds. The hazard of mastitis is a function of many factors, both managerial and environmental, making its control a difficult issue to milk producers. Observational studies of clinical mastitis (CM) often generate datasets with a number of characteristics which influence the analysis of those data: the outcome of interest may be the time to occurrence of a case of mastitis, predictors may change over time (time-dependent predictors), the effects of factors may change over time (time-dependent effects), there are usually multiple hierarchical levels, and datasets may be very large. Analysis of such data often requires expansion of the data into the counting-process format - leading to larger datasets - thus complicating the analysis and requiring excessive computing time. In this study, a nested frailty Cox model with time-dependent predictors and effects was applied to Canadian Bovine Mastitis Research Network data in which 10,831 lactations of 8035 cows from 69 herds were followed through lactation until the first occurrence of CM. The model was fit to the data as a Poisson model with nested normally distributed random effects at the cow and herd levels. Risk factors associated with the hazard of CM during the lactation were identified, such as parity, calving season, herd somatic cell score, pasture access, fore-stripping, and proportion of treated cases of CM in a herd. The analysis showed that most of the predictors had a strong effect early in lactation and also demonstrated substantial variation in the baseline hazard among cows and between herds. A small simulation study for a setting similar to the real data was conducted to evaluate the Poisson maximum likelihood estimation approach with both Gaussian quadrature method and Laplace approximation. Further, the performance of the two methods was compared with the performance of a widely used estimation

  18. Assessment of Poisson, logit, and linear models for genetic analysis of clinical mastitis in Norwegian Red cows.

    Science.gov (United States)

    Vazquez, A I; Gianola, D; Bates, D; Weigel, K A; Heringstad, B

    2009-02-01

    Clinical mastitis is typically coded as presence/absence during some period of exposure, and records are analyzed with linear or binary data models. Because presence includes cows with multiple episodes, there is loss of information when a count is treated as a binary response. The Poisson model is designed for counting random variables, and although it is used extensively in epidemiology of mastitis, it has rarely been used for studying the genetics of mastitis. Many models have been proposed for genetic analysis of mastitis, but they have not been formally compared. The main goal of this study was to compare linear (Gaussian), Bernoulli (with logit link), and Poisson models for the purpose of genetic evaluation of sires for mastitis in dairy cattle. The response variables were clinical mastitis (CM; 0, 1) and number of CM cases (NCM; 0, 1, 2, ..). Data consisted of records on 36,178 first-lactation daughters of 245 Norwegian Red sires distributed over 5,286 herds. Predictive ability of models was assessed via a 3-fold cross-validation using mean squared error of prediction (MSEP) as the end-point. Between-sire variance estimates for NCM were 0.065 in Poisson and 0.007 in the linear model. For CM the between-sire variance was 0.093 in logit and 0.003 in the linear model. The ratio between herd and sire variances for the models with NCM response was 4.6 and 3.5 for Poisson and linear, respectively, and for model for CM was 3.7 in both logit and linear models. The MSEP for all cows was similar. However, within healthy animals, MSEP was 0.085 (Poisson), 0.090 (linear for NCM), 0.053 (logit), and 0.056 (linear for CM). For mastitic animals the MSEP values were 1.206 (Poisson), 1.185 (linear for NCM response), 1.333 (logit), and 1.319 (linear for CM response). The models for count variables had a better performance when predicting diseased animals and also had a similar performance between them. Logit and linear models for CM had better predictive ability for healthy

  19. A Poisson Log-Normal Model for Constructing Gene Covariation Network Using RNA-seq Data.

    Science.gov (United States)

    Choi, Yoonha; Coram, Marc; Peng, Jie; Tang, Hua

    2017-07-01

    Constructing expression networks using transcriptomic data is an effective approach for studying gene regulation. A popular approach for constructing such a network is based on the Gaussian graphical model (GGM), in which an edge between a pair of genes indicates that the expression levels of these two genes are conditionally dependent, given the expression levels of all other genes. However, GGMs are not appropriate for non-Gaussian data, such as those generated in RNA-seq experiments. We propose a novel statistical framework that maximizes a penalized likelihood, in which the observed count data follow a Poisson log-normal distribution. To overcome the computational challenges, we use Laplace's method to approximate the likelihood and its gradients, and apply the alternating directions method of multipliers to find the penalized maximum likelihood estimates. The proposed method is evaluated and compared with GGMs using both simulated and real RNA-seq data. The proposed method shows improved performance in detecting edges that represent covarying pairs of genes, particularly for edges connecting low-abundant genes and edges around regulatory hubs.

  20. Investigation of time and weather effects on crash types using full Bayesian multivariate Poisson lognormal models.

    Science.gov (United States)

    El-Basyouny, Karim; Barua, Sudip; Islam, Md Tazul

    2014-12-01

    Previous research shows that various weather elements have significant effects on crash occurrence and risk; however, little is known about how these elements affect different crash types. Consequently, this study investigates the impact of weather elements and sudden extreme snow or rain weather changes on crash type. Multivariate models were used for seven crash types using five years of daily weather and crash data collected for the entire City of Edmonton. In addition, the yearly trend and random variation of parameters across the years were analyzed by using four different modeling formulations. The proposed models were estimated in a full Bayesian context via Markov Chain Monte Carlo simulation. The multivariate Poisson lognormal model with yearly varying coefficients provided the best fit for the data according to Deviance Information Criteria. Overall, results showed that temperature and snowfall were statistically significant with intuitive signs (crashes decrease with increasing temperature; crashes increase as snowfall intensity increases) for all crash types, while rainfall was mostly insignificant. Previous snow showed mixed results, being statistically significant and positively related to certain crash types, while negatively related or insignificant in other cases. Maximum wind gust speed was found mostly insignificant with a few exceptions that were positively related to crash type. Major snow or rain events following a dry weather condition were highly significant and positively related to three crash types: Follow-Too-Close, Stop-Sign-Violation, and Ran-Off-Road crashes. The day-of-the-week dummy variables were statistically significant, indicating a possible weekly variation in exposure. Transportation authorities might use the above results to improve road safety by providing drivers with information regarding the risk of certain crash types for a particular weather condition. Copyright © 2014 Elsevier Ltd. All rights reserved.

  1. A test for the parameters of multiple linear regression models ...

    African Journals Online (AJOL)

    A test for the parameters of multiple linear regression models is developed for conducting tests simultaneously on all the parameters of multiple linear regression models. The test is robust relative to the assumptions of homogeneity of variances and absence of serial correlation of the classical F-test. Under certain null and ...

  2. Mixed Frequency Data Sampling Regression Models: The R Package midasr

    Directory of Open Access Journals (Sweden)

    Eric Ghysels

    2016-08-01

    Full Text Available When modeling economic relationships it is increasingly common to encounter data sampled at different frequencies. We introduce the R package midasr which enables estimating regression models with variables sampled at different frequencies within a MIDAS regression framework put forward in work by Ghysels, Santa-Clara, and Valkanov (2002. In this article we define a general autoregressive MIDAS regression model with multiple variables of different frequencies and show how it can be specified using the familiar R formula interface and estimated using various optimization methods chosen by the researcher. We discuss how to check the validity of the estimated model both in terms of numerical convergence and statistical adequacy of a chosen regression specification, how to perform model selection based on a information criterion, how to assess forecasting accuracy of the MIDAS regression model and how to obtain a forecast aggregation of different MIDAS regression models. We illustrate the capabilities of the package with a simulated MIDAS regression model and give two empirical examples of application of MIDAS regression.

  3. Impact of multicollinearity on small sample hydrologic regression models

    Science.gov (United States)

    Kroll, Charles N.; Song, Peter

    2013-06-01

    Often hydrologic regression models are developed with ordinary least squares (OLS) procedures. The use of OLS with highly correlated explanatory variables produces multicollinearity, which creates highly sensitive parameter estimators with inflated variances and improper model selection. It is not clear how to best address multicollinearity in hydrologic regression models. Here a Monte Carlo simulation is developed to compare four techniques to address multicollinearity: OLS, OLS with variance inflation factor screening (VIF), principal component regression (PCR), and partial least squares regression (PLS). The performance of these four techniques was observed for varying sample sizes, correlation coefficients between the explanatory variables, and model error variances consistent with hydrologic regional regression models. The negative effects of multicollinearity are magnified at smaller sample sizes, higher correlations between the variables, and larger model error variances (smaller R2). The Monte Carlo simulation indicates that if the true model is known, multicollinearity is present, and the estimation and statistical testing of regression parameters are of interest, then PCR or PLS should be employed. If the model is unknown, or if the interest is solely on model predictions, is it recommended that OLS be employed since using more complicated techniques did not produce any improvement in model performance. A leave-one-out cross-validation case study was also performed using low-streamflow data sets from the eastern United States. Results indicate that OLS with stepwise selection generally produces models across study regions with varying levels of multicollinearity that are as good as biased regression techniques such as PCR and PLS.

  4. Coupling Poisson rectangular pulse and multiplicative microcanonical random cascade models to generate sub-daily precipitation timeseries

    Science.gov (United States)

    Pohle, Ina; Niebisch, Michael; Müller, Hannes; Schümberg, Sabine; Zha, Tingting; Maurer, Thomas; Hinz, Christoph

    2018-07-01

    To simulate the impacts of within-storm rainfall variabilities on fast hydrological processes, long precipitation time series with high temporal resolution are required. Due to limited availability of observed data such time series are typically obtained from stochastic models. However, most existing rainfall models are limited in their ability to conserve rainfall event statistics which are relevant for hydrological processes. Poisson rectangular pulse models are widely applied to generate long time series of alternating precipitation events durations and mean intensities as well as interstorm period durations. Multiplicative microcanonical random cascade (MRC) models are used to disaggregate precipitation time series from coarse to fine temporal resolution. To overcome the inconsistencies between the temporal structure of the Poisson rectangular pulse model and the MRC model, we developed a new coupling approach by introducing two modifications to the MRC model. These modifications comprise (a) a modified cascade model ("constrained cascade") which preserves the event durations generated by the Poisson rectangular model by constraining the first and last interval of a precipitation event to contain precipitation and (b) continuous sigmoid functions of the multiplicative weights to consider the scale-dependency in the disaggregation of precipitation events of different durations. The constrained cascade model was evaluated in its ability to disaggregate observed precipitation events in comparison to existing MRC models. For that, we used a 20-year record of hourly precipitation at six stations across Germany. The constrained cascade model showed a pronounced better agreement with the observed data in terms of both the temporal pattern of the precipitation time series (e.g. the dry and wet spell durations and autocorrelations) and event characteristics (e.g. intra-event intermittency and intensity fluctuation within events). The constrained cascade model also

  5. Mean electrostatic and Poisson-Boltzmann models for multicomponent transport through compacted clay

    International Nuclear Information System (INIS)

    Steefel, C.I.; Galindez, J.M.

    2012-01-01

    Document available in extended abstract form only. Electrical double layer effects in the pore space of clays become increasingly important as the level of compaction increases and intergrain and interlayer spacings shift towards the range of nano-meters. At such scales, solute transport can no longer be explained by concentration gradients alone and it becomes necessary to include the electrostatic effects on chemical potentials. In fact, the electrical double layer (EDL) that develops in the neighborhood of the negatively charged clay surfaces can extend well into the aqueous phase, effectively constraining the space available to anions (known as anion exclusion), thus distorting the spatial distribution of ionic species in solution. In this study, we make use of two approaches for addressing the accumulation and transport of charged ionic species in the electrical double layers of compacted bentonite: 1) a mean electrostatic approach based on the assumption of Donnan equilibrium, and 2) a 2D numerical approach based on the multicomponent Poisson-Nernst-Planck (NPP) set of equations. For the mean electrostatic or Donnan approach to the electrical double layer [1], two options are considered: 1) a model in which surface complexation in the Stern layer may partly balance the fixed charge of the montmorillonite making up the bentonite buffer, and 2) a model in which the fixed mineral charge is balanced completely by the diffuse layer. In the mean electrostatic approach, one additional equation that balances the charge between the Stern layer and the diffuse layer is added to the multicomponent reactive transport code CrunchFlow. The only additional unknown that is required is the mean electrostatic potential, although it may be necessary in certain cases to consider the volume (or width) of the electrical double layer as an additional implicit unknown. Both ions and neutral species may diffuse within the diffuse layer according to their gradients and species

  6. Zero-inflated Conway-Maxwell Poisson Distribution to Analyze Discrete Data.

    Science.gov (United States)

    Sim, Shin Zhu; Gupta, Ramesh C; Ong, Seng Huat

    2018-01-09

    In this paper, we study the zero-inflated Conway-Maxwell Poisson (ZICMP) distribution and develop a regression model. Score and likelihood ratio tests are also implemented for testing the inflation/deflation parameter. Simulation studies are carried out to examine the performance of these tests. A data example is presented to illustrate the concepts. In this example, the proposed model is compared to the well-known zero-inflated Poisson (ZIP) and the zero- inflated generalized Poisson (ZIGP) regression models. It is shown that the fit by ZICMP is comparable or better than these models.

  7. A generalized multivariate regression model for modelling ocean wave heights

    Science.gov (United States)

    Wang, X. L.; Feng, Y.; Swail, V. R.

    2012-04-01

    In this study, a generalized multivariate linear regression model is developed to represent the relationship between 6-hourly ocean significant wave heights (Hs) and the corresponding 6-hourly mean sea level pressure (MSLP) fields. The model is calibrated using the ERA-Interim reanalysis of Hs and MSLP fields for 1981-2000, and is validated using the ERA-Interim reanalysis for 2001-2010 and ERA40 reanalysis of Hs and MSLP for 1958-2001. The performance of the fitted model is evaluated in terms of Pierce skill score, frequency bias index, and correlation skill score. Being not normally distributed, wave heights are subjected to a data adaptive Box-Cox transformation before being used in the model fitting. Also, since 6-hourly data are being modelled, lag-1 autocorrelation must be and is accounted for. The models with and without Box-Cox transformation, and with and without accounting for autocorrelation, are inter-compared in terms of their prediction skills. The fitted MSLP-Hs relationship is then used to reconstruct historical wave height climate from the 6-hourly MSLP fields taken from the Twentieth Century Reanalysis (20CR, Compo et al. 2011), and to project possible future wave height climates using CMIP5 model simulations of MSLP fields. The reconstructed and projected wave heights, both seasonal means and maxima, are subject to a trend analysis that allows for non-linear (polynomial) trends.

  8. Poisson-Nernst-Planck-Fermi theory for modeling biological ion channels

    International Nuclear Information System (INIS)

    Liu, Jinn-Liang; Eisenberg, Bob

    2014-01-01

    A Poisson-Nernst-Planck-Fermi (PNPF) theory is developed for studying ionic transport through biological ion channels. Our goal is to deal with the finite size of particle using a Fermi like distribution without calculating the forces between the particles, because they are both expensive and tricky to compute. We include the steric effect of ions and water molecules with nonuniform sizes and interstitial voids, the correlation effect of crowded ions with different valences, and the screening effect of water molecules in an inhomogeneous aqueous electrolyte. Including the finite volume of water and the voids between particles is an important new part of the theory presented here. Fermi like distributions of all particle species are derived from the volume exclusion of classical particles. Volume exclusion and the resulting saturation phenomena are especially important to describe the binding and permeation mechanisms of ions in a narrow channel pore. The Gibbs free energy of the Fermi distribution reduces to that of a Boltzmann distribution when these effects are not considered. The classical Gibbs entropy is extended to a new entropy form — called Gibbs-Fermi entropy — that describes mixing configurations of all finite size particles and voids in a thermodynamic system where microstates do not have equal probabilities. The PNPF model describes the dynamic flow of ions, water molecules, as well as voids with electric fields and protein charges. The model also provides a quantitative mean-field description of the charge/space competition mechanism of particles within the highly charged and crowded channel pore. The PNPF results are in good accord with experimental currents recorded in a 10 8 -fold range of Ca 2+ concentrations. The results illustrate the anomalous mole fraction effect, a signature of L-type calcium channels. Moreover, numerical results concerning water density, dielectric permittivity, void volume, and steric energy provide useful details to

  9. Nonlocal Poisson-Fermi double-layer models: Effects of nonuniform ion sizes on double-layer structure

    Science.gov (United States)

    Xie, Dexuan; Jiang, Yi

    2018-05-01

    This paper reports a nonuniform ionic size nonlocal Poisson-Fermi double-layer model (nuNPF) and a uniform ionic size nonlocal Poisson-Fermi double-layer model (uNPF) for an electrolyte mixture of multiple ionic species, variable voltages on electrodes, and variable induced charges on boundary segments. The finite element solvers of nuNPF and uNPF are developed and applied to typical double-layer tests defined on a rectangular box, a hollow sphere, and a hollow rectangle with a charged post. Numerical results show that nuNPF can significantly improve the quality of the ionic concentrations and electric fields generated from uNPF, implying that the effect of nonuniform ion sizes is a key consideration in modeling the double-layer structure.

  10. Stochastic interest model driven by compound Poisson process andBrownian motion with applications in life contingencies

    Directory of Open Access Journals (Sweden)

    Shilong Li

    2018-03-01

    Full Text Available In this paper, we introduce a class of stochastic interest model driven by a compoundPoisson process and a Brownian motion, in which the jumping times of force of interest obeyscompound Poisson process and the continuous tiny fluctuations are described by Brownian motion, andthe adjustment in each jump of interest force is assumed to be random. Based on the proposed interestmodel, we discuss the expected discounted function, the validity of the model and actuarial presentvalues of life annuities and life insurances under different parameters and distribution settings. Ournumerical results show actuarial values could be sensitive to the parameters and distribution settings,which shows the importance of introducing this kind interest model.

  11. A Poisson equation formulation for pressure calculations in penalty finite element models for viscous incompressible flows

    Science.gov (United States)

    Sohn, J. L.; Heinrich, J. C.

    1990-01-01

    The calculation of pressures when the penalty-function approximation is used in finite-element solutions of laminar incompressible flows is addressed. A Poisson equation for the pressure is formulated that involves third derivatives of the velocity field. The second derivatives appearing in the weak formulation of the Poisson equation are calculated from the C0 velocity approximation using a least-squares method. The present scheme is shown to be efficient, free of spurious oscillations, and accurate. Examples of applications are given and compared with results obtained using mixed formulations.

  12. A Family of Poisson Processes for Use in Stochastic Models of Precipitation

    Science.gov (United States)

    Penland, C.

    2013-12-01

    Both modified Poisson processes and compound Poisson processes can be relevant to stochastic parameterization of precipitation. This presentation compares the dynamical properties of these systems and discusses the physical situations in which each might be appropriate. If the parameters describing either class of systems originate in hydrodynamics, then proper consideration of stochastic calculus is required during numerical implementation of the parameterization. It is shown here that an improper numerical treatment can have severe implications for estimating rainfall distributions, particularly in the tails of the distributions and, thus, on the frequency of extreme events.

  13. Modeling spiking behavior of neurons with time-dependent Poisson processes.

    Science.gov (United States)

    Shinomoto, S; Tsubo, Y

    2001-10-01

    Three kinds of interval statistics, as represented by the coefficient of variation, the skewness coefficient, and the correlation coefficient of consecutive intervals, are evaluated for three kinds of time-dependent Poisson processes: pulse regulated, sinusoidally regulated, and doubly stochastic. Among these three processes, the sinusoidally regulated and doubly stochastic Poisson processes, in the case when the spike rate varies slowly compared with the mean interval between spikes, are found to be consistent with the three statistical coefficients exhibited by data recorded from neurons in the prefrontal cortex of monkeys.

  14. Poisson-Gaussian Noise Reduction Using the Hidden Markov Model in Contourlet Domain for Fluorescence Microscopy Images

    Science.gov (United States)

    Yang, Sejung; Lee, Byung-Uk

    2015-01-01

    In certain image acquisitions processes, like in fluorescence microscopy or astronomy, only a limited number of photons can be collected due to various physical constraints. The resulting images suffer from signal dependent noise, which can be modeled as a Poisson distribution, and a low signal-to-noise ratio. However, the majority of research on noise reduction algorithms focuses on signal independent Gaussian noise. In this paper, we model noise as a combination of Poisson and Gaussian probability distributions to construct a more accurate model and adopt the contourlet transform which provides a sparse representation of the directional components in images. We also apply hidden Markov models with a framework that neatly describes the spatial and interscale dependencies which are the properties of transformation coefficients of natural images. In this paper, an effective denoising algorithm for Poisson-Gaussian noise is proposed using the contourlet transform, hidden Markov models and noise estimation in the transform domain. We supplement the algorithm by cycle spinning and Wiener filtering for further improvements. We finally show experimental results with simulations and fluorescence microscopy images which demonstrate the improved performance of the proposed approach. PMID:26352138

  15. Antibiotic Resistances in Livestock: A Comparative Approach to Identify an Appropriate Regression Model for Count Data

    Directory of Open Access Journals (Sweden)

    Anke Hüls

    2017-05-01

    Full Text Available Antimicrobial resistance in livestock is a matter of general concern. To develop hygiene measures and methods for resistance prevention and control, epidemiological studies on a population level are needed to detect factors associated with antimicrobial resistance in livestock holdings. In general, regression models are used to describe these relationships between environmental factors and resistance outcome. Besides the study design, the correlation structures of the different outcomes of antibiotic resistance and structural zero measurements on the resistance outcome as well as on the exposure side are challenges for the epidemiological model building process. The use of appropriate regression models that acknowledge these complexities is essential to assure valid epidemiological interpretations. The aims of this paper are (i to explain the model building process comparing several competing models for count data (negative binomial model, quasi-Poisson model, zero-inflated model, and hurdle model and (ii to compare these models using data from a cross-sectional study on antibiotic resistance in animal husbandry. These goals are essential to evaluate which model is most suitable to identify potential prevention measures. The dataset used as an example in our analyses was generated initially to study the prevalence and associated factors for the appearance of cefotaxime-resistant Escherichia coli in 48 German fattening pig farms. For each farm, the outcome was the count of samples with resistant bacteria. There was almost no overdispersion and only moderate evidence of excess zeros in the data. Our analyses show that it is essential to evaluate regression models in studies analyzing the relationship between environmental factors and antibiotic resistances in livestock. After model comparison based on evaluation of model predictions, Akaike information criterion, and Pearson residuals, here the hurdle model was judged to be the most appropriate

  16. Identification of Influential Points in a Linear Regression Model

    Directory of Open Access Journals (Sweden)

    Jan Grosz

    2011-03-01

    Full Text Available The article deals with the detection and identification of influential points in the linear regression model. Three methods of detection of outliers and leverage points are described. These procedures can also be used for one-sample (independentdatasets. This paper briefly describes theoretical aspects of several robust methods as well. Robust statistics is a powerful tool to increase the reliability and accuracy of statistical modelling and data analysis. A simulation model of the simple linear regression is presented.

  17. Disease Mapping and Regression with Count Data in the Presence of Overdispersion and Spatial Autocorrelation: A Bayesian Model Averaging Approach

    Science.gov (United States)

    Mohebbi, Mohammadreza; Wolfe, Rory; Forbes, Andrew

    2014-01-01

    This paper applies the generalised linear model for modelling geographical variation to esophageal cancer incidence data in the Caspian region of Iran. The data have a complex and hierarchical structure that makes them suitable for hierarchical analysis using Bayesian techniques, but with care required to deal with problems arising from counts of events observed in small geographical areas when overdispersion and residual spatial autocorrelation are present. These considerations lead to nine regression models derived from using three probability distributions for count data: Poisson, generalised Poisson and negative binomial, and three different autocorrelation structures. We employ the framework of Bayesian variable selection and a Gibbs sampling based technique to identify significant cancer risk factors. The framework deals with situations where the number of possible models based on different combinations of candidate explanatory variables is large enough such that calculation of posterior probabilities for all models is difficult or infeasible. The evidence from applying the modelling methodology suggests that modelling strategies based on the use of generalised Poisson and negative binomial with spatial autocorrelation work well and provide a robust basis for inference. PMID:24413702

  18. Detection of epistatic effects with logic regression and a classical linear regression model.

    Science.gov (United States)

    Malina, Magdalena; Ickstadt, Katja; Schwender, Holger; Posch, Martin; Bogdan, Małgorzata

    2014-02-01

    To locate multiple interacting quantitative trait loci (QTL) influencing a trait of interest within experimental populations, usually methods as the Cockerham's model are applied. Within this framework, interactions are understood as the part of the joined effect of several genes which cannot be explained as the sum of their additive effects. However, if a change in the phenotype (as disease) is caused by Boolean combinations of genotypes of several QTLs, this Cockerham's approach is often not capable to identify them properly. To detect such interactions more efficiently, we propose a logic regression framework. Even though with the logic regression approach a larger number of models has to be considered (requiring more stringent multiple testing correction) the efficient representation of higher order logic interactions in logic regression models leads to a significant increase of power to detect such interactions as compared to a Cockerham's approach. The increase in power is demonstrated analytically for a simple two-way interaction model and illustrated in more complex settings with simulation study and real data analysis.

  19. Birth and Death Process Modeling Leads to the Poisson Distribution: A Journey Worth Taking

    Science.gov (United States)

    Rash, Agnes M.; Winkel, Brian J.

    2009-01-01

    This paper describes details of development of the general birth and death process from which we can extract the Poisson process as a special case. This general process is appropriate for a number of courses and units in courses and can enrich the study of mathematics for students as it touches and uses a diverse set of mathematical topics, e.g.,…

  20. Random regression models for detection of gene by environment interaction

    Directory of Open Access Journals (Sweden)

    Meuwissen Theo HE

    2007-02-01

    Full Text Available Abstract Two random regression models, where the effect of a putative QTL was regressed on an environmental gradient, are described. The first model estimates the correlation between intercept and slope of the random regression, while the other model restricts this correlation to 1 or -1, which is expected under a bi-allelic QTL model. The random regression models were compared to a model assuming no gene by environment interactions. The comparison was done with regards to the models ability to detect QTL, to position them accurately and to detect possible QTL by environment interactions. A simulation study based on a granddaughter design was conducted, and QTL were assumed, either by assigning an effect independent of the environment or as a linear function of a simulated environmental gradient. It was concluded that the random regression models were suitable for detection of QTL effects, in the presence and absence of interactions with environmental gradients. Fixing the correlation between intercept and slope of the random regression had a positive effect on power when the QTL effects re-ranked between environments.

  1. Multiple regression and beyond an introduction to multiple regression and structural equation modeling

    CERN Document Server

    Keith, Timothy Z

    2014-01-01

    Multiple Regression and Beyond offers a conceptually oriented introduction to multiple regression (MR) analysis and structural equation modeling (SEM), along with analyses that flow naturally from those methods. By focusing on the concepts and purposes of MR and related methods, rather than the derivation and calculation of formulae, this book introduces material to students more clearly, and in a less threatening way. In addition to illuminating content necessary for coursework, the accessibility of this approach means students are more likely to be able to conduct research using MR or SEM--and more likely to use the methods wisely. Covers both MR and SEM, while explaining their relevance to one another Also includes path analysis, confirmatory factor analysis, and latent growth modeling Figures and tables throughout provide examples and illustrate key concepts and techniques For additional resources, please visit: http://tzkeith.com/.

  2. Leptospirosis disease mapping with standardized morbidity ratio and Poisson-Gamma model: An analysis of Leptospirosis disease in Kelantan, Malaysia

    Science.gov (United States)

    Che Awang, Aznida; Azah Samat, Nor

    2017-09-01

    Leptospirosis is a disease caused by the infection of pathogenic species from the genus of Leptospira. Human can be infected by the leptospirosis from direct or indirect exposure to the urine of infected animals. The excretion of urine from the animal host that carries pathogenic Leptospira causes the soil or water to be contaminated. Therefore, people can become infected when they are exposed to contaminated soil and water by cut on the skin as well as open wound. It also can enter the human body by mucous membrane such nose, eyes and mouth, for example by splashing contaminated water or urine into the eyes or swallowing contaminated water or food. Currently, there is no vaccine available for the prevention or treatment of leptospirosis disease but this disease can be treated if it is diagnosed early to avoid any complication. The disease risk mapping is important in a way to control and prevention of disease. Using a good choice of statistical model will produce a good disease risk map. Therefore, the aim of this study is to estimate the relative risk for leptospirosis disease based initially on the most common statistic used in disease mapping called Standardized Morbidity Ratio (SMR) and Poisson-gamma model. This paper begins by providing a review of the SMR method and Poisson-gamma model, which we then applied to leptospirosis data of Kelantan, Malaysia. Both results are displayed and compared using graph, tables and maps. The result shows that the second method Poisson-gamma model produces better relative risk estimates compared to the SMR method. This is because the Poisson-gamma model can overcome the drawback of SMR where the relative risk will become zero when there is no observed leptospirosis case in certain regions. However, the Poisson-gamma model also faced problems where the covariate adjustment for this model is difficult and no possibility for allowing spatial correlation between risks in neighbouring areas. The problems of this model have

  3. Poisson-Fermi modeling of ion activities in aqueous single and mixed electrolyte solutions at variable temperature

    Science.gov (United States)

    Liu, Jinn-Liang; Eisenberg, Bob

    2018-02-01

    The combinatorial explosion of empirical parameters in tens of thousands presents a tremendous challenge for extended Debye-Hückel models to calculate activity coefficients of aqueous mixtures of the most important salts in chemistry. The explosion of parameters originates from the phenomenological extension of the Debye-Hückel theory that does not take steric and correlation effects of ions and water into account. By contrast, the Poisson-Fermi theory developed in recent years treats ions and water molecules as nonuniform hard spheres of any size with interstitial voids and includes ion-water and ion-ion correlations. We present a Poisson-Fermi model and numerical methods for calculating the individual or mean activity coefficient of electrolyte solutions with any arbitrary number of ionic species in a large range of salt concentrations and temperatures. For each activity-concentration curve, we show that the Poisson-Fermi model requires only three unchanging parameters at most to well fit the corresponding experimental data. The three parameters are associated with the Born radius of the solvation energy of an ion in electrolyte solution that changes with salt concentrations in a highly nonlinear manner.

  4. Regularized multivariate regression models with skew-t error distributions

    KAUST Repository

    Chen, Lianfu; Pourahmadi, Mohsen; Maadooliat, Mehdi

    2014-01-01

    We consider regularization of the parameters in multivariate linear regression models with the errors having a multivariate skew-t distribution. An iterative penalized likelihood procedure is proposed for constructing sparse estimators of both

  5. Correlation-regression model for physico-chemical quality of ...

    African Journals Online (AJOL)

    abusaad

    areas, suggesting that groundwater quality in urban areas is closely related with land use ... the ground water, with correlation and regression model is also presented. ...... WHO (World Health Organization) (1985). Health hazards from nitrates.

  6. Macroscopic Modeling of a One-Dimensional Electrochemical Cell using the Poisson-Nernst-Planck Equations

    Science.gov (United States)

    Yan, David

    This thesis presents the one-dimensional equations, numerical method and simulations of a model to characterize the dynamical operation of an electrochemical cell. This model extends the current state-of-the art in that it accounts, in a primitive way, for the physics of the electrolyte/electrode interface and incorporates diffuse-charge dynamics, temperature coupling, surface coverage, and polarization phenomena. The one-dimensional equations account for a system with one or two mobile ions of opposite charge, and the electrode reaction we consider (when one is needed) is a one-electron electrodeposition reaction. Though the modeled system is far from representing a realistic electrochemical device, our results show a range of dynamics and behaviors which have not been observed previously, and explore the numerical challenges required when adding more complexity to a model. Furthermore, the basic transport equations (which are developed in three spatial dimensions) can in future accomodate the inclusion of additional physics, and coupling to more complex boundary conditions that incorporate two-dimensional surface phenomena and multi-rate reactions. In the model, the Poisson-Nernst-Planck equations are used to model diffusion and electromigration in an electrolyte, and the generalized Frumkin-Butler-Volmer equation is used to model reaction kinetics at electrodes. An energy balance equation is derived and coupled to the diffusion-migration equation. The model also includes dielectric polarization effects by introducing different values of the dielectric permittivity in different regions of the bulk, as well as accounting for surface coverage effects due to adsorption, and finite size "crowding", or steric effects. Advection effects are not modeled but could in future be incorporated. In order to solve the coupled PDE's, we use a variable step size second order scheme in time and finite differencing in space. Numerical tests are performed on a simplified system and

  7. Wavelet regression model in forecasting crude oil price

    Science.gov (United States)

    Hamid, Mohd Helmie; Shabri, Ani

    2017-05-01

    This study presents the performance of wavelet multiple linear regression (WMLR) technique in daily crude oil forecasting. WMLR model was developed by integrating the discrete wavelet transform (DWT) and multiple linear regression (MLR) model. The original time series was decomposed to sub-time series with different scales by wavelet theory. Correlation analysis was conducted to assist in the selection of optimal decomposed components as inputs for the WMLR model. The daily WTI crude oil price series has been used in this study to test the prediction capability of the proposed model. The forecasting performance of WMLR model were also compared with regular multiple linear regression (MLR), Autoregressive Moving Average (ARIMA) and Generalized Autoregressive Conditional Heteroscedasticity (GARCH) using root mean square errors (RMSE) and mean absolute errors (MAE). Based on the experimental results, it appears that the WMLR model performs better than the other forecasting technique tested in this study.

  8. Real estate value prediction using multivariate regression models

    Science.gov (United States)

    Manjula, R.; Jain, Shubham; Srivastava, Sharad; Rajiv Kher, Pranav

    2017-11-01

    The real estate market is one of the most competitive in terms of pricing and the same tends to vary significantly based on a lot of factors, hence it becomes one of the prime fields to apply the concepts of machine learning to optimize and predict the prices with high accuracy. Therefore in this paper, we present various important features to use while predicting housing prices with good accuracy. We have described regression models, using various features to have lower Residual Sum of Squares error. While using features in a regression model some feature engineering is required for better prediction. Often a set of features (multiple regressions) or polynomial regression (applying a various set of powers in the features) is used for making better model fit. For these models are expected to be susceptible towards over fitting ridge regression is used to reduce it. This paper thus directs to the best application of regression models in addition to other techniques to optimize the result.

  9. Application of random regression models to the genetic evaluation ...

    African Journals Online (AJOL)

    The model included fixed regression on AM (range from 30 to 138 mo) and the effect of herd-measurement date concatenation. Random parts of the model were RRM coefficients for additive and permanent environmental effects, while residual effects were modelled to account for heterogeneity of variance by AY. Estimates ...

  10. The APT model as reduced-rank regression

    NARCIS (Netherlands)

    Bekker, P.A.; Dobbelstein, P.; Wansbeek, T.J.

    Integrating the two steps of an arbitrage pricing theory (APT) model leads to a reduced-rank regression (RRR) model. So the results on RRR can be used to estimate APT models, making estimation very simple. We give a succinct derivation of estimation of RRR, derive the asymptotic variance of RRR

  11. Microergodicity effects on ebullition of methane modelled by Mixed Poisson process with Pareto mixing variable

    Czech Academy of Sciences Publication Activity Database

    Jordanova, P.; Dušek, Jiří; Stehlík, M.

    2013-01-01

    Roč. 128, OCT 15 (2013), s. 124-134 ISSN 0169-7439 R&D Projects: GA ČR(CZ) GAP504/11/1151; GA MŠk(CZ) ED1.1.00/02.0073 Institutional support: RVO:67179843 Keywords : environmental chemistry * ebullition of methane * mixed poisson processes * renewal process * pareto distribution * moving average process * robust statistics * sedge–grass marsh Subject RIV: EH - Ecology, Behaviour Impact factor: 2.381, year: 2013

  12. Predictions and implications of a poisson process model to describe corrosion of transuranic waste drums

    International Nuclear Information System (INIS)

    Lyon, B.F.; Holmes, J.A.; Wilbert, K.A.

    1995-01-01

    A risk assessment methodology is described in this paper to compare risks associated with immediate or near-term retrieval of transuranic (TRU) waste drums from bermed storage versus delayed retrieval. Assuming a Poisson process adequately describes corrosion, significant breaching of drums is expected to begin at - 15 and 24 yr for pitting and general corrosion, respectively. Because of this breaching, more risk will be incurred by delayed than by immediate retrieval

  13. Alternative regression models to assess increase in childhood BMI

    Directory of Open Access Journals (Sweden)

    Mansmann Ulrich

    2008-09-01

    Full Text Available Abstract Background Body mass index (BMI data usually have skewed distributions, for which common statistical modeling approaches such as simple linear or logistic regression have limitations. Methods Different regression approaches to predict childhood BMI by goodness-of-fit measures and means of interpretation were compared including generalized linear models (GLMs, quantile regression and Generalized Additive Models for Location, Scale and Shape (GAMLSS. We analyzed data of 4967 children participating in the school entry health examination in Bavaria, Germany, from 2001 to 2002. TV watching, meal frequency, breastfeeding, smoking in pregnancy, maternal obesity, parental social class and weight gain in the first 2 years of life were considered as risk factors for obesity. Results GAMLSS showed a much better fit regarding the estimation of risk factors effects on transformed and untransformed BMI data than common GLMs with respect to the generalized Akaike information criterion. In comparison with GAMLSS, quantile regression allowed for additional interpretation of prespecified distribution quantiles, such as quantiles referring to overweight or obesity. The variables TV watching, maternal BMI and weight gain in the first 2 years were directly, and meal frequency was inversely significantly associated with body composition in any model type examined. In contrast, smoking in pregnancy was not directly, and breastfeeding and parental social class were not inversely significantly associated with body composition in GLM models, but in GAMLSS and partly in quantile regression models. Risk factor specific BMI percentile curves could be estimated from GAMLSS and quantile regression models. Conclusion GAMLSS and quantile regression seem to be more appropriate than common GLMs for risk factor modeling of BMI data.

  14. Alternative regression models to assess increase in childhood BMI.

    Science.gov (United States)

    Beyerlein, Andreas; Fahrmeir, Ludwig; Mansmann, Ulrich; Toschke, André M

    2008-09-08

    Body mass index (BMI) data usually have skewed distributions, for which common statistical modeling approaches such as simple linear or logistic regression have limitations. Different regression approaches to predict childhood BMI by goodness-of-fit measures and means of interpretation were compared including generalized linear models (GLMs), quantile regression and Generalized Additive Models for Location, Scale and Shape (GAMLSS). We analyzed data of 4967 children participating in the school entry health examination in Bavaria, Germany, from 2001 to 2002. TV watching, meal frequency, breastfeeding, smoking in pregnancy, maternal obesity, parental social class and weight gain in the first 2 years of life were considered as risk factors for obesity. GAMLSS showed a much better fit regarding the estimation of risk factors effects on transformed and untransformed BMI data than common GLMs with respect to the generalized Akaike information criterion. In comparison with GAMLSS, quantile regression allowed for additional interpretation of prespecified distribution quantiles, such as quantiles referring to overweight or obesity. The variables TV watching, maternal BMI and weight gain in the first 2 years were directly, and meal frequency was inversely significantly associated with body composition in any model type examined. In contrast, smoking in pregnancy was not directly, and breastfeeding and parental social class were not inversely significantly associated with body composition in GLM models, but in GAMLSS and partly in quantile regression models. Risk factor specific BMI percentile curves could be estimated from GAMLSS and quantile regression models. GAMLSS and quantile regression seem to be more appropriate than common GLMs for risk factor modeling of BMI data.

  15. Robust mislabel logistic regression without modeling mislabel probabilities.

    Science.gov (United States)

    Hung, Hung; Jou, Zhi-Yu; Huang, Su-Yun

    2018-03-01

    Logistic regression is among the most widely used statistical methods for linear discriminant analysis. In many applications, we only observe possibly mislabeled responses. Fitting a conventional logistic regression can then lead to biased estimation. One common resolution is to fit a mislabel logistic regression model, which takes into consideration of mislabeled responses. Another common method is to adopt a robust M-estimation by down-weighting suspected instances. In this work, we propose a new robust mislabel logistic regression based on γ-divergence. Our proposal possesses two advantageous features: (1) It does not need to model the mislabel probabilities. (2) The minimum γ-divergence estimation leads to a weighted estimating equation without the need to include any bias correction term, that is, it is automatically bias-corrected. These features make the proposed γ-logistic regression more robust in model fitting and more intuitive for model interpretation through a simple weighting scheme. Our method is also easy to implement, and two types of algorithms are included. Simulation studies and the Pima data application are presented to demonstrate the performance of γ-logistic regression. © 2017, The International Biometric Society.

  16. A probabilistic Poisson-based model accounts for an extensive set of absolute auditory threshold measurements.

    Science.gov (United States)

    Heil, Peter; Matysiak, Artur; Neubauer, Heinrich

    2017-09-01

    Thresholds for detecting sounds in quiet decrease with increasing sound duration in every species studied. The neural mechanisms underlying this trade-off, often referred to as temporal integration, are not fully understood. Here, we probe the human auditory system with a large set of tone stimuli differing in duration, shape of the temporal amplitude envelope, duration of silent gaps between bursts, and frequency. Duration was varied by varying the plateau duration of plateau-burst (PB) stimuli, the duration of the onsets and offsets of onset-offset (OO) stimuli, and the number of identical bursts of multiple-burst (MB) stimuli. Absolute thresholds for a large number of ears (>230) were measured using a 3-interval-3-alternative forced choice (3I-3AFC) procedure. Thresholds decreased with increasing sound duration in a manner that depended on the temporal envelope. Most commonly, thresholds for MB stimuli were highest followed by thresholds for OO and PB stimuli of corresponding durations. Differences in the thresholds for MB and OO stimuli and in the thresholds for MB and PB stimuli, however, varied widely across ears, were negative in some ears, and were tightly correlated. We show that the variation and correlation of MB-OO and MB-PB threshold differences are linked to threshold microstructure, which affects the relative detectability of the sidebands of the MB stimuli and affects estimates of the bandwidth of auditory filters. We also found that thresholds for MB stimuli increased with increasing duration of the silent gaps between bursts. We propose a new model and show that it accurately accounts for our results and does so considerably better than a leaky-integrator-of-intensity model and a probabilistic model proposed by others. Our model is based on the assumption that sensory events are generated by a Poisson point process with a low rate in the absence of stimulation and higher, time-varying rates in the presence of stimulation. A subject in a 3I-3AFC

  17. A multi-Poisson dynamic mixture model to cluster developmental patterns of gene expression by RNA-seq.

    Science.gov (United States)

    Ye, Meixia; Wang, Zhong; Wang, Yaqun; Wu, Rongling

    2015-03-01

    Dynamic changes of gene expression reflect an intrinsic mechanism of how an organism responds to developmental and environmental signals. With the increasing availability of expression data across a time-space scale by RNA-seq, the classification of genes as per their biological function using RNA-seq data has become one of the most significant challenges in contemporary biology. Here we develop a clustering mixture model to discover distinct groups of genes expressed during a period of organ development. By integrating the density function of multivariate Poisson distribution, the model accommodates the discrete property of read counts characteristic of RNA-seq data. The temporal dependence of gene expression is modeled by the first-order autoregressive process. The model is implemented with the Expectation-Maximization algorithm and model selection to determine the optimal number of gene clusters and obtain the estimates of Poisson parameters that describe the pattern of time-dependent expression of genes from each cluster. The model has been demonstrated by analyzing a real data from an experiment aimed to link the pattern of gene expression to catkin development in white poplar. The usefulness of the model has been validated through computer simulation. The model provides a valuable tool for clustering RNA-seq data, facilitating our global view of expression dynamics and understanding of gene regulation mechanisms. © The Author 2014. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.

  18. Linear regression models for quantitative assessment of left ...

    African Journals Online (AJOL)

    Changes in left ventricular structures and function have been reported in cardiomyopathies. No prediction models have been established in this environment. This study established regression models for prediction of left ventricular structures in normal subjects. A sample of normal subjects was drawn from a large urban ...

  19. Geographically Weighted Logistic Regression Applied to Credit Scoring Models

    Directory of Open Access Journals (Sweden)

    Pedro Henrique Melo Albuquerque

    Full Text Available Abstract This study used real data from a Brazilian financial institution on transactions involving Consumer Direct Credit (CDC, granted to clients residing in the Distrito Federal (DF, to construct credit scoring models via Logistic Regression and Geographically Weighted Logistic Regression (GWLR techniques. The aims were: to verify whether the factors that influence credit risk differ according to the borrower’s geographic location; to compare the set of models estimated via GWLR with the global model estimated via Logistic Regression, in terms of predictive power and financial losses for the institution; and to verify the viability of using the GWLR technique to develop credit scoring models. The metrics used to compare the models developed via the two techniques were the AICc informational criterion, the accuracy of the models, the percentage of false positives, the sum of the value of false positive debt, and the expected monetary value of portfolio default compared with the monetary value of defaults observed. The models estimated for each region in the DF were distinct in their variables and coefficients (parameters, with it being concluded that credit risk was influenced differently in each region in the study. The Logistic Regression and GWLR methodologies presented very close results, in terms of predictive power and financial losses for the institution, and the study demonstrated viability in using the GWLR technique to develop credit scoring models for the target population in the study.

  20. Physics constrained nonlinear regression models for time series

    International Nuclear Information System (INIS)

    Majda, Andrew J; Harlim, John

    2013-01-01

    A central issue in contemporary science is the development of data driven statistical nonlinear dynamical models for time series of partial observations of nature or a complex physical model. It has been established recently that ad hoc quadratic multi-level regression (MLR) models can have finite-time blow up of statistical solutions and/or pathological behaviour of their invariant measure. Here a new class of physics constrained multi-level quadratic regression models are introduced, analysed and applied to build reduced stochastic models from data of nonlinear systems. These models have the advantages of incorporating memory effects in time as well as the nonlinear noise from energy conserving nonlinear interactions. The mathematical guidelines for the performance and behaviour of these physics constrained MLR models as well as filtering algorithms for their implementation are developed here. Data driven applications of these new multi-level nonlinear regression models are developed for test models involving a nonlinear oscillator with memory effects and the difficult test case of the truncated Burgers–Hopf model. These new physics constrained quadratic MLR models are proposed here as process models for Bayesian estimation through Markov chain Monte Carlo algorithms of low frequency behaviour in complex physical data. (paper)

  1. Geographically weighted negative binomial regression applied to zonal level safety performance models.

    Science.gov (United States)

    Gomes, Marcos José Timbó Lima; Cunto, Flávio; da Silva, Alan Ricardo

    2017-09-01

    Generalized Linear Models (GLM) with negative binomial distribution for errors, have been widely used to estimate safety at the level of transportation planning. The limited ability of this technique to take spatial effects into account can be overcome through the use of local models from spatial regression techniques, such as Geographically Weighted Poisson Regression (GWPR). Although GWPR is a system that deals with spatial dependency and heterogeneity and has already been used in some road safety studies at the planning level, it fails to account for the possible overdispersion that can be found in the observations on road-traffic crashes. Two approaches were adopted for the Geographically Weighted Negative Binomial Regression (GWNBR) model to allow discrete data to be modeled in a non-stationary form and to take note of the overdispersion of the data: the first examines the constant overdispersion for all the traffic zones and the second includes the variable for each spatial unit. This research conducts a comparative analysis between non-spatial global crash prediction models and spatial local GWPR and GWNBR at the level of traffic zones in Fortaleza/Brazil. A geographic database of 126 traffic zones was compiled from the available data on exposure, network characteristics, socioeconomic factors and land use. The models were calibrated by using the frequency of injury crashes as a dependent variable and the results showed that GWPR and GWNBR achieved a better performance than GLM for the average residuals and likelihood as well as reducing the spatial autocorrelation of the residuals, and the GWNBR model was more able to capture the spatial heterogeneity of the crash frequency. Copyright © 2017 Elsevier Ltd. All rights reserved.

  2. Forecasting daily meteorological time series using ARIMA and regression models

    Science.gov (United States)

    Murat, Małgorzata; Malinowska, Iwona; Gos, Magdalena; Krzyszczak, Jaromir

    2018-04-01

    The daily air temperature and precipitation time series recorded between January 1, 1980 and December 31, 2010 in four European sites (Jokioinen, Dikopshof, Lleida and Lublin) from different climatic zones were modeled and forecasted. In our forecasting we used the methods of the Box-Jenkins and Holt- Winters seasonal auto regressive integrated moving-average, the autoregressive integrated moving-average with external regressors in the form of Fourier terms and the time series regression, including trend and seasonality components methodology with R software. It was demonstrated that obtained models are able to capture the dynamics of the time series data and to produce sensible forecasts.

  3. Multiple Response Regression for Gaussian Mixture Models with Known Labels.

    Science.gov (United States)

    Lee, Wonyul; Du, Ying; Sun, Wei; Hayes, D Neil; Liu, Yufeng

    2012-12-01

    Multiple response regression is a useful regression technique to model multiple response variables using the same set of predictor variables. Most existing methods for multiple response regression are designed for modeling homogeneous data. In many applications, however, one may have heterogeneous data where the samples are divided into multiple groups. Our motivating example is a cancer dataset where the samples belong to multiple cancer subtypes. In this paper, we consider modeling the data coming from a mixture of several Gaussian distributions with known group labels. A naive approach is to split the data into several groups according to the labels and model each group separately. Although it is simple, this approach ignores potential common structures across different groups. We propose new penalized methods to model all groups jointly in which the common and unique structures can be identified. The proposed methods estimate the regression coefficient matrix, as well as the conditional inverse covariance matrix of response variables. Asymptotic properties of the proposed methods are explored. Through numerical examples, we demonstrate that both estimation and prediction can be improved by modeling all groups jointly using the proposed methods. An application to a glioblastoma cancer dataset reveals some interesting common and unique gene relationships across different cancer subtypes.

  4. Thermal Efficiency Degradation Diagnosis Method Using Regression Model

    International Nuclear Information System (INIS)

    Jee, Chang Hyun; Heo, Gyun Young; Jang, Seok Won; Lee, In Cheol

    2011-01-01

    This paper proposes an idea for thermal efficiency degradation diagnosis in turbine cycles, which is based on turbine cycle simulation under abnormal conditions and a linear regression model. The correlation between the inputs for representing degradation conditions (normally unmeasured but intrinsic states) and the simulation outputs (normally measured but superficial states) was analyzed with the linear regression model. The regression models can inversely response an associated intrinsic state for a superficial state observed from a power plant. The diagnosis method proposed herein is classified into three processes, 1) simulations for degradation conditions to get measured states (referred as what-if method), 2) development of the linear model correlating intrinsic and superficial states, and 3) determination of an intrinsic state using the superficial states of current plant and the linear regression model (referred as inverse what-if method). The what-if method is to generate the outputs for the inputs including various root causes and/or boundary conditions whereas the inverse what-if method is the process of calculating the inverse matrix with the given superficial states, that is, component degradation modes. The method suggested in this paper was validated using the turbine cycle model for an operating power plant

  5. Regression modeling strategies with applications to linear models, logistic and ordinal regression, and survival analysis

    CERN Document Server

    Harrell , Jr , Frank E

    2015-01-01

    This highly anticipated second edition features new chapters and sections, 225 new references, and comprehensive R software. In keeping with the previous edition, this book is about the art and science of data analysis and predictive modeling, which entails choosing and using multiple tools. Instead of presenting isolated techniques, this text emphasizes problem solving strategies that address the many issues arising when developing multivariable models using real data and not standard textbook examples. It includes imputation methods for dealing with missing data effectively, methods for fitting nonlinear relationships and for making the estimation of transformations a formal part of the modeling process, methods for dealing with "too many variables to analyze and not enough observations," and powerful model validation techniques based on the bootstrap.  The reader will gain a keen understanding of predictive accuracy, and the harm of categorizing continuous predictors or outcomes.  This text realistically...

  6. A Logistic Regression Based Auto Insurance Rate-Making Model Designed for the Insurance Rate Reform

    Directory of Open Access Journals (Sweden)

    Zhengmin Duan

    2018-02-01

    Full Text Available Using a generalized linear model to determine the claim frequency of auto insurance is a key ingredient in non-life insurance research. Among auto insurance rate-making models, there are very few considering auto types. Therefore, in this paper we are proposing a model that takes auto types into account by making an innovative use of the auto burden index. Based on this model and data from a Chinese insurance company, we built a clustering model that classifies auto insurance rates into three risk levels. The claim frequency and the claim costs are fitted to select a better loss distribution. Then the Logistic Regression model is employed to fit the claim frequency, with the auto burden index considered. Three key findings can be concluded from our study. First, more than 80% of the autos with an auto burden index of 20 or higher belong to the highest risk level. Secondly, the claim frequency is better fitted using the Poisson distribution, however the claim cost is better fitted using the Gamma distribution. Lastly, based on the AIC criterion, the claim frequency is more adequately represented by models that consider the auto burden index than those do not. It is believed that insurance policy recommendations that are based on Generalized linear models (GLM can benefit from our findings.

  7. Flexible competing risks regression modeling and goodness-of-fit

    DEFF Research Database (Denmark)

    Scheike, Thomas; Zhang, Mei-Jie

    2008-01-01

    In this paper we consider different approaches for estimation and assessment of covariate effects for the cumulative incidence curve in the competing risks model. The classic approach is to model all cause-specific hazards and then estimate the cumulative incidence curve based on these cause...... models that is easy to fit and contains the Fine-Gray model as a special case. One advantage of this approach is that our regression modeling allows for non-proportional hazards. This leads to a new simple goodness-of-fit procedure for the proportional subdistribution hazards assumption that is very easy...... of the flexible regression models to analyze competing risks data when non-proportionality is present in the data....

  8. The art of regression modeling in road safety

    CERN Document Server

    Hauer, Ezra

    2015-01-01

    This unique book explains how to fashion useful regression models from commonly available data to erect models essential for evidence-based road safety management and research. Composed from techniques and best practices presented over many years of lectures and workshops, The Art of Regression Modeling in Road Safety illustrates that fruitful modeling cannot be done without substantive knowledge about the modeled phenomenon. Class-tested in courses and workshops across North America, the book is ideal for professionals, researchers, university professors, and graduate students with an interest in, or responsibilities related to, road safety. This book also: · Presents for the first time a powerful analytical tool for road safety researchers and practitioners · Includes problems and solutions in each chapter as well as data and spreadsheets for running models and PowerPoint presentation slides · Features pedagogy well-suited for graduate courses and workshops including problems, solutions, and PowerPoint p...

  9. Model building strategy for logistic regression: purposeful selection.

    Science.gov (United States)

    Zhang, Zhongheng

    2016-03-01

    Logistic regression is one of the most commonly used models to account for confounders in medical literature. The article introduces how to perform purposeful selection model building strategy with R. I stress on the use of likelihood ratio test to see whether deleting a variable will have significant impact on model fit. A deleted variable should also be checked for whether it is an important adjustment of remaining covariates. Interaction should be checked to disentangle complex relationship between covariates and their synergistic effect on response variable. Model should be checked for the goodness-of-fit (GOF). In other words, how the fitted model reflects the real data. Hosmer-Lemeshow GOF test is the most widely used for logistic regression model.

  10. Linear and Poisson models for genetic evaluation of tick resistance in cross-bred Hereford x Nellore cattle.

    Science.gov (United States)

    Ayres, D R; Pereira, R J; Boligon, A A; Silva, F F; Schenkel, F S; Roso, V M; Albuquerque, L G

    2013-12-01

    Cattle resistance to ticks is measured by the number of ticks infesting the animal. The model used for the genetic analysis of cattle resistance to ticks frequently requires logarithmic transformation of the observations. The objective of this study was to evaluate the predictive ability and goodness of fit of different models for the analysis of this trait in cross-bred Hereford x Nellore cattle. Three models were tested: a linear model using logarithmic transformation of the observations (MLOG); a linear model without transformation of the observations (MLIN); and a generalized linear Poisson model with residual term (MPOI). All models included the classificatory effects of contemporary group and genetic group and the covariates age of animal at the time of recording and individual heterozygosis, as well as additive genetic effects as random effects. Heritability estimates were 0.08 ± 0.02, 0.10 ± 0.02 and 0.14 ± 0.04 for MLIN, MLOG and MPOI models, respectively. The model fit quality, verified by deviance information criterion (DIC) and residual mean square, indicated fit superiority of MPOI model. The predictive ability of the models was compared by validation test in independent sample. The MPOI model was slightly superior in terms of goodness of fit and predictive ability, whereas the correlations between observed and predicted tick counts were practically the same for all models. A higher rank correlation between breeding values was observed between models MLOG and MPOI. Poisson model can be used for the selection of tick-resistant animals. © 2013 Blackwell Verlag GmbH.

  11. Regression analysis of a chemical reaction fouling model

    International Nuclear Information System (INIS)

    Vasak, F.; Epstein, N.

    1996-01-01

    A previously reported mathematical model for the initial chemical reaction fouling of a heated tube is critically examined in the light of the experimental data for which it was developed. A regression analysis of the model with respect to that data shows that the reference point upon which the two adjustable parameters of the model were originally based was well chosen, albeit fortuitously. (author). 3 refs., 2 tabs., 2 figs

  12. Spatial stochastic regression modelling of urban land use

    International Nuclear Information System (INIS)

    Arshad, S H M; Jaafar, J; Abiden, M Z Z; Latif, Z A; Rasam, A R A

    2014-01-01

    Urbanization is very closely linked to industrialization, commercialization or overall economic growth and development. This results in innumerable benefits of the quantity and quality of the urban environment and lifestyle but on the other hand contributes to unbounded development, urban sprawl, overcrowding and decreasing standard of living. Regulation and observation of urban development activities is crucial. The understanding of urban systems that promotes urban growth are also essential for the purpose of policy making, formulating development strategies as well as development plan preparation. This study aims to compare two different stochastic regression modeling techniques for spatial structure models of urban growth in the same specific study area. Both techniques will utilize the same datasets and their results will be analyzed. The work starts by producing an urban growth model by using stochastic regression modeling techniques namely the Ordinary Least Square (OLS) and Geographically Weighted Regression (GWR). The two techniques are compared to and it is found that, GWR seems to be a more significant stochastic regression model compared to OLS, it gives a smaller AICc (Akaike's Information Corrected Criterion) value and its output is more spatially explainable

  13. Direction of Effects in Multiple Linear Regression Models.

    Science.gov (United States)

    Wiedermann, Wolfgang; von Eye, Alexander

    2015-01-01

    Previous studies analyzed asymmetric properties of the Pearson correlation coefficient using higher than second order moments. These asymmetric properties can be used to determine the direction of dependence in a linear regression setting (i.e., establish which of two variables is more likely to be on the outcome side) within the framework of cross-sectional observational data. Extant approaches are restricted to the bivariate regression case. The present contribution extends the direction of dependence methodology to a multiple linear regression setting by analyzing distributional properties of residuals of competing multiple regression models. It is shown that, under certain conditions, the third central moments of estimated regression residuals can be used to decide upon direction of effects. In addition, three different approaches for statistical inference are discussed: a combined D'Agostino normality test, a skewness difference test, and a bootstrap difference test. Type I error and power of the procedures are assessed using Monte Carlo simulations, and an empirical example is provided for illustrative purposes. In the discussion, issues concerning the quality of psychological data, possible extensions of the proposed methods to the fourth central moment of regression residuals, and potential applications are addressed.

  14. Logistic regression for risk factor modelling in stuttering research.

    Science.gov (United States)

    Reed, Phil; Wu, Yaqionq

    2013-06-01

    To outline the uses of logistic regression and other statistical methods for risk factor analysis in the context of research on stuttering. The principles underlying the application of a logistic regression are illustrated, and the types of questions to which such a technique has been applied in the stuttering field are outlined. The assumptions and limitations of the technique are discussed with respect to existing stuttering research, and with respect to formulating appropriate research strategies to accommodate these considerations. Finally, some alternatives to the approach are briefly discussed. The way the statistical procedures are employed are demonstrated with some hypothetical data. Research into several practical issues concerning stuttering could benefit if risk factor modelling were used. Important examples are early diagnosis, prognosis (whether a child will recover or persist) and assessment of treatment outcome. After reading this article you will: (a) Summarize the situations in which logistic regression can be applied to a range of issues about stuttering; (b) Follow the steps in performing a logistic regression analysis; (c) Describe the assumptions of the logistic regression technique and the precautions that need to be checked when it is employed; (d) Be able to summarize its advantages over other techniques like estimation of group differences and simple regression. Copyright © 2012 Elsevier Inc. All rights reserved.

  15. Modeling and prediction of flotation performance using support vector regression

    Directory of Open Access Journals (Sweden)

    Despotović Vladimir

    2017-01-01

    Full Text Available Continuous efforts have been made in recent year to improve the process of paper recycling, as it is of critical importance for saving the wood, water and energy resources. Flotation deinking is considered to be one of the key methods for separation of ink particles from the cellulose fibres. Attempts to model the flotation deinking process have often resulted in complex models that are difficult to implement and use. In this paper a model for prediction of flotation performance based on Support Vector Regression (SVR, is presented. Representative data samples were created in laboratory, under a variety of practical control variables for the flotation deinking process, including different reagents, pH values and flotation residence time. Predictive model was created that was trained on these data samples, and the flotation performance was assessed showing that Support Vector Regression is a promising method even when dataset used for training the model is limited.

  16. Variable Selection for Regression Models of Percentile Flows

    Science.gov (United States)

    Fouad, G.

    2017-12-01

    Percentile flows describe the flow magnitude equaled or exceeded for a given percent of time, and are widely used in water resource management. However, these statistics are normally unavailable since most basins are ungauged. Percentile flows of ungauged basins are often predicted using regression models based on readily observable basin characteristics, such as mean elevation. The number of these independent variables is too large to evaluate all possible models. A subset of models is typically evaluated using automatic procedures, like stepwise regression. This ignores a large variety of methods from the field of feature (variable) selection and physical understanding of percentile flows. A study of 918 basins in the United States was conducted to compare an automatic regression procedure to the following variable selection methods: (1) principal component analysis, (2) correlation analysis, (3) random forests, (4) genetic programming, (5) Bayesian networks, and (6) physical understanding. The automatic regression procedure only performed better than principal component analysis. Poor performance of the regression procedure was due to a commonly used filter for multicollinearity, which rejected the strongest models because they had cross-correlated independent variables. Multicollinearity did not decrease model performance in validation because of a representative set of calibration basins. Variable selection methods based strictly on predictive power (numbers 2-5 from above) performed similarly, likely indicating a limit to the predictive power of the variables. Similar performance was also reached using variables selected based on physical understanding, a finding that substantiates recent calls to emphasize physical understanding in modeling for predictions in ungauged basins. The strongest variables highlighted the importance of geology and land cover, whereas widely used topographic variables were the weakest predictors. Variables suffered from a high

  17. Linearity and Misspecification Tests for Vector Smooth Transition Regression Models

    DEFF Research Database (Denmark)

    Teräsvirta, Timo; Yang, Yukai

    The purpose of the paper is to derive Lagrange multiplier and Lagrange multiplier type specification and misspecification tests for vector smooth transition regression models. We report results from simulation studies in which the size and power properties of the proposed asymptotic tests in small...

  18. Application of multilinear regression analysis in modeling of soil ...

    African Journals Online (AJOL)

    The application of Multi-Linear Regression Analysis (MLRA) model for predicting soil properties in Calabar South offers a technical guide and solution in foundation designs problems in the area. Forty-five soil samples were collected from fifteen different boreholes at a different depth and 270 tests were carried out for CBR, ...

  19. Efficient estimation of an additive quantile regression model

    NARCIS (Netherlands)

    Cheng, Y.; de Gooijer, J.G.; Zerom, D.

    2009-01-01

    In this paper two kernel-based nonparametric estimators are proposed for estimating the components of an additive quantile regression model. The first estimator is a computationally convenient approach which can be viewed as a viable alternative to the method of De Gooijer and Zerom (2003). By

  20. Efficient estimation of an additive quantile regression model

    NARCIS (Netherlands)

    Cheng, Y.; de Gooijer, J.G.; Zerom, D.

    2010-01-01

    In this paper two kernel-based nonparametric estimators are proposed for estimating the components of an additive quantile regression model. The first estimator is a computationally convenient approach which can be viewed as a viable alternative to the method of De Gooijer and Zerom (2003). By

  1. Efficient estimation of an additive quantile regression model

    NARCIS (Netherlands)

    Cheng, Y.; de Gooijer, J.G.; Zerom, D.

    2011-01-01

    In this paper, two non-parametric estimators are proposed for estimating the components of an additive quantile regression model. The first estimator is a computationally convenient approach which can be viewed as a more viable alternative to existing kernel-based approaches. The second estimator

  2. A binary logistic regression model with complex sampling design of ...

    African Journals Online (AJOL)

    2017-09-03

    Sep 3, 2017 ... Bi-variable and multi-variable binary logistic regression model with complex sampling design was fitted. .... Data was entered into STATA-12 and analyzed using. SPSS-21. .... lack of access/too far or costs too much. 35. 1.2.

  3. Transpiration of glasshouse rose crops: evaluation of regression models

    NARCIS (Netherlands)

    Baas, R.; Rijssel, van E.

    2006-01-01

    Regression models of transpiration (T) based on global radiation inside the greenhouse (G), with or without energy input from heating pipes (Eh) and/or vapor pressure deficit (VPD) were parameterized. Therefore, data on T, G, temperatures from air, canopy and heating pipes, and VPD from both a

  4. Harnessing the theoretical foundations of the exponential and beta-Poisson dose-response models to quantify parameter uncertainty using Markov Chain Monte Carlo.

    Science.gov (United States)

    Schmidt, Philip J; Pintar, Katarina D M; Fazil, Aamir M; Topp, Edward

    2013-09-01

    Dose-response models are the essential link between exposure assessment and computed risk values in quantitative microbial risk assessment, yet the uncertainty that is inherent to computed risks because the dose-response model parameters are estimated using limited epidemiological data is rarely quantified. Second-order risk characterization approaches incorporating uncertainty in dose-response model parameters can provide more complete information to decisionmakers by separating variability and uncertainty to quantify the uncertainty in computed risks. Therefore, the objective of this work is to develop procedures to sample from posterior distributions describing uncertainty in the parameters of exponential and beta-Poisson dose-response models using Bayes's theorem and Markov Chain Monte Carlo (in OpenBUGS). The theoretical origins of the beta-Poisson dose-response model are used to identify a decomposed version of the model that enables Bayesian analysis without the need to evaluate Kummer confluent hypergeometric functions. Herein, it is also established that the beta distribution in the beta-Poisson dose-response model cannot address variation among individual pathogens, criteria to validate use of the conventional approximation to the beta-Poisson model are proposed, and simple algorithms to evaluate actual beta-Poisson probabilities of infection are investigated. The developed MCMC procedures are applied to analysis of a case study data set, and it is demonstrated that an important region of the posterior distribution of the beta-Poisson dose-response model parameters is attributable to the absence of low-dose data. This region includes beta-Poisson models for which the conventional approximation is especially invalid and in which many beta distributions have an extreme shape with questionable plausibility. © Her Majesty the Queen in Right of Canada 2013. Reproduced with the permission of the Minister of the Public Health Agency of Canada.

  5. Adjusting for overdispersion in piecewise exponential regression models to estimate excess mortality rate in population-based research.

    Science.gov (United States)

    Luque-Fernandez, Miguel Angel; Belot, Aurélien; Quaresma, Manuela; Maringe, Camille; Coleman, Michel P; Rachet, Bernard

    2016-10-01

    In population-based cancer research, piecewise exponential regression models are used to derive adjusted estimates of excess mortality due to cancer using the Poisson generalized linear modelling framework. However, the assumption that the conditional mean and variance of the rate parameter given the set of covariates x i are equal is strong and may fail to account for overdispersion given the variability of the rate parameter (the variance exceeds the mean). Using an empirical example, we aimed to describe simple methods to test and correct for overdispersion. We used a regression-based score test for overdispersion under the relative survival framework and proposed different approaches to correct for overdispersion including a quasi-likelihood, robust standard errors estimation, negative binomial regression and flexible piecewise modelling. All piecewise exponential regression models showed the presence of significant inherent overdispersion (p-value regression modelling, with either a quasi-likelihood or robust standard errors, was the best approach as it deals with both, overdispersion due to model misspecification and true or inherent overdispersion.

  6. Extending minimal repair models for repairable systems: A comparison of dynamic and heterogeneous extensions of a nonhomogeneous Poisson process

    International Nuclear Information System (INIS)

    Asfaw, Zeytu Gashaw; Lindqvist, Bo Henry

    2015-01-01

    For many applications of repairable systems, the minimal repair assumption, which leads to nonhomogeneous Poisson processes (NHPP), is not adequate. We review and study two extensions of the NHPP, the dynamic NHPP and the heterogeneous NHPP. Both extensions are motivated by specific aspects of potential applications. It has long been known, however, that the two paradigms are essentially indistinguishable in an analysis of failure data. We investigate the connection between the two approaches for extending NHPP models, both theoretically and numerically in a data example and a simulation study. - Highlights: • Review of dynamic extension of a minimal repair model (LEYP), introduced by Le Gat. • Derivation of likelihood function and comparison to NHPP model with heterogeneity. • Likelihood functions and conditional intensities are similar for the models. • ML estimation is considered for both models using a power law baseline. • A simulation study illustrates and confirms findings of the theoretical study

  7. Testing a Poisson counter model for visual identification of briefly presented, mutually confusable single stimuli in pure accuracy tasks.

    Science.gov (United States)

    Kyllingsbæk, Søren; Markussen, Bo; Bundesen, Claus

    2012-06-01

    The authors propose and test a simple model of the time course of visual identification of briefly presented, mutually confusable single stimuli in pure accuracy tasks. The model implies that during stimulus analysis, tentative categorizations that stimulus i belongs to category j are made at a constant Poisson rate, v(i, j). The analysis is continued until the stimulus disappears, and the overt response is based on the categorization made the greatest number of times. The model was evaluated by Monte Carlo tests of goodness of fit against observed probability distributions of responses in two extensive experiments and also by quantifications of the information loss of the model compared with the observed data by use of information theoretic measures. The model provided a close fit to individual data on identification of digits and an apparently perfect fit to data on identification of Landolt rings.

  8. Approximating prediction uncertainty for random forest regression models

    Science.gov (United States)

    John W. Coulston; Christine E. Blinn; Valerie A. Thomas; Randolph H. Wynne

    2016-01-01

    Machine learning approaches such as random forest have increased for the spatial modeling and mapping of continuous variables. Random forest is a non-parametric ensemble approach, and unlike traditional regression approaches there is no direct quantification of prediction error. Understanding prediction uncertainty is important when using model-based continuous maps as...

  9. CICAAR - Convolutive ICA with an Auto-Regressive Inverse Model

    DEFF Research Database (Denmark)

    Dyrholm, Mads; Hansen, Lars Kai

    2004-01-01

    We invoke an auto-regressive IIR inverse model for convolutive ICA and derive expressions for the likelihood and its gradient. We argue that optimization will give a stable inverse. When there are more sensors than sources the mixing model parameters are estimated in a second step by least square...... estimation. We demonstrate the method on synthetic data and finally separate speech and music in a real room recording....

  10. On concurvity in nonlinear and nonparametric regression models

    Directory of Open Access Journals (Sweden)

    Sonia Amodio

    2014-12-01

    Full Text Available When data are affected by multicollinearity in the linear regression framework, then concurvity will be present in fitting a generalized additive model (GAM. The term concurvity describes nonlinear dependencies among the predictor variables. As collinearity results in inflated variance of the estimated regression coefficients in the linear regression model, the result of the presence of concurvity leads to instability of the estimated coefficients in GAMs. Even if the backfitting algorithm will always converge to a solution, in case of concurvity the final solution of the backfitting procedure in fitting a GAM is influenced by the starting functions. While exact concurvity is highly unlikely, approximate concurvity, the analogue of multicollinearity, is of practical concern as it can lead to upwardly biased estimates of the parameters and to underestimation of their standard errors, increasing the risk of committing type I error. We compare the existing approaches to detect concurvity, pointing out their advantages and drawbacks, using simulated and real data sets. As a result, this paper will provide a general criterion to detect concurvity in nonlinear and non parametric regression models.

  11. Regression Models and Fuzzy Logic Prediction of TBM Penetration Rate

    Directory of Open Access Journals (Sweden)

    Minh Vu Trieu

    2017-03-01

    Full Text Available This paper presents statistical analyses of rock engineering properties and the measured penetration rate of tunnel boring machine (TBM based on the data of an actual project. The aim of this study is to analyze the influence of rock engineering properties including uniaxial compressive strength (UCS, Brazilian tensile strength (BTS, rock brittleness index (BI, the distance between planes of weakness (DPW, and the alpha angle (Alpha between the tunnel axis and the planes of weakness on the TBM rate of penetration (ROP. Four (4 statistical regression models (two linear and two nonlinear are built to predict the ROP of TBM. Finally a fuzzy logic model is developed as an alternative method and compared to the four statistical regression models. Results show that the fuzzy logic model provides better estimations and can be applied to predict the TBM performance. The R-squared value (R2 of the fuzzy logic model scores the highest value of 0.714 over the second runner-up of 0.667 from the multiple variables nonlinear regression model.

  12. Regression Models and Fuzzy Logic Prediction of TBM Penetration Rate

    Science.gov (United States)

    Minh, Vu Trieu; Katushin, Dmitri; Antonov, Maksim; Veinthal, Renno

    2017-03-01

    This paper presents statistical analyses of rock engineering properties and the measured penetration rate of tunnel boring machine (TBM) based on the data of an actual project. The aim of this study is to analyze the influence of rock engineering properties including uniaxial compressive strength (UCS), Brazilian tensile strength (BTS), rock brittleness index (BI), the distance between planes of weakness (DPW), and the alpha angle (Alpha) between the tunnel axis and the planes of weakness on the TBM rate of penetration (ROP). Four (4) statistical regression models (two linear and two nonlinear) are built to predict the ROP of TBM. Finally a fuzzy logic model is developed as an alternative method and compared to the four statistical regression models. Results show that the fuzzy logic model provides better estimations and can be applied to predict the TBM performance. The R-squared value (R2) of the fuzzy logic model scores the highest value of 0.714 over the second runner-up of 0.667 from the multiple variables nonlinear regression model.

  13. Detection of Outliers in Regression Model for Medical Data

    Directory of Open Access Journals (Sweden)

    Stephen Raj S

    2017-07-01

    Full Text Available In regression analysis, an outlier is an observation for which the residual is large in magnitude compared to other observations in the data set. The detection of outliers and influential points is an important step of the regression analysis. Outlier detection methods have been used to detect and remove anomalous values from data. In this paper, we detect the presence of outliers in simple linear regression models for medical data set. Chatterjee and Hadi mentioned that the ordinary residuals are not appropriate for diagnostic purposes; a transformed version of them is preferable. First, we investigate the presence of outliers based on existing procedures of residuals and standardized residuals. Next, we have used the new approach of standardized scores for detecting outliers without the use of predicted values. The performance of the new approach was verified with the real-life data.

  14. Hierarchical Neural Regression Models for Customer Churn Prediction

    Directory of Open Access Journals (Sweden)

    Golshan Mohammadi

    2013-01-01

    Full Text Available As customers are the main assets of each industry, customer churn prediction is becoming a major task for companies to remain in competition with competitors. In the literature, the better applicability and efficiency of hierarchical data mining techniques has been reported. This paper considers three hierarchical models by combining four different data mining techniques for churn prediction, which are backpropagation artificial neural networks (ANN, self-organizing maps (SOM, alpha-cut fuzzy c-means (α-FCM, and Cox proportional hazards regression model. The hierarchical models are ANN + ANN + Cox, SOM + ANN + Cox, and α-FCM + ANN + Cox. In particular, the first component of the models aims to cluster data in two churner and nonchurner groups and also filter out unrepresentative data or outliers. Then, the clustered data as the outputs are used to assign customers to churner and nonchurner groups by the second technique. Finally, the correctly classified data are used to create Cox proportional hazards model. To evaluate the performance of the hierarchical models, an Iranian mobile dataset is considered. The experimental results show that the hierarchical models outperform the single Cox regression baseline model in terms of prediction accuracy, Types I and II errors, RMSE, and MAD metrics. In addition, the α-FCM + ANN + Cox model significantly performs better than the two other hierarchical models.

  15. The PIT-trap-A "model-free" bootstrap procedure for inference about regression models with discrete, multivariate responses.

    Science.gov (United States)

    Warton, David I; Thibaut, Loïc; Wang, Yi Alice

    2017-01-01

    Bootstrap methods are widely used in statistics, and bootstrapping of residuals can be especially useful in the regression context. However, difficulties are encountered extending residual resampling to regression settings where residuals are not identically distributed (thus not amenable to bootstrapping)-common examples including logistic or Poisson regression and generalizations to handle clustered or multivariate data, such as generalised estimating equations. We propose a bootstrap method based on probability integral transform (PIT-) residuals, which we call the PIT-trap, which assumes data come from some marginal distribution F of known parametric form. This method can be understood as a type of "model-free bootstrap", adapted to the problem of discrete and highly multivariate data. PIT-residuals have the key property that they are (asymptotically) pivotal. The PIT-trap thus inherits the key property, not afforded by any other residual resampling approach, that the marginal distribution of data can be preserved under PIT-trapping. This in turn enables the derivation of some standard bootstrap properties, including second-order correctness of pivotal PIT-trap test statistics. In multivariate data, bootstrapping rows of PIT-residuals affords the property that it preserves correlation in data without the need for it to be modelled, a key point of difference as compared to a parametric bootstrap. The proposed method is illustrated on an example involving multivariate abundance data in ecology, and demonstrated via simulation to have improved properties as compared to competing resampling methods.

  16. Electricity consumption forecasting in Italy using linear regression models

    Energy Technology Data Exchange (ETDEWEB)

    Bianco, Vincenzo; Manca, Oronzio; Nardini, Sergio [DIAM, Seconda Universita degli Studi di Napoli, Via Roma 29, 81031 Aversa (CE) (Italy)

    2009-09-15

    The influence of economic and demographic variables on the annual electricity consumption in Italy has been investigated with the intention to develop a long-term consumption forecasting model. The time period considered for the historical data is from 1970 to 2007. Different regression models were developed, using historical electricity consumption, gross domestic product (GDP), gross domestic product per capita (GDP per capita) and population. A first part of the paper considers the estimation of GDP, price and GDP per capita elasticities of domestic and non-domestic electricity consumption. The domestic and non-domestic short run price elasticities are found to be both approximately equal to -0.06, while long run elasticities are equal to -0.24 and -0.09, respectively. On the contrary, the elasticities of GDP and GDP per capita present higher values. In the second part of the paper, different regression models, based on co-integrated or stationary data, are presented. Different statistical tests are employed to check the validity of the proposed models. A comparison with national forecasts, based on complex econometric models, such as Markal-Time, was performed, showing that the developed regressions are congruent with the official projections, with deviations of {+-}1% for the best case and {+-}11% for the worst. These deviations are to be considered acceptable in relation to the time span taken into account. (author)

  17. Electricity consumption forecasting in Italy using linear regression models

    International Nuclear Information System (INIS)

    Bianco, Vincenzo; Manca, Oronzio; Nardini, Sergio

    2009-01-01

    The influence of economic and demographic variables on the annual electricity consumption in Italy has been investigated with the intention to develop a long-term consumption forecasting model. The time period considered for the historical data is from 1970 to 2007. Different regression models were developed, using historical electricity consumption, gross domestic product (GDP), gross domestic product per capita (GDP per capita) and population. A first part of the paper considers the estimation of GDP, price and GDP per capita elasticities of domestic and non-domestic electricity consumption. The domestic and non-domestic short run price elasticities are found to be both approximately equal to -0.06, while long run elasticities are equal to -0.24 and -0.09, respectively. On the contrary, the elasticities of GDP and GDP per capita present higher values. In the second part of the paper, different regression models, based on co-integrated or stationary data, are presented. Different statistical tests are employed to check the validity of the proposed models. A comparison with national forecasts, based on complex econometric models, such as Markal-Time, was performed, showing that the developed regressions are congruent with the official projections, with deviations of ±1% for the best case and ±11% for the worst. These deviations are to be considered acceptable in relation to the time span taken into account. (author)

  18. Derivation of Poisson and Nernst-Planck equations in a bath and channel from a molecular model.

    Science.gov (United States)

    Schuss, Z; Nadler, B; Eisenberg, R S

    2001-09-01

    Permeation of ions from one electrolytic solution to another, through a protein channel, is a biological process of considerable importance. Permeation occurs on a time scale of micro- to milliseconds, far longer than the femtosecond time scales of atomic motion. Direct simulations of atomic dynamics are not yet possible for such long-time scales; thus, averaging is unavoidable. The question is what and how to average. In this paper, we average a Langevin model of ionic motion in a bulk solution and protein channel. The main result is a coupled system of averaged Poisson and Nernst-Planck equations (CPNP) involving conditional and unconditional charge densities and conditional potentials. The resulting NP equations contain the averaged force on a single ion, which is the sum of two components. The first component is the gradient of a conditional electric potential that is the solution of Poisson's equation with conditional and permanent charge densities and boundary conditions of the applied voltage. The second component is the self-induced force on an ion due to surface charges induced only by that ion at dielectric interfaces. The ion induces surface polarization charge that exerts a significant force on the ion itself, not present in earlier PNP equations. The proposed CPNP system is not complete, however, because the electric potential satisfies Poisson's equation with conditional charge densities, conditioned on the location of an ion, while the NP equations contain unconditional densities. The conditional densities are closely related to the well-studied pair-correlation functions of equilibrium statistical mechanics. We examine a specific closure relation, which on the one hand replaces the conditional charge densities by the unconditional ones in the Poisson equation, and on the other hand replaces the self-induced force in the NP equation by an effective self-induced force. This effective self-induced force is nearly zero in the baths but is approximately

  19. Regression Model to Predict Global Solar Irradiance in Malaysia

    Directory of Open Access Journals (Sweden)

    Hairuniza Ahmed Kutty

    2015-01-01

    Full Text Available A novel regression model is developed to estimate the monthly global solar irradiance in Malaysia. The model is developed based on different available meteorological parameters, including temperature, cloud cover, rain precipitate, relative humidity, wind speed, pressure, and gust speed, by implementing regression analysis. This paper reports on the details of the analysis of the effect of each prediction parameter to identify the parameters that are relevant to estimating global solar irradiance. In addition, the proposed model is compared in terms of the root mean square error (RMSE, mean bias error (MBE, and the coefficient of determination (R2 with other models available from literature studies. Seven models based on single parameters (PM1 to PM7 and five multiple-parameter models (PM7 to PM12 are proposed. The new models perform well, with RMSE ranging from 0.429% to 1.774%, R2 ranging from 0.942 to 0.992, and MBE ranging from −0.1571% to 0.6025%. In general, cloud cover significantly affects the estimation of global solar irradiance. However, cloud cover in Malaysia lacks sufficient influence when included into multiple-parameter models although it performs fairly well in single-parameter prediction models.

  20. Two-step variable selection in quantile regression models

    Directory of Open Access Journals (Sweden)

    FAN Yali

    2015-06-01

    Full Text Available We propose a two-step variable selection procedure for high dimensional quantile regressions, in which the dimension of the covariates, pn is much larger than the sample size n. In the first step, we perform ℓ1 penalty, and we demonstrate that the first step penalized estimator with the LASSO penalty can reduce the model from an ultra-high dimensional to a model whose size has the same order as that of the true model, and the selected model can cover the true model. The second step excludes the remained irrelevant covariates by applying the adaptive LASSO penalty to the reduced model obtained from the first step. Under some regularity conditions, we show that our procedure enjoys the model selection consistency. We conduct a simulation study and a real data analysis to evaluate the finite sample performance of the proposed approach.

  1. Reducing Monte Carlo error in the Bayesian estimation of risk ratios using log-binomial regression models.

    Science.gov (United States)

    Salmerón, Diego; Cano, Juan A; Chirlaque, María D

    2015-08-30

    In cohort studies, binary outcomes are very often analyzed by logistic regression. However, it is well known that when the goal is to estimate a risk ratio, the logistic regression is inappropriate if the outcome is common. In these cases, a log-binomial regression model is preferable. On the other hand, the estimation of the regression coefficients of the log-binomial model is difficult owing to the constraints that must be imposed on these coefficients. Bayesian methods allow a straightforward approach for log-binomial regression models and produce smaller mean squared errors in the estimation of risk ratios than the frequentist methods, and the posterior inferences can be obtained using the software WinBUGS. However, Markov chain Monte Carlo methods implemented in WinBUGS can lead to large Monte Carlo errors in the approximations to the posterior inferences because they produce correlated simulations, and the accuracy of the approximations are inversely related to this correlation. To reduce correlation and to improve accuracy, we propose a reparameterization based on a Poisson model and a sampling algorithm coded in R. Copyright © 2015 John Wiley & Sons, Ltd.

  2. Spatial Bayesian latent factor regression modeling of coordinate-based meta-analysis data.

    Science.gov (United States)

    Montagna, Silvia; Wager, Tor; Barrett, Lisa Feldman; Johnson, Timothy D; Nichols, Thomas E

    2018-03-01

    Now over 20 years old, functional MRI (fMRI) has a large and growing literature that is best synthesised with meta-analytic tools. As most authors do not share image data, only the peak activation coordinates (foci) reported in the article are available for Coordinate-Based Meta-Analysis (CBMA). Neuroimaging meta-analysis is used to (i) identify areas of consistent activation; and (ii) build a predictive model of task type or cognitive process for new studies (reverse inference). To simultaneously address these aims, we propose a Bayesian point process hierarchical model for CBMA. We model the foci from each study as a doubly stochastic Poisson process, where the study-specific log intensity function is characterized as a linear combination of a high-dimensional basis set. A sparse representation of the intensities is guaranteed through latent factor modeling of the basis coefficients. Within our framework, it is also possible to account for the effect of study-level covariates (meta-regression), significantly expanding the capabilities of the current neuroimaging meta-analysis methods available. We apply our methodology to synthetic data and neuroimaging meta-analysis datasets. © 2017, The International Biometric Society.

  3. Spatial Bayesian Latent Factor Regression Modeling of Coordinate-based Meta-analysis Data

    Science.gov (United States)

    Montagna, Silvia; Wager, Tor; Barrett, Lisa Feldman; Johnson, Timothy D.; Nichols, Thomas E.

    2017-01-01

    Summary Now over 20 years old, functional MRI (fMRI) has a large and growing literature that is best synthesised with meta-analytic tools. As most authors do not share image data, only the peak activation coordinates (foci) reported in the paper are available for Coordinate-Based Meta-Analysis (CBMA). Neuroimaging meta-analysis is used to 1) identify areas of consistent activation; and 2) build a predictive model of task type or cognitive process for new studies (reverse inference). To simultaneously address these aims, we propose a Bayesian point process hierarchical model for CBMA. We model the foci from each study as a doubly stochastic Poisson process, where the study-specific log intensity function is characterised as a linear combination of a high-dimensional basis set. A sparse representation of the intensities is guaranteed through latent factor modeling of the basis coefficients. Within our framework, it is also possible to account for the effect of study-level covariates (meta-regression), significantly expanding the capabilities of the current neuroimaging meta-analysis methods available. We apply our methodology to synthetic data and neuroimaging meta-analysis datasets. PMID:28498564

  4. New robust statistical procedures for the polytomous logistic regression models.

    Science.gov (United States)

    Castilla, Elena; Ghosh, Abhik; Martin, Nirian; Pardo, Leandro

    2018-05-17

    This article derives a new family of estimators, namely the minimum density power divergence estimators, as a robust generalization of the maximum likelihood estimator for the polytomous logistic regression model. Based on these estimators, a family of Wald-type test statistics for linear hypotheses is introduced. Robustness properties of both the proposed estimators and the test statistics are theoretically studied through the classical influence function analysis. Appropriate real life examples are presented to justify the requirement of suitable robust statistical procedures in place of the likelihood based inference for the polytomous logistic regression model. The validity of the theoretical results established in the article are further confirmed empirically through suitable simulation studies. Finally, an approach for the data-driven selection of the robustness tuning parameter is proposed with empirical justifications. © 2018, The International Biometric Society.

  5. Stiffness and Poisson ratio in longitudinal compression of fiber yarns in meso-FE modelling of composite reinforcement forming

    Science.gov (United States)

    Wang, D.; Naouar, N.; Vidal-Salle, E.; Boisse, P.

    2018-05-01

    In meso-scale finite element modeling, the yarns of the reinforcement are considered to be solids made of a continuous material in contact with their neighbors. The present paper consider the mechanical behavior of these yarns that can happen for some loadings of the reinforcement. The yarns present a specific mechanical behavior when under longitudinal compression because they are made up of a large number of fibers, Local buckling of the fibers causes the compressive stiffness of the continuous material representing the yarn to be much weaker than when under tension. In addition, longitudinal compression causes an important transverse expansion. It is shown that the transverse expansion can be depicted by a Poisson ratio that remained roughly constant when the yarn length and the compression strain varied. Buckling of the fibers significantly increases the transverse dimensions of the yarn which leads to a large Poisson ratio (up to 12 for a yarn analyzed in the present study). Meso-scale finite element simulations of reinforcements with binder yarns submitted to longitudinal compression showed that these improvements led to results in good agreement with micro-CT analyses.

  6. THE REGRESSION MODEL OF IRAN LIBRARIES ORGANIZATIONAL CLIMATE

    OpenAIRE

    Jahani, Mohammad Ali; Yaminfirooz, Mousa; Siamian, Hasan

    2015-01-01

    Background: The purpose of this study was to drawing a regression model of organizational climate of central libraries of Iran?s universities. Methods: This study is an applied research. The statistical population of this study consisted of 96 employees of the central libraries of Iran?s public universities selected among the 117 universities affiliated to the Ministry of Health by Stratified Sampling method (510 people). Climate Qual localized questionnaire was used as research tools. For pr...

  7. Online Statistical Modeling (Regression Analysis) for Independent Responses

    Science.gov (United States)

    Made Tirta, I.; Anggraeni, Dian; Pandutama, Martinus

    2017-06-01

    Regression analysis (statistical analmodelling) are among statistical methods which are frequently needed in analyzing quantitative data, especially to model relationship between response and explanatory variables. Nowadays, statistical models have been developed into various directions to model various type and complex relationship of data. Rich varieties of advanced and recent statistical modelling are mostly available on open source software (one of them is R). However, these advanced statistical modelling, are not very friendly to novice R users, since they are based on programming script or command line interface. Our research aims to developed web interface (based on R and shiny), so that most recent and advanced statistical modelling are readily available, accessible and applicable on web. We have previously made interface in the form of e-tutorial for several modern and advanced statistical modelling on R especially for independent responses (including linear models/LM, generalized linier models/GLM, generalized additive model/GAM and generalized additive model for location scale and shape/GAMLSS). In this research we unified them in the form of data analysis, including model using Computer Intensive Statistics (Bootstrap and Markov Chain Monte Carlo/ MCMC). All are readily accessible on our online Virtual Statistics Laboratory. The web (interface) make the statistical modeling becomes easier to apply and easier to compare them in order to find the most appropriate model for the data.

  8. Reconstruction of missing daily streamflow data using dynamic regression models

    Science.gov (United States)

    Tencaliec, Patricia; Favre, Anne-Catherine; Prieur, Clémentine; Mathevet, Thibault

    2015-12-01

    River discharge is one of the most important quantities in hydrology. It provides fundamental records for water resources management and climate change monitoring. Even very short data-gaps in this information can cause extremely different analysis outputs. Therefore, reconstructing missing data of incomplete data sets is an important step regarding the performance of the environmental models, engineering, and research applications, thus it presents a great challenge. The objective of this paper is to introduce an effective technique for reconstructing missing daily discharge data when one has access to only daily streamflow data. The proposed procedure uses a combination of regression and autoregressive integrated moving average models (ARIMA) called dynamic regression model. This model uses the linear relationship between neighbor and correlated stations and then adjusts the residual term by fitting an ARIMA structure. Application of the model to eight daily streamflow data for the Durance river watershed showed that the model yields reliable estimates for the missing data in the time series. Simulation studies were also conducted to evaluate the performance of the procedure.

  9. Predicting and Modelling of Survival Data when Cox's Regression Model does not hold

    DEFF Research Database (Denmark)

    Scheike, Thomas H.; Zhang, Mei-Jie

    2002-01-01

    Aalen model; additive risk model; counting processes; competing risk; Cox regression; flexible modeling; goodness of fit; prediction of survival; survival analysis; time-varying effects......Aalen model; additive risk model; counting processes; competing risk; Cox regression; flexible modeling; goodness of fit; prediction of survival; survival analysis; time-varying effects...

  10. Extended cox regression model: The choice of timefunction

    Science.gov (United States)

    Isik, Hatice; Tutkun, Nihal Ata; Karasoy, Durdu

    2017-07-01

    Cox regression model (CRM), which takes into account the effect of censored observations, is one the most applicative and usedmodels in survival analysis to evaluate the effects of covariates. Proportional hazard (PH), requires a constant hazard ratio over time, is the assumptionofCRM. Using extended CRM provides the test of including a time dependent covariate to assess the PH assumption or an alternative model in case of nonproportional hazards. In this study, the different types of real data sets are used to choose the time function and the differences between time functions are analyzed and discussed.

  11. Multivariate Frequency-Severity Regression Models in Insurance

    Directory of Open Access Journals (Sweden)

    Edward W. Frees

    2016-02-01

    Full Text Available In insurance and related industries including healthcare, it is common to have several outcome measures that the analyst wishes to understand using explanatory variables. For example, in automobile insurance, an accident may result in payments for damage to one’s own vehicle, damage to another party’s vehicle, or personal injury. It is also common to be interested in the frequency of accidents in addition to the severity of the claim amounts. This paper synthesizes and extends the literature on multivariate frequency-severity regression modeling with a focus on insurance industry applications. Regression models for understanding the distribution of each outcome continue to be developed yet there now exists a solid body of literature for the marginal outcomes. This paper contributes to this body of literature by focusing on the use of a copula for modeling the dependence among these outcomes; a major advantage of this tool is that it preserves the body of work established for marginal models. We illustrate this approach using data from the Wisconsin Local Government Property Insurance Fund. This fund offers insurance protection for (i property; (ii motor vehicle; and (iii contractors’ equipment claims. In addition to several claim types and frequency-severity components, outcomes can be further categorized by time and space, requiring complex dependency modeling. We find significant dependencies for these data; specifically, we find that dependencies among lines are stronger than the dependencies between the frequency and average severity within each line.

  12. Augmented Beta rectangular regression models: A Bayesian perspective.

    Science.gov (United States)

    Wang, Jue; Luo, Sheng

    2016-01-01

    Mixed effects Beta regression models based on Beta distributions have been widely used to analyze longitudinal percentage or proportional data ranging between zero and one. However, Beta distributions are not flexible to extreme outliers or excessive events around tail areas, and they do not account for the presence of the boundary values zeros and ones because these values are not in the support of the Beta distributions. To address these issues, we propose a mixed effects model using Beta rectangular distribution and augment it with the probabilities of zero and one. We conduct extensive simulation studies to assess the performance of mixed effects models based on both the Beta and Beta rectangular distributions under various scenarios. The simulation studies suggest that the regression models based on Beta rectangular distributions improve the accuracy of parameter estimates in the presence of outliers and heavy tails. The proposed models are applied to the motivating Neuroprotection Exploratory Trials in Parkinson's Disease (PD) Long-term Study-1 (LS-1 study, n = 1741), developed by The National Institute of Neurological Disorders and Stroke Exploratory Trials in Parkinson's Disease (NINDS NET-PD) network. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  13. Bayesian semiparametric regression models to characterize molecular evolution

    Directory of Open Access Journals (Sweden)

    Datta Saheli

    2012-10-01

    Full Text Available Abstract Background Statistical models and methods that associate changes in the physicochemical properties of amino acids with natural selection at the molecular level typically do not take into account the correlations between such properties. We propose a Bayesian hierarchical regression model with a generalization of the Dirichlet process prior on the distribution of the regression coefficients that describes the relationship between the changes in amino acid distances and natural selection in protein-coding DNA sequence alignments. Results The Bayesian semiparametric approach is illustrated with simulated data and the abalone lysin sperm data. Our method identifies groups of properties which, for this particular dataset, have a similar effect on evolution. The model also provides nonparametric site-specific estimates for the strength of conservation of these properties. Conclusions The model described here is distinguished by its ability to handle a large number of amino acid properties simultaneously, while taking into account that such data can be correlated. The multi-level clustering ability of the model allows for appealing interpretations of the results in terms of properties that are roughly equivalent from the standpoint of molecular evolution.

  14. Regularized multivariate regression models with skew-t error distributions

    KAUST Repository

    Chen, Lianfu

    2014-06-01

    We consider regularization of the parameters in multivariate linear regression models with the errors having a multivariate skew-t distribution. An iterative penalized likelihood procedure is proposed for constructing sparse estimators of both the regression coefficient and inverse scale matrices simultaneously. The sparsity is introduced through penalizing the negative log-likelihood by adding L1-penalties on the entries of the two matrices. Taking advantage of the hierarchical representation of skew-t distributions, and using the expectation conditional maximization (ECM) algorithm, we reduce the problem to penalized normal likelihood and develop a procedure to minimize the ensuing objective function. Using a simulation study the performance of the method is assessed, and the methodology is illustrated using a real data set with a 24-dimensional response vector. © 2014 Elsevier B.V.

  15. Risk Estimation for Lung Cancer in Libya: Analysis Based on Standardized Morbidity Ratio, Poisson-Gamma Model, BYM Model and Mixture Model

    Science.gov (United States)

    Alhdiri, Maryam Ahmed; Samat, Nor Azah; Mohamed, Zulkifley

    2017-03-01

    Cancer is the most rapidly spreading disease in the world, especially in developing countries, including Libya. Cancer represents a significant burden on patients, families, and their societies. This disease can be controlled if detected early. Therefore, disease mapping has recently become an important method in the fields of public health research and disease epidemiology. The correct choice of statistical model is a very important step to producing a good map of a disease. Libya was selected to perform this work and to examine its geographical variation in the incidence of lung cancer. The objective of this paper is to estimate the relative risk for lung cancer. Four statistical models to estimate the relative risk for lung cancer and population censuses of the study area for the time period 2006 to 2011 were used in this work. They are initially known as Standardized Morbidity Ratio, which is the most popular statistic, which used in the field of disease mapping, Poisson-gamma model, which is one of the earliest applications of Bayesian methodology, Besag, York and Mollie (BYM) model and Mixture model. As an initial step, this study begins by providing a review of all proposed models, which we then apply to lung cancer data in Libya. Maps, tables and graph, goodness-of-fit (GOF) were used to compare and present the preliminary results. This GOF is common in statistical modelling to compare fitted models. The main general results presented in this study show that the Poisson-gamma model, BYM model, and Mixture model can overcome the problem of the first model (SMR) when there is no observed lung cancer case in certain districts. Results show that the Mixture model is most robust and provides better relative risk estimates across a range of models. Creative Commons Attribution License

  16. Dynamic Regression Intervention Modeling for the Malaysian Daily Load

    Directory of Open Access Journals (Sweden)

    Fadhilah Abdrazak

    2014-05-01

    Full Text Available Malaysia is a unique country due to having both fixed and moving holidays.  These moving holidays may overlap with other fixed holidays and therefore, increase the complexity of the load forecasting activities. The errors due to holidays’ effects in the load forecasting are known to be higher than other factors.  If these effects can be estimated and removed, the behavior of the series could be better viewed.  Thus, the aim of this paper is to improve the forecasting errors by using a dynamic regression model with intervention analysis.   Based on the linear transfer function method, a daily load model consists of either peak or average is developed.  The developed model outperformed the seasonal ARIMA model in estimating the fixed and moving holidays’ effects and achieved a smaller Mean Absolute Percentage Error (MAPE in load forecast.

  17. Learning Supervised Topic Models for Classification and Regression from Crowds.

    Science.gov (United States)

    Rodrigues, Filipe; Lourenco, Mariana; Ribeiro, Bernardete; Pereira, Francisco C

    2017-12-01

    The growing need to analyze large collections of documents has led to great developments in topic modeling. Since documents are frequently associated with other related variables, such as labels or ratings, much interest has been placed on supervised topic models. However, the nature of most annotation tasks, prone to ambiguity and noise, often with high volumes of documents, deem learning under a single-annotator assumption unrealistic or unpractical for most real-world applications. In this article, we propose two supervised topic models, one for classification and another for regression problems, which account for the heterogeneity and biases among different annotators that are encountered in practice when learning from crowds. We develop an efficient stochastic variational inference algorithm that is able to scale to very large datasets, and we empirically demonstrate the advantages of the proposed model over state-of-the-art approaches.

  18. Continuous validation of ASTEC containment models and regression testing

    International Nuclear Information System (INIS)

    Nowack, Holger; Reinke, Nils; Sonnenkalb, Martin

    2014-01-01

    The focus of the ASTEC (Accident Source Term Evaluation Code) development at GRS is primarily on the containment module CPA (Containment Part of ASTEC), whose modelling is to a large extent based on the GRS containment code COCOSYS (COntainment COde SYStem). Validation is usually understood as the approval of the modelling capabilities by calculations of appropriate experiments done by external users different from the code developers. During the development process of ASTEC CPA, bugs and unintended side effects may occur, which leads to changes in the results of the initially conducted validation. Due to the involvement of a considerable number of developers in the coding of ASTEC modules, validation of the code alone, even if executed repeatedly, is not sufficient. Therefore, a regression testing procedure has been implemented in order to ensure that the initially obtained validation results are still valid with succeeding code versions. Within the regression testing procedure, calculations of experiments and plant sequences are performed with the same input deck but applying two different code versions. For every test-case the up-to-date code version is compared to the preceding one on the basis of physical parameters deemed to be characteristic for the test-case under consideration. In the case of post-calculations of experiments also a comparison to experimental data is carried out. Three validation cases from the regression testing procedure are presented within this paper. The very good post-calculation of the HDR E11.1 experiment shows the high quality modelling of thermal-hydraulics in ASTEC CPA. Aerosol behaviour is validated on the BMC VANAM M3 experiment, and the results show also a very good agreement with experimental data. Finally, iodine behaviour is checked in the validation test-case of the THAI IOD-11 experiment. Within this test-case, the comparison of the ASTEC versions V2.0r1 and V2.0r2 shows how an error was detected by the regression testing

  19. Modeling of the Monthly Rainfall-Runoff Process Through Regressions

    Directory of Open Access Journals (Sweden)

    Campos-Aranda Daniel Francisco

    2014-10-01

    Full Text Available To solve the problems associated with the assessment of water resources of a river, the modeling of the rainfall-runoff process (RRP allows the deduction of runoff missing data and to extend its record, since generally the information available on precipitation is larger. It also enables the estimation of inputs to reservoirs, when their building led to the suppression of the gauging station. The simplest mathematical model that can be set for the RRP is the linear regression or curve on a monthly basis. Such a model is described in detail and is calibrated with the simultaneous record of monthly rainfall and runoff in Ballesmi hydrometric station, which covers 35 years. Since the runoff of this station has an important contribution from the spring discharge, the record is corrected first by removing that contribution. In order to do this a procedure was developed based either on the monthly average regional runoff coefficients or on nearby and similar watershed; in this case the Tancuilín gauging station was used. Both stations belong to the Partial Hydrologic Region No. 26 (Lower Rio Panuco and are located within the state of San Luis Potosi, México. The study performed indicates that the monthly regression model, due to its conceptual approach, faithfully reproduces monthly average runoff volumes and achieves an excellent approximation in relation to the dispersion, proved by calculation of the means and standard deviations.

  20. Genetic evaluation of European quails by random regression models

    Directory of Open Access Journals (Sweden)

    Flaviana Miranda Gonçalves

    2012-09-01

    Full Text Available The objective of this study was to compare different random regression models, defined from different classes of heterogeneity of variance combined with different Legendre polynomial orders for the estimate of (covariance of quails. The data came from 28,076 observations of 4,507 female meat quails of the LF1 lineage. Quail body weights were determined at birth and 1, 14, 21, 28, 35 and 42 days of age. Six different classes of residual variance were fitted to Legendre polynomial functions (orders ranging from 2 to 6 to determine which model had the best fit to describe the (covariance structures as a function of time. According to the evaluated criteria (AIC, BIC and LRT, the model with six classes of residual variances and of sixth-order Legendre polynomial was the best fit. The estimated additive genetic variance increased from birth to 28 days of age, and dropped slightly from 35 to 42 days. The heritability estimates decreased along the growth curve and changed from 0.51 (1 day to 0.16 (42 days. Animal genetic and permanent environmental correlation estimates between weights and age classes were always high and positive, except for birth weight. The sixth order Legendre polynomial, along with the residual variance divided into six classes was the best fit for the growth rate curve of meat quails; therefore, they should be considered for breeding evaluation processes by random regression models.

  1. Interpreting parameters in the logistic regression model with random effects

    DEFF Research Database (Denmark)

    Larsen, Klaus; Petersen, Jørgen Holm; Budtz-Jørgensen, Esben

    2000-01-01

    interpretation, interval odds ratio, logistic regression, median odds ratio, normally distributed random effects......interpretation, interval odds ratio, logistic regression, median odds ratio, normally distributed random effects...

  2. [Use of multiple regression models in observational studies (1970-2013) and requirements of the STROBE guidelines in Spanish scientific journals].

    Science.gov (United States)

    Real, J; Cleries, R; Forné, C; Roso-Llorach, A; Martínez-Sánchez, J M

    In medicine and biomedical research, statistical techniques like logistic, linear, Cox and Poisson regression are widely known. The main objective is to describe the evolution of multivariate techniques used in observational studies indexed in PubMed (1970-2013), and to check the requirements of the STROBE guidelines in the author guidelines in Spanish journals indexed in PubMed. A targeted PubMed search was performed to identify papers that used logistic linear Cox and Poisson models. Furthermore, a review was also made of the author guidelines of journals published in Spain and indexed in PubMed and Web of Science. Only 6.1% of the indexed manuscripts included a term related to multivariate analysis, increasing from 0.14% in 1980 to 12.3% in 2013. In 2013, 6.7, 2.5, 3.5, and 0.31% of the manuscripts contained terms related to logistic, linear, Cox and Poisson regression, respectively. On the other hand, 12.8% of journals author guidelines explicitly recommend to follow the STROBE guidelines, and 35.9% recommend the CONSORT guideline. A low percentage of Spanish scientific journals indexed in PubMed include the STROBE statement requirement in the author guidelines. Multivariate regression models in published observational studies such as logistic regression, linear, Cox and Poisson are increasingly used both at international level, as well as in journals published in Spanish. Copyright © 2015 Sociedad Española de Médicos de Atención Primaria (SEMERGEN). Publicado por Elsevier España, S.L.U. All rights reserved.

  3. Learning Supervised Topic Models for Classification and Regression from Crowds

    DEFF Research Database (Denmark)

    Rodrigues, Filipe; Lourenco, Mariana; Ribeiro, Bernardete

    2017-01-01

    problems, which account for the heterogeneity and biases among different annotators that are encountered in practice when learning from crowds. We develop an efficient stochastic variational inference algorithm that is able to scale to very large datasets, and we empirically demonstrate the advantages...... annotation tasks, prone to ambiguity and noise, often with high volumes of documents, deem learning under a single-annotator assumption unrealistic or unpractical for most real-world applications. In this article, we propose two supervised topic models, one for classification and another for regression...

  4. Preference learning with evolutionary Multivariate Adaptive Regression Spline model

    DEFF Research Database (Denmark)

    Abou-Zleikha, Mohamed; Shaker, Noor; Christensen, Mads Græsbøll

    2015-01-01

    This paper introduces a novel approach for pairwise preference learning through combining an evolutionary method with Multivariate Adaptive Regression Spline (MARS). Collecting users' feedback through pairwise preferences is recommended over other ranking approaches as this method is more appealing...... for function approximation as well as being relatively easy to interpret. MARS models are evolved based on their efficiency in learning pairwise data. The method is tested on two datasets that collectively provide pairwise preference data of five cognitive states expressed by users. The method is analysed...

  5. Predicting Performance on MOOC Assessments using Multi-Regression Models

    OpenAIRE

    Ren, Zhiyun; Rangwala, Huzefa; Johri, Aditya

    2016-01-01

    The past few years has seen the rapid growth of data min- ing approaches for the analysis of data obtained from Mas- sive Open Online Courses (MOOCs). The objectives of this study are to develop approaches to predict the scores a stu- dent may achieve on a given grade-related assessment based on information, considered as prior performance or prior ac- tivity in the course. We develop a personalized linear mul- tiple regression (PLMR) model to predict the grade for a student, prior to attempt...

  6. Analytical and regression models of glass rod drawing process

    Science.gov (United States)

    Alekseeva, L. B.

    2018-03-01

    The process of drawing glass rods (light guides) is being studied. The parameters of the process affecting the quality of the light guide have been determined. To solve the problem, mathematical models based on general equations of continuum mechanics are used. The conditions for the stable flow of the drawing process have been found, which are determined by the stability of the motion of the glass mass in the formation zone to small uncontrolled perturbations. The sensitivity of the formation zone to perturbations of the drawing speed and viscosity is estimated. Experimental models of the drawing process, based on the regression analysis methods, have been obtained. These models make it possible to customize a specific production process to obtain light guides of the required quality. They allow one to find the optimum combination of process parameters in the chosen area and to determine the required accuracy of maintaining them at a specified level.

  7. Regression Models for Predicting Force Coefficients of Aerofoils

    Directory of Open Access Journals (Sweden)

    Mohammed ABDUL AKBAR

    2015-09-01

    Full Text Available Renewable sources of energy are attractive and advantageous in a lot of different ways. Among the renewable energy sources, wind energy is the fastest growing type. Among wind energy converters, Vertical axis wind turbines (VAWTs have received renewed interest in the past decade due to some of the advantages they possess over their horizontal axis counterparts. VAWTs have evolved into complex 3-D shapes. A key component in predicting the output of VAWTs through analytical studies is obtaining the values of lift and drag coefficients which is a function of shape of the aerofoil, ‘angle of attack’ of wind and Reynolds’s number of flow. Sandia National Laboratories have carried out extensive experiments on aerofoils for the Reynolds number in the range of those experienced by VAWTs. The volume of experimental data thus obtained is huge. The current paper discusses three Regression analysis models developed wherein lift and drag coefficients can be found out using simple formula without having to deal with the bulk of the data. Drag coefficients and Lift coefficients were being successfully estimated by regression models with R2 values as high as 0.98.

  8. Review and Recommendations for Zero-inflated Count Regression Modeling of Dental Caries Indices in Epidemiological Studies

    Science.gov (United States)

    Stamm, John W.; Long, D. Leann; Kincade, Megan E.

    2012-01-01

    Over the past five to ten years, zero-inflated count regression models have been increasingly applied to the analysis of dental caries indices (e.g., DMFT, dfms, etc). The main reason for that is linked to the broad decline in children’s caries experience, such that dmf and DMF indices more frequently generate low or even zero counts. This article specifically reviews the application of zero-inflated Poisson and zero-inflated negative binomial regression models to dental caries, with emphasis on the description of the models and the interpretation of fitted model results given the study goals. The review finds that interpretations provided in the published caries research are often imprecise or inadvertently misleading, particularly with respect to failing to discriminate between inference for the class of susceptible persons defined by such models and inference for the sampled population in terms of overall exposure effects. Recommendations are provided to enhance the use as well as the interpretation and reporting of results of count regression models when applied to epidemiological studies of dental caries. PMID:22710271

  9. Seasonally adjusted birth frequencies follow the Poisson distribution.

    Science.gov (United States)

    Barra, Mathias; Lindstrøm, Jonas C; Adams, Samantha S; Augestad, Liv A

    2015-12-15

    Variations in birth frequencies have an impact on activity planning in maternity wards. Previous studies of this phenomenon have commonly included elective births. A Danish study of spontaneous births found that birth frequencies were well modelled by a Poisson process. Somewhat unexpectedly, there were also weekly variations in the frequency of spontaneous births. Another study claimed that birth frequencies follow the Benford distribution. Our objective was to test these results. We analysed 50,017 spontaneous births at Akershus University Hospital in the period 1999-2014. To investigate the Poisson distribution of these births, we plotted their variance over a sliding average. We specified various Poisson regression models, with the number of births on a given day as the outcome variable. The explanatory variables included various combinations of years, months, days of the week and the digit sum of the date. The relationship between the variance and the average fits well with an underlying Poisson process. A Benford distribution was disproved by a goodness-of-fit test (p Poisson process when monthly and day-of-the-week variation is included. The frequency is highest in summer towards June and July, Friday and Tuesday stand out as particularly busy days, and the activity level is at its lowest during weekends.

  10. Using Poisson-gamma model to evaluate the duration of recruitment process when historical trials are available.

    Science.gov (United States)

    Minois, Nathan; Lauwers-Cances, Valérie; Savy, Stéphanie; Attal, Michel; Andrieu, Sandrine; Anisimov, Vladimir; Savy, Nicolas

    2017-10-15

    At the design of clinical trial operation, a question of a paramount interest is how long it takes to recruit a given number of patients. Modelling the recruitment dynamics is the necessary step to answer this question. Poisson-gamma model provides very convenient, flexible and realistic approach. This model allows predicting the trial duration using data collected at an interim time with very good accuracy. A natural question arises: how to evaluate the parameters of recruitment model before the trial begins? The question is harder to handle as there are no recruitment data available for this trial. However, if there exist similar completed trials, it is appealing to use data from these trials to investigate feasibility of the recruitment process. In this paper, the authors explore the recruitment data of two similar clinical trials (Intergroupe Francais du Myélome 2005 and 2009). It is shown that the natural idea of plugging the historical rates estimated from the completed trial in the same centres of the new trial for predicting recruitment is not a relevant strategy. In contrast, using the parameters of a gamma distribution of the rates estimated from the completed trial in the recruitment dynamic model of the new trial provides reasonable predictive properties with relevant confidence intervals. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  11. Complex Environmental Data Modelling Using Adaptive General Regression Neural Networks

    Science.gov (United States)

    Kanevski, Mikhail

    2015-04-01

    The research deals with an adaptation and application of Adaptive General Regression Neural Networks (GRNN) to high dimensional environmental data. GRNN [1,2,3] are efficient modelling tools both for spatial and temporal data and are based on nonparametric kernel methods closely related to classical Nadaraya-Watson estimator. Adaptive GRNN, using anisotropic kernels, can be also applied for features selection tasks when working with high dimensional data [1,3]. In the present research Adaptive GRNN are used to study geospatial data predictability and relevant feature selection using both simulated and real data case studies. The original raw data were either three dimensional monthly precipitation data or monthly wind speeds embedded into 13 dimensional space constructed by geographical coordinates and geo-features calculated from digital elevation model. GRNN were applied in two different ways: 1) adaptive GRNN with the resulting list of features ordered according to their relevancy; and 2) adaptive GRNN applied to evaluate all possible models N [in case of wind fields N=(2^13 -1)=8191] and rank them according to the cross-validation error. In both cases training were carried out applying leave-one-out procedure. An important result of the study is that the set of the most relevant features depends on the month (strong seasonal effect) and year. The predictabilities of precipitation and wind field patterns, estimated using the cross-validation and testing errors of raw and shuffled data, were studied in detail. The results of both approaches were qualitatively and quantitatively compared. In conclusion, Adaptive GRNN with their ability to select features and efficient modelling of complex high dimensional data can be widely used in automatic/on-line mapping and as an integrated part of environmental decision support systems. 1. Kanevski M., Pozdnoukhov A., Timonin V. Machine Learning for Spatial Environmental Data. Theory, applications and software. EPFL Press

  12. Conditional Monte Carlo randomization tests for regression models.

    Science.gov (United States)

    Parhat, Parwen; Rosenberger, William F; Diao, Guoqing

    2014-08-15

    We discuss the computation of randomization tests for clinical trials of two treatments when the primary outcome is based on a regression model. We begin by revisiting the seminal paper of Gail, Tan, and Piantadosi (1988), and then describe a method based on Monte Carlo generation of randomization sequences. The tests based on this Monte Carlo procedure are design based, in that they incorporate the particular randomization procedure used. We discuss permuted block designs, complete randomization, and biased coin designs. We also use a new technique by Plamadeala and Rosenberger (2012) for simple computation of conditional randomization tests. Like Gail, Tan, and Piantadosi, we focus on residuals from generalized linear models and martingale residuals from survival models. Such techniques do not apply to longitudinal data analysis, and we introduce a method for computation of randomization tests based on the predicted rate of change from a generalized linear mixed model when outcomes are longitudinal. We show, by simulation, that these randomization tests preserve the size and power well under model misspecification. Copyright © 2014 John Wiley & Sons, Ltd.

  13. Genomic breeding value estimation using nonparametric additive regression models

    Directory of Open Access Journals (Sweden)

    Solberg Trygve

    2009-01-01

    Full Text Available Abstract Genomic selection refers to the use of genomewide dense markers for breeding value estimation and subsequently for selection. The main challenge of genomic breeding value estimation is the estimation of many effects from a limited number of observations. Bayesian methods have been proposed to successfully cope with these challenges. As an alternative class of models, non- and semiparametric models were recently introduced. The present study investigated the ability of nonparametric additive regression models to predict genomic breeding values. The genotypes were modelled for each marker or pair of flanking markers (i.e. the predictors separately. The nonparametric functions for the predictors were estimated simultaneously using additive model theory, applying a binomial kernel. The optimal degree of smoothing was determined by bootstrapping. A mutation-drift-balance simulation was carried out. The breeding values of the last generation (genotyped was predicted using data from the next last generation (genotyped and phenotyped. The results show moderate to high accuracies of the predicted breeding values. A determination of predictor specific degree of smoothing increased the accuracy.

  14. Global Land Use Regression Model for Nitrogen Dioxide Air Pollution.

    Science.gov (United States)

    Larkin, Andrew; Geddes, Jeffrey A; Martin, Randall V; Xiao, Qingyang; Liu, Yang; Marshall, Julian D; Brauer, Michael; Hystad, Perry

    2017-06-20

    Nitrogen dioxide is a common air pollutant with growing evidence of health impacts independent of other common pollutants such as ozone and particulate matter. However, the worldwide distribution of NO 2 exposure and associated impacts on health is still largely uncertain. To advance global exposure estimates we created a global nitrogen dioxide (NO 2 ) land use regression model for 2011 using annual measurements from 5,220 air monitors in 58 countries. The model captured 54% of global NO 2 variation, with a mean absolute error of 3.7 ppb. Regional performance varied from R 2 = 0.42 (Africa) to 0.67 (South America). Repeated 10% cross-validation using bootstrap sampling (n = 10,000) demonstrated a robust performance with respect to air monitor sampling in North America, Europe, and Asia (adjusted R 2 within 2%) but not for Africa and Oceania (adjusted R 2 within 11%) where NO 2 monitoring data are sparse. The final model included 10 variables that captured both between and within-city spatial gradients in NO 2 concentrations. Variable contributions differed between continental regions, but major roads within 100 m and satellite-derived NO 2 were consistently the strongest predictors. The resulting model can be used for global risk assessments and health studies, particularly in countries without existing NO 2 monitoring data or models.

  15. Drought Patterns Forecasting using an Auto-Regressive Logistic Model

    Science.gov (United States)

    del Jesus, M.; Sheffield, J.; Méndez Incera, F. J.; Losada, I. J.; Espejo, A.

    2014-12-01

    Drought is characterized by a water deficit that may manifest across a large range of spatial and temporal scales. Drought may create important socio-economic consequences, many times of catastrophic dimensions. A quantifiable definition of drought is elusive because depending on its impacts, consequences and generation mechanism, different water deficit periods may be identified as a drought by virtue of some definitions but not by others. Droughts are linked to the water cycle and, although a climate change signal may not have emerged yet, they are also intimately linked to climate.In this work we develop an auto-regressive logistic model for drought prediction at different temporal scales that makes use of a spatially explicit framework. Our model allows to include covariates, continuous or categorical, to improve the performance of the auto-regressive component.Our approach makes use of dimensionality reduction (principal component analysis) and classification techniques (K-Means and maximum dissimilarity) to simplify the representation of complex climatic patterns, such as sea surface temperature (SST) and sea level pressure (SLP), while including information on their spatial structure, i.e. considering their spatial patterns. This procedure allows us to include in the analysis multivariate representation of complex climatic phenomena, as the El Niño-Southern Oscillation. We also explore the impact of other climate-related variables such as sun spots. The model allows to quantify the uncertainty of the forecasts and can be easily adapted to make predictions under future climatic scenarios. The framework herein presented may be extended to other applications such as flash flood analysis, or risk assessment of natural hazards.

  16. A Gompertz regression model for fern spores germination

    Directory of Open Access Journals (Sweden)

    Gabriel y Galán, Jose María

    2015-06-01

    Full Text Available Germination is one of the most important biological processes for both seed and spore plants, also for fungi. At present, mathematical models of germination have been developed in fungi, bryophytes and several plant species. However, ferns are the only group whose germination has never been modelled. In this work we develop a regression model of the germination of fern spores. We have found that for Blechnum serrulatum, Blechnum yungense, Cheilanthes pilosa, Niphidium macbridei and Polypodium feuillei species the Gompertz growth model describe satisfactorily cumulative germination. An important result is that regression parameters are independent of fern species and the model is not affected by intraspecific variation. Our results show that the Gompertz curve represents a general germination model for all the non-green spore leptosporangiate ferns, including in the paper a discussion about the physiological and ecological meaning of the model.La germinación es uno de los procesos biológicos más relevantes tanto para las plantas con esporas, como para las plantas con semillas y los hongos. Hasta el momento, se han desarrollado modelos de germinación para hongos, briofitos y diversas especies de espermatófitos. Los helechos son el único grupo de plantas cuya germinación nunca ha sido modelizada. En este trabajo se desarrolla un modelo de regresión para explicar la germinación de las esporas de helechos. Observamos que para las especies Blechnum serrulatum, Blechnum yungense, Cheilanthes pilosa, Niphidium macbridei y Polypodium feuillei el modelo de crecimiento de Gompertz describe satisfactoriamente la germinación acumulativa. Un importante resultado es que los parámetros de la regresión son independientes de la especie y que el modelo no está afectado por variación intraespecífica. Por lo tanto, los resultados del trabajo muestran que la curva de Gompertz puede representar un modelo general para todos los helechos leptosporangiados

  17. THE REGRESSION MODEL OF IRAN LIBRARIES ORGANIZATIONAL CLIMATE.

    Science.gov (United States)

    Jahani, Mohammad Ali; Yaminfirooz, Mousa; Siamian, Hasan

    2015-10-01

    The purpose of this study was to drawing a regression model of organizational climate of central libraries of Iran's universities. This study is an applied research. The statistical population of this study consisted of 96 employees of the central libraries of Iran's public universities selected among the 117 universities affiliated to the Ministry of Health by Stratified Sampling method (510 people). Climate Qual localized questionnaire was used as research tools. For predicting the organizational climate pattern of the libraries is used from the multivariate linear regression and track diagram. of the 9 variables affecting organizational climate, 5 variables of innovation, teamwork, customer service, psychological safety and deep diversity play a major role in prediction of the organizational climate of Iran's libraries. The results also indicate that each of these variables with different coefficient have the power to predict organizational climate but the climate score of psychological safety (0.94) plays a very crucial role in predicting the organizational climate. Track diagram showed that five variables of teamwork, customer service, psychological safety, deep diversity and innovation directly effects on the organizational climate variable that contribution of the team work from this influence is more than any other variables. Of the indicator of the organizational climate of climateQual, the contribution of the team work from this influence is more than any other variables that reinforcement of teamwork in academic libraries can be more effective in improving the organizational climate of this type libraries.

  18. Extending the Solvation-Layer Interface Condition Continum Electrostatic Model to a Linearized Poisson-Boltzmann Solvent.

    Science.gov (United States)

    Molavi Tabrizi, Amirhossein; Goossens, Spencer; Mehdizadeh Rahimi, Ali; Cooper, Christopher D; Knepley, Matthew G; Bardhan, Jaydeep P

    2017-06-13

    We extend the linearized Poisson-Boltzmann (LPB) continuum electrostatic model for molecular solvation to address charge-hydration asymmetry. Our new solvation-layer interface condition (SLIC)/LPB corrects for first-shell response by perturbing the traditional continuum-theory interface conditions at the protein-solvent and the Stern-layer interfaces. We also present a GPU-accelerated treecode implementation capable of simulating large proteins, and our results demonstrate that the new model exhibits significant accuracy improvements over traditional LPB models, while reducing the number of fitting parameters from dozens (atomic radii) to just five parameters, which have physical meanings related to first-shell water behavior at an uncharged interface. In particular, atom radii in the SLIC model are not optimized but uniformly scaled from their Lennard-Jones radii. Compared to explicit-solvent free-energy calculations of individual atoms in small molecules, SLIC/LPB is significantly more accurate than standard parametrizations (RMS error 0.55 kcal/mol for SLIC, compared to RMS error of 3.05 kcal/mol for standard LPB). On parametrizing the electrostatic model with a simple nonpolar component for total molecular solvation free energies, our model predicts octanol/water transfer free energies with an RMS error 1.07 kcal/mol. A more detailed assessment illustrates that standard continuum electrostatic models reproduce total charging free energies via a compensation of significant errors in atomic self-energies; this finding offers a window into improving the accuracy of Generalized-Born theories and other coarse-grained models. Most remarkably, the SLIC model also reproduces positive charging free energies for atoms in hydrophobic groups, whereas standard PB models are unable to generate positive charging free energies regardless of the parametrized radii. The GPU-accelerated solver is freely available online, as is a MATLAB implementation.

  19. Meta-Modeling by Symbolic Regression and Pareto Simulated Annealing

    NARCIS (Netherlands)

    Stinstra, E.; Rennen, G.; Teeuwen, G.J.A.

    2006-01-01

    The subject of this paper is a new approach to Symbolic Regression.Other publications on Symbolic Regression use Genetic Programming.This paper describes an alternative method based on Pareto Simulated Annealing.Our method is based on linear regression for the estimation of constants.Interval

  20. Modeling Information Content Via Dirichlet-Multinomial Regression Analysis.

    Science.gov (United States)

    Ferrari, Alberto

    2017-01-01

    Shannon entropy is being increasingly used in biomedical research as an index of complexity and information content in sequences of symbols, e.g. languages, amino acid sequences, DNA methylation patterns and animal vocalizations. Yet, distributional properties of information entropy as a random variable have seldom been the object of study, leading to researchers mainly using linear models or simulation-based analytical approach to assess differences in information content, when entropy is measured repeatedly in different experimental conditions. Here a method to perform inference on entropy in such conditions is proposed. Building on results coming from studies in the field of Bayesian entropy estimation, a symmetric Dirichlet-multinomial regression model, able to deal efficiently with the issue of mean entropy estimation, is formulated. Through a simulation study the model is shown to outperform linear modeling in a vast range of scenarios and to have promising statistical properties. As a practical example, the method is applied to a data set coming from a real experiment on animal communication.

  1. Variable selection in Logistic regression model with genetic algorithm.

    Science.gov (United States)

    Zhang, Zhongheng; Trevino, Victor; Hoseini, Sayed Shahabuddin; Belciug, Smaranda; Boopathi, Arumugam Manivanna; Zhang, Ping; Gorunescu, Florin; Subha, Velappan; Dai, Songshi

    2018-02-01

    Variable or feature selection is one of the most important steps in model specification. Especially in the case of medical-decision making, the direct use of a medical database, without a previous analysis and preprocessing step, is often counterproductive. In this way, the variable selection represents the method of choosing the most relevant attributes from the database in order to build a robust learning models and, thus, to improve the performance of the models used in the decision process. In biomedical research, the purpose of variable selection is to select clinically important and statistically significant variables, while excluding unrelated or noise variables. A variety of methods exist for variable selection, but none of them is without limitations. For example, the stepwise approach, which is highly used, adds the best variable in each cycle generally producing an acceptable set of variables. Nevertheless, it is limited by the fact that it commonly trapped in local optima. The best subset approach can systematically search the entire covariate pattern space, but the solution pool can be extremely large with tens to hundreds of variables, which is the case in nowadays clinical data. Genetic algorithms (GA) are heuristic optimization approaches and can be used for variable selection in multivariable regression models. This tutorial paper aims to provide a step-by-step approach to the use of GA in variable selection. The R code provided in the text can be extended and adapted to other data analysis needs.

  2. Electricity prices forecasting by automatic dynamic harmonic regression models

    International Nuclear Information System (INIS)

    Pedregal, Diego J.; Trapero, Juan R.

    2007-01-01

    The changes experienced by electricity markets in recent years have created the necessity for more accurate forecast tools of electricity prices, both for producers and consumers. Many methodologies have been applied to this aim, but in the view of the authors, state space models are not yet fully exploited. The present paper proposes a univariate dynamic harmonic regression model set up in a state space framework for forecasting prices in these markets. The advantages of the approach are threefold. Firstly, a fast automatic identification and estimation procedure is proposed based on the frequency domain. Secondly, the recursive algorithms applied offer adaptive predictions that compare favourably with respect to other techniques. Finally, since the method is based on unobserved components models, explicit information about trend, seasonal and irregular behaviours of the series can be extracted. This information is of great value to the electricity companies' managers in order to improve their strategies, i.e. it provides management innovations. The good forecast performance and the rapid adaptability of the model to changes in the data are illustrated with actual prices taken from the PJM interconnection in the US and for the Spanish market for the year 2002. (author)

  3. Characteristics and Properties of a Simple Linear Regression Model

    Directory of Open Access Journals (Sweden)

    Kowal Robert

    2016-12-01

    Full Text Available A simple linear regression model is one of the pillars of classic econometrics. Despite the passage of time, it continues to raise interest both from the theoretical side as well as from the application side. One of the many fundamental questions in the model concerns determining derivative characteristics and studying the properties existing in their scope, referring to the first of these aspects. The literature of the subject provides several classic solutions in that regard. In the paper, a completely new design is proposed, based on the direct application of variance and its properties, resulting from the non-correlation of certain estimators with the mean, within the scope of which some fundamental dependencies of the model characteristics are obtained in a much more compact manner. The apparatus allows for a simple and uniform demonstration of multiple dependencies and fundamental properties in the model, and it does it in an intuitive manner. The results were obtained in a classic, traditional area, where everything, as it might seem, has already been thoroughly studied and discovered.

  4. Bayesian Regression of Thermodynamic Models of Redox Active Materials

    Energy Technology Data Exchange (ETDEWEB)

    Johnston, Katherine [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

    2017-09-01

    Finding a suitable functional redox material is a critical challenge to achieving scalable, economically viable technologies for storing concentrated solar energy in the form of a defected oxide. Demonstrating e ectiveness for thermal storage or solar fuel is largely accomplished by using a thermodynamic model derived from experimental data. The purpose of this project is to test the accuracy of our regression model on representative data sets. Determining the accuracy of the model includes parameter tting the model to the data, comparing the model using di erent numbers of param- eters, and analyzing the entropy and enthalpy calculated from the model. Three data sets were considered in this project: two demonstrating materials for solar fuels by wa- ter splitting and the other of a material for thermal storage. Using Bayesian Inference and Markov Chain Monte Carlo (MCMC), parameter estimation was preformed on the three data sets. Good results were achieved, except some there was some deviations on the edges of the data input ranges. The evidence values were then calculated in a variety of ways and used to compare models with di erent number of parameters. It was believed that at least one of the parameters was unnecessary and comparing evidence values demonstrated that the parameter was need on one data set and not signi cantly helpful on another. The entropy was calculated by taking the derivative in one variable and integrating over another. and its uncertainty was also calculated by evaluating the entropy over multiple MCMC samples. Afterwards, all the parts were written up as a tutorial for the Uncertainty Quanti cation Toolkit (UQTk).

  5. Poisson's spot and Gouy phase

    Science.gov (United States)

    da Paz, I. G.; Soldati, Rodolfo; Cabral, L. A.; de Oliveira, J. G. G.; Sampaio, Marcos

    2016-12-01

    Recently there have been experimental results on Poisson spot matter-wave interferometry followed by theoretical models describing the relative importance of the wave and particle behaviors for the phenomenon. We propose an analytical theoretical model for Poisson's spot with matter waves based on the Babinet principle, in which we use the results for free propagation and single-slit diffraction. We take into account effects of loss of coherence and finite detection area using the propagator for a quantum particle interacting with an environment. We observe that the matter-wave Gouy phase plays a role in the existence of the central peak and thus corroborates the predominantly wavelike character of the Poisson's spot. Our model shows remarkable agreement with the experimental data for deuterium (D2) molecules.

  6. Convergence diagnostics for Eigenvalue problems with linear regression model

    International Nuclear Information System (INIS)

    Shi, Bo; Petrovic, Bojan

    2011-01-01

    Although the Monte Carlo method has been extensively used for criticality/Eigenvalue problems, a reliable, robust, and efficient convergence diagnostics method is still desired. Most methods are based on integral parameters (multiplication factor, entropy) and either condense the local distribution information into a single value (e.g., entropy) or even disregard it. We propose to employ the detailed cycle-by-cycle local flux evolution obtained by using mesh tally mechanism to assess the source and flux convergence. By applying a linear regression model to each individual mesh in a mesh tally for convergence diagnostics, a global convergence criterion can be obtained. We exemplify this method on two problems and obtain promising diagnostics results. (author)

  7. The R Package threg to Implement Threshold Regression Models

    Directory of Open Access Journals (Sweden)

    Tao Xiao

    2015-08-01

    This new package includes four functions: threg, and the methods hr, predict and plot for threg objects returned by threg. The threg function is the model-fitting function which is used to calculate regression coefficient estimates, asymptotic standard errors and p values. The hr method for threg objects is the hazard-ratio calculation function which provides the estimates of hazard ratios at selected time points for specified scenarios (based on given categories or value settings of covariates. The predict method for threg objects is used for prediction. And the plot method for threg objects provides plots for curves of estimated hazard functions, survival functions and probability density functions of the first-hitting-time; function curves corresponding to different scenarios can be overlaid in the same plot for comparison to give additional research insights.

  8. A Poisson-like closed-form expression for the steady-state wealth distribution in a kinetic model of gambling

    Science.gov (United States)

    Garcia, Jane Bernadette Denise M.; Esguerra, Jose Perico H.

    2017-08-01

    An approximate but closed-form expression for a Poisson-like steady state wealth distribution in a kinetic model of gambling was formulated from a finite number of its moments, which were generated from a βa,b(x) exchange distribution. The obtained steady-state wealth distributions have tails which are qualitatively similar to those observed in actual wealth distributions.

  9. A modified Poisson-Boltmann model including charge regulation for the adsorption of ionizable polyelectrolytes to charged interfaces, applied to lysozyme adsorption on silica

    NARCIS (Netherlands)

    Biesheuvel, P.M.; Veen, van der M.; Norde, W.

    2005-01-01

    The equilibrium adsorption of polyelectrolytes with multiple types of ionizable groups is described using a modified Poisson-Boltzmann equation including charge regulation of both the polymer and the interface. A one-dimensional mean-field model is used in which the electrostatic potential is

  10. Relationship between the generalized equivalent uniform dose formulation and the Poisson statistics-based tumor control probability model

    International Nuclear Information System (INIS)

    Zhou Sumin; Das, Shiva; Wang Zhiheng; Marks, Lawrence B.

    2004-01-01

    The generalized equivalent uniform dose (GEUD) model uses a power-law formalism, where the outcome is related to the dose via a power law. We herein investigate the mathematical compatibility between this GEUD model and the Poisson statistics based tumor control probability (TCP) model. The GEUD and TCP formulations are combined and subjected to a compatibility constraint equation. This compatibility constraint equates tumor control probability from the original heterogeneous target dose distribution to that from the homogeneous dose from the GEUD formalism. It is shown that this constraint equation possesses a unique, analytical closed-form solution which relates radiation dose to the tumor cell survival fraction. It is further demonstrated that, when there is no positive threshold or finite critical dose in the tumor response to radiation, this relationship is not bounded within the realistic cell survival limits of 0%-100%. Thus, the GEUD and TCP formalisms are, in general, mathematically inconsistent. However, when a threshold dose or finite critical dose exists in the tumor response to radiation, there is a unique mathematical solution for the tumor cell survival fraction that allows the GEUD and TCP formalisms to coexist, provided that all portions of the tumor are confined within certain specific dose ranges

  11. Using Poisson mixed-effects model to quantify transcript-level gene expression in RNA-Seq.

    Science.gov (United States)

    Hu, Ming; Zhu, Yu; Taylor, Jeremy M G; Liu, Jun S; Qin, Zhaohui S

    2012-01-01

    RNA sequencing (RNA-Seq) is a powerful new technology for mapping and quantifying transcriptomes using ultra high-throughput next-generation sequencing technologies. Using deep sequencing, gene expression levels of all transcripts including novel ones can be quantified digitally. Although extremely promising, the massive amounts of data generated by RNA-Seq, substantial biases and uncertainty in short read alignment pose challenges for data analysis. In particular, large base-specific variation and between-base dependence make simple approaches, such as those that use averaging to normalize RNA-Seq data and quantify gene expressions, ineffective. In this study, we propose a Poisson mixed-effects (POME) model to characterize base-level read coverage within each transcript. The underlying expression level is included as a key parameter in this model. Since the proposed model is capable of incorporating base-specific variation as well as between-base dependence that affect read coverage profile throughout the transcript, it can lead to improved quantification of the true underlying expression level. POME can be freely downloaded at http://www.stat.purdue.edu/~yuzhu/pome.html. yuzhu@purdue.edu; zhaohui.qin@emory.edu Supplementary data are available at Bioinformatics online.

  12. Some findings on zero-inflated and hurdle poisson models for disease mapping.

    Science.gov (United States)

    Corpas-Burgos, Francisca; García-Donato, Gonzalo; Martinez-Beneito, Miguel A

    2018-05-27

    Zero excess in the study of geographically referenced mortality data sets has been the focus of considerable attention in the literature, with zero-inflation being the most common procedure to handle this lack of fit. Although hurdle models have also been used in disease mapping studies, their use is more rare. We show in this paper that models using particular treatments of zero excesses are often required for achieving appropriate fits in regular mortality studies since, otherwise, geographical units with low expected counts are oversmoothed. However, as also shown, an indiscriminate treatment of zero excess may be unnecessary and has a problematic implementation. In this regard, we find that naive zero-inflation and hurdle models, without an explicit modeling of the probabilities of zeroes, do not fix zero excesses problems well enough and are clearly unsatisfactory. Results sharply suggest the need for an explicit modeling of the probabilities that should vary across areal units. Unfortunately, these more flexible modeling strategies can easily lead to improper posterior distributions as we prove in several theoretical results. Those procedures have been repeatedly used in the disease mapping literature, and one should bear these issues in mind in order to propose valid models. We finally propose several valid modeling alternatives according to the results mentioned that are suitable for fitting zero excesses. We show that those proposals fix zero excesses problems and correct the mentioned oversmoothing of risks in low populated units depicting geographic patterns more suited to the data. Copyright © 2018 John Wiley & Sons, Ltd.

  13. Multiple linear regression and regression with time series error models in forecasting PM10 concentrations in Peninsular Malaysia.

    Science.gov (United States)

    Ng, Kar Yong; Awang, Norhashidah

    2018-01-06

    Frequent haze occurrences in Malaysia have made the management of PM 10 (particulate matter with aerodynamic less than 10 μm) pollution a critical task. This requires knowledge on factors associating with PM 10 variation and good forecast of PM 10 concentrations. Hence, this paper demonstrates the prediction of 1-day-ahead daily average PM 10 concentrations based on predictor variables including meteorological parameters and gaseous pollutants. Three different models were built. They were multiple linear regression (MLR) model with lagged predictor variables (MLR1), MLR model with lagged predictor variables and PM 10 concentrations (MLR2) and regression with time series error (RTSE) model. The findings revealed that humidity, temperature, wind speed, wind direction, carbon monoxide and ozone were the main factors explaining the PM 10 variation in Peninsular Malaysia. Comparison among the three models showed that MLR2 model was on a same level with RTSE model in terms of forecasting accuracy, while MLR1 model was the worst.

  14. Nonlinear Poisson equation for heterogeneous media.

    Science.gov (United States)

    Hu, Langhua; Wei, Guo-Wei

    2012-08-22

    The Poisson equation is a widely accepted model for electrostatic analysis. However, the Poisson equation is derived based on electric polarizations in a linear, isotropic, and homogeneous dielectric medium. This article introduces a nonlinear Poisson equation to take into consideration of hyperpolarization effects due to intensive charges and possible nonlinear, anisotropic, and heterogeneous media. Variational principle is utilized to derive the nonlinear Poisson model from an electrostatic energy functional. To apply the proposed nonlinear Poisson equation for the solvation analysis, we also construct a nonpolar solvation energy functional based on the nonlinear Poisson equation by using the geometric measure theory. At a fixed temperature, the proposed nonlinear Poisson theory is extensively validated by the electrostatic analysis of the Kirkwood model and a set of 20 proteins, and the solvation analysis of a set of 17 small molecules whose experimental measurements are also available for a comparison. Moreover, the nonlinear Poisson equation is further applied to the solvation analysis of 21 compounds at different temperatures. Numerical results are compared to theoretical prediction, experimental measurements, and those obtained from other theoretical methods in the literature. A good agreement between our results and experimental data as well as theoretical results suggests that the proposed nonlinear Poisson model is a potentially useful model for electrostatic analysis involving hyperpolarization effects. Copyright © 2012 Biophysical Society. Published by Elsevier Inc. All rights reserved.

  15. (Quasi-)Poisson enveloping algebras

    OpenAIRE

    Yang, Yan-Hong; Yao, Yuan; Ye, Yu

    2010-01-01

    We introduce the quasi-Poisson enveloping algebra and Poisson enveloping algebra for a non-commutative Poisson algebra. We prove that for a non-commutative Poisson algebra, the category of quasi-Poisson modules is equivalent to the category of left modules over its quasi-Poisson enveloping algebra, and the category of Poisson modules is equivalent to the category of left modules over its Poisson enveloping algebra.

  16. Functional form comparison between the population and the individual Poisson based TCP models

    International Nuclear Information System (INIS)

    Schinkel, C.; Stavreva, N.; Stavrev, P.; Carlone, M.; Fallone, B.G.

    2007-01-01

    In this work, the functional form similarity between the individual and fundamental population TCP models is investigated. Using the fact that both models can be expressed in terms of the geometric parameters γ 50 and D 50 , we show that they have almost identical functional form for values of γ 50 ≥1. The conceptual inadequacy of applying an individual model to clinical data is also discussed. A general individual response TCP expression is given, parameterized by D f and γ f - the dose corresponding to a control level of f, and the normalized slope at that point. It is shown that the dose-response may be interpreted as an individual response only if γ 50 is sufficiently high. Based on the functional form equivalency between the individual and the population TCP models, we discuss the possibility of applying the individual TCP model for the case of heterogeneous irradiations. Due to the fact that the fundamental population TCP model is derived for homogeneous irradiations only, we propose the use of the EUD, given by the generalized mean dose, when the fundamental population TCP model is used to fit clinical data. (author)

  17. Ultracentrifuge separative power modeling with multivariate regression using covariance matrix

    International Nuclear Information System (INIS)

    Migliavacca, Elder

    2004-01-01

    In this work, the least-squares methodology with covariance matrix is applied to determine a data curve fitting to obtain a performance function for the separative power δU of a ultracentrifuge as a function of variables that are experimentally controlled. The experimental data refer to 460 experiments on the ultracentrifugation process for uranium isotope separation. The experimental uncertainties related with these independent variables are considered in the calculation of the experimental separative power values, determining an experimental data input covariance matrix. The process variables, which significantly influence the δU values are chosen in order to give information on the ultracentrifuge behaviour when submitted to several levels of feed flow rate F, cut θ and product line pressure P p . After the model goodness-of-fit validation, a residual analysis is carried out to verify the assumed basis concerning its randomness and independence and mainly the existence of residual heteroscedasticity with any explained regression model variable. The surface curves are made relating the separative power with the control variables F, θ and P p to compare the fitted model with the experimental data and finally to calculate their optimized values. (author)

  18. Modeling Pan Evaporation for Kuwait by Multiple Linear Regression

    Science.gov (United States)

    Almedeij, Jaber

    2012-01-01

    Evaporation is an important parameter for many projects related to hydrology and water resources systems. This paper constitutes the first study conducted in Kuwait to obtain empirical relations for the estimation of daily and monthly pan evaporation as functions of available meteorological data of temperature, relative humidity, and wind speed. The data used here for the modeling are daily measurements of substantial continuity coverage, within a period of 17 years between January 1993 and December 2009, which can be considered representative of the desert climate of the urban zone of the country. Multiple linear regression technique is used with a procedure of variable selection for fitting the best model forms. The correlations of evaporation with temperature and relative humidity are also transformed in order to linearize the existing curvilinear patterns of the data by using power and exponential functions, respectively. The evaporation models suggested with the best variable combinations were shown to produce results that are in a reasonable agreement with observation values. PMID:23226984

  19. Bayesian Poisson log-bilinear models for mortality projections with multiple populations

    NARCIS (Netherlands)

    Antonio, K.; Bardoutsos, A.; Ouburg, W.

    2015-01-01

    Life insurers, pension funds, health care providers and social security institutions face increasing expenses due to continuing improvements of mortality rates. The actuarial and demographic literature has introduced a myriad of (deterministic and stochastic) models to forecast mortality rates of

  20. Short-term electricity prices forecasting based on support vector regression and Auto-regressive integrated moving average modeling

    International Nuclear Information System (INIS)

    Che Jinxing; Wang Jianzhou

    2010-01-01

    In this paper, we present the use of different mathematical models to forecast electricity price under deregulated power. A successful prediction tool of electricity price can help both power producers and consumers plan their bidding strategies. Inspired by that the support vector regression (SVR) model, with the ε-insensitive loss function, admits of the residual within the boundary values of ε-tube, we propose a hybrid model that combines both SVR and Auto-regressive integrated moving average (ARIMA) models to take advantage of the unique strength of SVR and ARIMA models in nonlinear and linear modeling, which is called SVRARIMA. A nonlinear analysis of the time-series indicates the convenience of nonlinear modeling, the SVR is applied to capture the nonlinear patterns. ARIMA models have been successfully applied in solving the residuals regression estimation problems. The experimental results demonstrate that the model proposed outperforms the existing neural-network approaches, the traditional ARIMA models and other hybrid models based on the root mean square error and mean absolute percentage error.

  1. Application of Negative Binomial Regression for Assessing Public ...

    African Journals Online (AJOL)

    Because the variance was nearly two times greater than the mean, the negative binomial regression model provided an improved fit to the data and accounted better for overdispersion than the Poisson regression model, which assumed that the mean and variance are the same. The level of education and race were found

  2. Applied Prevalence Ratio estimation with different Regression models: An example from a cross-national study on substance use research.

    Science.gov (United States)

    Espelt, Albert; Marí-Dell'Olmo, Marc; Penelo, Eva; Bosque-Prous, Marina

    2016-06-14

    To examine the differences between Prevalence Ratio (PR) and Odds Ratio (OR) in a cross-sectional study and to provide tools to calculate PR using two statistical packages widely used in substance use research (STATA and R). We used cross-sectional data from 41,263 participants of 16 European countries participating in the Survey on Health, Ageing and Retirement in Europe (SHARE). The dependent variable, hazardous drinking, was calculated using the Alcohol Use Disorders Identification Test - Consumption (AUDIT-C). The main independent variable was gender. Other variables used were: age, educational level and country of residence. PR of hazardous drinking in men with relation to women was estimated using Mantel-Haenszel method, log-binomial regression models and poisson regression models with robust variance. These estimations were compared to the OR calculated using logistic regression models. Prevalence of hazardous drinkers varied among countries. Generally, men have higher prevalence of hazardous drinking than women [PR=1.43 (1.38-1.47)]. Estimated PR was identical independently of the method and the statistical package used. However, OR overestimated PR, depending on the prevalence of hazardous drinking in the country. In cross-sectional studies, where comparisons between countries with differences in the prevalence of the disease or condition are made, it is advisable to use PR instead of OR.

  3. A compound Poisson EOQ model for perishable items with intermittent high and low demand periods

    NARCIS (Netherlands)

    Boxma, O.J.; Perry, D.; Stadje, W.; Zacks, S.

    2012-01-01

    We consider a stochastic EOQ-type model, with demand operating in a two-state random environment. This environment alternates between exponentially distributed periods of high demand and generally distributed periods of low demand. The inventory level starts at some level q, and decreases according

  4. An Ordered Regression Model to Predict Transit Passengers’ Behavioural Intentions

    Energy Technology Data Exchange (ETDEWEB)

    Oña, J. de; Oña, R. de; Eboli, L.; Forciniti, C.; Mazzulla, G.

    2016-07-01

    Passengers’ behavioural intentions after experiencing transit services can be viewed as signals that show if a customer continues to utilise a company’s service. Users’ behavioural intentions can depend on a series of aspects that are difficult to measure directly. More recently, transit passengers’ behavioural intentions have been just considered together with the concepts of service quality and customer satisfaction. Due to the characteristics of the ways for evaluating passengers’ behavioural intentions, service quality and customer satisfaction, we retain that this kind of issue could be analysed also by applying ordered regression models. This work aims to propose just an ordered probit model for analysing service quality factors that can influence passengers’ behavioural intentions towards the use of transit services. The case study is the LRT of Seville (Spain), where a survey was conducted in order to collect the opinions of the passengers about the existing transit service, and to have a measure of the aspects that can influence the intentions of the users to continue using the transit service in the future. (Author)

  5. Heterogeneous Breast Phantom Development for Microwave Imaging Using Regression Models

    Directory of Open Access Journals (Sweden)

    Camerin Hahn

    2012-01-01

    Full Text Available As new algorithms for microwave imaging emerge, it is important to have standard accurate benchmarking tests. Currently, most researchers use homogeneous phantoms for testing new algorithms. These simple structures lack the heterogeneity of the dielectric properties of human tissue and are inadequate for testing these algorithms for medical imaging. To adequately test breast microwave imaging algorithms, the phantom has to resemble different breast tissues physically and in terms of dielectric properties. We propose a systematic approach in designing phantoms that not only have dielectric properties close to breast tissues but also can be easily shaped to realistic physical models. The approach is based on regression model to match phantom's dielectric properties with the breast tissue dielectric properties found in Lazebnik et al. (2007. However, the methodology proposed here can be used to create phantoms for any tissue type as long as ex vivo, in vitro, or in vivo tissue dielectric properties are measured and available. Therefore, using this method, accurate benchmarking phantoms for testing emerging microwave imaging algorithms can be developed.

  6. A Multi-Model Stereo Similarity Function Based on Monogenic Signal Analysis in Poisson Scale Space

    Directory of Open Access Journals (Sweden)

    Jinjun Li

    2011-01-01

    Full Text Available A stereo similarity function based on local multi-model monogenic image feature descriptors (LMFD is proposed to match interest points and estimate disparity map for stereo images. Local multi-model monogenic image features include local orientation and instantaneous phase of the gray monogenic signal, local color phase of the color monogenic signal, and local mean colors in the multiscale color monogenic signal framework. The gray monogenic signal, which is the extension of analytic signal to gray level image using Dirac operator and Laplace equation, consists of local amplitude, local orientation, and instantaneous phase of 2D image signal. The color monogenic signal is the extension of monogenic signal to color image based on Clifford algebras. The local color phase can be estimated by computing geometric product between the color monogenic signal and a unit reference vector in RGB color space. Experiment results on the synthetic and natural stereo images show the performance of the proposed approach.

  7. Square root approximation to the poisson channel

    NARCIS (Netherlands)

    Tsiatmas, A.; Willems, F.M.J.; Baggen, C.P.M.J.

    2013-01-01

    Starting from the Poisson model we present a channel model for optical communications, called the Square Root (SR) Channel, in which the noise is additive Gaussian with constant variance. Initially, we prove that for large peak or average power, the transmission rate of a Poisson Channel when coding

  8. Solution of the equations of motion for a super non-Abelian sigma model in curved background by the super Poisson-Lie T-duality

    International Nuclear Information System (INIS)

    Eghbali, Ali

    2015-01-01

    The equations of motion of a super non-Abelian T-dual sigma model on the Lie supergroup (C_1"1+A) in the curved background are explicitly solved by the super Poisson-Lie T-duality. To find the solution of the flat model we use the transformation of supercoordinates, transforming the metric into a constant one, which is shown to be a supercanonical transformation. Then, using the super Poisson-Lie T-duality transformations and the dual decomposition of elements of Drinfel’d superdouble, the solution of the equations of motion for the dual sigma model is obtained. The general form of the dilaton fields satisfying the vanishing β−function equations of the sigma models is found. In this respect, conformal invariance of the sigma models built on the Drinfel’d superdouble ((C_1"1+A) , I_(_2_|_2_)) is guaranteed up to one-loop, at least.

  9. Relative age and birthplace effect in Japanese professional sports: a quantitative evaluation using a Bayesian hierarchical Poisson model.

    Science.gov (United States)

    Ishigami, Hideaki

    2016-01-01

    Relative age effect (RAE) in sports has been well documented. Recent studies investigate the effect of birthplace in addition to the RAE. The first objective of this study was to show the magnitude of the RAE in two major professional sports in Japan, baseball and soccer. Second, we examined the birthplace effect and compared its magnitude with that of the RAE. The effect sizes were estimated using a Bayesian hierarchical Poisson model with the number of players as dependent variable. The RAEs were 9.0% and 7.7% per month for soccer and baseball, respectively. These estimates imply that children born in the first month of a school year have about three times greater chance of becoming a professional player than those born in the last month of the year. Over half of the difference in likelihoods of becoming a professional player between birthplaces was accounted for by weather conditions, with the likelihood decreasing by 1% per snow day. An effect of population size was not detected in the data. By investigating different samples, we demonstrated that using quarterly data leads to underestimation and that the age range of sampled athletes should be set carefully.

  10. How many dinosaur species were there? Fossil bias and true richness estimated using a Poisson sampling model.

    Science.gov (United States)

    Starrfelt, Jostein; Liow, Lee Hsiang

    2016-04-05

    The fossil record is a rich source of information about biological diversity in the past. However, the fossil record is not only incomplete but has also inherent biases due to geological, physical, chemical and biological factors. Our knowledge of past life is also biased because of differences in academic and amateur interests and sampling efforts. As a result, not all individuals or species that lived in the past are equally likely to be discovered at any point in time or space. To reconstruct temporal dynamics of diversity using the fossil record, biased sampling must be explicitly taken into account. Here, we introduce an approach that uses the variation in the number of times each species is observed in the fossil record to estimate both sampling bias and true richness. We term our technique TRiPS (True Richness estimated using a Poisson Sampling model) and explore its robustness to violation of its assumptions via simulations. We then venture to estimate sampling bias and absolute species richness of dinosaurs in the geological stages of the Mesozoic. Using TRiPS, we estimate that 1936 (1543-2468) species of dinosaurs roamed the Earth during the Mesozoic. We also present improved estimates of species richness trajectories of the three major dinosaur clades: the sauropodomorphs, ornithischians and theropods, casting doubt on the Jurassic-Cretaceous extinction event and demonstrating that all dinosaur groups are subject to considerable sampling bias throughout the Mesozoic. © 2016 The Authors.

  11. Modelling carcinogenesis after radiotherapy using Poisson statistics: implications for IMRT, protons and ions

    Energy Technology Data Exchange (ETDEWEB)

    Jones, Bleddyn [Gray Institute for Radiation Oncology and Biology, University of Oxford, Old Road Campus, Headington, Oxford OX3 7DQ (United Kingdom)], E-mail: Bleddyn.Jones@rob.ox.ac.uk

    2009-06-01

    Current technical radiotherapy advances aim to (a) better conform the dose contours to cancers and (b) reduce the integral dose exposure and thereby minimise unnecessary dose exposure to normal tissues unaffected by the cancer. Various types of conformal and intensity modulated radiotherapy (IMRT) using x-rays can achieve (a) while charged particle therapy (CPT)-using proton and ion beams-can achieve both (a) and (b), but at greater financial cost. Not only is the long term risk of radiation related normal tissue complications important, but so is the risk of carcinogenesis. Physical dose distribution plans can be generated to show the differences between the above techniques. IMRT is associated with a dose bath of low to medium dose due to fluence transfer: dose is effectively transferred from designated organs at risk to other areas; thus dose and risk are transferred. Many clinicians are concerned that there may be additional carcinogenesis many years after IMRT. CPT reduces the total energy deposition in the body and offers many potential advantages in terms of the prospects for better quality of life along with cancer cure. With C ions there is a tail of dose beyond the Bragg peaks, due to nuclear fragmentation; this is not found with protons. CPT generally uses higher linear energy transfer (which varies with particle and energy), which carries a higher relative risk of malignant induction, but also of cell death quantified by the relative biological effect concept, so at higher dose levels the frank development of malignancy should be reduced. Standard linear radioprotection models have been used to show a reduction in carcinogenesis risk of between two- and 15-fold depending on the CPT location. But the standard risk models make no allowance for fractionation and some have a dose limit at 4 Gy. Alternatively, tentative application of the linear quadratic model and Poissonian statistics to chromosome breakage and cell kill simultaneously allows estimation of

  12. application of multilinear regression analysis in modeling of soil

    African Journals Online (AJOL)

    Windows User

    Accordingly [1, 3] in their work, they applied linear regression ... (MLRA) is a statistical technique that uses several explanatory ... order to check this, they adopted bivariate correlation analysis .... groups, namely A-1 through A-7, based on their relative expected ..... Multivariate Regression in Gorgan Province North of Iran” ...

  13. "Testing a Poisson counter model for visual identification of briefly presented, mutually confusable single stimuli in pure accuracy tasks": Correction to Kyllingsbæk, Markussen, and Bundesen (2012)

    DEFF Research Database (Denmark)

    Nielsen, Carsten Søren; Kyllingsbæk, Søren; Markussen, Bo

    2015-01-01

    The article “Testing a Poisson Counter Model for Visual Identification of Briefly Presented, Mutually Confusable Single Stimuli in Pure Accuracy Tasks” by Søren Kyllingsbæk, Bo Markussen and Claus Bundesen (Journal of Experimental Psychology: Human Perception and Performance, 2012, Vol. 38, No. 3......, pp. 628–642. http://dx.doi.org/10.1037/a0024751) used a computational shortcut (Equation A5) that strongly reduced the time needed to fit the Poisson counter model to experimental data. Unfortunately, the computational shortcut built on an approximation that was not well-founded in the Poisson...... counter model. To measure the actual deviation, the authors refitted both the computational shortcut and the Poisson counter model (Equations A1-A4) to the experimental data reported in the article. The Poisson counter model fits did, fortunately, not deviate noticeably from those produced...

  14. Wheat flour dough Alveograph characteristics predicted by Mixolab regression models.

    Science.gov (United States)

    Codină, Georgiana Gabriela; Mironeasa, Silvia; Mironeasa, Costel; Popa, Ciprian N; Tamba-Berehoiu, Radiana

    2012-02-01

    In Romania, the Alveograph is the most used device to evaluate the rheological properties of wheat flour dough, but lately the Mixolab device has begun to play an important role in the breadmaking industry. These two instruments are based on different principles but there are some correlations that can be found between the parameters determined by the Mixolab and the rheological properties of wheat dough measured with the Alveograph. Statistical analysis on 80 wheat flour samples using the backward stepwise multiple regression method showed that Mixolab values using the ‘Chopin S’ protocol (40 samples) and ‘Chopin + ’ protocol (40 samples) can be used to elaborate predictive models for estimating the value of the rheological properties of wheat dough: baking strength (W), dough tenacity (P) and extensibility (L). The correlation analysis confirmed significant findings (P 0.70 for P, R²(adjusted) > 0.70 for W and R²(adjusted) > 0.38 for L, at a 95% confidence interval. Copyright © 2011 Society of Chemical Industry.

  15. Application of regression model on stream water quality parameters

    International Nuclear Information System (INIS)

    Suleman, M.; Maqbool, F.; Malik, A.H.; Bhatti, Z.A.

    2012-01-01

    Statistical analysis was conducted to evaluate the effect of solid waste leachate from the open solid waste dumping site of Salhad on the stream water quality. Five sites were selected along the stream. Two sites were selected prior to mixing of leachate with the surface water. One was of leachate and other two sites were affected with leachate. Samples were analyzed for pH, water temperature, electrical conductivity (EC), total dissolved solids (TDS), Biological oxygen demand (BOD), chemical oxygen demand (COD), dissolved oxygen (DO) and total bacterial load (TBL). In this study correlation coefficient r among different water quality parameters of various sites were calculated by using Pearson model and then average of each correlation between two parameters were also calculated, which shows TDS and EC and pH and BOD have significantly increasing r value, while temperature and TDS, temp and EC, DO and BL, DO and COD have decreasing r value. Single factor ANOVA at 5% level of significance was used which shows EC, TDS, TCL and COD were significantly differ among various sites. By the application of these two statistical approaches TDS and EC shows strongly positive correlation because the ions from the dissolved solids in water influence the ability of that water to conduct an electrical current. These two parameters significantly vary among 5 sites which are further confirmed by using linear regression. (author)

  16. Identifying overrepresented concepts in gene lists from literature: a statistical approach based on Poisson mixture model

    Directory of Open Access Journals (Sweden)

    Zhai Chengxiang

    2010-05-01

    Full Text Available Abstract Background Large-scale genomic studies often identify large gene lists, for example, the genes sharing the same expression patterns. The interpretation of these gene lists is generally achieved by extracting concepts overrepresented in the gene lists. This analysis often depends on manual annotation of genes based on controlled vocabularies, in particular, Gene Ontology (GO. However, the annotation of genes is a labor-intensive process; and the vocabularies are generally incomplete, leaving some important biological domains inadequately covered. Results We propose a statistical method that uses the primary literature, i.e. free-text, as the source to perform overrepresentation analysis. The method is based on a statistical framework of mixture model and addresses the methodological flaws in several existing programs. We implemented this method within a literature mining system, BeeSpace, taking advantage of its analysis environment and added features that facilitate the interactive analysis of gene sets. Through experimentation with several datasets, we showed that our program can effectively summarize the important conceptual themes of large gene sets, even when traditional GO-based analysis does not yield informative results. Conclusions We conclude that the current work will provide biologists with a tool that effectively complements the existing ones for overrepresentation analysis from genomic experiments. Our program, Genelist Analyzer, is freely available at: http://workerbee.igb.uiuc.edu:8080/BeeSpace/Search.jsp

  17. The microcomputer scientific software series 2: general linear model--regression.

    Science.gov (United States)

    Harold M. Rauscher

    1983-01-01

    The general linear model regression (GLMR) program provides the microcomputer user with a sophisticated regression analysis capability. The output provides a regression ANOVA table, estimators of the regression model coefficients, their confidence intervals, confidence intervals around the predicted Y-values, residuals for plotting, a check for multicollinearity, a...

  18. MODELING SNAKE MICROHABITAT FROM RADIOTELEMETRY STUDIES USING POLYTOMOUS LOGISTIC REGRESSION

    Science.gov (United States)

    Multivariate analysis of snake microhabitat has historically used techniques that were derived under assumptions of normality and common covariance structure (e.g., discriminant function analysis, MANOVA). In this study, polytomous logistic regression (PLR which does not require ...

  19. Methods of Detecting Outliers in A Regression Analysis Model ...

    African Journals Online (AJOL)

    PROF. O. E. OSUAGWU

    2013-06-01

    Jun 1, 2013 ... especially true in observational studies .... Simple linear regression and multiple ... The simple linear ..... Grubbs,F.E (1950): Sample Criteria for Testing Outlying observations: Annals of ... In experimental design, the Relative.

  20. 231 Using Multiple Regression Analysis in Modelling the Role of ...

    African Journals Online (AJOL)

    User

    of Internal Revenue, Tourism Bureau and hotel records. The multiple regression .... additional guest facilities such as restaurant, a swimming pool or child care and social function ... and provide good quality service to the public. Conclusion.

  1. Modeling Fire Occurrence at the City Scale: A Comparison between Geographically Weighted Regression and Global Linear Regression.

    Science.gov (United States)

    Song, Chao; Kwan, Mei-Po; Zhu, Jiping

    2017-04-08

    An increasing number of fires are occurring with the rapid development of cities, resulting in increased risk for human beings and the environment. This study compares geographically weighted regression-based models, including geographically weighted regression (GWR) and geographically and temporally weighted regression (GTWR), which integrates spatial and temporal effects and global linear regression models (LM) for modeling fire risk at the city scale. The results show that the road density and the spatial distribution of enterprises have the strongest influences on fire risk, which implies that we should focus on areas where roads and enterprises are densely clustered. In addition, locations with a large number of enterprises have fewer fire ignition records, probably because of strict management and prevention measures. A changing number of significant variables across space indicate that heterogeneity mainly exists in the northern and eastern rural and suburban areas of Hefei city, where human-related facilities or road construction are only clustered in the city sub-centers. GTWR can capture small changes in the spatiotemporal heterogeneity of the variables while GWR and LM cannot. An approach that integrates space and time enables us to better understand the dynamic changes in fire risk. Thus governments can use the results to manage fire safety at the city scale.

  2. Analyzing musculoskeletal neck pain, measured as present pain and periods of pain, with three different regression models: a cohort study

    Directory of Open Access Journals (Sweden)

    Hagberg Mats

    2009-06-01

    Full Text Available Abstract Background In the literature there are discussions on the choice of outcome and the need for more longitudinal studies of musculoskeletal disorders. The general aim of this longitudinal study was to analyze musculoskeletal neck pain, in a group of young adults. Specific aims were to determine whether psychosocial factors, computer use, high work/study demands, and lifestyle are long-term or short-term factors for musculoskeletal neck pain, and whether these factors are important for developing or ongoing musculoskeletal neck pain. Methods Three regression models were used to analyze the different outcomes. Pain at present was analyzed with a marginal logistic model, for number of years with pain a Poisson regression model was used and for developing and ongoing pain a logistic model was used. Presented results are odds ratios and proportion ratios (logistic models and rate ratios (Poisson model. The material consisted of web-based questionnaires answered by 1204 Swedish university students from a prospective cohort recruited in 2002. Results Perceived stress was a risk factor for pain at present (PR = 1.6, for developing pain (PR = 1.7 and for number of years with pain (RR = 1.3. High work/study demands was associated with pain at present (PR = 1.6; and with number of years with pain when the demands negatively affect home life (RR = 1.3. Computer use pattern (number of times/week with a computer session ≥ 4 h, without break was a risk factor for developing pain (PR = 1.7, but also associated with pain at present (PR = 1.4 and number of years with pain (RR = 1.2. Among life style factors smoking (PR = 1.8 was found to be associated to pain at present. The difference between men and women in prevalence of musculoskeletal pain was confirmed in this study. It was smallest for the outcome ongoing pain (PR = 1.4 compared to pain at present (PR = 2.4 and developing pain (PR = 2.5. Conclusion By using different regression models different

  3. Accuracy assessment of the linear Poisson-Boltzmann equation and reparametrization of the OBC generalized Born model for nucleic acids and nucleic acid-protein complexes.

    Science.gov (United States)

    Fogolari, Federico; Corazza, Alessandra; Esposito, Gennaro

    2015-04-05

    The generalized Born model in the Onufriev, Bashford, and Case (Onufriev et al., Proteins: Struct Funct Genet 2004, 55, 383) implementation has emerged as one of the best compromises between accuracy and speed of computation. For simulations of nucleic acids, however, a number of issues should be addressed: (1) the generalized Born model is based on a linear model and the linearization of the reference Poisson-Boltmann equation may be questioned for highly charged systems as nucleic acids; (2) although much attention has been given to potentials, solvation forces could be much less sensitive to linearization than the potentials; and (3) the accuracy of the Onufriev-Bashford-Case (OBC) model for nucleic acids depends on fine tuning of parameters. Here, we show that the linearization of the Poisson Boltzmann equation has mild effects on computed forces, and that with optimal choice of the OBC model parameters, solvation forces, essential for molecular dynamics simulations, agree well with those computed using the reference Poisson-Boltzmann model. © 2015 Wiley Periodicals, Inc.

  4. Direct modeling of regression effects for transition probabilities in the progressive illness-death model

    DEFF Research Database (Denmark)

    Azarang, Leyla; Scheike, Thomas; de Uña-Álvarez, Jacobo

    2017-01-01

    In this work, we present direct regression analysis for the transition probabilities in the possibly non-Markov progressive illness–death model. The method is based on binomial regression, where the response is the indicator of the occupancy for the given state along time. Randomly weighted score...

  5. A logistic regression model for Ghana National Health Insurance claims

    Directory of Open Access Journals (Sweden)

    Samuel Antwi

    2013-07-01

    Full Text Available In August 2003, the Ghanaian Government made history by implementing the first National Health Insurance System (NHIS in Sub-Saharan Africa. Within three years, over half of the country’s population had voluntarily enrolled into the National Health Insurance Scheme. This study had three objectives: 1 To estimate the risk factors that influences the Ghana national health insurance claims. 2 To estimate the magnitude of each of the risk factors in relation to the Ghana national health insurance claims. In this work, data was collected from the policyholders of the Ghana National Health Insurance Scheme with the help of the National Health Insurance database and the patients’ attendance register of the Koforidua Regional Hospital, from 1st January to 31st December 2011. Quantitative analysis was done using the generalized linear regression (GLR models. The results indicate that risk factors such as sex, age, marital status, distance and length of stay at the hospital were important predictors of health insurance claims. However, it was found that the risk factors; health status, billed charges and income level are not good predictors of national health insurance claim. The outcome of the study shows that sex, age, marital status, distance and length of stay at the hospital are statistically significant in the determination of the Ghana National health insurance premiums since they considerably influence claims. We recommended, among other things that, the National Health Insurance Authority should facilitate the institutionalization of the collection of appropriate data on a continuous basis to help in the determination of future premiums.

  6. Homogeneous Poisson structures

    International Nuclear Information System (INIS)

    Shafei Deh Abad, A.; Malek, F.

    1993-09-01

    We provide an algebraic definition for Schouten product and give a decomposition for any homogenenous Poisson structure in any n-dimensional vector space. A large class of n-homogeneous Poisson structures in R k is also characterized. (author). 4 refs

  7. A generalized additive regression model for survival times

    DEFF Research Database (Denmark)

    Scheike, Thomas H.

    2001-01-01

    Additive Aalen model; counting process; disability model; illness-death model; generalized additive models; multiple time-scales; non-parametric estimation; survival data; varying-coefficient models......Additive Aalen model; counting process; disability model; illness-death model; generalized additive models; multiple time-scales; non-parametric estimation; survival data; varying-coefficient models...

  8. Singular Poisson tensors

    International Nuclear Information System (INIS)

    Littlejohn, R.G.

    1982-01-01

    The Hamiltonian structures discovered by Morrison and Greene for various fluid equations were obtained by guessing a Hamiltonian and a suitable Poisson bracket formula, expressed in terms of noncanonical (but physical) coordinates. In general, such a procedure for obtaining a Hamiltonian system does not produce a Hamiltonian phase space in the usual sense (a symplectic manifold), but rather a family of symplectic manifolds. To state the matter in terms of a system with a finite number of degrees of freedom, the family of symplectic manifolds is parametrized by a set of Casimir functions, which are characterized by having vanishing Poisson brackets with all other functions. The number of independent Casimir functions is the corank of the Poisson tensor J/sup ij/, the components of which are the Poisson brackets of the coordinates among themselves. Thus, these Casimir functions exist only when the Poisson tensor is singular

  9. Comparison of INAR(1)-Poisson model and Markov prediction model in forecasting the number of DHF patients in west java Indonesia

    Science.gov (United States)

    Ahdika, Atina; Lusiyana, Novyan

    2017-02-01

    World Health Organization (WHO) noted Indonesia as the country with the highest dengue (DHF) cases in Southeast Asia. There are no vaccine and specific treatment for DHF. One of the efforts which can be done by both government and resident is doing a prevention action. In statistics, there are some methods to predict the number of DHF cases to be used as the reference to prevent the DHF cases. In this paper, a discrete time series model, INAR(1)-Poisson model in specific, and Markov prediction model are used to predict the number of DHF patients in West Java Indonesia. The result shows that MPM is the best model since it has the smallest value of MAE (mean absolute error) and MAPE (mean absolute percentage error).

  10. Extending the linear model with R generalized linear, mixed effects and nonparametric regression models

    CERN Document Server

    Faraway, Julian J

    2005-01-01

    Linear models are central to the practice of statistics and form the foundation of a vast range of statistical methodologies. Julian J. Faraway''s critically acclaimed Linear Models with R examined regression and analysis of variance, demonstrated the different methods available, and showed in which situations each one applies. Following in those footsteps, Extending the Linear Model with R surveys the techniques that grow from the regression model, presenting three extensions to that framework: generalized linear models (GLMs), mixed effect models, and nonparametric regression models. The author''s treatment is thoroughly modern and covers topics that include GLM diagnostics, generalized linear mixed models, trees, and even the use of neural networks in statistics. To demonstrate the interplay of theory and practice, throughout the book the author weaves the use of the R software environment to analyze the data of real examples, providing all of the R commands necessary to reproduce the analyses. All of the ...

  11. On poisson-stopped-sums that are mixed poisson

    OpenAIRE

    Valero Baya, Jordi; Pérez Casany, Marta; Ginebra Molins, Josep

    2013-01-01

    Maceda (1948) characterized the mixed Poisson distributions that are Poisson-stopped-sum distributions based on the mixing distribution. In an alternative characterization of the same set of distributions here the Poisson-stopped-sum distributions that are mixed Poisson distributions is proved to be the set of Poisson-stopped-sums of either a mixture of zero-truncated Poisson distributions or a zero-modification of it. Peer Reviewed

  12. Zeroth Poisson Homology, Foliated Cohomology and Perfect Poisson Manifolds

    Science.gov (United States)

    Martínez-Torres, David; Miranda, Eva

    2018-01-01

    We prove that, for compact regular Poisson manifolds, the zeroth homology group is isomorphic to the top foliated cohomology group, and we give some applications. In particular, we show that, for regular unimodular Poisson manifolds, top Poisson and foliated cohomology groups are isomorphic. Inspired by the symplectic setting, we define what a perfect Poisson manifold is. We use these Poisson homology computations to provide families of perfect Poisson manifolds.

  13. A Bayesian Nonparametric Causal Model for Regression Discontinuity Designs

    Science.gov (United States)

    Karabatsos, George; Walker, Stephen G.

    2013-01-01

    The regression discontinuity (RD) design (Thistlewaite & Campbell, 1960; Cook, 2008) provides a framework to identify and estimate causal effects from a non-randomized design. Each subject of a RD design is assigned to the treatment (versus assignment to a non-treatment) whenever her/his observed value of the assignment variable equals or…

  14. Parametric vs. Nonparametric Regression Modelling within Clinical Decision Support

    Czech Academy of Sciences Publication Activity Database

    Kalina, Jan; Zvárová, Jana

    2017-01-01

    Roč. 5, č. 1 (2017), s. 21-27 ISSN 1805-8698 R&D Projects: GA ČR GA17-01251S Institutional support: RVO:67985807 Keywords : decision support systems * decision rules * statistical analysis * nonparametric regression Subject RIV: IN - Informatics, Computer Science OBOR OECD: Statistics and probability

  15. Logistic regression modelling: procedures and pitfalls in developing and interpreting prediction models

    Directory of Open Access Journals (Sweden)

    Nataša Šarlija

    2017-01-01

    Full Text Available This study sheds light on the most common issues related to applying logistic regression in prediction models for company growth. The purpose of the paper is 1 to provide a detailed demonstration of the steps in developing a growth prediction model based on logistic regression analysis, 2 to discuss common pitfalls and methodological errors in developing a model, and 3 to provide solutions and possible ways of overcoming these issues. Special attention is devoted to the question of satisfying logistic regression assumptions, selecting and defining dependent and independent variables, using classification tables and ROC curves, for reporting model strength, interpreting odds ratios as effect measures and evaluating performance of the prediction model. Development of a logistic regression model in this paper focuses on a prediction model of company growth. The analysis is based on predominantly financial data from a sample of 1471 small and medium-sized Croatian companies active between 2009 and 2014. The financial data is presented in the form of financial ratios divided into nine main groups depicting following areas of business: liquidity, leverage, activity, profitability, research and development, investing and export. The growth prediction model indicates aspects of a business critical for achieving high growth. In that respect, the contribution of this paper is twofold. First, methodological, in terms of pointing out pitfalls and potential solutions in logistic regression modelling, and secondly, theoretical, in terms of identifying factors responsible for high growth of small and medium-sized companies.

  16. Modelling subject-specific childhood growth using linear mixed-effect models with cubic regression splines.

    Science.gov (United States)

    Grajeda, Laura M; Ivanescu, Andrada; Saito, Mayuko; Crainiceanu, Ciprian; Jaganath, Devan; Gilman, Robert H; Crabtree, Jean E; Kelleher, Dermott; Cabrera, Lilia; Cama, Vitaliano; Checkley, William

    2016-01-01

    Childhood growth is a cornerstone of pediatric research. Statistical models need to consider individual trajectories to adequately describe growth outcomes. Specifically, well-defined longitudinal models are essential to characterize both population and subject-specific growth. Linear mixed-effect models with cubic regression splines can account for the nonlinearity of growth curves and provide reasonable estimators of population and subject-specific growth, velocity and acceleration. We provide a stepwise approach that builds from simple to complex models, and account for the intrinsic complexity of the data. We start with standard cubic splines regression models and build up to a model that includes subject-specific random intercepts and slopes and residual autocorrelation. We then compared cubic regression splines vis-à-vis linear piecewise splines, and with varying number of knots and positions. Statistical code is provided to ensure reproducibility and improve dissemination of methods. Models are applied to longitudinal height measurements in a cohort of 215 Peruvian children followed from birth until their fourth year of life. Unexplained variability, as measured by the variance of the regression model, was reduced from 7.34 when using ordinary least squares to 0.81 (p linear mixed-effect models with random slopes and a first order continuous autoregressive error term. There was substantial heterogeneity in both the intercept (p modeled with a first order continuous autoregressive error term as evidenced by the variogram of the residuals and by a lack of association among residuals. The final model provides a parametric linear regression equation for both estimation and prediction of population- and individual-level growth in height. We show that cubic regression splines are superior to linear regression splines for the case of a small number of knots in both estimation and prediction with the full linear mixed effect model (AIC 19,352 vs. 19

  17. Evaluation of Logistic Regression and Multivariate Adaptive Regression Spline Models for Groundwater Potential Mapping Using R and GIS

    Directory of Open Access Journals (Sweden)

    Soyoung Park

    2017-07-01

    Full Text Available This study mapped and analyzed groundwater potential using two different models, logistic regression (LR and multivariate adaptive regression splines (MARS, and compared the results. A spatial database was constructed for groundwater well data and groundwater influence factors. Groundwater well data with a high potential yield of ≥70 m3/d were extracted, and 859 locations (70% were used for model training, whereas the other 365 locations (30% were used for model validation. We analyzed 16 groundwater influence factors including altitude, slope degree, slope aspect, plan curvature, profile curvature, topographic wetness index, stream power index, sediment transport index, distance from drainage, drainage density, lithology, distance from fault, fault density, distance from lineament, lineament density, and land cover. Groundwater potential maps (GPMs were constructed using LR and MARS models and tested using a receiver operating characteristics curve. Based on this analysis, the area under the curve (AUC for the success rate curve of GPMs created using the MARS and LR models was 0.867 and 0.838, and the AUC for the prediction rate curve was 0.836 and 0.801, respectively. This implies that the MARS model is useful and effective for groundwater potential analysis in the study area.

  18. Cumulative Poisson Distribution Program

    Science.gov (United States)

    Bowerman, Paul N.; Scheuer, Ernest M.; Nolty, Robert

    1990-01-01

    Overflow and underflow in sums prevented. Cumulative Poisson Distribution Program, CUMPOIS, one of two computer programs that make calculations involving cumulative Poisson distributions. Both programs, CUMPOIS (NPO-17714) and NEWTPOIS (NPO-17715), used independently of one another. CUMPOIS determines cumulative Poisson distribution, used to evaluate cumulative distribution function (cdf) for gamma distributions with integer shape parameters and cdf for X (sup2) distributions with even degrees of freedom. Used by statisticians and others concerned with probabilities of independent events occurring over specific units of time, area, or volume. Written in C.

  19. Modifications to POISSON

    International Nuclear Information System (INIS)

    Harwood, L.H.

    1981-01-01

    At MSU we have used the POISSON family of programs extensively for magnetic field calculations. In the presently super-saturated computer situation, reducing the run time for the program is imperative. Thus, a series of modifications have been made to POISSON to speed up convergence. Two of the modifications aim at having the first guess solution as close as possible to the final solution. The other two aim at increasing the convergence rate. In this discussion, a working knowledge of POISSON is assumed. The amount of new code and expected time saving for each modification is discussed

  20. Semiparametric Mixtures of Regressions with Single-index for Model Based Clustering

    OpenAIRE

    Xiang, Sijia; Yao, Weixin

    2017-01-01

    In this article, we propose two classes of semiparametric mixture regression models with single-index for model based clustering. Unlike many semiparametric/nonparametric mixture regression models that can only be applied to low dimensional predictors, the new semiparametric models can easily incorporate high dimensional predictors into the nonparametric components. The proposed models are very general, and many of the recently proposed semiparametric/nonparametric mixture regression models a...

  1. Semiparametric nonlinear quantile regression model for financial returns

    Czech Academy of Sciences Publication Activity Database

    Avdulaj, Krenar; Baruník, Jozef

    2017-01-01

    Roč. 21, č. 1 (2017), s. 81-97 ISSN 1081-1826 R&D Projects: GA ČR(CZ) GBP402/12/G097 Institutional support: RVO:67985556 Keywords : copula quantile regression * realized volatility * value-at-risk Subject RIV: AH - Economic s OBOR OECD: Applied Economic s, Econometrics Impact factor: 0.649, year: 2016 http://library.utia.cas.cz/separaty/2017/E/avdulaj-0472346.pdf

  2. Poisson Processes in Free Probability

    OpenAIRE

    An, Guimei; Gao, Mingchu

    2015-01-01

    We prove a multidimensional Poisson limit theorem in free probability, and define joint free Poisson distributions in a non-commutative probability space. We define (compound) free Poisson process explicitly, similar to the definitions of (compound) Poisson processes in classical probability. We proved that the sum of finitely many freely independent compound free Poisson processes is a compound free Poisson processes. We give a step by step procedure for constructing a (compound) free Poisso...

  3. Independence of the effective dielectric constant of an electrolytic solution on the ionic distribution in the linear Poisson-Nernst-Planck model.

    Science.gov (United States)

    Alexe-Ionescu, A L; Barbero, G; Lelidis, I

    2014-08-28

    We consider the influence of the spatial dependence of the ions distribution on the effective dielectric constant of an electrolytic solution. We show that in the linear version of the Poisson-Nernst-Planck model, the effective dielectric constant of the solution has to be considered independent of any ionic distribution induced by the external field. This result follows from the fact that, in the linear approximation of the Poisson-Nernst-Planck model, the redistribution of the ions in the solvent due to the external field gives rise to a variation of the dielectric constant that is of the first order in the effective potential, and therefore it has to be neglected in the Poisson's equation that relates the actual electric potential across the electrolytic cell to the bulk density of ions. The analysis is performed in the case where the electrodes are perfectly blocking and the adsorption at the electrodes is negligible, and in the absence of any ion dissociation-recombination effect.

  4. Use of multiple linear regression and logistic regression models to investigate changes in birthweight for term singleton infants in Scotland.

    Science.gov (United States)

    Bonellie, Sandra R

    2012-10-01

    To illustrate the use of regression and logistic regression models to investigate changes over time in size of babies particularly in relation to social deprivation, age of the mother and smoking. Mean birthweight has been found to be increasing in many countries in recent years, but there are still a group of babies who are born with low birthweights. Population-based retrospective cohort study. Multiple linear regression and logistic regression models are used to analyse data on term 'singleton births' from Scottish hospitals between 1994-2003. Mothers who smoke are shown to give birth to lighter babies on average, a difference of approximately 0.57 Standard deviations lower (95% confidence interval. 0.55-0.58) when adjusted for sex and parity. These mothers are also more likely to have babies that are low birthweight (odds ratio 3.46, 95% confidence interval 3.30-3.63) compared with non-smokers. Low birthweight is 30% more likely where the mother lives in the most deprived areas compared with the least deprived, (odds ratio 1.30, 95% confidence interval 1.21-1.40). Smoking during pregnancy is shown to have a detrimental effect on the size of infants at birth. This effect explains some, though not all, of the observed socioeconomic birthweight. It also explains much of the observed birthweight differences by the age of the mother.   Identifying mothers at greater risk of having a low birthweight baby as important implications for the care and advice this group receives. © 2012 Blackwell Publishing Ltd.

  5. On Poisson Nonlinear Transformations

    Directory of Open Access Journals (Sweden)

    Nasir Ganikhodjaev

    2014-01-01

    Full Text Available We construct the family of Poisson nonlinear transformations defined on the countable sample space of nonnegative integers and investigate their trajectory behavior. We have proved that these nonlinear transformations are regular.

  6. Scaling the Poisson Distribution

    Science.gov (United States)

    Farnsworth, David L.

    2014-01-01

    We derive the additive property of Poisson random variables directly from the probability mass function. An important application of the additive property to quality testing of computer chips is presented.

  7. Extended Poisson Exponential Distribution

    Directory of Open Access Journals (Sweden)

    Anum Fatima

    2015-09-01

    Full Text Available A new mixture of Modified Exponential (ME and Poisson distribution has been introduced in this paper. Taking the Maximum of Modified Exponential random variable when the sample size follows a zero truncated Poisson distribution we have derived the new distribution, named as Extended Poisson Exponential distribution. This distribution possesses increasing and decreasing failure rates. The Poisson-Exponential, Modified Exponential and Exponential distributions are special cases of this distribution. We have also investigated some mathematical properties of the distribution along with Information entropies and Order statistics of the distribution. The estimation of parameters has been obtained using the Maximum Likelihood Estimation procedure. Finally we have illustrated a real data application of our distribution.

  8. Poisson branching point processes

    International Nuclear Information System (INIS)

    Matsuo, K.; Teich, M.C.; Saleh, B.E.A.

    1984-01-01

    We investigate the statistical properties of a special branching point process. The initial process is assumed to be a homogeneous Poisson point process (HPP). The initiating events at each branching stage are carried forward to the following stage. In addition, each initiating event independently contributes a nonstationary Poisson point process (whose rate is a specified function) located at that point. The additional contributions from all points of a given stage constitute a doubly stochastic Poisson point process (DSPP) whose rate is a filtered version of the initiating point process at that stage. The process studied is a generalization of a Poisson branching process in which random time delays are permitted in the generation of events. Particular attention is given to the limit in which the number of branching stages is infinite while the average number of added events per event of the previous stage is infinitesimal. In the special case when the branching is instantaneous this limit of continuous branching corresponds to the well-known Yule--Furry process with an initial Poisson population. The Poisson branching point process provides a useful description for many problems in various scientific disciplines, such as the behavior of electron multipliers, neutron chain reactions, and cosmic ray showers

  9. Beta Regression Finite Mixture Models of Polarization and Priming

    Science.gov (United States)

    Smithson, Michael; Merkle, Edgar C.; Verkuilen, Jay

    2011-01-01

    This paper describes the application of finite-mixture general linear models based on the beta distribution to modeling response styles, polarization, anchoring, and priming effects in probability judgments. These models, in turn, enhance our capacity for explicitly testing models and theories regarding the aforementioned phenomena. The mixture…

  10. A spatially-explicit count data regression for modeling the density of forest cockchafer (Melolontha hippocastani larvae in the Hessian Ried (Germany

    Directory of Open Access Journals (Sweden)

    Matthias Schmidt

    2014-10-01

    Full Text Available Background In this paper, a regression model for predicting the spatial distribution of forest cockchafer larvae in the Hessian Ried region (Germany is presented. The forest cockchafer, a native biotic pest, is a major cause of damage in forests in this region particularly during the regeneration phase. The model developed in this study is based on a systematic sample inventory of forest cockchafer larvae by excavation across the Hessian Ried. These forest cockchafer larvae data were characterized by excess zeros and overdispersion. Methods Using specific generalized additive regression models, different discrete distributions, including the Poisson, negative binomial and zero-inflated Poisson distributions, were compared. The methodology employed allowed the simultaneous estimation of non-linear model effects of causal covariates and, to account for spatial autocorrelation, of a 2-dimensional spatial trend function. In the validation of the models, both the Akaike information criterion (AIC and more detailed graphical procedures based on randomized quantile residuals were used. Results The negative binomial distribution was superior to the Poisson and the zero-inflated Poisson distributions, providing a near perfect fit to the data, which was proven in an extensive validation process. The causal predictors found to affect the density of larvae significantly were distance to water table and percentage of pure clay layer in the soil to a depth of 1 m. Model predictions showed that larva density increased with an increase in distance to the water table up to almost 4 m, after which it remained constant, and with a reduction in the percentage of pure clay layer. However this latter correlation was weak and requires further investigation. The 2-dimensional trend function indicated a strong spatial effect, and thus explained by far the highest proportion of variation in larva density. Conclusions As such the model can be used to support forest

  11. A generalized exponential time series regression model for electricity prices

    DEFF Research Database (Denmark)

    Haldrup, Niels; Knapik, Oskar; Proietti, Tomasso

    on the estimated model, the best linear predictor is constructed. Our modeling approach provides good fit within sample and outperforms competing benchmark predictors in terms of forecasting accuracy. We also find that building separate models for each hour of the day and averaging the forecasts is a better...

  12. Forecast Model of Urban Stagnant Water Based on Logistic Regression

    Directory of Open Access Journals (Sweden)

    Liu Pan

    2017-01-01

    Full Text Available With the development of information technology, the construction of water resource system has been gradually carried out. In the background of big data, the work of water information needs to carry out the process of quantitative to qualitative change. Analyzing the correlation of data and exploring the deep value of data which are the key of water information’s research. On the basis of the research on the water big data and the traditional data warehouse architecture, we try to find out the connection of different data source. According to the temporal and spatial correlation of stagnant water and rainfall, we use spatial interpolation to integrate data of stagnant water and rainfall which are from different data source and different sensors, then use logistic regression to find out the relationship between them.

  13. Parental Vaccine Acceptance: A Logistic Regression Model Using Previsit Decisions.

    Science.gov (United States)

    Lee, Sara; Riley-Behringer, Maureen; Rose, Jeanmarie C; Meropol, Sharon B; Lazebnik, Rina

    2017-07-01

    This study explores how parents' intentions regarding vaccination prior to their children's visit were associated with actual vaccine acceptance. A convenience sample of parents accompanying 6-week-old to 17-year-old children completed a written survey at 2 pediatric practices. Using hierarchical logistic regression, for hospital-based participants (n = 216), vaccine refusal history ( P < .01) and vaccine decision made before the visit ( P < .05) explained 87% of vaccine refusals. In community-based participants (n = 100), vaccine refusal history ( P < .01) explained 81% of refusals. Over 1 in 5 parents changed their minds about vaccination during the visit. Thirty parents who were previous vaccine refusers accepted current vaccines, and 37 who had intended not to vaccinate choose vaccination. Twenty-nine parents without a refusal history declined vaccines, and 32 who did not intend to refuse before the visit declined vaccination. Future research should identify key factors to nudge parent decision making in favor of vaccination.

  14. Improving sub-pixel imperviousness change prediction by ensembling heterogeneous non-linear regression models

    Directory of Open Access Journals (Sweden)

    Drzewiecki Wojciech

    2016-12-01

    Full Text Available In this work nine non-linear regression models were compared for sub-pixel impervious surface area mapping from Landsat images. The comparison was done in three study areas both for accuracy of imperviousness coverage evaluation in individual points in time and accuracy of imperviousness change assessment. The performance of individual machine learning algorithms (Cubist, Random Forest, stochastic gradient boosting of regression trees, k-nearest neighbors regression, random k-nearest neighbors regression, Multivariate Adaptive Regression Splines, averaged neural networks, and support vector machines with polynomial and radial kernels was also compared with the performance of heterogeneous model ensembles constructed from the best models trained using particular techniques.

  15. Modelling the Frequency of Operational Risk Losses under the Basel II Capital Accord: A Comparative study of Poisson and Negative Binomial Distributions

    OpenAIRE

    Silver, Toni O.

    2013-01-01

    2013 dissertation for MSc in Finance and Risk Management. Selected by academic staff as a good example of a masters level dissertation. \\ud \\ud This study investigated the two major methods of modelling the frequency of\\ud operational losses under the BCBS Accord of 1998 known as Basel II Capital\\ud Accord. It compared the Poisson method of modelling the frequency of\\ud losses to that of the Negative Binomial. The frequency of operational losses\\ud was investigated using a cross section of se...

  16. Additive Intensity Regression Models in Corporate Default Analysis

    DEFF Research Database (Denmark)

    Lando, David; Medhat, Mamdouh; Nielsen, Mads Stenbo

    2013-01-01

    We consider additive intensity (Aalen) models as an alternative to the multiplicative intensity (Cox) models for analyzing the default risk of a sample of rated, nonfinancial U.S. firms. The setting allows for estimating and testing the significance of time-varying effects. We use a variety of mo...

  17. Logistic regression model for detecting radon prone areas in Ireland.

    Science.gov (United States)

    Elío, J; Crowley, Q; Scanlon, R; Hodgson, J; Long, S

    2017-12-01

    A new high spatial resolution radon risk map of Ireland has been developed, based on a combination of indoor radon measurements (n=31,910) and relevant geological information (i.e. Bedrock Geology, Quaternary Geology, soil permeability and aquifer type). Logistic regression was used to predict the probability of having an indoor radon concentration above the national reference level of 200Bqm -3 in Ireland. The four geological datasets evaluated were found to be statistically significant, and, based on combinations of these four variables, the predicted probabilities ranged from 0.57% to 75.5%. Results show that the Republic of Ireland may be divided in three main radon risk categories: High (HR), Medium (MR) and Low (LR). The probability of having an indoor radon concentration above 200Bqm -3 in each area was found to be 19%, 8% and 3%; respectively. In the Republic of Ireland, the population affected by radon concentrations above 200Bqm -3 is estimated at ca. 460k (about 10% of the total population). Of these, 57% (265k), 35% (160k) and 8% (35k) are in High, Medium and Low Risk Areas, respectively. Our results provide a high spatial resolution utility which permit customised radon-awareness information to be targeted at specific geographic areas. Copyright © 2017 Elsevier B.V. All rights reserved.

  18. Predicting recycling behaviour: Comparison of a linear regression model and a fuzzy logic model.

    Science.gov (United States)

    Vesely, Stepan; Klöckner, Christian A; Dohnal, Mirko

    2016-03-01

    In this paper we demonstrate that fuzzy logic can provide a better tool for predicting recycling behaviour than the customarily used linear regression. To show this, we take a set of empirical data on recycling behaviour (N=664), which we randomly divide into two halves. The first half is used to estimate a linear regression model of recycling behaviour, and to develop a fuzzy logic model of recycling behaviour. As the first comparison, the fit of both models to the data included in estimation of the models (N=332) is evaluated. As the second comparison, predictive accuracy of both models for "new" cases (hold-out data not included in building the models, N=332) is assessed. In both cases, the fuzzy logic model significantly outperforms the regression model in terms of fit. To conclude, when accurate predictions of recycling and possibly other environmental behaviours are needed, fuzzy logic modelling seems to be a promising technique. Copyright © 2015 Elsevier Ltd. All rights reserved.

  19. Modeling and analysis of surface potential of single gate fully depleted SOI MOSFET using 2D-Poisson's equation

    Science.gov (United States)

    Mani, Prashant; Tyagi, Chandra Shekhar; Srivastav, Nishant

    2016-03-01

    In this paper the analytical solution of the 2D Poisson's equation for single gate Fully Depleted SOI (FDSOI) MOSFET's is derived by using a Green's function solution technique. The surface potential is calculated and the threshold voltage of the device is minimized for the low power consumption. Due to minimization of threshold voltage the short channel effect of device is suppressed and after observation we obtain the device is kink free. The structure and characteristics of SingleGate FDSOI MOSFET were matched by using MathCAD and silvaco respectively.

  20. Logistic Regression Modeling of Diminishing Manufacturing Sources for Integrated Circuits

    National Research Council Canada - National Science Library

    Gravier, Michael

    1999-01-01

    .... This thesis draws on available data from the electronics integrated circuit industry to attempt to assess whether statistical modeling offers a viable method for predicting the presence of DMSMS...

  1. Data to support "Boosted Regression Tree Models to Explain Watershed Nutrient Concentrations & Biological Condition"

    Data.gov (United States)

    U.S. Environmental Protection Agency — Spreadsheets are included here to support the manuscript "Boosted Regression Tree Models to Explain Watershed Nutrient Concentrations and Biological Condition". This...

  2. Martingale Regressions for a Continuous Time Model of Exchange Rates

    OpenAIRE

    Guo, Zi-Yi

    2017-01-01

    One of the daunting problems in international finance is the weak explanatory power of existing theories of the nominal exchange rates, the so-called “foreign exchange rate determination puzzle”. We propose a continuous-time model to study the impact of order flow on foreign exchange rates. The model is estimated by a newly developed econometric tool based on a time-change sampling from calendar to volatility time. The estimation results indicate that the effect of order flow on exchange rate...

  3. Focused information criterion and model averaging based on weighted composite quantile regression

    KAUST Repository

    Xu, Ganggang; Wang, Suojin; Huang, Jianhua Z.

    2013-01-01

    We study the focused information criterion and frequentist model averaging and their application to post-model-selection inference for weighted composite quantile regression (WCQR) in the context of the additive partial linear models. With the non

  4. Cox's regression model for dynamics of grouped unemployment data

    Czech Academy of Sciences Publication Activity Database

    Volf, Petr

    2003-01-01

    Roč. 10, č. 19 (2003), s. 151-162 ISSN 1212-074X R&D Projects: GA ČR GA402/01/0539 Institutional research plan: CEZ:AV0Z1075907 Keywords : mathematical statistics * survival analysis * Cox's model Subject RIV: BB - Applied Statistics, Operational Research

  5. Multiple Linear Regression Model for Estimating the Price of a ...

    African Journals Online (AJOL)

    Ghana Mining Journal ... In the modeling, the Ordinary Least Squares (OLS) normality assumption which could introduce errors in the statistical analyses was dealt with by log transformation of the data, ensuring the data is normally ... The resultant MLRM is: Ŷi MLRM = (X'X)-1X'Y(xi') where X is the sample data matrix.

  6. Inflation, Forecast Intervals and Long Memory Regression Models

    NARCIS (Netherlands)

    C.S. Bos (Charles); Ph.H.B.F. Franses (Philip Hans); M. Ooms (Marius)

    2001-01-01

    textabstractWe examine recursive out-of-sample forecasting of monthly postwar U.S. core inflation and log price levels. We use the autoregressive fractionally integrated moving average model with explanatory variables (ARFIMAX). Our analysis suggests a significant explanatory power of leading

  7. Inflation, Forecast Intervals and Long Memory Regression Models

    NARCIS (Netherlands)

    Ooms, M.; Bos, C.S.; Franses, P.H.

    2003-01-01

    We examine recursive out-of-sample forecasting of monthly postwar US core inflation and log price levels. We use the autoregressive fractionally integrated moving average model with explanatory variables (ARFIMAX). Our analysis suggests a significant explanatory power of leading indicators

  8. Data-driven modelling of LTI systems using symbolic regression

    NARCIS (Netherlands)

    Khandelwal, D.; Toth, R.; Van den Hof, P.M.J.

    2017-01-01

    The aim of this project is to automate the task of data-driven identification of dynamical systems. The underlying goal is to develop an identification tool that models a physical system without distinguishing between classes of systems such as linear, nonlinear or possibly even hybrid systems. Such

  9. Nonparametric Estimation of Regression Parameters in Measurement Error Models

    Czech Academy of Sciences Publication Activity Database

    Ehsanes Saleh, A.K.M.D.; Picek, J.; Kalina, Jan

    2009-01-01

    Roč. 67, č. 2 (2009), s. 177-200 ISSN 0026-1424 Grant - others:GA AV ČR(CZ) IAA101120801; GA MŠk(CZ) LC06024 Institutional research plan: CEZ:AV0Z10300504 Keywords : asymptotic relative efficiency(ARE) * asymptotic theory * emaculate mode * Me model * R-estimation * Reliabilty ratio(RR) Subject RIV: BB - Applied Statistics, Operational Research

  10. Shaofu Zhuyu Decoction Regresses Endometriotic Lesions in a Rat Model

    Directory of Open Access Journals (Sweden)

    Guanghui Zhu

    2018-01-01

    Full Text Available The current therapies for endometriosis are restricted by various side effects and treatment outcome has been less than satisfactory. Shaofu Zhuyu Decoction (SZD, a classic traditional Chinese medicinal (TCM prescription for dysmenorrhea, has been widely used in clinical practice by TCM doctors to relieve symptoms of endometriosis. The present study aimed to investigate the effects of SZD on a rat model of endometriosis. Forty-eight female Sprague-Dawley rats with regular estrous cycles went through autotransplantation operation to establish endometriosis model. Then 38 rats with successful ectopic implants were randomized into two groups: vehicle- and SZD-treated groups. The latter were administered SZD through oral gavage for 4 weeks. By the end of the treatment period, the volume of the endometriotic lesions was measured, the histopathological properties of the ectopic endometrium were evaluated, and levels of proliferating cell nuclear antigen (PCNA, CD34, and hypoxia inducible factor- (HIF- 1α in the ectopic endometrium were detected with immunohistochemistry. Furthermore, apoptosis was assessed using the terminal deoxynucleotidyl transferase (TdT deoxyuridine 5′-triphosphate (dUTP nick-end labeling (TUNEL assay. In this study, SZD significantly reduced the size of ectopic lesions in rats with endometriosis, inhibited cell proliferation, increased cell apoptosis, and reduced microvessel density and HIF-1α expression. It suggested that SZD could be an effective therapy for the treatment and prevention of endometriosis recurrence.

  11. Linking Simple Economic Theory Models and the Cointegrated Vector AutoRegressive Model

    DEFF Research Database (Denmark)

    Møller, Niels Framroze

    This paper attempts to clarify the connection between simple economic theory models and the approach of the Cointegrated Vector-Auto-Regressive model (CVAR). By considering (stylized) examples of simple static equilibrium models, it is illustrated in detail, how the theoretical model and its stru....... Further fundamental extensions and advances to more sophisticated theory models, such as those related to dynamics and expectations (in the structural relations) are left for future papers......This paper attempts to clarify the connection between simple economic theory models and the approach of the Cointegrated Vector-Auto-Regressive model (CVAR). By considering (stylized) examples of simple static equilibrium models, it is illustrated in detail, how the theoretical model and its......, it is demonstrated how other controversial hypotheses such as Rational Expectations can be formulated directly as restrictions on the CVAR-parameters. A simple example of a "Neoclassical synthetic" AS-AD model is also formulated. Finally, the partial- general equilibrium distinction is related to the CVAR as well...

  12. Fractional Poisson process (II)

    International Nuclear Information System (INIS)

    Wang Xiaotian; Wen Zhixiong; Zhang Shiying

    2006-01-01

    In this paper, we propose a stochastic process W H (t)(H-bar (12,1)) which we call fractional Poisson process. The process W H (t) is self-similar in wide sense, displays long range dependence, and has more fatter tail than Gaussian process. In addition, it converges to fractional Brownian motion in distribution

  13. Using the classical linear regression model in analysis of the dependences of conveyor belt life

    Directory of Open Access Journals (Sweden)

    Miriam Andrejiová

    2013-12-01

    Full Text Available The paper deals with the classical linear regression model of the dependence of conveyor belt life on some selected parameters: thickness of paint layer, width and length of the belt, conveyor speed and quantity of transported material. The first part of the article is about regression model design, point and interval estimation of parameters, verification of statistical significance of the model, and about the parameters of the proposed regression model. The second part of the article deals with identification of influential and extreme values that can have an impact on estimation of regression model parameters. The third part focuses on assumptions of the classical regression model, i.e. on verification of independence assumptions, normality and homoscedasticity of residuals.

  14. Two levels ARIMAX and regression models for forecasting time series data with calendar variation effects

    Science.gov (United States)

    Suhartono, Lee, Muhammad Hisyam; Prastyo, Dedy Dwi

    2015-12-01

    The aim of this research is to develop a calendar variation model for forecasting retail sales data with the Eid ul-Fitr effect. The proposed model is based on two methods, namely two levels ARIMAX and regression methods. Two levels ARIMAX and regression models are built by using ARIMAX for the first level and regression for the second level. Monthly men's jeans and women's trousers sales in a retail company for the period January 2002 to September 2009 are used as case study. In general, two levels of calendar variation model yields two models, namely the first model to reconstruct the sales pattern that already occurred, and the second model to forecast the effect of increasing sales due to Eid ul-Fitr that affected sales at the same and the previous months. The results show that the proposed two level calendar variation model based on ARIMAX and regression methods yields better forecast compared to the seasonal ARIMA model and Neural Networks.

  15. Statistical approach for selection of regression model during validation of bioanalytical method

    Directory of Open Access Journals (Sweden)

    Natalija Nakov

    2014-06-01

    Full Text Available The selection of an adequate regression model is the basis for obtaining accurate and reproducible results during the bionalytical method validation. Given the wide concentration range, frequently present in bioanalytical assays, heteroscedasticity of the data may be expected. Several weighted linear and quadratic regression models were evaluated during the selection of the adequate curve fit using nonparametric statistical tests: One sample rank test and Wilcoxon signed rank test for two independent groups of samples. The results obtained with One sample rank test could not give statistical justification for the selection of linear vs. quadratic regression models because slight differences between the error (presented through the relative residuals were obtained. Estimation of the significance of the differences in the RR was achieved using Wilcoxon signed rank test, where linear and quadratic regression models were treated as two independent groups. The application of this simple non-parametric statistical test provides statistical confirmation of the choice of an adequate regression model.

  16. Improving EWMA Plans for Detecting Unusual Increases in Poisson Counts

    Directory of Open Access Journals (Sweden)

    R. S. Sparks

    2009-01-01

    adaptive exponentially weighted moving average (EWMA plan is developed for signalling unusually high incidence when monitoring a time series of nonhomogeneous daily disease counts. A Poisson transitional regression model is used to fit background/expected trend in counts and provides “one-day-ahead” forecasts of the next day's count. Departures of counts from their forecasts are monitored. The paper outlines an approach for improving early outbreak data signals by dynamically adjusting the exponential weights to be efficient at signalling local persistent high side changes. We emphasise outbreak signals in steady-state situations; that is, changes that occur after the EWMA statistic had run through several in-control counts.

  17. On a Robust MaxEnt Process Regression Model with Sample-Selection

    Directory of Open Access Journals (Sweden)

    Hea-Jung Kim

    2018-04-01

    Full Text Available In a regression analysis, a sample-selection bias arises when a dependent variable is partially observed as a result of the sample selection. This study introduces a Maximum Entropy (MaxEnt process regression model that assumes a MaxEnt prior distribution for its nonparametric regression function and finds that the MaxEnt process regression model includes the well-known Gaussian process regression (GPR model as a special case. Then, this special MaxEnt process regression model, i.e., the GPR model, is generalized to obtain a robust sample-selection Gaussian process regression (RSGPR model that deals with non-normal data in the sample selection. Various properties of the RSGPR model are established, including the stochastic representation, distributional hierarchy, and magnitude of the sample-selection bias. These properties are used in the paper to develop a hierarchical Bayesian methodology to estimate the model. This involves a simple and computationally feasible Markov chain Monte Carlo algorithm that avoids analytical or numerical derivatives of the log-likelihood function of the model. The performance of the RSGPR model in terms of the sample-selection bias correction, robustness to non-normality, and prediction, is demonstrated through results in simulations that attest to its good finite-sample performance.

  18. Predicting Antitumor Activity of Peptides by Consensus of Regression Models Trained on a Small Data Sample

    Directory of Open Access Journals (Sweden)

    Ivanka Jerić

    2011-11-01

    Full Text Available Predicting antitumor activity of compounds using regression models trained on a small number of compounds with measured biological activity is an ill-posed inverse problem. Yet, it occurs very often within the academic community. To counteract, up to some extent, overfitting problems caused by a small training data, we propose to use consensus of six regression models for prediction of biological activity of virtual library of compounds. The QSAR descriptors of 22 compounds related to the opioid growth factor (OGF, Tyr-Gly-Gly-Phe-Met with known antitumor activity were used to train regression models: the feed-forward artificial neural network, the k-nearest neighbor, sparseness constrained linear regression, the linear and nonlinear (with polynomial and Gaussian kernel support vector machine. Regression models were applied on a virtual library of 429 compounds that resulted in six lists with candidate compounds ranked by predicted antitumor activity. The highly ranked candidate compounds were synthesized, characterized and tested for an antiproliferative activity. Some of prepared peptides showed more pronounced activity compared with the native OGF; however, they were less active than highly ranked compounds selected previously by the radial basis function support vector machine (RBF SVM regression model. The ill-posedness of the related inverse problem causes unstable behavior of trained regression models on test data. These results point to high complexity of prediction based on the regression models trained on a small data sample.

  19. Testing for constant nonparametric effects in general semiparametric regression models with interactions

    KAUST Repository

    Wei, Jiawei; Carroll, Raymond J.; Maity, Arnab

    2011-01-01

    We consider the problem of testing for a constant nonparametric effect in a general semi-parametric regression model when there is the potential for interaction between the parametrically and nonparametrically modeled variables. The work

  20. Formal equivalence of Poisson structures around Poisson submanifolds

    NARCIS (Netherlands)

    Marcut, I.T.

    2012-01-01

    Let (M,π) be a Poisson manifold. A Poisson submanifold P ⊂ M gives rise to a Lie algebroid AP → P. Formal deformations of π around P are controlled by certain cohomology groups associated to AP. Assuming that these groups vanish, we prove that π is formally rigid around P; that is, any other Poisson