Beta-binomial regression and bimodal utilization.
Liu, Chuan-Fen; Burgess, James F; Manning, Willard G; Maciejewski, Matthew L
2013-10-01
To illustrate how the analysis of bimodal U-shaped distributed utilization can be modeled with beta-binomial regression, which is rarely used in health services research. Veterans Affairs (VA) administrative data and Medicare claims in 2001-2004 for 11,123 Medicare-eligible VA primary care users in 2000. We compared means and distributions of VA reliance (the proportion of all VA/Medicare primary care visits occurring in VA) predicted from beta-binomial, binomial, and ordinary least-squares (OLS) models. Beta-binomial model fits the bimodal distribution of VA reliance better than binomial and OLS models due to the nondependence on normality and the greater flexibility in shape parameters. Increased awareness of beta-binomial regression may help analysts apply appropriate methods to outcomes with bimodal or U-shaped distributions. © Health Research and Educational Trust.
Application of Negative Binomial Regression for Assessing Public ...
African Journals Online (AJOL)
Because the variance was nearly two times greater than the mean, the negative binomial regression model provided an improved fit to the data and accounted better for overdispersion than the Poisson regression model, which assumed that the mean and variance are the same. The level of education and race were found
Amaliana, Luthfatul; Sa'adah, Umu; Wayan Surya Wardhani, Ni
2017-12-01
Tetanus Neonatorum is an infectious disease that can be prevented by immunization. The number of Tetanus Neonatorum cases in East Java Province is the highest in Indonesia until 2015. Tetanus Neonatorum data contain over dispersion and big enough proportion of zero-inflation. Negative Binomial (NB) regression is an alternative method when over dispersion happens in Poisson regression. However, the data containing over dispersion and zero-inflation are more appropriately analyzed by using Zero-Inflated Negative Binomial (ZINB) regression. The purpose of this study are: (1) to model Tetanus Neonatorum cases in East Java Province with 71.05 percent proportion of zero-inflation by using NB and ZINB regression, (2) to obtain the best model. The result of this study indicates that ZINB is better than NB regression with smaller AIC.
Predicting Cumulative Incidence Probability by Direct Binomial Regression
DEFF Research Database (Denmark)
Scheike, Thomas H.; Zhang, Mei-Jie
Binomial modelling; cumulative incidence probability; cause-specific hazards; subdistribution hazard......Binomial modelling; cumulative incidence probability; cause-specific hazards; subdistribution hazard...
Estimation of adjusted rate differences using additive negative binomial regression.
Donoghoe, Mark W; Marschner, Ian C
2016-08-15
Rate differences are an important effect measure in biostatistics and provide an alternative perspective to rate ratios. When the data are event counts observed during an exposure period, adjusted rate differences may be estimated using an identity-link Poisson generalised linear model, also known as additive Poisson regression. A problem with this approach is that the assumption of equality of mean and variance rarely holds in real data, which often show overdispersion. An additive negative binomial model is the natural alternative to account for this; however, standard model-fitting methods are often unable to cope with the constrained parameter space arising from the non-negativity restrictions of the additive model. In this paper, we propose a novel solution to this problem using a variant of the expectation-conditional maximisation-either algorithm. Our method provides a reliable way to fit an additive negative binomial regression model and also permits flexible generalisations using semi-parametric regression functions. We illustrate the method using a placebo-controlled clinical trial of fenofibrate treatment in patients with type II diabetes, where the outcome is the number of laser therapy courses administered to treat diabetic retinopathy. An R package is available that implements the proposed method. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Dynamic prediction of cumulative incidence functions by direct binomial regression.
Grand, Mia K; de Witte, Theo J M; Putter, Hein
2018-03-25
In recent years there have been a series of advances in the field of dynamic prediction. Among those is the development of methods for dynamic prediction of the cumulative incidence function in a competing risk setting. These models enable the predictions to be updated as time progresses and more information becomes available, for example when a patient comes back for a follow-up visit after completing a year of treatment, the risk of death, and adverse events may have changed since treatment initiation. One approach to model the cumulative incidence function in competing risks is by direct binomial regression, where right censoring of the event times is handled by inverse probability of censoring weights. We extend the approach by combining it with landmarking to enable dynamic prediction of the cumulative incidence function. The proposed models are very flexible, as they allow the covariates to have complex time-varying effects, and we illustrate how to investigate possible time-varying structures using Wald tests. The models are fitted using generalized estimating equations. The method is applied to bone marrow transplant data and the performance is investigated in a simulation study. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
[Evaluation of estimation of prevalence ratio using bayesian log-binomial regression model].
Gao, W L; Lin, H; Liu, X N; Ren, X W; Li, J S; Shen, X P; Zhu, S L
2017-03-10
To evaluate the estimation of prevalence ratio ( PR ) by using bayesian log-binomial regression model and its application, we estimated the PR of medical care-seeking prevalence to caregivers' recognition of risk signs of diarrhea in their infants by using bayesian log-binomial regression model in Openbugs software. The results showed that caregivers' recognition of infant' s risk signs of diarrhea was associated significantly with a 13% increase of medical care-seeking. Meanwhile, we compared the differences in PR 's point estimation and its interval estimation of medical care-seeking prevalence to caregivers' recognition of risk signs of diarrhea and convergence of three models (model 1: not adjusting for the covariates; model 2: adjusting for duration of caregivers' education, model 3: adjusting for distance between village and township and child month-age based on model 2) between bayesian log-binomial regression model and conventional log-binomial regression model. The results showed that all three bayesian log-binomial regression models were convergence and the estimated PRs were 1.130(95 %CI : 1.005-1.265), 1.128(95 %CI : 1.001-1.264) and 1.132(95 %CI : 1.004-1.267), respectively. Conventional log-binomial regression model 1 and model 2 were convergence and their PRs were 1.130(95 % CI : 1.055-1.206) and 1.126(95 % CI : 1.051-1.203), respectively, but the model 3 was misconvergence, so COPY method was used to estimate PR , which was 1.125 (95 %CI : 1.051-1.200). In addition, the point estimation and interval estimation of PRs from three bayesian log-binomial regression models differed slightly from those of PRs from conventional log-binomial regression model, but they had a good consistency in estimating PR . Therefore, bayesian log-binomial regression model can effectively estimate PR with less misconvergence and have more advantages in application compared with conventional log-binomial regression model.
Cao, Qingqing; Wu, Zhenqiang; Sun, Ying; Wang, Tiezhu; Han, Tengwei; Gu, Chaomei; Sun, Yehuan
2011-11-01
To Eexplore the application of negative binomial regression and modified Poisson regression analysis in analyzing the influential factors for injury frequency and the risk factors leading to the increase of injury frequency. 2917 primary and secondary school students were selected from Hefei by cluster random sampling method and surveyed by questionnaire. The data on the count event-based injuries used to fitted modified Poisson regression and negative binomial regression model. The risk factors incurring the increase of unintentional injury frequency for juvenile students was explored, so as to probe the efficiency of these two models in studying the influential factors for injury frequency. The Poisson model existed over-dispersion (P Poisson regression and negative binomial regression model, was fitted better. respectively. Both showed that male gender, younger age, father working outside of the hometown, the level of the guardian being above junior high school and smoking might be the results of higher injury frequencies. On a tendency of clustered frequency data on injury event, both the modified Poisson regression analysis and negative binomial regression analysis can be used. However, based on our data, the modified Poisson regression fitted better and this model could give a more accurate interpretation of relevant factors affecting the frequency of injury.
Bennett, Bradley C; Husby, Chad E
2008-03-28
Botanical pharmacopoeias are non-random subsets of floras, with some taxonomic groups over- or under-represented. Moerman [Moerman, D.E., 1979. Symbols and selectivity: a statistical analysis of Native American medical ethnobotany, Journal of Ethnopharmacology 1, 111-119] introduced linear regression/residual analysis to examine these patterns. However, regression, the commonly-employed analysis, suffers from several statistical flaws. We use contingency table and binomial analyses to examine patterns of Shuar medicinal plant use (from Amazonian Ecuador). We first analyzed the Shuar data using Moerman's approach, modified to better meet requirements of linear regression analysis. Second, we assessed the exact randomization contingency table test for goodness of fit. Third, we developed a binomial model to test for non-random selection of plants in individual families. Modified regression models (which accommodated assumptions of linear regression) reduced R(2) to from 0.59 to 0.38, but did not eliminate all problems associated with regression analyses. Contingency table analyses revealed that the entire flora departs from the null model of equal proportions of medicinal plants in all families. In the binomial analysis, only 10 angiosperm families (of 115) differed significantly from the null model. These 10 families are largely responsible for patterns seen at higher taxonomic levels. Contingency table and binomial analyses offer an easy and statistically valid alternative to the regression approach.
Censored Hurdle Negative Binomial Regression (Case Study: Neonatorum Tetanus Case in Indonesia)
Yuli Rusdiana, Riza; Zain, Ismaini; Wulan Purnami, Santi
2017-06-01
Hurdle negative binomial model regression is a method that can be used for discreate dependent variable, excess zero and under- and overdispersion. It uses two parts approach. The first part estimates zero elements from dependent variable is zero hurdle model and the second part estimates not zero elements (non-negative integer) from dependent variable is called truncated negative binomial models. The discrete dependent variable in such cases is censored for some values. The type of censor that will be studied in this research is right censored. This study aims to obtain the parameter estimator hurdle negative binomial regression for right censored dependent variable. In the assessment of parameter estimation methods used Maximum Likelihood Estimator (MLE). Hurdle negative binomial model regression for right censored dependent variable is applied on the number of neonatorum tetanus cases in Indonesia. The type data is count data which contains zero values in some observations and other variety value. This study also aims to obtain the parameter estimator and test statistic censored hurdle negative binomial model. Based on the regression results, the factors that influence neonatorum tetanus case in Indonesia is the percentage of baby health care coverage and neonatal visits.
Bianca N.I. Eskelson; Hailemariam Temesgen; Tara M. Barrett
2009-01-01
Cavity tree and snag abundance data are highly variable and contain many zero observations. We predict cavity tree and snag abundance from variables that are readily available from forest cover maps or remotely sensed data using negative binomial (NB), zero-inflated NB, and zero-altered NB (ZANB) regression models as well as nearest neighbor (NN) imputation methods....
Marginalized zero-inflated negative binomial regression with application to dental caries.
Preisser, John S; Das, Kalyan; Long, D Leann; Divaris, Kimon
2016-05-10
The zero-inflated negative binomial regression model (ZINB) is often employed in diverse fields such as dentistry, health care utilization, highway safety, and medicine to examine relationships between exposures of interest and overdispersed count outcomes exhibiting many zeros. The regression coefficients of ZINB have latent class interpretations for a susceptible subpopulation at risk for the disease/condition under study with counts generated from a negative binomial distribution and for a non-susceptible subpopulation that provides only zero counts. The ZINB parameters, however, are not well-suited for estimating overall exposure effects, specifically, in quantifying the effect of an explanatory variable in the overall mixture population. In this paper, a marginalized zero-inflated negative binomial regression (MZINB) model for independent responses is proposed to model the population marginal mean count directly, providing straightforward inference for overall exposure effects based on maximum likelihood estimation. Through simulation studies, the finite sample performance of MZINB is compared with marginalized zero-inflated Poisson, Poisson, and negative binomial regression. The MZINB model is applied in the evaluation of a school-based fluoride mouthrinse program on dental caries in 677 children. Copyright © 2015 John Wiley & Sons, Ltd.
Zero inflated Poisson and negative binomial regression models: application in education.
Salehi, Masoud; Roudbari, Masoud
2015-01-01
The number of failed courses and semesters in students are indicators of their performance. These amounts have zero inflated (ZI) distributions. Using ZI Poisson and negative binomial distributions we can model these count data to find the associated factors and estimate the parameters. This study aims at to investigate the important factors related to the educational performance of students. This cross-sectional study performed in 2008-2009 at Iran University of Medical Sciences (IUMS) with a population of almost 6000 students, 670 students selected using stratified random sampling. The educational and demographical data were collected using the University records. The study design was approved at IUMS and the students' data kept confidential. The descriptive statistics and ZI Poisson and negative binomial regressions were used to analyze the data. The data were analyzed using STATA. In the number of failed semesters, Poisson and negative binomial distributions with ZI, students' total average and quota system had the most roles. For the number of failed courses, total average, and being in undergraduate or master levels had the most effect in both models. In all models the total average have the most effect on the number of failed courses or semesters. The next important factor is quota system in failed semester and undergraduate and master levels in failed courses. Therefore, average has an important inverse effect on the numbers of failed courses and semester.
Lu, Hai-Xia; Wong, May Chun Mei; Lo, Edward Chin Man; McGrath, Colman
2013-08-19
Limited information on oral health status for young adults aged 18 year-olds is known, and no available data exists in Hong Kong. The aims of this study were to investigate the oral health status and its risk indicators among young adults in Hong Kong using negative binomial regression. A survey was conducted in a representative sample of Hong Kong young adults aged 18 years. Clinical examinations were taken to assess oral health status using DMFT index and Community Periodontal Index (CPI) according to WHO criteria. Negative binomial regressions for DMFT score and the number of sextants with healthy gums were performed to identify the risk indicators of oral health status. A total of 324 young adults were examined. Prevalence of dental caries experience among the subjects was 59% and the overall mean DMFT score was 1.4. Most subjects (95%) had a score of 2 as their highest CPI score. Negative binomial regression analyses revealed that subjects who had a dental visit within 3 years had significantly higher DMFT scores (IRR = 1.68, p < 0.001). Subjects who brushed their teeth more frequently (IRR = 1.93, p < 0.001) and those with better dental knowledge (IRR = 1.09, p = 0.002) had significantly more sextants with healthy gums. Dental caries experience of the young adults aged 18 years in Hong Kong was not high but their periodontal condition was unsatisfactory. Their oral health status was related to their dental visit behavior, oral hygiene habit, and oral health knowledge.
Gomes, Marcos José Timbó Lima; Cunto, Flávio; da Silva, Alan Ricardo
2017-09-01
Generalized Linear Models (GLM) with negative binomial distribution for errors, have been widely used to estimate safety at the level of transportation planning. The limited ability of this technique to take spatial effects into account can be overcome through the use of local models from spatial regression techniques, such as Geographically Weighted Poisson Regression (GWPR). Although GWPR is a system that deals with spatial dependency and heterogeneity and has already been used in some road safety studies at the planning level, it fails to account for the possible overdispersion that can be found in the observations on road-traffic crashes. Two approaches were adopted for the Geographically Weighted Negative Binomial Regression (GWNBR) model to allow discrete data to be modeled in a non-stationary form and to take note of the overdispersion of the data: the first examines the constant overdispersion for all the traffic zones and the second includes the variable for each spatial unit. This research conducts a comparative analysis between non-spatial global crash prediction models and spatial local GWPR and GWNBR at the level of traffic zones in Fortaleza/Brazil. A geographic database of 126 traffic zones was compiled from the available data on exposure, network characteristics, socioeconomic factors and land use. The models were calibrated by using the frequency of injury crashes as a dependent variable and the results showed that GWPR and GWNBR achieved a better performance than GLM for the average residuals and likelihood as well as reducing the spatial autocorrelation of the residuals, and the GWNBR model was more able to capture the spatial heterogeneity of the crash frequency. Copyright © 2017 Elsevier Ltd. All rights reserved.
Majumdar, Arunabha; Witte, John S; Ghosh, Saurabh
2015-12-01
Binary phenotypes commonly arise due to multiple underlying quantitative precursors and genetic variants may impact multiple traits in a pleiotropic manner. Hence, simultaneously analyzing such correlated traits may be more powerful than analyzing individual traits. Various genotype-level methods, e.g., MultiPhen (O'Reilly et al. []), have been developed to identify genetic factors underlying a multivariate phenotype. For univariate phenotypes, the usefulness and applicability of allele-level tests have been investigated. The test of allele frequency difference among cases and controls is commonly used for mapping case-control association. However, allelic methods for multivariate association mapping have not been studied much. In this article, we explore two allelic tests of multivariate association: one using a Binomial regression model based on inverted regression of genotype on phenotype (Binomial regression-based Association of Multivariate Phenotypes [BAMP]), and the other employing the Mahalanobis distance between two sample means of the multivariate phenotype vector for two alleles at a single-nucleotide polymorphism (Distance-based Association of Multivariate Phenotypes [DAMP]). These methods can incorporate both discrete and continuous phenotypes. Some theoretical properties for BAMP are studied. Using simulations, the power of the methods for detecting multivariate association is compared with the genotype-level test MultiPhen's. The allelic tests yield marginally higher power than MultiPhen for multivariate phenotypes. For one/two binary traits under recessive mode of inheritance, allelic tests are found to be substantially more powerful. All three tests are applied to two different real data and the results offer some support for the simulation study. We propose a hybrid approach for testing multivariate association that implements MultiPhen when Hardy-Weinberg Equilibrium (HWE) is violated and BAMP otherwise, because the allelic approaches assume HWE
Martina, R; Kay, R; van Maanen, R; Ridder, A
2015-01-01
Clinical studies in overactive bladder have traditionally used analysis of covariance or nonparametric methods to analyse the number of incontinence episodes and other count data. It is known that if the underlying distributional assumptions of a particular parametric method do not hold, an alternative parametric method may be more efficient than a nonparametric one, which makes no assumptions regarding the underlying distribution of the data. Therefore, there are advantages in using methods based on the Poisson distribution or extensions of that method, which incorporate specific features that provide a modelling framework for count data. One challenge with count data is overdispersion, but methods are available that can account for this through the introduction of random effect terms in the modelling, and it is this modelling framework that leads to the negative binomial distribution. These models can also provide clinicians with a clearer and more appropriate interpretation of treatment effects in terms of rate ratios. In this paper, the previously used parametric and non-parametric approaches are contrasted with those based on Poisson regression and various extensions in trials evaluating solifenacin and mirabegron in patients with overactive bladder. In these applications, negative binomial models are seen to fit the data well. Copyright © 2014 John Wiley & Sons, Ltd.
Goodness-of-fit tests and model diagnostics for negative binomial regression of RNA sequencing data.
Mi, Gu; Di, Yanming; Schafer, Daniel W
2015-01-01
This work is about assessing model adequacy for negative binomial (NB) regression, particularly (1) assessing the adequacy of the NB assumption, and (2) assessing the appropriateness of models for NB dispersion parameters. Tools for the first are appropriate for NB regression generally; those for the second are primarily intended for RNA sequencing (RNA-Seq) data analysis. The typically small number of biological samples and large number of genes in RNA-Seq analysis motivate us to address the trade-offs between robustness and statistical power using NB regression models. One widely-used power-saving strategy, for example, is to assume some commonalities of NB dispersion parameters across genes via simple models relating them to mean expression rates, and many such models have been proposed. As RNA-Seq analysis is becoming ever more popular, it is appropriate to make more thorough investigations into power and robustness of the resulting methods, and into practical tools for model assessment. In this article, we propose simulation-based statistical tests and diagnostic graphics to address model adequacy. We provide simulated and real data examples to illustrate that our proposed methods are effective for detecting the misspecification of the NB mean-variance relationship as well as judging the adequacy of fit of several NB dispersion models.
Robust inference in the negative binomial regression model with an application to falls data.
Aeberhard, William H; Cantoni, Eva; Heritier, Stephane
2014-12-01
A popular way to model overdispersed count data, such as the number of falls reported during intervention studies, is by means of the negative binomial (NB) distribution. Classical estimating methods are well-known to be sensitive to model misspecifications, taking the form of patients falling much more than expected in such intervention studies where the NB regression model is used. We extend in this article two approaches for building robust M-estimators of the regression parameters in the class of generalized linear models to the NB distribution. The first approach achieves robustness in the response by applying a bounded function on the Pearson residuals arising in the maximum likelihood estimating equations, while the second approach achieves robustness by bounding the unscaled deviance components. For both approaches, we explore different choices for the bounding functions. Through a unified notation, we show how close these approaches may actually be as long as the bounding functions are chosen and tuned appropriately, and provide the asymptotic distributions of the resulting estimators. Moreover, we introduce a robust weighted maximum likelihood estimator for the overdispersion parameter, specific to the NB distribution. Simulations under various settings show that redescending bounding functions yield estimates with smaller biases under contamination while keeping high efficiency at the assumed model, and this for both approaches. We present an application to a recent randomized controlled trial measuring the effectiveness of an exercise program at reducing the number of falls among people suffering from Parkinsons disease to illustrate the diagnostic use of such robust procedures and their need for reliable inference. © 2014, The International Biometric Society.
Directory of Open Access Journals (Sweden)
Xuedong Yan
2012-01-01
Full Text Available In this study, the traffic crash rate, total crash frequency, and injury and fatal crash frequency were taken into consideration for distinguishing between rural and urban road segment safety. The GIS-based crash data during four and half years in Pikes Peak Area, US were applied for the analyses. The comparative statistical results show that the crash rates in rural segments are consistently lower than urban segments. Further, the regression results based on Zero-Inflated Negative Binomial (ZINB regression models indicate that the urban areas have a higher crash risk in terms of both total crash frequency and injury and fatal crash frequency, compared to rural areas. Additionally, it is found that crash frequencies increase as traffic volume and segment length increase, though the higher traffic volume lower the likelihood of severe crash occurrence; compared to 2-lane roads, the 4-lane roads have lower crash frequencies but have a higher probability of severe crash occurrence; and better road facilities with higher free flow speed can benefit from high standard design feature thus resulting in a lower total crash frequency, but they cannot mitigate the severe crash risk.
Salmerón, Diego; Cano, Juan A; Chirlaque, María D
2015-08-30
In cohort studies, binary outcomes are very often analyzed by logistic regression. However, it is well known that when the goal is to estimate a risk ratio, the logistic regression is inappropriate if the outcome is common. In these cases, a log-binomial regression model is preferable. On the other hand, the estimation of the regression coefficients of the log-binomial model is difficult owing to the constraints that must be imposed on these coefficients. Bayesian methods allow a straightforward approach for log-binomial regression models and produce smaller mean squared errors in the estimation of risk ratios than the frequentist methods, and the posterior inferences can be obtained using the software WinBUGS. However, Markov chain Monte Carlo methods implemented in WinBUGS can lead to large Monte Carlo errors in the approximations to the posterior inferences because they produce correlated simulations, and the accuracy of the approximations are inversely related to this correlation. To reduce correlation and to improve accuracy, we propose a reparameterization based on a Poisson model and a sampling algorithm coded in R. Copyright © 2015 John Wiley & Sons, Ltd.
Moghimbeigi, Abbas
2015-05-07
Poisson regression models provide a standard framework for quantitative trait locus (QTL) mapping of count traits. In practice, however, count traits are often over-dispersed relative to the Poisson distribution. In these situations, the zero-inflated Poisson (ZIP), zero-inflated generalized Poisson (ZIGP) and zero-inflated negative binomial (ZINB) regression may be useful for QTL mapping of count traits. Added genetic variables to the negative binomial part equation, may also affect extra zero data. In this study, to overcome these challenges, I apply two-part ZINB model. The EM algorithm with Newton-Raphson method in the M-step uses for estimating parameters. An application of the two-part ZINB model for QTL mapping is considered to detect associations between the formation of gallstone and the genotype of markers. Copyright © 2015 Elsevier Ltd. All rights reserved.
Najera-Zuloaga, Josu; Lee, Dae-Jin; Arostegui, Inmaculada
2017-01-01
Health-related quality of life has become an increasingly important indicator of health status in clinical trials and epidemiological research. Moreover, the study of the relationship of health-related quality of life with patients and disease characteristics has become one of the primary aims of many health-related quality of life studies. Health-related quality of life scores are usually assumed to be distributed as binomial random variables and often highly skewed. The use of the beta-binomial distribution in the regression context has been proposed to model such data; however, the beta-binomial regression has been performed by means of two different approaches in the literature: (i) beta-binomial distribution with a logistic link; and (ii) hierarchical generalized linear models. None of the existing literature in the analysis of health-related quality of life survey data has performed a comparison of both approaches in terms of adequacy and regression parameter interpretation context. This paper is motivated by the analysis of a real data application of health-related quality of life outcomes in patients with Chronic Obstructive Pulmonary Disease, where the use of both approaches yields to contradictory results in terms of covariate effects significance and consequently the interpretation of the most relevant factors in health-related quality of life. We present an explanation of the results in both methodologies through a simulation study and address the need to apply the proper approach in the analysis of health-related quality of life survey data for practitioners, providing an R package.
Buonaccorsi, John; Prochenka, Agnieszka; Thoresen, Magne; Ploski, Rafal
2016-09-30
Motivated by a genetic application, this paper addresses the problem of fitting regression models when the predictor is a proportion measured with error. While the problem of dealing with additive measurement error in fitting regression models has been extensively studied, the problem where the additive error is of a binomial nature has not been addressed. The measurement errors here are heteroscedastic for two reasons; dependence on the underlying true value and changing sampling effort over observations. While some of the previously developed methods for treating additive measurement error with heteroscedasticity can be used in this setting, other methods need modification. A new version of simulation extrapolation is developed, and we also explore a variation on the standard regression calibration method that uses a beta-binomial model based on the fact that the true value is a proportion. Although most of the methods introduced here can be used for fitting non-linear models, this paper will focus primarily on their use in fitting a linear model. While previous work has focused mainly on estimation of the coefficients, we will, with motivation from our example, also examine estimation of the variance around the regression line. In addressing these problems, we also discuss the appropriate manner in which to bootstrap for both inferences and bias assessment. The various methods are compared via simulation, and the results are illustrated using our motivating data, for which the goal is to relate the methylation rate of a blood sample to the age of the individual providing the sample. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Tang, Yongqiang
2015-01-01
A sample size formula is derived for negative binomial regression for the analysis of recurrent events, in which subjects can have unequal follow-up time. We obtain sharp lower and upper bounds on the required size, which is easy to compute. The upper bound is generally only slightly larger than the required size, and hence can be used to approximate the sample size. The lower and upper size bounds can be decomposed into two terms. The first term relies on the mean number of events in each group, and the second term depends on two factors that measure, respectively, the extent of between-subject variability in event rates, and follow-up time. Simulation studies are conducted to assess the performance of the proposed method. An application of our formulae to a multiple sclerosis trial is provided.
Ali, Asad; Zaidi, Farrah; Fatima, Syeda Hira; Adnan, Muhammad; Ullah, Saleem
2018-03-24
In this study, we propose to develop a geostatistical computational framework to model the distribution of rat bite infestation of epidemic proportion in Peshawar valley, Pakistan. Two species Rattus norvegicus and Rattus rattus are suspected to spread the infestation. The framework combines strengths of maximum entropy algorithm and binomial kriging with logistic regression to spatially model the distribution of infestation and to determine the individual role of environmental predictors in modeling the distribution trends. Our results demonstrate the significance of a number of social and environmental factors in rat infestations such as (I) high human population density; (II) greater dispersal ability of rodents due to the availability of better connectivity routes such as roads, and (III) temperature and precipitation influencing rodent fecundity and life cycle.
Batra, Manu; Shah, Aasim Farooq; Rajput, Prashant; Shah, Ishrat Aasim
2016-01-01
Dental caries among children has been described as a pandemic disease with a multifactorial nature. Various sociodemographic factors and oral hygiene practices are commonly tested for their influence on dental caries. In recent years, a recent statistical model that allows for covariate adjustment has been developed and is commonly referred zero-inflated negative binomial (ZINB) models. To compare the fit of the two models, the conventional linear regression (LR) model and ZINB model to assess the risk factors associated with dental caries. A cross-sectional survey was conducted on 1138 12-year-old school children in Moradabad Town, Uttar Pradesh during months of February-August 2014. Selected participants were interviewed using a questionnaire. Dental caries was assessed by recording decayed, missing, or filled teeth (DMFT) index. To assess the risk factor associated with dental caries in children, two approaches have been applied - LR model and ZINB model. The prevalence of caries-free subjects was 24.1%, and mean DMFT was 3.4 ± 1.8. In LR model, all the variables were statistically significant. Whereas in ZINB model, negative binomial part showed place of residence, father's education level, tooth brushing frequency, and dental visit statistically significant implying that the degree of being caries-free (DMFT = 0) increases for group of children who are living in urban, whose father is university pass out, who brushes twice a day and if have ever visited a dentist. The current study report that the LR model is a poorly fitted model and may lead to spurious conclusions whereas ZINB model has shown better goodness of fit (Akaike information criterion values - LR: 3.94; ZINB: 2.39) and can be preferred if high variance and number of an excess of zeroes are present.
2014-01-01
Background Whole-genome bisulfite sequencing currently provides the highest-precision view of the epigenome, with quantitative information about populations of cells down to single nucleotide resolution. Several studies have demonstrated the value of this precision: meaningful features that correlate strongly with biological functions can be found associated with only a few CpG sites. Understanding the role of DNA methylation, and more broadly the role of DNA accessibility, requires that methylation differences between populations of cells are identified with extreme precision and in complex experimental designs. Results In this work we investigated the use of beta-binomial regression as a general approach for modeling whole-genome bisulfite data to identify differentially methylated sites and genomic intervals. Conclusions The regression-based analysis can handle medium- and large-scale experiments where it becomes critical to accurately model variation in methylation levels between replicates and account for influence of various experimental factors like cell types or batch effects. PMID:24962134
An, Qingyu; Wu, Jun; Fan, Xuesong; Pan, Liyang; Sun, Wei
2016-01-01
The hand foot and mouth disease (HFMD) is a human syndrome caused by intestinal viruses like that coxsackie A virus 16, enterovirus 71 and easily developed into outbreak in kindergarten and school. Scientifically and accurately early detection of the start time of HFMD epidemic is a key principle in planning of control measures and minimizing the impact of HFMD. The objective of this study was to establish a reliable early detection model for start timing of hand foot mouth disease epidemic in Dalian and to evaluate the performance of model by analyzing the sensitivity in detectability. The negative binomial regression model was used to estimate the weekly baseline case number of HFMD and identified the optimal alerting threshold between tested difference threshold values during the epidemic and non-epidemic year. Circular distribution method was used to calculate the gold standard of start timing of HFMD epidemic. From 2009 to 2014, a total of 62022 HFMD cases were reported (36879 males and 25143 females) in Dalian, Liaoning Province, China, including 15 fatal cases. The median age of the patients was 3 years. The incidence rate of epidemic year ranged from 137.54 per 100,000 population to 231.44 per 100,000population, the incidence rate of non-epidemic year was lower than 112 per 100,000 population. The negative binomial regression model with AIC value 147.28 was finally selected to construct the baseline level. The threshold value was 100 for the epidemic year and 50 for the non- epidemic year had the highest sensitivity(100%) both in retrospective and prospective early warning and the detection time-consuming was 2 weeks before the actual starting of HFMD epidemic. The negative binomial regression model could early warning the start of a HFMD epidemic with good sensitivity and appropriate detection time in Dalian.
Najaf, Pooya; Duddu, Venkata R; Pulugurtha, Srinivas S
2018-03-01
Machine learning (ML) techniques have higher prediction accuracy compared to conventional statistical methods for crash frequency modelling. However, their black-box nature limits the interpretability. The objective of this research is to combine both ML and statistical methods to develop hybrid link-level crash frequency models with high predictability and interpretability. For this purpose, M5' model trees method (M5') is introduced and applied to classify the crash data and then calibrate a model for each homogenous class. The data for 1134 and 345 randomly selected links on urban arterials in the city of Charlotte, North Carolina was used to develop and validate models, respectively. The outputs from the hybrid approach are compared with the outputs from cluster-based negative binomial regression (NBR) and general NBR models. Findings indicate that M5' has high predictability and is very reliable to interpret the role of different attributes on crash frequency compared to other developed models.
Directory of Open Access Journals (Sweden)
Glauco H.S. Mendes
2013-09-01
Full Text Available Critical success factors in new product development (NPD in the Brazilian small and medium enterprises (SMEs are identified and analyzed. Critical success factors are best practices that can be used to improve NPD management and performance in a company. However, the traditional method for identifying these factors is survey methods. Subsequently, the collected data are reduced through traditional multivariate analysis. The objective of this work is to develop a logistic regression model for predicting the success or failure of the new product development. This model allows for an evaluation and prioritization of resource commitments. The results will be helpful for guiding management actions, as one way to improve NPD performance in those industries.
Statistical power analyses using G*Power 3.1: tests for correlation and regression analyses.
Faul, Franz; Erdfelder, Edgar; Buchner, Axel; Lang, Albert-Georg
2009-11-01
G*Power is a free power analysis program for a variety of statistical tests. We present extensions and improvements of the version introduced by Faul, Erdfelder, Lang, and Buchner (2007) in the domain of correlation and regression analyses. In the new version, we have added procedures to analyze the power of tests based on (1) single-sample tetrachoric correlations, (2) comparisons of dependent correlations, (3) bivariate linear regression, (4) multiple linear regression based on the random predictor model, (5) logistic regression, and (6) Poisson regression. We describe these new features and provide a brief introduction to their scope and handling.
2014-01-01
Background Large-scale public health interventions with rapid scale-up are increasingly being implemented worldwide. Such implementation allows for a large target population to be reached in a short period of time. But when the time comes to investigate the effectiveness of these interventions, the rapid scale-up creates several methodological challenges, such as the lack of baseline data and the absence of control groups. One example of such an intervention is Avahan, the India HIV/AIDS initiative of the Bill & Melinda Gates Foundation. One question of interest is the effect of Avahan on condom use by female sex workers with their clients. By retrospectively reconstructing condom use and sex work history from survey data, it is possible to estimate how condom use rates evolve over time. However formal inference about how this rate changes at a given point in calendar time remains challenging. Methods We propose a new statistical procedure based on a mixture of binomial regression and Cox regression. We compare this new method to an existing approach based on generalized estimating equations through simulations and application to Indian data. Results Both methods are unbiased, but the proposed method is more powerful than the existing method, especially when initial condom use is high. When applied to the Indian data, the new method mostly agrees with the existing method, but seems to have corrected some implausible results of the latter in a few districts. We also show how the new method can be used to analyze the data of all districts combined. Conclusions The use of both methods can be recommended for exploratory data analysis. However for formal statistical inference, the new method has better power. PMID:24397563
Kondo, Yumi; Zhao, Yinshan; Petkau, John
2015-06-15
We develop a new modeling approach to enhance a recently proposed method to detect increases of contrast-enhancing lesions (CELs) on repeated magnetic resonance imaging, which have been used as an indicator for potential adverse events in multiple sclerosis clinical trials. The method signals patients with unusual increases in CEL activity by estimating the probability of observing CEL counts as large as those observed on a patient's recent scans conditional on the patient's CEL counts on previous scans. This conditional probability index (CPI), computed based on a mixed-effect negative binomial regression model, can vary substantially depending on the choice of distribution for the patient-specific random effects. Therefore, we relax this parametric assumption to model the random effects with an infinite mixture of beta distributions, using the Dirichlet process, which effectively allows any form of distribution. To our knowledge, no previous literature considers a mixed-effect regression for longitudinal count variables where the random effect is modeled with a Dirichlet process mixture. As our inference is in the Bayesian framework, we adopt a meta-analytic approach to develop an informative prior based on previous clinical trials. This is particularly helpful at the early stages of trials when less data are available. Our enhanced method is illustrated with CEL data from 10 previous multiple sclerosis clinical trials. Our simulation study shows that our procedure estimates the CPI more accurately than parametric alternatives when the patient-specific random effect distribution is misspecified and that an informative prior improves the accuracy of the CPI estimates. Copyright © 2015 John Wiley & Sons, Ltd.
Wei, Feng; Lovegrove, Gordon
2013-12-01
Today, North American governments are more willing to consider compact neighborhoods with increased use of sustainable transportation modes. Bicycling, one of the most effective modes for short trips with distances less than 5km is being encouraged. However, as vulnerable road users (VRUs), cyclists are more likely to be injured when involved in collisions. In order to create a safe road environment for them, evaluating cyclists' road safety at a macro level in a proactive way is necessary. In this paper, different generalized linear regression methods for collision prediction model (CPM) development are reviewed and previous studies on micro-level and macro-level bicycle-related CPMs are summarized. On the basis of insights gained in the exploration stage, this paper also reports on efforts to develop negative binomial models for bicycle-auto collisions at a community-based, macro-level. Data came from the Central Okanagan Regional District (CORD), of British Columbia, Canada. The model results revealed two types of statistical associations between collisions and each explanatory variable: (1) An increase in bicycle-auto collisions is associated with an increase in total lane kilometers (TLKM), bicycle lane kilometers (BLKM), bus stops (BS), traffic signals (SIG), intersection density (INTD), and arterial-local intersection percentage (IALP). (2) A decrease in bicycle collisions was found to be associated with an increase in the number of drive commuters (DRIVE), and in the percentage of drive commuters (DRP). These results support our hypothesis that in North America, with its current low levels of bicycle use (macro-level CPMs. Copyright © 2012. Published by Elsevier Ltd.
Applications of MIDAS regression in analysing trends in water quality
Penev, Spiridon; Leonte, Daniela; Lazarov, Zdravetz; Mann, Rob A.
2014-04-01
We discuss novel statistical methods in analysing trends in water quality. Such analysis uses complex data sets of different classes of variables, including water quality, hydrological and meteorological. We analyse the effect of rainfall and flow on trends in water quality utilising a flexible model called Mixed Data Sampling (MIDAS). This model arises because of the mixed frequency in the data collection. Typically, water quality variables are sampled fortnightly, whereas the rain data is sampled daily. The advantage of using MIDAS regression is in the flexible and parsimonious modelling of the influence of the rain and flow on trends in water quality variables. We discuss the model and its implementation on a data set from the Shoalhaven Supply System and Catchments in the state of New South Wales, Australia. Information criteria indicate that MIDAS modelling improves upon simplistic approaches that do not utilise the mixed data sampling nature of the data.
Analysis of hypoglycemic events using negative binomial models.
Luo, Junxiang; Qu, Yongming
2013-01-01
Negative binomial regression is a standard model to analyze hypoglycemic events in diabetes clinical trials. Adjusting for baseline covariates could potentially increase the estimation efficiency of negative binomial regression. However, adjusting for covariates raises concerns about model misspecification, in which the negative binomial regression is not robust because of its requirement for strong model assumptions. In some literature, it was suggested to correct the standard error of the maximum likelihood estimator through introducing overdispersion, which can be estimated by the Deviance or Pearson Chi-square. We proposed to conduct the negative binomial regression using Sandwich estimation to calculate the covariance matrix of the parameter estimates together with Pearson overdispersion correction (denoted by NBSP). In this research, we compared several commonly used negative binomial model options with our proposed NBSP. Simulations and real data analyses showed that NBSP is the most robust to model misspecification, and the estimation efficiency will be improved by adjusting for baseline hypoglycemia. Copyright © 2013 John Wiley & Sons, Ltd.
Multicollinearity in Regression Analyses Conducted in Epidemiologic Studies.
Vatcheva, Kristina P; Lee, MinJae; McCormick, Joseph B; Rahbar, Mohammad H
2016-04-01
The adverse impact of ignoring multicollinearity on findings and data interpretation in regression analysis is very well documented in the statistical literature. The failure to identify and report multicollinearity could result in misleading interpretations of the results. A review of epidemiological literature in PubMed from January 2004 to December 2013, illustrated the need for a greater attention to identifying and minimizing the effect of multicollinearity in analysis of data from epidemiologic studies. We used simulated datasets and real life data from the Cameron County Hispanic Cohort to demonstrate the adverse effects of multicollinearity in the regression analysis and encourage researchers to consider the diagnostic for multicollinearity as one of the steps in regression analysis.
DEFF Research Database (Denmark)
Elmasry, Amr; Jensen, Claus; Katajainen, Jyrki
2017-01-01
the (total) number of elements stored in the data structure(s) prior to the operation. As the resulting data structure consists of two components that are different variants of binomial heaps, we call it a bipartite binomial heap. Compared to its counterpart, a multipartite binomial heap, the new structure...
Statistical and regression analyses of detected extrasolar systems
Czech Academy of Sciences Publication Activity Database
Pintr, Pavel; Peřinová, V.; Lukš, A.; Pathak, A.
2013-01-01
Roč. 75, č. 1 (2013), s. 37-45 ISSN 0032-0633 Institutional support: RVO:61389021 Keywords : Exoplanets * Kepler candidates * Regression analysis Subject RIV: BN - Astronomy, Celestial Mechanics, Astrophysics Impact factor: 1.630, year: 2013 http://www.sciencedirect.com/science/article/pii/S0032063312003066
Chain binomial models and binomial autoregressive processes.
Weiss, Christian H; Pollett, Philip K
2012-09-01
We establish a connection between a class of chain-binomial models of use in ecology and epidemiology and binomial autoregressive (AR) processes. New results are obtained for the latter, including expressions for the lag-conditional distribution and related quantities. We focus on two types of chain-binomial model, extinction-colonization and colonization-extinction models, and present two approaches to parameter estimation. The asymptotic distributions of the resulting estimators are studied, as well as their finite-sample performance, and we give an application to real data. A connection is made with standard AR models, which also has implications for parameter estimation. © 2011, The International Biometric Society.
Multicollinearity in Regression Analyses Conducted in Epidemiologic Studies
Vatcheva, Kristina P.; Lee, MinJae; McCormick, Joseph B.; Rahbar, Mohammad H.
2016-01-01
The adverse impact of ignoring multicollinearity on findings and data interpretation in regression analysis is very well documented in the statistical literature. The failure to identify and report multicollinearity could result in misleading interpretations of the results. A review of epidemiological literature in PubMed from January 2004 to December 2013, illustrated the need for a greater attention to identifying and minimizing the effect of multicollinearity in analysis of data from epide...
Analysing inequalities in Germany a structured additive distributional regression approach
Silbersdorff, Alexander
2017-01-01
This book seeks new perspectives on the growing inequalities that our societies face, putting forward Structured Additive Distributional Regression as a means of statistical analysis that circumvents the common problem of analytical reduction to simple point estimators. This new approach allows the observed discrepancy between the individuals’ realities and the abstract representation of those realities to be explicitly taken into consideration using the arithmetic mean alone. In turn, the method is applied to the question of economic inequality in Germany.
Stochastic background of negative binomial distribution
International Nuclear Information System (INIS)
Suzuki, N.; Biyajima, M.; Wilk, G.
1991-01-01
A branching equations of the birth process with immigration is taken as a model for the particle production process. Using it we investigate cases in which its solution becomes the negative binomial distribution. Furthermore, we compare our approach with the modified negative binomial distribution proposed recently by Chliapnikov and Tchikilev and use it to analyse the observed multiplicity distributions. (orig.)
Standardized binomial models for risk or prevalence ratios and differences.
Richardson, David B; Kinlaw, Alan C; MacLehose, Richard F; Cole, Stephen R
2015-10-01
Epidemiologists often analyse binary outcomes in cohort and cross-sectional studies using multivariable logistic regression models, yielding estimates of adjusted odds ratios. It is widely known that the odds ratio closely approximates the risk or prevalence ratio when the outcome is rare, and it does not do so when the outcome is common. Consequently, investigators may decide to directly estimate the risk or prevalence ratio using a log binomial regression model. We describe the use of a marginal structural binomial regression model to estimate standardized risk or prevalence ratios and differences. We illustrate the proposed approach using data from a cohort study of coronary heart disease status in Evans County, Georgia, USA. The approach reduces problems with model convergence typical of log binomial regression by shifting all explanatory variables except the exposures of primary interest from the linear predictor of the outcome regression model to a model for the standardization weights. The approach also facilitates evaluation of departures from additivity in the joint effects of two exposures. Epidemiologists should consider reporting standardized risk or prevalence ratios and differences in cohort and cross-sectional studies. These are readily-obtained using the SAS, Stata and R statistical software packages. The proposed approach estimates the exposure effect in the total population. © The Author 2015; all rights reserved. Published by Oxford University Press on behalf of the International Epidemiological Association.
Khan, Iftekhar; Morris, Stephen
2014-11-12
The performance of the Beta Binomial (BB) model is compared with several existing models for mapping the EORTC QLQ-C30 (QLQ-C30) on to the EQ-5D-3L using data from lung cancer trials. Data from 2 separate non small cell lung cancer clinical trials (TOPICAL and SOCCAR) are used to develop and validate the BB model. Comparisons with Linear, TOBIT, Quantile, Quadratic and CLAD models are carried out. The mean prediction error, R(2), proportion predicted outside the valid range, clinical interpretation of coefficients, model fit and estimation of Quality Adjusted Life Years (QALY) are reported and compared. Monte-Carlo simulation is also used. The Beta-Binomial regression model performed 'best' among all models. For TOPICAL and SOCCAR trials, respectively, residual mean square error (RMSE) was 0.09 and 0.11; R(2) was 0.75 and 0.71; observed vs. predicted means were 0.612 vs. 0.608 and 0.750 vs. 0.749. Mean difference in QALY's (observed vs. predicted) were 0.051 vs. 0.053 and 0.164 vs. 0.162 for TOPICAL and SOCCAR respectively. Models tested on independent data show simulated 95% confidence from the BB model containing the observed mean more often (77% and 59% for TOPICAL and SOCCAR respectively) compared to the other models. All algorithms over-predict at poorer health states but the BB model was relatively better, particularly for the SOCCAR data. The BB model may offer superior predictive properties amongst mapping algorithms considered and may be more useful when predicting EQ-5D-3L at poorer health states. We recommend the algorithm derived from the TOPICAL data due to better predictive properties and less uncertainty.
Tripepi, Giovanni; Jager, Kitty J.; Stel, Vianda S.; Dekker, Friedo W.; Zoccali, Carmine
2011-01-01
Because of some limitations of stratification methods, epidemiologists frequently use multiple linear and logistic regression analyses to address specific epidemiological questions. If the dependent variable is a continuous one (for example, systolic pressure and serum creatinine), the researcher
Longitudinal beta-binomial modeling using GEE for overdispersed binomial data.
Wu, Hongqian; Zhang, Ying; Long, Jeffrey D
2017-03-15
Longitudinal binomial data are frequently generated from multiple questionnaires and assessments in various scientific settings for which the binomial data are often overdispersed. The standard generalized linear mixed effects model may result in severe underestimation of standard errors of estimated regression parameters in such cases and hence potentially bias the statistical inference. In this paper, we propose a longitudinal beta-binomial model for overdispersed binomial data and estimate the regression parameters under a probit model using the generalized estimating equation method. A hybrid algorithm of the Fisher scoring and the method of moments is implemented for computing the method. Extensive simulation studies are conducted to justify the validity of the proposed method. Finally, the proposed method is applied to analyze functional impairment in subjects who are at risk of Huntington disease from a multisite observational study of prodromal Huntington disease. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Negative binomial multiplicity distribution from binomial cluster production
International Nuclear Information System (INIS)
Iso, C.; Mori, K.
1990-01-01
Two-step interpretation of negative binomial multiplicity distribution as a compound of binomial cluster production and negative binomial like cluster decay distribution is proposed. In this model we can expect the average multiplicity for the cluster production increases with increasing energy, different from a compound Poisson-Logarithmic distribution. (orig.)
Binomial collisions and near collisions
Blokhuis, Aart; Brouwer, Andries; de Weger, Benne
2017-01-01
We describe efficient algorithms to search for cases in which binomial coefficients are equal or almost equal, give a conjecturally complete list of all cases where two binomial coefficients differ by 1, and give some identities for binomial coefficients that seem to be new.
USE OF THE SIMPLE LINEAR REGRESSION MODEL IN MACRO-ECONOMICAL ANALYSES
Directory of Open Access Journals (Sweden)
Constantin ANGHELACHE
2011-10-01
Full Text Available The article presents the fundamental aspects of the linear regression, as a toolbox which can be used in macroeconomic analyses. The article describes the estimation of the parameters, the statistical tests used, the homoscesasticity and heteroskedasticity. The use of econometrics instrument in macroeconomics is an important factor that guarantees the quality of the models, analyses, results and possible interpretation that can be drawn at this level.
Statistical inference involving binomial and negative binomial parameters.
García-Pérez, Miguel A; Núñez-Antón, Vicente
2009-05-01
Statistical inference about two binomial parameters implies that they are both estimated by binomial sampling. There are occasions in which one aims at testing the equality of two binomial parameters before and after the occurrence of the first success along a sequence of Bernoulli trials. In these cases, the binomial parameter before the first success is estimated by negative binomial sampling whereas that after the first success is estimated by binomial sampling, and both estimates are related. This paper derives statistical tools to test two hypotheses, namely, that both binomial parameters equal some specified value and that both parameters are equal though unknown. Simulation studies are used to show that in small samples both tests are accurate in keeping the nominal Type-I error rates, and also to determine sample size requirements to detect large, medium, and small effects with adequate power. Additional simulations also show that the tests are sufficiently robust to certain violations of their assumptions.
CUMBIN - CUMULATIVE BINOMIAL PROGRAMS
Bowerman, P. N.
1994-01-01
The cumulative binomial program, CUMBIN, is one of a set of three programs which calculate cumulative binomial probability distributions for arbitrary inputs. The three programs, CUMBIN, NEWTONP (NPO-17556), and CROSSER (NPO-17557), can be used independently of one another. CUMBIN can be used by statisticians and users of statistical procedures, test planners, designers, and numerical analysts. The program has been used for reliability/availability calculations. CUMBIN calculates the probability that a system of n components has at least k operating if the probability that any one operating is p and the components are independent. Equivalently, this is the reliability of a k-out-of-n system having independent components with common reliability p. CUMBIN can evaluate the incomplete beta distribution for two positive integer arguments. CUMBIN can also evaluate the cumulative F distribution and the negative binomial distribution, and can determine the sample size in a test design. CUMBIN is designed to work well with all integer values 0 < k <= n. To run the program, the user simply runs the executable version and inputs the information requested by the program. The program is not designed to weed out incorrect inputs, so the user must take care to make sure the inputs are correct. Once all input has been entered, the program calculates and lists the result. The CUMBIN program is written in C. It was developed on an IBM AT with a numeric co-processor using Microsoft C 5.0. Because the source code is written using standard C structures and functions, it should compile correctly with most C compilers. The program format is interactive. It has been implemented under DOS 3.2 and has a memory requirement of 26K. CUMBIN was developed in 1988.
CROSSER - CUMULATIVE BINOMIAL PROGRAMS
Bowerman, P. N.
1994-01-01
The cumulative binomial program, CROSSER, is one of a set of three programs which calculate cumulative binomial probability distributions for arbitrary inputs. The three programs, CROSSER, CUMBIN (NPO-17555), and NEWTONP (NPO-17556), can be used independently of one another. CROSSER can be used by statisticians and users of statistical procedures, test planners, designers, and numerical analysts. The program has been used for reliability/availability calculations. CROSSER calculates the point at which the reliability of a k-out-of-n system equals the common reliability of the n components. It is designed to work well with all integer values 0 < k <= n. To run the program, the user simply runs the executable version and inputs the information requested by the program. The program is not designed to weed out incorrect inputs, so the user must take care to make sure the inputs are correct. Once all input has been entered, the program calculates and lists the result. It also lists the number of iterations of Newton's method required to calculate the answer within the given error. The CROSSER program is written in C. It was developed on an IBM AT with a numeric co-processor using Microsoft C 5.0. Because the source code is written using standard C structures and functions, it should compile correctly with most C compilers. The program format is interactive. It has been implemented under DOS 3.2 and has a memory requirement of 26K. CROSSER was developed in 1988.
NEWTONP - CUMULATIVE BINOMIAL PROGRAMS
Bowerman, P. N.
1994-01-01
The cumulative binomial program, NEWTONP, is one of a set of three programs which calculate cumulative binomial probability distributions for arbitrary inputs. The three programs, NEWTONP, CUMBIN (NPO-17555), and CROSSER (NPO-17557), can be used independently of one another. NEWTONP can be used by statisticians and users of statistical procedures, test planners, designers, and numerical analysts. The program has been used for reliability/availability calculations. NEWTONP calculates the probably p required to yield a given system reliability V for a k-out-of-n system. It can also be used to determine the Clopper-Pearson confidence limits (either one-sided or two-sided) for the parameter p of a Bernoulli distribution. NEWTONP can determine Bayesian probability limits for a proportion (if the beta prior has positive integer parameters). It can determine the percentiles of incomplete beta distributions with positive integer parameters. It can also determine the percentiles of F distributions and the midian plotting positions in probability plotting. NEWTONP is designed to work well with all integer values 0 < k <= n. To run the program, the user simply runs the executable version and inputs the information requested by the program. NEWTONP is not designed to weed out incorrect inputs, so the user must take care to make sure the inputs are correct. Once all input has been entered, the program calculates and lists the result. It also lists the number of iterations of Newton's method required to calculate the answer within the given error. The NEWTONP program is written in C. It was developed on an IBM AT with a numeric co-processor using Microsoft C 5.0. Because the source code is written using standard C structures and functions, it should compile correctly with most C compilers. The program format is interactive. It has been implemented under DOS 3.2 and has a memory requirement of 26K. NEWTONP was developed in 1988.
Distinguishing between Binomial, Hypergeometric and Negative Binomial Distributions
Wroughton, Jacqueline; Cole, Tarah
2013-01-01
Recognizing the differences between three discrete distributions (Binomial, Hypergeometric and Negative Binomial) can be challenging for students. We present an activity designed to help students differentiate among these distributions. In addition, we present assessment results in the form of pre- and post-tests that were designed to assess the…
On a Fractional Binomial Process
Cahoy, Dexter O.; Polito, Federico
2012-02-01
The classical binomial process has been studied by Jakeman (J. Phys. A 23:2815-2825, 1990) (and the references therein) and has been used to characterize a series of radiation states in quantum optics. In particular, he studied a classical birth-death process where the chance of birth is proportional to the difference between a larger fixed number and the number of individuals present. It is shown that at large times, an equilibrium is reached which follows a binomial process. In this paper, the classical binomial process is generalized using the techniques of fractional calculus and is called the fractional binomial process. The fractional binomial process is shown to preserve the binomial limit at large times while expanding the class of models that include non-binomial fluctuations (non-Markovian) at regular and small times. As a direct consequence, the generality of the fractional binomial model makes the proposed model more desirable than its classical counterpart in describing real physical processes. More statistical properties are also derived.
Binomial Rings: Axiomatisation, Transfer and Classification
Xantcha, Qimh Richey
2011-01-01
Hall's binomial rings, rings with binomial coefficients, are given an axiomatisation and proved identical to the numerical rings studied by Ekedahl. The Binomial Transfer Principle is established, enabling combinatorial proofs of algebraical identities. The finitely generated binomial rings are completely classified. An application to modules over binomial rings is given.
Karakawa, Ayako; Murata, Hiroshi; Hirasawa, Hiroyo; Mayama, Chihiro; Asaoka, Ryo
2013-01-01
To compare the performance of newly proposed point-wise linear regression (PLR) with the binomial test (binomial PLR) against mean deviation (MD) trend analysis and permutation analyses of PLR (PoPLR), in detecting global visual field (VF) progression in glaucoma. 15 VFs (Humphrey Field Analyzer, SITA standard, 24-2) were collected from 96 eyes of 59 open angle glaucoma patients (6.0 ± 1.5 [mean ± standard deviation] years). Using the total deviation of each point on the 2(nd) to 16(th) VFs (VF2-16), linear regression analysis was carried out. The numbers of VF test points with a significant trend at various probability levels (pbinomial test (one-side). A VF series was defined as "significant" if the median p-value from the binomial test was binomial PLR method (0.14 to 0.86) was significantly higher than MD trend analysis (0.04 to 0.89) and PoPLR (0.09 to 0.93). The PIS of the proposed method (0.0 to 0.17) was significantly lower than the MD approach (0.0 to 0.67) and PoPLR (0.07 to 0.33). The PBNS of the three approaches were not significantly different. The binomial BLR method gives more consistent results than MD trend analysis and PoPLR, hence it will be helpful as a tool to 'flag' possible VF deterioration.
Karami, K; Zerehdaran, S; Barzanooni, B; Lotfi, E
2017-12-01
1. The aim of the present study was to estimate genetic parameters for average egg weight (EW) and egg number (EN) at different ages in Japanese quail using multi-trait random regression (MTRR) models. 2. A total of 8534 records from 900 quail, hatched between 2014 and 2015, were used in the study. Average weekly egg weights and egg numbers were measured from second until sixth week of egg production. 3. Nine random regression models were compared to identify the best order of the Legendre polynomials (LP). The most optimal model was identified by the Bayesian Information Criterion. A model with second order of LP for fixed effects, second order of LP for additive genetic effects and third order of LP for permanent environmental effects (MTRR23) was found to be the best. 4. According to the MTRR23 model, direct heritability for EW increased from 0.26 in the second week to 0.53 in the sixth week of egg production, whereas the ratio of permanent environment to phenotypic variance decreased from 0.48 to 0.1. Direct heritability for EN was low, whereas the ratio of permanent environment to phenotypic variance decreased from 0.57 to 0.15 during the production period. 5. For each trait, estimated genetic correlations among weeks of egg production were high (from 0.85 to 0.98). Genetic correlations between EW and EN were low and negative for the first two weeks, but they were low and positive for the rest of the egg production period. 6. In conclusion, random regression models can be used effectively for analysing egg production traits in Japanese quail. Response to selection for increased egg weight would be higher at older ages because of its higher heritability and such a breeding program would have no negative genetic impact on egg production.
Directory of Open Access Journals (Sweden)
Esther Leushuis
2016-12-01
Full Text Available Background: Standardization of the semen analysis may improve reproducibility. We assessed variability between laboratories in semen analyses and evaluated whether a transformation using Z scores and regression statistics was able to reduce this variability. Materials and Methods: We performed a retrospective cohort study. We calculated between-laboratory coefficients of variation (CVB for sperm concentration and for morphology. Subsequently, we standardized the semen analysis results by calculating laboratory specific Z scores, and by using regression. We used analysis of variance for four semen parameters to assess systematic differences between laboratories before and after the transformations, both in the circulation samples and in the samples obtained in the prospective cohort study in the Netherlands between January 2002 and February 2004. Results: The mean CVB was 7% for sperm concentration (range 3 to 13% and 32% for sperm morphology (range 18 to 51%. The differences between the laboratories were statistically significant for all semen parameters (all P<0.001. Standardization using Z scores did not reduce the differences in semen analysis results between the laboratories (all P<0.001. Conclusion: There exists large between-laboratory variability for sperm morphology and small, but statistically significant, between-laboratory variation for sperm concentration. Standardization using Z scores does not eliminate between-laboratory variability.
The Binomial Distribution in Shooting
Chalikias, Miltiadis S.
2009-01-01
The binomial distribution is used to predict the winner of the 49th International Shooting Sport Federation World Championship in double trap shooting held in 2006 in Zagreb, Croatia. The outcome of the competition was definitely unexpected.
Negative binomial properties and clan structure in multiplicity distributions
International Nuclear Information System (INIS)
Giovannini, A.; Van Hove, L.
1988-01-01
We review the negative binomial properties measured recently for many multiplicity distributions of high energy hadronic, semi-leptonic reactions in selected rapidity intervals. We analyse them in terms of the ''clan'' structure which can be defined for any negative binomial distribution. By comparing reactions we exhibit a number of regularities for the average number N-bar of clans and the average charged multiplicity (n-bar) c per clan. 22 refs., 6 figs. (author)
International Nuclear Information System (INIS)
Bhowmik, K.R.; Islam, S.
2016-01-01
Logistic regression (LR) analysis is the most common statistical methodology to find out the determinants of childhood mortality. However, the significant predictors cannot be ranked according to their influence on the response variable. Multiple classification (MC) analysis can be applied to identify the significant predictors with a priority index which helps to rank the predictors. The main objective of the study is to find the socio-demographic determinants of childhood mortality at neonatal, post-neonatal, and post-infant period by fitting LR model as well as to rank those through MC analysis. The study is conducted using the data of Bangladesh Demographic and Health Survey 2007 where birth and death information of children were collected from their mothers. Three dichotomous response variables are constructed from children age at death to fit the LR and MC models. Socio-economic and demographic variables significantly associated with the response variables separately are considered in LR and MC analyses. Both the LR and MC models identified the same significant predictors for specific childhood mortality. For both the neonatal and child mortality, biological factors of children, regional settings, and parents socio-economic status are found as 1st, 2nd, and 3rd significant groups of predictors respectively. Mother education and household environment are detected as major significant predictors of post-neonatal mortality. This study shows that MC analysis with or without LR analysis can be applied to detect determinants with rank which help the policy makers taking initiatives on a priority basis. (author)
PENERAPAN REGRESI BINOMIAL NEGATIF UNTUK MENGATASI OVERDISPERSI PADA REGRESI POISSON
Directory of Open Access Journals (Sweden)
PUTU SUSAN PRADAWATI
2013-09-01
Full Text Available Poisson regression was used to analyze the count data which Poisson distributed. Poisson regression analysis requires state equidispersion, in which the mean value of the response variable is equal to the value of the variance. However, there are deviations in which the value of the response variable variance is greater than the mean. This is called overdispersion. If overdispersion happens and Poisson Regression analysis is being used, then underestimated standard errors will be obtained. Negative Binomial Regression can handle overdispersion because it contains a dispersion parameter. From the simulation data which experienced overdispersion in the Poisson Regression model it was found that the Negative Binomial Regression was better than the Poisson Regression model.
The number of subjects per variable required in linear regression analyses
P.C. Austin (Peter); E.W. Steyerberg (Ewout)
2015-01-01
textabstractObjectives To determine the number of independent variables that can be included in a linear regression model. Study Design and Setting We used a series of Monte Carlo simulations to examine the impact of the number of subjects per variable (SPV) on the accuracy of estimated regression
The number of subjects per variable required in linear regression analyses.
Austin, Peter C; Steyerberg, Ewout W
2015-06-01
To determine the number of independent variables that can be included in a linear regression model. We used a series of Monte Carlo simulations to examine the impact of the number of subjects per variable (SPV) on the accuracy of estimated regression coefficients and standard errors, on the empirical coverage of estimated confidence intervals, and on the accuracy of the estimated R(2) of the fitted model. A minimum of approximately two SPV tended to result in estimation of regression coefficients with relative bias of less than 10%. Furthermore, with this minimum number of SPV, the standard errors of the regression coefficients were accurately estimated and estimated confidence intervals had approximately the advertised coverage rates. A much higher number of SPV were necessary to minimize bias in estimating the model R(2), although adjusted R(2) estimates behaved well. The bias in estimating the model R(2) statistic was inversely proportional to the magnitude of the proportion of variation explained by the population regression model. Linear regression models require only two SPV for adequate estimation of regression coefficients, standard errors, and confidence intervals. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
Integer Solutions of Binomial Coefficients
Gilbertson, Nicholas J.
2016-01-01
A good formula is like a good story, rich in description, powerful in communication, and eye-opening to readers. The formula presented in this article for determining the coefficients of the binomial expansion of (x + y)n is one such "good read." The beauty of this formula is in its simplicity--both describing a quantitative situation…
[Using log-binomial model for estimating the prevalence ratio].
Ye, Rong; Gao, Yan-hui; Yang, Yi; Chen, Yue
2010-05-01
To estimate the prevalence ratios, using a log-binomial model with or without continuous covariates. Prevalence ratios for individuals' attitude towards smoking-ban legislation associated with smoking status, estimated by using a log-binomial model were compared with odds ratios estimated by logistic regression model. In the log-binomial modeling, maximum likelihood method was used when there were no continuous covariates and COPY approach was used if the model did not converge, for example due to the existence of continuous covariates. We examined the association between individuals' attitude towards smoking-ban legislation and smoking status in men and women. Prevalence ratio and odds ratio estimation provided similar results for the association in women since smoking was not common. In men however, the odds ratio estimates were markedly larger than the prevalence ratios due to a higher prevalence of outcome. The log-binomial model did not converge when age was included as a continuous covariate and COPY method was used to deal with the situation. All analysis was performed by SAS. Prevalence ratio seemed to better measure the association than odds ratio when prevalence is high. SAS programs were provided to calculate the prevalence ratios with or without continuous covariates in the log-binomial regression analysis.
Kromhout, D.
2009-01-01
Within-person variability in measured values of multiple risk factors can bias their associations with disease. The multivariate regression calibration (RC) approach can correct for such measurement error and has been applied to studies in which true values or independent repeat measurements of the
Li, Spencer D.
2011-01-01
Mediation analysis in child and adolescent development research is possible using large secondary data sets. This article provides an overview of two statistical methods commonly used to test mediated effects in secondary analysis: multiple regression and structural equation modeling (SEM). Two empirical studies are presented to illustrate the…
Wu, Dane W.
2002-01-01
The year 2000 US presidential election between Al Gore and George Bush has been the most intriguing and controversial one in American history. The state of Florida was the trigger for the controversy, mainly, due to the use of the misleading "butterfly ballot". Using prediction (or confidence) intervals for least squares regression lines…
Directory of Open Access Journals (Sweden)
Giuliano de Oliveira Freitas
2013-10-01
Full Text Available PURPOSE: To determine linear regression models between Alpins descriptive indices and Thibos astigmatic power vectors (APV, assessing the validity and strength of such correlations. METHODS: This case series prospectively assessed 62 eyes of 31 consecutive cataract patients with preoperative corneal astigmatism between 0.75 and 2.50 diopters in both eyes. Patients were randomly assorted among two phacoemulsification groups: one assigned to receive AcrySof®Toric intraocular lens (IOL in both eyes and another assigned to have AcrySof Natural IOL associated with limbal relaxing incisions, also in both eyes. All patients were reevaluated postoperatively at 6 months, when refractive astigmatism analysis was performed using both Alpins and Thibos methods. The ratio between Thibos postoperative APV and preoperative APV (APVratio and its linear regression to Alpins percentage of success of astigmatic surgery, percentage of astigmatism corrected and percentage of astigmatism reduction at the intended axis were assessed. RESULTS: Significant negative correlation between the ratio of post- and preoperative Thibos APVratio and Alpins percentage of success (%Success was found (Spearman's ρ=-0.93; linear regression is given by the following equation: %Success = (-APVratio + 1.00x100. CONCLUSION: The linear regression we found between APVratio and %Success permits a validated mathematical inference concerning the overall success of astigmatic surgery.
Check-all-that-apply data analysed by Partial Least Squares regression
DEFF Research Database (Denmark)
Rinnan, Åsmund; Giacalone, Davide; Frøst, Michael Bom
2015-01-01
are analysed by multivariate techniques. CATA data can be analysed both by setting the CATA as the X and the Y. The former is the PLS-Discriminant Analysis (PLS-DA) version, while the latter is the ANOVA-PLS (A-PLS) version. We investigated the difference between these two approaches, concluding...
Smoothness in Binomial Edge Ideals
Directory of Open Access Journals (Sweden)
Hamid Damadi
2016-06-01
Full Text Available In this paper we study some geometric properties of the algebraic set associated to the binomial edge ideal of a graph. We study the singularity and smoothness of the algebraic set associated to the binomial edge ideal of a graph. Some of these algebraic sets are irreducible and some of them are reducible. If every irreducible component of the algebraic set is smooth we call the graph an edge smooth graph, otherwise it is called an edge singular graph. We show that complete graphs are edge smooth and introduce two conditions such that the graph G is edge singular if and only if it satisfies these conditions. Then, it is shown that cycles and most of trees are edge singular. In addition, it is proved that complete bipartite graphs are edge smooth.
DEFF Research Database (Denmark)
Scott, Neil W; Fayers, Peter M; Aaronson, Neil K
2010-01-01
Differential item functioning (DIF) methods can be used to determine whether different subgroups respond differently to particular items within a health-related quality of life (HRQoL) subscale, after allowing for overall subgroup differences in that scale. This article reviews issues that arise ...... when testing for DIF in HRQoL instruments. We focus on logistic regression methods, which are often used because of their efficiency, simplicity and ease of application....
Analyses of Developmental Rate Isomorphy in Ectotherms: Introducing the Dirichlet Regression.
Directory of Open Access Journals (Sweden)
David S Boukal
Full Text Available Temperature drives development in insects and other ectotherms because their metabolic rate and growth depends directly on thermal conditions. However, relative durations of successive ontogenetic stages often remain nearly constant across a substantial range of temperatures. This pattern, termed 'developmental rate isomorphy' (DRI in insects, appears to be widespread and reported departures from DRI are generally very small. We show that these conclusions may be due to the caveats hidden in the statistical methods currently used to study DRI. Because the DRI concept is inherently based on proportional data, we propose that Dirichlet regression applied to individual-level data is an appropriate statistical method to critically assess DRI. As a case study we analyze data on five aquatic and four terrestrial insect species. We find that results obtained by Dirichlet regression are consistent with DRI violation in at least eight of the studied species, although standard analysis detects significant departure from DRI in only four of them. Moreover, the departures from DRI detected by Dirichlet regression are consistently much larger than previously reported. The proposed framework can also be used to infer whether observed departures from DRI reflect life history adaptations to size- or stage-dependent effects of varying temperature. Our results indicate that the concept of DRI in insects and other ectotherms should be critically re-evaluated and put in a wider context, including the concept of 'equiproportional development' developed for copepods.
International Nuclear Information System (INIS)
Slutskaya, N.G.; Mosseh, I.B.
2006-01-01
Data about genetic mutations under radiation and chemical treatment for different types of cells have been analyzed with correlation and regression analyses. Linear correlation between different genetic effects in sex cells and somatic cells have found. The results may be extrapolated on sex cells of human and mammals. (authors)
DEFF Research Database (Denmark)
Tybjærg-Hansen, Anne
2009-01-01
Within-person variability in measured values of multiple risk factors can bias their associations with disease. The multivariate regression calibration (RC) approach can correct for such measurement error and has been applied to studies in which true values or independent repeat measurements...... of the risk factors are observed on a subsample. We extend the multivariate RC techniques to a meta-analysis framework where multiple studies provide independent repeat measurements and information on disease outcome. We consider the cases where some or all studies have repeat measurements, and compare study......-specific, averaged and empirical Bayes estimates of RC parameters. Additionally, we allow for binary covariates (e.g. smoking status) and for uncertainty and time trends in the measurement error corrections. Our methods are illustrated using a subset of individual participant data from prospective long-term studies...
Tutorial on Using Regression Models with Count Outcomes Using R
Directory of Open Access Journals (Sweden)
A. Alexander Beaujean
2016-02-01
Full Text Available Education researchers often study count variables, such as times a student reached a goal, discipline referrals, and absences. Most researchers that study these variables use typical regression methods (i.e., ordinary least-squares either with or without transforming the count variables. In either case, using typical regression for count data can produce parameter estimates that are biased, thus diminishing any inferences made from such data. As count-variable regression models are seldom taught in training programs, we present a tutorial to help educational researchers use such methods in their own research. We demonstrate analyzing and interpreting count data using Poisson, negative binomial, zero-inflated Poisson, and zero-inflated negative binomial regression models. The count regression methods are introduced through an example using the number of times students skipped class. The data for this example are freely available and the R syntax used run the example analyses are included in the Appendix.
Seyoum, Awoke; Ndlovu, Principal; Zewotir, Temesgen
2016-01-01
CD4 cells are a type of white blood cells that plays a significant role in protecting humans from infectious diseases. Lack of information on associated factors on CD4 cell count reduction is an obstacle for improvement of cells in HIV positive adults. Therefore, the main objective of this study was to investigate baseline factors that could affect initial CD4 cell count change after highly active antiretroviral therapy had been given to adult patients in North West Ethiopia. A retrospective cross-sectional study was conducted among 792 HIV positive adult patients who already started antiretroviral therapy for 1 month of therapy. A Chi square test of association was used to assess of predictor covariates on the variable of interest. Data was secondary source and modeled using generalized linear models, especially Quasi-Poisson regression. The patients' CD4 cell count changed within a month ranged from 0 to 109 cells/mm 3 with a mean of 15.9 cells/mm 3 and standard deviation 18.44 cells/mm 3 . The first month CD4 cell count change was significantly affected by poor adherence to highly active antiretroviral therapy (aRR = 0.506, P value = 2e -16 ), fair adherence (aRR = 0.592, P value = 0.0120), initial CD4 cell count (aRR = 1.0212, P value = 1.54e -15 ), low household income (aRR = 0.63, P value = 0.671e -14 ), middle income (aRR = 0.74, P value = 0.629e -12 ), patients without cell phone (aRR = 0.67, P value = 0.615e -16 ), WHO stage 2 (aRR = 0.91, P value = 0.0078), WHO stage 3 (aRR = 0.91, P value = 0.0058), WHO stage 4 (0876, P value = 0.0214), age (aRR = 0.987, P value = 0.000) and weight (aRR = 1.0216, P value = 3.98e -14 ). Adherence to antiretroviral therapy, initial CD4 cell count, household income, WHO stages, age, weight and owner of cell phone played a major role for the variation of CD4 cell count in our data. Hence, we recommend a close follow-up of patients to adhere the prescribed medication for
Huang, Banglian; Yang, Yiming; Luo, Tingting; Wu, S.; Du, Xuezhu; Cai, Detian; Loo, van, E.N.; Huang Bangquan
2013-01-01
In the present study correlation, regression and path analyses were carried out to decide correlations among the agro- nomic traits and their contributions to seed yield per plant in Crambe abyssinica. Partial correlation analysis indicated that plant height (X1) was significantly correlated with branching height and the number of first branches (P <0.01); Branching height (X2) was significantly correlated with pod number of primary inflorescence (P <0.01) and number of secondary branch...
Chromosome aberration analysis based on a beta-binomial distribution
International Nuclear Information System (INIS)
Otake, Masanori; Prentice, R.L.
1983-10-01
Analyses carried out here generalized on earlier studies of chromosomal aberrations in the populations of Hiroshima and Nagasaki, by allowing extra-binomial variation in aberrant cell counts corresponding to within-subject correlations in cell aberrations. Strong within-subject correlations were detected with corresponding standard errors for the average number of aberrant cells that were often substantially larger than was previously assumed. The extra-binomial variation is accomodated in the analysis in the present report, as described in the section on dose-response models, by using a beta-binomial (B-B) variance structure. It is emphasized that we have generally satisfactory agreement between the observed and the B-B fitted frequencies by city-dose category. The chromosomal aberration data considered here are not extensive enough to allow a precise discrimination between competing dose-response models. A quadratic gamma ray and linear neutron model, however, most closely fits the chromosome data. (author)
Zero-truncated negative binomial - Erlang distribution
Bodhisuwan, Winai; Pudprommarat, Chookait; Bodhisuwan, Rujira; Saothayanun, Luckhana
2017-11-01
The zero-truncated negative binomial-Erlang distribution is introduced. It is developed from negative binomial-Erlang distribution. In this work, the probability mass function is derived and some properties are included. The parameters of the zero-truncated negative binomial-Erlang distribution are estimated by using the maximum likelihood estimation. Finally, the proposed distribution is applied to real data, the number of methamphetamine in the Bangkok, Thailand. Based on the results, it shows that the zero-truncated negative binomial-Erlang distribution provided a better fit than the zero-truncated Poisson, zero-truncated negative binomial, zero-truncated generalized negative-binomial and zero-truncated Poisson-Lindley distributions for this data.
Jumps in binomial AR(1) processes
Weiß , Christian H.
2009-01-01
Abstract We consider the binomial AR(1) model for serially dependent processes of binomial counts. After a review of its definition and known properties, we investigate marginal and serial properties of jumps in such processes. Based on these results, we propose the jumps control chart for monitoring a binomial AR(1) process. We show how to evaluate the performance of this control chart and give design recommendations. correspondance: Tel.: +49 931 31 84968; ...
Using the {Beta}-binomial distribution to characterize forest health
Energy Technology Data Exchange (ETDEWEB)
Zarnoch, S. J. [USDA Forest Service, Southern Research Station, Athens, GA (United States); Anderson, R.L.; Sheffield, R. M. [USDA Forest Service, Southern Research Station, Asheville, NC (United States)
1995-03-01
Forest health monitoring programs often use base variables which are dichotomous (i e. alive/dead, damaged/undamaged) to describe the health of trees. Typical sampling designs usually consist of randomly or systematically chosen clusters of trees for observation.It was claimed that contagiousness of diseases for example may result in non-uniformity of affected trees, so that distribution of the proportions, rather than simply the mean proportion, becomes important. The use of the {Beta}-binomial model was suggested for such cases. Use of the {Beta}-binomial distribution model applied in forest health analyses, was described.. Data on dogwood anthracnose (caused by Discula destructiva), a disease of flowering dogwood (Cornus florida L.), was used to illustrate the utility of the model. The {Beta}-binomial model allowed the detection of different distributional patterns of dogwood anthracnose over time and space. Results led to further speculation regarding the cause of the patterns. Traditional proportion analyses like ANOVA would not have detected the trends found using the {Beta}-binomial model, until more distinct patterns had evolved at a later date. The model was said to be flexible and require no special weighting or transformations of data.Another advantage claimed was its ability to handle unequal sample sizes.
Problems on Divisibility of Binomial Coefficients
Osler, Thomas J.; Smoak, James
2004-01-01
Twelve unusual problems involving divisibility of the binomial coefficients are represented in this article. The problems are listed in "The Problems" section. All twelve problems have short solutions which are listed in "The Solutions" section. These problems could be assigned to students in any course in which the binomial theorem and Pascal's…
System-Reliability Cumulative-Binomial Program
Scheuer, Ernest M.; Bowerman, Paul N.
1989-01-01
Cumulative-binomial computer program, NEWTONP, one of set of three programs, calculates cumulative binomial probability distributions for arbitrary inputs. NEWTONP, CUMBIN (NPO-17555), and CROSSER (NPO-17557), used independently of one another. Program finds probability required to yield given system reliability. Used by statisticians and users of statistical procedures, test planners, designers, and numerical analysts. Program written in C.
A class of orthogonal nonrecursive binomial filters.
Haddad, R. A.
1971-01-01
The time- and frequency-domain properties of the orthogonal binomial sequences are presented. It is shown that these sequences, or digital filters based on them, can be generated using adders and delay elements only. The frequency-domain behavior of these nonrecursive binomial filters suggests a number of applications as low-pass Gaussian filters or as inexpensive bandpass filters.
Common-Reliability Cumulative-Binomial Program
Scheuer, Ernest, M.; Bowerman, Paul N.
1989-01-01
Cumulative-binomial computer program, CROSSER, one of set of three programs, calculates cumulative binomial probability distributions for arbitrary inputs. CROSSER, CUMBIN (NPO-17555), and NEWTONP (NPO-17556), used independently of one another. Point of equality between reliability of system and common reliability of components found. Used by statisticians and users of statistical procedures, test planners, designers, and numerical analysts. Program written in C.
Analyses of non-fatal accidents in an opencast mine by logistic regression model - a case study.
Onder, Seyhan; Mutlu, Mert
2017-09-01
Accidents cause major damage for both workers and enterprises in the mining industry. To reduce the number of occupational accidents, these incidents should be properly registered and carefully analysed. This study efficiently examines the Aegean Lignite Enterprise (ELI) of Turkish Coal Enterprises (TKI) in Soma between 2006 and 2011, and opencast coal mine occupational accident records were used for statistical analyses. A total of 231 occupational accidents were analysed for this study. The accident records were categorized into seven groups: area, reason, occupation, part of body, age, shift hour and lost days. The SPSS package program was used in this study for logistic regression analyses, which predicted the probability of accidents resulting in greater or less than 3 lost workdays for non-fatal injuries. Social facilities-area of surface installations, workshops and opencast mining areas are the areas with the highest probability for accidents with greater than 3 lost workdays for non-fatal injuries, while the reasons with the highest probability for these types of accidents are transporting and manual handling. Additionally, the model was tested for such reported accidents that occurred in 2012 for the ELI in Soma and estimated the probability of exposure to accidents with lost workdays correctly by 70%.
International Nuclear Information System (INIS)
Arneodo, M.; Ferrero, M.I.; Peroni, C.; Bee, C.P.; Bird, I.; Coughlan, J.; Sloan, T.; Braun, H.; Brueck, H.; Drees, J.; Edwards, A.; Krueger, J.; Montgomery, H.E.; Peschel, H.; Pietrzyk, U.; Poetsch, M.; Schneider, A.; Dreyer, T.; Ernst, T.; Haas, J.; Kabuss, E.M.; Landgraf, U.; Mohr, W.; Rith, K.; Schlagboehmer, A.; Schroeder, T.; Stier, H.E.; Wallucks, W.
1987-01-01
The multiplicity distributions of charged hadrons produced in the deep inelastic muon-proton scattering at 280 GeV are analysed in various rapidity intervals, as a function of the total hadronic centre of mass energy W ranging from 4-20 GeV. Multiplicity distributions for the backward and forward hemispheres are also analysed separately. The data can be well parameterized by binomial distributions, extending their range of applicability to the case of lepton-proton scattering. The energy and the rapidity dependence of the parameters is presented and a smooth transition from the binomial distribution via Poissonian to the ordinary binomial is observed. (orig.)
Directory of Open Access Journals (Sweden)
Kevin D. Cashman
2017-05-01
Full Text Available Dietary Reference Values (DRVs for vitamin D have a key role in the prevention of vitamin D deficiency. However, despite adopting similar risk assessment protocols, estimates from authoritative agencies over the last 6 years have been diverse. This may have arisen from diverse approaches to data analysis. Modelling strategies for pooling of individual subject data from cognate vitamin D randomized controlled trials (RCTs are likely to provide the most appropriate DRV estimates. Thus, the objective of the present work was to undertake the first-ever individual participant data (IPD-level meta-regression, which is increasingly recognized as best practice, from seven winter-based RCTs (with 882 participants ranging in age from 4 to 90 years of the vitamin D intake–serum 25-hydroxyvitamin D (25(OHD dose-response. Our IPD-derived estimates of vitamin D intakes required to maintain 97.5% of 25(OHD concentrations >25, 30, and 50 nmol/L across the population are 10, 13, and 26 µg/day, respectively. In contrast, standard meta-regression analyses with aggregate data (as used by several agencies in recent years from the same RCTs estimated that a vitamin D intake requirement of 14 µg/day would maintain 97.5% of 25(OHD >50 nmol/L. These first IPD-derived estimates offer improved dietary recommendations for vitamin D because the underpinning modeling captures the between-person variability in response of serum 25(OHD to vitamin D intake.
Correlated binomial models and correlation structures
International Nuclear Information System (INIS)
Hisakado, Masato; Kitsukawa, Kenji; Mori, Shintaro
2006-01-01
We discuss a general method to construct correlated binomial distributions by imposing several consistent relations on the joint probability function. We obtain self-consistency relations for the conditional correlations and conditional probabilities. The beta-binomial distribution is derived by a strong symmetric assumption on the conditional correlations. Our derivation clarifies the 'correlation' structure of the beta-binomial distribution. It is also possible to study the correlation structures of other probability distributions of exchangeable (homogeneous) correlated Bernoulli random variables. We study some distribution functions and discuss their behaviours in terms of their correlation structures
Understanding logistic regression analysis
Sperandei, Sandro
2014-01-01
Logistic regression is used to obtain odds ratio in the presence of more than one explanatory variable. The procedure is quite similar to multiple linear regression, with the exception that the response variable is binomial. The result is the impact of each variable on the odds ratio of the observed event of interest. The main advantage is to avoid confounding effects by analyzing the association of all variables together. In this article, we explain the logistic regression procedure using ex...
Newton Binomial Formulas in Schubert Calculus
Cordovez, Jorge; Gatto, Letterio; Santiago, Taise
2008-01-01
We prove Newton's binomial formulas for Schubert Calculus to determine numbers of base point free linear series on the projective line with prescribed ramification divisor supported at given distinct points.
The Binomial Coefficient for Negative Arguments
Kronenburg, M. J.
2011-01-01
The definition of the binomial coefficient in terms of gamma functions also allows non-integer arguments. For nonnegative integer arguments the gamma functions reduce to factorials, leading to the well-known Pascal triangle. Using a symmetry formula for the gamma function, this definition is extended to negative integer arguments, making the symmetry identity for binomial coefficients valid for all integer arguments. The agreement of this definition with some other identities and with the bin...
Calculating Cumulative Binomial-Distribution Probabilities
Scheuer, Ernest M.; Bowerman, Paul N.
1989-01-01
Cumulative-binomial computer program, CUMBIN, one of set of three programs, calculates cumulative binomial probability distributions for arbitrary inputs. CUMBIN, NEWTONP (NPO-17556), and CROSSER (NPO-17557), used independently of one another. Reliabilities and availabilities of k-out-of-n systems analyzed. Used by statisticians and users of statistical procedures, test planners, designers, and numerical analysts. Used for calculations of reliability and availability. Program written in C.
Cashman, Kevin D.; Ritz, Christian; Kiely, Mairead
2017-01-01
Dietary Reference Values (DRVs) for vitamin D have a key role in the prevention of vitamin D deficiency. However, despite adopting similar risk assessment protocols, estimates from authoritative agencies over the last 6 years have been diverse. This may have arisen from diverse approaches to data analysis. Modelling strategies for pooling of individual subject data from cognate vitamin D randomized controlled trials (RCTs) are likely to provide the most appropriate DRV estimates. Thus, the objective of the present work was to undertake the first-ever individual participant data (IPD)-level meta-regression, which is increasingly recognized as best practice, from seven winter-based RCTs (with 882 participants ranging in age from 4 to 90 years) of the vitamin D intake–serum 25-hydroxyvitamin D (25(OH)D) dose-response. Our IPD-derived estimates of vitamin D intakes required to maintain 97.5% of 25(OH)D concentrations >25, 30, and 50 nmol/L across the population are 10, 13, and 26 µg/day, respectively. In contrast, standard meta-regression analyses with aggregate data (as used by several agencies in recent years) from the same RCTs estimated that a vitamin D intake requirement of 14 µg/day would maintain 97.5% of 25(OH)D >50 nmol/L. These first IPD-derived estimates offer improved dietary recommendations for vitamin D because the underpinning modeling captures the between-person variability in response of serum 25(OH)D to vitamin D intake. PMID:28481259
Log-binomial models: exploring failed convergence.
Williamson, Tyler; Eliasziw, Misha; Fick, Gordon Hilton
2013-12-13
Relative risk is a summary metric that is commonly used in epidemiological investigations. Increasingly, epidemiologists are using log-binomial models to study the impact of a set of predictor variables on a single binary outcome, as they naturally offer relative risks. However, standard statistical software may report failed convergence when attempting to fit log-binomial models in certain settings. The methods that have been proposed in the literature for dealing with failed convergence use approximate solutions to avoid the issue. This research looks directly at the log-likelihood function for the simplest log-binomial model where failed convergence has been observed, a model with a single linear predictor with three levels. The possible causes of failed convergence are explored and potential solutions are presented for some cases. Among the principal causes is a failure of the fitting algorithm to converge despite the log-likelihood function having a single finite maximum. Despite these limitations, log-binomial models are a viable option for epidemiologists wishing to describe the relationship between a set of predictors and a binary outcome where relative risk is the desired summary measure. Epidemiologists are encouraged to continue to use log-binomial models and advocate for improvements to the fitting algorithms to promote the widespread use of log-binomial models.
The Validation of a Beta-Binomial Model for Overdispersed Binomial Data.
Kim, Jongphil; Lee, Ji-Hyun
2017-01-01
The beta-binomial model has been widely used as an analytically tractable alternative that captures the overdispersion of an intra-correlated, binomial random variable, X . However, the model validation for X has been rarely investigated. As a beta-binomial mass function takes on a few different shapes, the model validation is examined for each of the classified shapes in this paper. Further, the mean square error (MSE) is illustrated for each shape by the maximum likelihood estimator (MLE) based on a beta-binomial model approach and the method of moments estimator (MME) in order to gauge when and how much the MLE is biased.
Directory of Open Access Journals (Sweden)
Luise A Seeker
Full Text Available Telomeres cap the ends of linear chromosomes and shorten with age in many organisms. In humans short telomeres have been linked to morbidity and mortality. With the accumulation of longitudinal datasets the focus shifts from investigating telomere length (TL to exploring TL change within individuals over time. Some studies indicate that the speed of telomere attrition is predictive of future disease. The objectives of the present study were to 1 characterize the change in bovine relative leukocyte TL (RLTL across the lifetime in Holstein Friesian dairy cattle, 2 estimate genetic parameters of RLTL over time and 3 investigate the association of differences in individual RLTL profiles with productive lifespan. RLTL measurements were analysed using Legendre polynomials in a random regression model to describe TL profiles and genetic variance over age. The analyses were based on 1,328 repeated RLTL measurements of 308 female Holstein Friesian dairy cattle. A quadratic Legendre polynomial was fitted to the fixed effect of age in months and to the random effect of the animal identity. Changes in RLTL, heritability and within-trait genetic correlation along the age trajectory were calculated and illustrated. At a population level, the relationship between RLTL and age was described by a positive quadratic function. Individuals varied significantly regarding the direction and amount of RLTL change over life. The heritability of RLTL ranged from 0.36 to 0.47 (SE = 0.05-0.08 and remained statistically unchanged over time. The genetic correlation of RLTL at birth with measurements later in life decreased with the time interval between samplings from near unity to 0.69, indicating that TL later in life might be regulated by different genes than TL early in life. Even though animals differed in their RLTL profiles significantly, those differences were not correlated with productive lifespan (p = 0.954.
Directory of Open Access Journals (Sweden)
Lusi Eka Afri
2017-03-01
Full Text Available Regresi Binomial Negatif dan regresi Conway-Maxwell-Poisson merupakan solusi untuk mengatasi overdispersi pada regresi Poisson. Kedua model tersebut merupakan perluasan dari model regresi Poisson. Menurut Hinde dan Demetrio (2007, terdapat beberapa kemungkinan terjadi overdispersi pada regresi Poisson yaitu keragaman hasil pengamatan keragaman individu sebagai komponen yang tidak dijelaskan oleh model, korelasi antar respon individu, terjadinya pengelompokan dalam populasi dan peubah teramati yang dihilangkan. Akibatnya dapat menyebabkan pendugaan galat baku yang terlalu rendah dan akan menghasilkan pendugaan parameter yang bias ke bawah (underestimate. Penelitian ini bertujuan untuk membandingan model Regresi Binomial Negatif dan model regresi Conway-Maxwell-Poisson (COM-Poisson dalam mengatasi overdispersi pada data distribusi Poisson berdasarkan statistik uji devians. Data yang digunakan dalam penelitian ini terdiri dari dua sumber data yaitu data simulasi dan data kasus terapan. Data simulasi yang digunakan diperoleh dengan membangkitkan data berdistribusi Poisson yang mengandung overdispersi dengan menggunakan bahasa pemrograman R berdasarkan karakteristik data berupa , peluang munculnya nilai nol (p serta ukuran sampel (n. Data dibangkitkan berguna untuk mendapatkan estimasi koefisien parameter pada regresi binomial negatif dan COM-Poisson. Kata Kunci: overdispersi, regresi binomial negatif, regresi Conway-Maxwell-Poisson Negative binomial regression and Conway-Maxwell-Poisson regression could be used to overcome over dispersion on Poisson regression. Both models are the extension of Poisson regression model. According to Hinde and Demetrio (2007, there will be some over dispersion on Poisson regression: observed variance in individual variance cannot be described by a model, correlation among individual response, and the population group and the observed variables are eliminated. Consequently, this can lead to low standard error
Estimation Parameters And Modelling Zero Inflated Negative Binomial
Directory of Open Access Journals (Sweden)
Cindy Cahyaning Astuti
2016-11-01
Full Text Available Regression analysis is used to determine relationship between one or several response variable (Y with one or several predictor variables (X. Regression model between predictor variables and the Poisson distributed response variable is called Poisson Regression Model. Since, Poisson Regression requires an equality between mean and variance, it is not appropriate to apply this model on overdispersion (variance is higher than mean. Poisson regression model is commonly used to analyze the count data. On the count data type, it is often to encounteredd some observations that have zero value with large proportion of zero value on the response variable (zero Inflation. Poisson regression can be used to analyze count data but it has not been able to solve problem of excess zero value on the response variable. An alternative model which is more suitable for overdispersion data and can solve the problem of excess zero value on the response variable is Zero Inflated Negative Binomial (ZINB. In this research, ZINB is applied on the case of Tetanus Neonatorum in East Java. The aim of this research is to examine the likelihood function and to form an algorithm to estimate the parameter of ZINB and also applying ZINB model in the case of Tetanus Neonatorum in East Java. Maximum Likelihood Estimation (MLE method is used to estimate the parameter on ZINB and the likelihood function is maximized using Expectation Maximization (EM algorithm. Test results of ZINB regression model showed that the predictor variable have a partial significant effect at negative binomial model is the percentage of pregnant women visits and the percentage of maternal health personnel assisted, while the predictor variables that have a partial significant effect at zero inflation model is the percentage of neonatus visits.
Meta-analysis of studies with bivariate binary outcomes: a marginal beta-binomial model approach.
Chen, Yong; Hong, Chuan; Ning, Yang; Su, Xiao
2016-01-15
When conducting a meta-analysis of studies with bivariate binary outcomes, challenges arise when the within-study correlation and between-study heterogeneity should be taken into account. In this paper, we propose a marginal beta-binomial model for the meta-analysis of studies with binary outcomes. This model is based on the composite likelihood approach and has several attractive features compared with the existing models such as bivariate generalized linear mixed model (Chu and Cole, 2006) and Sarmanov beta-binomial model (Chen et al., 2012). The advantages of the proposed marginal model include modeling the probabilities in the original scale, not requiring any transformation of probabilities or any link function, having closed-form expression of likelihood function, and no constraints on the correlation parameter. More importantly, because the marginal beta-binomial model is only based on the marginal distributions, it does not suffer from potential misspecification of the joint distribution of bivariate study-specific probabilities. Such misspecification is difficult to detect and can lead to biased inference using currents methods. We compare the performance of the marginal beta-binomial model with the bivariate generalized linear mixed model and the Sarmanov beta-binomial model by simulation studies. Interestingly, the results show that the marginal beta-binomial model performs better than the Sarmanov beta-binomial model, whether or not the true model is Sarmanov beta-binomial, and the marginal beta-binomial model is more robust than the bivariate generalized linear mixed model under model misspecifications. Two meta-analyses of diagnostic accuracy studies and a meta-analysis of case-control studies are conducted for illustration. Copyright © 2015 John Wiley & Sons, Ltd.
Comparison: Binomial model and Black Scholes model
Directory of Open Access Journals (Sweden)
Amir Ahmad Dar
2018-03-01
Full Text Available The Binomial Model and the Black Scholes Model are the popular methods that are used to solve the option pricing problems. Binomial Model is a simple statistical method and Black Scholes model requires a solution of a stochastic differential equation. Pricing of European call and a put option is a very difficult method used by actuaries. The main goal of this study is to differentiate the Binominal model and the Black Scholes model by using two statistical model - t-test and Tukey model at one period. Finally, the result showed that there is no significant difference between the means of the European options by using the above two models.
Directory of Open Access Journals (Sweden)
Željko V. Račić
2010-12-01
Full Text Available This paper aims to present the specifics of the application of multiple linear regression model. The economic (financial crisis is analyzed in terms of gross domestic product which is in a function of the foreign trade balance (on one hand and the credit cards, i.e. indebtedness of the population on this basis (on the other hand, in the USA (from 1999. to 2008. We used the extended application model which shows how the analyst should run the whole development process of regression model. This process began with simple statistical features and the application of regression procedures, and ended with residual analysis, intended for the study of compatibility of data and model settings. This paper also analyzes the values of some standard statistics used in the selection of appropriate regression model. Testing of the model is carried out with the use of the Statistics PASW 17 program.
Simulation on Poisson and negative binomial models of count road accident modeling
Sapuan, M. S.; Razali, A. M.; Zamzuri, Z. H.; Ibrahim, K.
2016-11-01
Accident count data have often been shown to have overdispersion. On the other hand, the data might contain zero count (excess zeros). The simulation study was conducted to create a scenarios which an accident happen in T-junction with the assumption the dependent variables of generated data follows certain distribution namely Poisson and negative binomial distribution with different sample size of n=30 to n=500. The study objective was accomplished by fitting Poisson regression, negative binomial regression and Hurdle negative binomial model to the simulated data. The model validation was compared and the simulation result shows for each different sample size, not all model fit the data nicely even though the data generated from its own distribution especially when the sample size is larger. Furthermore, the larger sample size indicates that more zeros accident count in the dataset.
Binomial distribution for the charge asymmetry parameter
International Nuclear Information System (INIS)
Chou, T.T.; Yang, C.N.
1984-01-01
It is suggested that for high energy collisions the distribution with respect to the charge asymmetry z = nsub(F) - nsub(B) is binomial, where nsub(F) and nsub(B) are the forward and backward charge multiplicities. (orig.)
Binomial test models and item difficulty
van der Linden, Willem J.
1979-01-01
In choosing a binomial test model, it is important to know exactly what conditions are imposed on item difficulty. In this paper these conditions are examined for both a deterministic and a stochastic conception of item responses. It appears that they are more restrictive than is generally
Adaptive estimation of binomial probabilities under misclassification
Albers, Willem/Wim; Veldman, H.J.
1984-01-01
If misclassification occurs the standard binomial estimator is usually seriously biased. It is known that an improvement can be achieved by using more than one observer in classifying the sample elements. Here it will be investigated which number of observers is optimal given the total number of
Binomial vs poisson statistics in radiation studies
International Nuclear Information System (INIS)
Foster, J.; Kouris, K.; Spyrou, N.M.; Matthews, I.P.; Welsh National School of Medicine, Cardiff
1983-01-01
The processes of radioactive decay, decay and growth of radioactive species in a radioactive chain, prompt emission(s) from nuclear reactions, conventional activation and cyclic activation are discussed with respect to their underlying statistical density function. By considering the transformation(s) that each nucleus may undergo it is shown that all these processes are fundamentally binomial. Formally, when the number of experiments N is large and the probability of success p is close to zero, the binomial is closely approximated by the Poisson density function. In radiation and nuclear physics, N is always large: each experiment can be conceived of as the observation of the fate of each of the N nuclei initially present. Whether p, the probability that a given nucleus undergoes a prescribed transformation, is close to zero depends on the process and nuclide(s) concerned. Hence, although a binomial description is always valid, the Poisson approximation is not always adequate. Therefore further clarification is provided as to when the binomial distribution must be used in the statistical treatment of detected events. (orig.)
Expansion around half-integer values, binomial sums, and inverse binomial sums
International Nuclear Information System (INIS)
Weinzierl, Stefan
2004-01-01
I consider the expansion of transcendental functions in a small parameter around rational numbers. This includes in particular the expansion around half-integer values. I present algorithms which are suitable for an implementation within a symbolic computer algebra system. The method is an extension of the technique of nested sums. The algorithms allow in addition the evaluation of binomial sums, inverse binomial sums and generalizations thereof
Covariate Imbalance and Adjustment for Logistic Regression Analysis of Clinical Trial Data
Ciolino, Jody D.; Martin, Reneé H.; Zhao, Wenle; Jauch, Edward C.; Hill, Michael D.; Palesch, Yuko Y.
2014-01-01
In logistic regression analysis for binary clinical trial data, adjusted treatment effect estimates are often not equivalent to unadjusted estimates in the presence of influential covariates. This paper uses simulation to quantify the benefit of covariate adjustment in logistic regression. However, International Conference on Harmonization guidelines suggest that covariate adjustment be pre-specified. Unplanned adjusted analyses should be considered secondary. Results suggest that that if adjustment is not possible or unplanned in a logistic setting, balance in continuous covariates can alleviate some (but never all) of the shortcomings of unadjusted analyses. The case of log binomial regression is also explored. PMID:24138438
Energy Technology Data Exchange (ETDEWEB)
Reddy, T.A. (Energy Systems Lab., Texas A and M Univ., College Station, TX (United States)); Claridge, D.E. (Energy Systems Lab., Texas A and M Univ., College Station, TX (United States))
1994-01-01
Multiple regression modeling of monitored building energy use data is often faulted as a reliable means of predicting energy use on the grounds that multicollinearity between the regressor variables can lead both to improper interpretation of the relative importance of the various physical regressor parameters and to a model with unstable regressor coefficients. Principal component analysis (PCA) has the potential to overcome such drawbacks. While a few case studies have already attempted to apply this technique to building energy data, the objectives of this study were to make a broader evaluation of PCA and multiple regression analysis (MRA) and to establish guidelines under which one approach is preferable to the other. Four geographic locations in the US with different climatic conditions were selected and synthetic data sequence representative of daily energy use in large institutional buildings were generated in each location using a linear model with outdoor temperature, outdoor specific humidity and solar radiation as the three regression variables. MRA and PCA approaches were then applied to these data sets and their relative performances were compared. Conditions under which PCA seems to perform better than MRA were identified and preliminary recommendations on the use of either modeling approach formulated. (orig.)
Tomography of binomial states of the radiation field
Bazrafkan, MR; Man'ko, [No Value
2004-01-01
The symplectic, optical, and photon-number tomographic symbols of binomial states of the radiation field are studied. Explicit relations for all tomograms of the binomial states are obtained. Two measures for nonclassical properties of these states are discussed.
Understanding logistic regression analysis.
Sperandei, Sandro
2014-01-01
Logistic regression is used to obtain odds ratio in the presence of more than one explanatory variable. The procedure is quite similar to multiple linear regression, with the exception that the response variable is binomial. The result is the impact of each variable on the odds ratio of the observed event of interest. The main advantage is to avoid confounding effects by analyzing the association of all variables together. In this article, we explain the logistic regression procedure using examples to make it as simple as possible. After definition of the technique, the basic interpretation of the results is highlighted and then some special issues are discussed.
Laszlo, Sarah; Federmeier, Kara D.
2010-01-01
Linking print with meaning tends to be divided into subprocesses, such as recognition of an input's lexical entry and subsequent access of semantics. However, recent results suggest that the set of semantic features activated by an input is broader than implied by a view wherein access serially follows recognition. EEG was collected from participants who viewed items varying in number and frequency of both orthographic neighbors and lexical associates. Regression analysis of single item ERPs replicated past findings, showing that N400 amplitudes are greater for items with more neighbors, and further revealed that N400 amplitudes increase for items with more lexical associates and with higher frequency neighbors or associates. Together, the data suggest that in the N400 time window semantic features of items broadly related to inputs are active, consistent with models in which semantic access takes place in parallel with stimulus recognition. PMID:20624252
International Nuclear Information System (INIS)
Amitabh, J.; Vaccaro, J.A.; Hill, K.E.
1998-01-01
We study the recently defined number-phase Wigner function S NP (n,θ) for a single-mode field considered to be in binomial and negative binomial states. These states interpolate between Fock and coherent states and coherent and quasi thermal states, respectively, and thus provide a set of states with properties ranging from uncertain phase and sharp photon number to sharp phase and uncertain photon number. The distribution function S NP (n,θ) gives a graphical representation of the complimentary nature of the number and phase properties of these states. We highlight important differences between Wigner's quasi probability function, which is associated with the position and momentum observables, and S NP (n,θ), which is associated directly with the photon number and phase observables. We also discuss the number-phase entropic uncertainty relation for the binomial and negative binomial states and we show that negative binomial states give a lower phase entropy than states which minimize the phase variance
Pooling overdispersed binomial data to estimate event rate.
Young-Xu, Yinong; Chan, K Arnold
2008-08-19
The beta-binomial model is one of the methods that can be used to validly combine event rates from overdispersed binomial data. Our objective is to provide a full description of this method and to update and broaden its applications in clinical and public health research. We describe the statistical theories behind the beta-binomial model and the associated estimation methods. We supply information about statistical software that can provide beta-binomial estimations. Using a published example, we illustrate the application of the beta-binomial model when pooling overdispersed binomial data. In an example regarding the safety of oral antifungal treatments, we had 41 treatment arms with event rates varying from 0% to 13.89%. Using the beta-binomial model, we obtained a summary event rate of 3.44% with a standard error of 0.59%. The parameters of the beta-binomial model took the values of 1.24 for alpha and 34.73 for beta. The beta-binomial model can provide a robust estimate for the summary event rate by pooling overdispersed binomial data from different studies. The explanation of the method and the demonstration of its applications should help researchers incorporate the beta-binomial method as they aggregate probabilities of events from heterogeneous studies.
Pooling overdispersed binomial data to estimate event rate
Directory of Open Access Journals (Sweden)
Chan K Arnold
2008-08-01
Full Text Available Abstract Background The beta-binomial model is one of the methods that can be used to validly combine event rates from overdispersed binomial data. Our objective is to provide a full description of this method and to update and broaden its applications in clinical and public health research. Methods We describe the statistical theories behind the beta-binomial model and the associated estimation methods. We supply information about statistical software that can provide beta-binomial estimations. Using a published example, we illustrate the application of the beta-binomial model when pooling overdispersed binomial data. Results In an example regarding the safety of oral antifungal treatments, we had 41 treatment arms with event rates varying from 0% to 13.89%. Using the beta-binomial model, we obtained a summary event rate of 3.44% with a standard error of 0.59%. The parameters of the beta-binomial model took the values of 1.24 for alpha and 34.73 for beta. Conclusion The beta-binomial model can provide a robust estimate for the summary event rate by pooling overdispersed binomial data from different studies. The explanation of the method and the demonstration of its applications should help researchers incorporate the beta-binomial method as they aggregate probabilities of events from heterogeneous studies.
Directory of Open Access Journals (Sweden)
Abdelfattah M. Selim
2018-03-01
Full Text Available Aim: The present cross-sectional study was conducted to determine the seroprevalence and potential risk factors associated with Bovine viral diarrhea virus (BVDV disease in cattle and buffaloes in Egypt, to model the potential risk factors associated with the disease using logistic regression (LR models, and to fit the best predictive model for the current data. Materials and Methods: A total of 740 blood samples were collected within November 2012-March 2013 from animals aged between 6 months and 3 years. The potential risk factors studied were species, age, sex, and herd location. All serum samples were examined with indirect ELIZA test for antibody detection. Data were analyzed with different statistical approaches such as Chi-square test, odds ratios (OR, univariable, and multivariable LR models. Results: Results revealed a non-significant association between being seropositive with BVDV and all risk factors, except for species of animal. Seroprevalence percentages were 40% and 23% for cattle and buffaloes, respectively. OR for all categories were close to one with the highest OR for cattle relative to buffaloes, which was 2.237. Likelihood ratio tests showed a significant drop of the -2LL from univariable LR to multivariable LR models. Conclusion: There was an evidence of high seroprevalence of BVDV among cattle as compared with buffaloes with the possibility of infection in different age groups of animals. In addition, multivariable LR model was proved to provide more information for association and prediction purposes relative to univariable LR models and Chi-square tests if we have more than one predictor.
Botha, J.; De Ridder, J.H.; Potgieter, J.C.; Steyn, H.S.; Malan, L.
2013-01-01
A recently proposed model for waist circumference cut points (RPWC), driven by increased blood pressure, was demonstrated in an African population. We therefore aimed to validate the RPWC by comparing the RPWC and the Joint Statement Consensus (JSC) models via Logistic Regression (LR) and Neural Networks (NN) analyses. Urban African gender groups (N=171) were stratified according to the JSC and RPWC cut point models. Ultrasound carotid intima media thickness (CIMT), blood pressure (BP) and fa...
Poissonian and binomial models in radionuclide metrology by liquid scintillation counting
International Nuclear Information System (INIS)
Grau Malonda, A.
1990-01-01
Binomial and Poissonian models developed for calculating the counting efficiency from a free parameter is analysed in this paper. This model have been applied to liquid scintillator counting systems with two or three photomultipliers. It is mathematically demostrated that both models are equivalent and that the counting efficiencies calculated either from one or the other model are identical. (Author)
Tran, Phoebe; Waller, Lance
2015-01-01
Lyme disease has been the subject of many studies due to increasing incidence rates year after year and the severe complications that can arise in later stages of the disease. Negative binomial models have been used to model Lyme disease in the past with some success. However, there has been little focus on the reliability and consistency of these models when they are used to study Lyme disease at multiple spatial scales. This study seeks to explore how sensitive/consistent negative binomial models are when they are used to study Lyme disease at different spatial scales (at the regional and sub-regional levels). The study area includes the thirteen states in the Northeastern United States with the highest Lyme disease incidence during the 2002-2006 period. Lyme disease incidence at county level for the period of 2002-2006 was linked with several previously identified key landscape and climatic variables in a negative binomial regression model for the Northeastern region and two smaller sub-regions (the New England sub-region and the Mid-Atlantic sub-region). This study found that negative binomial models, indeed, were sensitive/inconsistent when used at different spatial scales. We discuss various plausible explanations for such behavior of negative binomial models. Further investigation of the inconsistency and sensitivity of negative binomial models when used at different spatial scales is important for not only future Lyme disease studies and Lyme disease risk assessment/management but any study that requires use of this model type in a spatial context. Copyright © 2014 Elsevier Inc. All rights reserved.
Misyura, Maksym; Sukhai, Mahadeo A; Kulasignam, Vathany; Zhang, Tong; Kamel-Reid, Suzanne; Stockley, Tracy L
2018-02-01
A standard approach in test evaluation is to compare results of the assay in validation to results from previously validated methods. For quantitative molecular diagnostic assays, comparison of test values is often performed using simple linear regression and the coefficient of determination (R 2 ), using R 2 as the primary metric of assay agreement. However, the use of R 2 alone does not adequately quantify constant or proportional errors required for optimal test evaluation. More extensive statistical approaches, such as Bland-Altman and expanded interpretation of linear regression methods, can be used to more thoroughly compare data from quantitative molecular assays. We present the application of Bland-Altman and linear regression statistical methods to evaluate quantitative outputs from next-generation sequencing assays (NGS). NGS-derived data sets from assay validation experiments were used to demonstrate the utility of the statistical methods. Both Bland-Altman and linear regression were able to detect the presence and magnitude of constant and proportional error in quantitative values of NGS data. Deming linear regression was used in the context of assay comparison studies, while simple linear regression was used to analyse serial dilution data. Bland-Altman statistical approach was also adapted to quantify assay accuracy, including constant and proportional errors, and precision where theoretical and empirical values were known. The complementary application of the statistical methods described in this manuscript enables more extensive evaluation of performance characteristics of quantitative molecular assays, prior to implementation in the clinical molecular laboratory. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
A comparison of methods for the analysis of binomial clustered outcomes in behavioral research.
Ferrari, Alberto; Comelli, Mario
2016-12-01
In behavioral research, data consisting of a per-subject proportion of "successes" and "failures" over a finite number of trials often arise. This clustered binary data are usually non-normally distributed, which can distort inference if the usual general linear model is applied and sample size is small. A number of more advanced methods is available, but they are often technically challenging and a comparative assessment of their performances in behavioral setups has not been performed. We studied the performances of some methods applicable to the analysis of proportions; namely linear regression, Poisson regression, beta-binomial regression and Generalized Linear Mixed Models (GLMMs). We report on a simulation study evaluating power and Type I error rate of these models in hypothetical scenarios met by behavioral researchers; plus, we describe results from the application of these methods on data from real experiments. Our results show that, while GLMMs are powerful instruments for the analysis of clustered binary outcomes, beta-binomial regression can outperform them in a range of scenarios. Linear regression gave results consistent with the nominal level of significance, but was overall less powerful. Poisson regression, instead, mostly led to anticonservative inference. GLMMs and beta-binomial regression are generally more powerful than linear regression; yet linear regression is robust to model misspecification in some conditions, whereas Poisson regression suffers heavily from violations of the assumptions when used to model proportion data. We conclude providing directions to behavioral scientists dealing with clustered binary data and small sample sizes. Copyright © 2016 Elsevier B.V. All rights reserved.
Zero inflated negative binomial-Sushila distribution and its application
Yamrubboon, Darika; Thongteeraparp, Ampai; Bodhisuwan, Winai; Jampachaisri, Katechan
2017-11-01
A new zero inflated distribution is proposed in this work, namely the zero inflated negative binomial-Sushila distribution. The new distribution which is a mixture of the Bernoulli and negative binomial-Sushila distributions is an alternative distribution for the excessive zero counts and over-dispersion. Some characteristics of the proposed distribution are derived including probability mass function, mean and variance. The parameter estimation of the zero inflated negative binomial-Sushila distribution is also implemented by maximum likelihood method. In application, the proposed distribution can provide a better fit than traditional distributions: zero inflated Poisson and zero inflated negative binomial distributions.
Understanding poisson regression.
Hayat, Matthew J; Higgins, Melinda
2014-04-01
Nurse investigators often collect study data in the form of counts. Traditional methods of data analysis have historically approached analysis of count data either as if the count data were continuous and normally distributed or with dichotomization of the counts into the categories of occurred or did not occur. These outdated methods for analyzing count data have been replaced with more appropriate statistical methods that make use of the Poisson probability distribution, which is useful for analyzing count data. The purpose of this article is to provide an overview of the Poisson distribution and its use in Poisson regression. Assumption violations for the standard Poisson regression model are addressed with alternative approaches, including addition of an overdispersion parameter or negative binomial regression. An illustrative example is presented with an application from the ENSPIRE study, and regression modeling of comorbidity data is included for illustrative purposes. Copyright 2014, SLACK Incorporated.
Dyer, Betsey D.; Kahn, Michael J.; LeBlanc, Mark D.
2008-01-01
Classification and regression tree (CART) analysis was applied to genome-wide tetranucleotide frequencies (genomic signatures) of 195 archaea and bacteria. Although genomic signatures have typically been used to classify evolutionary divergence, in this study, convergent evolution was the focus. Temperature optima for most of the organisms examined could be distinguished by CART analyses of tetranucleotide frequencies. This suggests that pervasive (nonlinear) qualities of genomes may reflect certain environmental conditions (such as temperature) in which those genomes evolved. The predominant use of GAGA and AGGA as the discriminating tetramers in CART models suggests that purine-loading and codon biases of thermophiles may explain some of the results. PMID:19054742
Extending the Binomial Checkpointing Technique for Resilience
Energy Technology Data Exchange (ETDEWEB)
Walther, Andrea; Narayanan, Sri Hari Krishna
2016-10-10
In terms of computing time, adjoint methods offer a very attractive alternative to compute gradient information, re- quired, e.g., for optimization purposes. However, together with this very favorable temporal complexity result comes a memory requirement that is in essence proportional with the operation count of the underlying function, e.g., if algo- rithmic differentiation is used to provide the adjoints. For this reason, checkpointing approaches in many variants have become popular. This paper analyzes an extension of the so-called binomial approach to cover also possible failures of the computing systems. Such a measure of precaution is of special interest for massive parallel simulations and adjoint calculations where the mean time between failure of the large scale computing system is smaller than the time needed to complete the calculation of the adjoint information. We de- scribe the extensions of standard checkpointing approaches required for such resilience, provide a corresponding imple- mentation and discuss numerical results.
Forward selection two sample binomial test
Wong, Kam-Fai; Wong, Weng-Kee; Lin, Miao-Shan
2016-01-01
Fisher’s exact test (FET) is a conditional method that is frequently used to analyze data in a 2 × 2 table for small samples. This test is conservative and attempts have been made to modify the test to make it less conservative. For example, Crans and Shuster (2008) proposed adding more points in the rejection region to make the test more powerful. We provide another way to modify the test to make it less conservative by using two independent binomial distributions as the reference distribution for the test statistic. We compare our new test with several methods and show that our test has advantages over existing methods in terms of control of the type 1 and type 2 errors. We reanalyze results from an oncology trial using our proposed method and our software which is freely available to the reader. PMID:27335577
Constrained Dynamic Optimality and Binomial Terminal Wealth
DEFF Research Database (Denmark)
Pedersen, J. L.; Peskir, G.
2018-01-01
with interest rate $r \\in {R}$). Letting $P_{t,x}$ denote a probability measure under which $X^u$ takes value $x$ at time $t,$ we study the dynamic version of the nonlinear optimal control problem $\\inf_u\\, Var{t,X_t^u}(X_T^u)$ where the infimum is taken over admissible controls $u$ subject to $X_t^u \\ge e...... a martingale method combined with Lagrange multipliers, we derive the dynamically optimal control $u_*^d$ in closed form and prove that the dynamically optimal terminal wealth $X_T^d$ can only take two values $g$ and $\\beta$. This binomial nature of the dynamically optimal strategy stands in sharp contrast...... with other known portfolio selection strategies encountered in the literature. A direct comparison shows that the dynamically optimal (time-consistent) strategy outperforms the statically optimal (time-inconsistent) strategy in the problem....
Bowden, Jack; Del Greco M, Fabiola; Minelli, Cosetta; Davey Smith, George; Sheehan, Nuala A; Thompson, John R
2016-12-01
: MR-Egger regression has recently been proposed as a method for Mendelian randomization (MR) analyses incorporating summary data estimates of causal effect from multiple individual variants, which is robust to invalid instruments. It can be used to test for directional pleiotropy and provides an estimate of the causal effect adjusted for its presence. MR-Egger regression provides a useful additional sensitivity analysis to the standard inverse variance weighted (IVW) approach that assumes all variants are valid instruments. Both methods use weights that consider the single nucleotide polymorphism (SNP)-exposure associations to be known, rather than estimated. We call this the `NO Measurement Error' (NOME) assumption. Causal effect estimates from the IVW approach exhibit weak instrument bias whenever the genetic variants utilized violate the NOME assumption, which can be reliably measured using the F-statistic. The effect of NOME violation on MR-Egger regression has yet to be studied. An adaptation of the I2 statistic from the field of meta-analysis is proposed to quantify the strength of NOME violation for MR-Egger. It lies between 0 and 1, and indicates the expected relative bias (or dilution) of the MR-Egger causal estimate in the two-sample MR context. We call it IGX2 . The method of simulation extrapolation is also explored to counteract the dilution. Their joint utility is evaluated using simulated data and applied to a real MR example. In simulated two-sample MR analyses we show that, when a causal effect exists, the MR-Egger estimate of causal effect is biased towards the null when NOME is violated, and the stronger the violation (as indicated by lower values of IGX2 ), the stronger the dilution. When additionally all genetic variants are valid instruments, the type I error rate of the MR-Egger test for pleiotropy is inflated and the causal effect underestimated. Simulation extrapolation is shown to substantially mitigate these adverse effects. We
Penggunaan Model Binomial Pada Penentuan Harga Opsi Saham Karyawan
Directory of Open Access Journals (Sweden)
Dara Puspita Anggraeni
2015-11-01
Full Text Available Binomial Model for Valuing Employee Stock Options. Employee Stock Options (ESO differ from standard exchange-traded options. The three main differences in a valuation model for employee stock options : Vesting Period, Exit Rate and Non-Transferability. In this thesis, the model for valuing employee stock options discussed. This model are implement with a generalized binomial model.
Wigner Function of Density Operator for Negative Binomial Distribution
International Nuclear Information System (INIS)
Xu Xinglei; Li Hongqi
2008-01-01
By using the technique of integration within an ordered product (IWOP) of operator we derive Wigner function of density operator for negative binomial distribution of radiation field in the mixed state case, then we derive the Wigner function of squeezed number state, which yields negative binomial distribution by virtue of the entangled state representation and the entangled Wigner operator
Optimized Binomial Quantum States of Complex Oscillators with Real Spectrum
International Nuclear Information System (INIS)
Zelaya, K D; Rosas-Ortiz, O
2016-01-01
Classical and nonclassical states of quantum complex oscillators with real spectrum are presented. Such states are bi-orthonormal superpositions of n +1 energy eigenvectors of the system with binomial-like coefficients. For large values of n these optimized binomial states behave as photon added coherent states when the imaginary part of the potential is cancelled. (paper)
Discovering Binomial Identities with PascGaloisJE
Evans, Tyler J.
2008-01-01
We describe exercises in which students use PascGaloisJE to formulate conjectures about certain binomial identities which hold when the binomial coefficients are interpreted as elements in the cyclic group Z[subscript p] of integers modulo a prime integer "p". In addition to having an appealing visual component, these exercises are open-ended and…
SEPARATION PHENOMENA LOGISTIC REGRESSION
Directory of Open Access Journals (Sweden)
Ikaro Daniel de Carvalho Barreto
2014-03-01
Full Text Available This paper proposes an application of concepts about the maximum likelihood estimation of the binomial logistic regression model to the separation phenomena. It generates bias in the estimation and provides different interpretations of the estimates on the different statistical tests (Wald, Likelihood Ratio and Score and provides different estimates on the different iterative methods (Newton-Raphson and Fisher Score. It also presents an example that demonstrates the direct implications for the validation of the model and validation of variables, the implications for estimates of odds ratios and confidence intervals, generated from the Wald statistics. Furthermore, we present, briefly, the Firth correction to circumvent the phenomena of separation.
Xie, Heping; Wang, Fuxing; Hao, Yanbin; Chen, Jiaxue; An, Jing; Wang, Yuxin; Liu, Huashan
2017-01-01
Cueing facilitates retention and transfer of multimedia learning. From the perspective of cognitive load theory (CLT), cueing has a positive effect on learning outcomes because of the reduction in total cognitive load and avoidance of cognitive overload. However, this has not been systematically evaluated. Moreover, what remains ambiguous is the direct relationship between the cue-related cognitive load and learning outcomes. A meta-analysis and two subsequent meta-regression analyses were conducted to explore these issues. Subjective total cognitive load (SCL) and scores on a retention test and transfer test were selected as dependent variables. Through a systematic literature search, 32 eligible articles encompassing 3,597 participants were included in the SCL-related meta-analysis. Among them, 25 articles containing 2,910 participants were included in the retention-related meta-analysis and the following retention-related meta-regression, while there were 29 articles containing 3,204 participants included in the transfer-related meta-analysis and the transfer-related meta-regression. The meta-analysis revealed a statistically significant cueing effect on subjective ratings of cognitive load (d = -0.11, 95% CI = [-0.19, -0.02], p < 0.05), retention performance (d = 0.27, 95% CI = [0.08, 0.46], p < 0.01), and transfer performance (d = 0.34, 95% CI = [0.12, 0.56], p < 0.01). The subsequent meta-regression analyses showed that dSCL for cueing significantly predicted dretention for cueing (β = -0.70, 95% CI = [-1.02, -0.38], p < 0.001), as well as dtransfer for cueing (β = -0.60, 95% CI = [-0.92, -0.28], p < 0.001). Thus in line with CLT, adding cues in multimedia materials can indeed reduce SCL and promote learning outcomes, and the more SCL is reduced by cues, the better retention and transfer of multimedia learning.
International Nuclear Information System (INIS)
Valor, Alma; Alfonso, Lester; Caleyo, Francisco; Vidal, Julio; Perez-Baruch, Eloy; Hallen, José M.
2015-01-01
Highlights: • Observed external-corrosion defects in underground pipelines revealed a tendency to cluster. • The Poisson distribution is unable to fit extensive count data for these type of defects. • In contrast, the negative binomial distribution provides a suitable count model for them. • Two spatial stochastic processes lead to the negative binomial distribution for defect counts. • They are the Gamma-Poisson mixed process and the compound Poisson process. • A Rogeŕs process also arises as a plausible temporal stochastic process leading to corrosion defect clustering and to negative binomially distributed defect counts. - Abstract: The spatial distribution of external corrosion defects in buried pipelines is usually described as a Poisson process, which leads to corrosion defects being randomly distributed along the pipeline. However, in real operating conditions, the spatial distribution of defects considerably departs from Poisson statistics due to the aggregation of defects in groups or clusters. In this work, the statistical analysis of real corrosion data from underground pipelines operating in southern Mexico leads to conclude that the negative binomial distribution provides a better description for defect counts. The origin of this distribution from several processes is discussed. The analysed processes are: mixed Gamma-Poisson, compound Poisson and Roger’s processes. The physical reasons behind them are discussed for the specific case of soil corrosion.
Hutton, Eileen K; Simioni, Julia C; Thabane, Lehana
2017-08-01
Among women with a fetus with a non-cephalic presentation, external cephalic version (ECV) has been shown to reduce the rate of breech presentation at birth and cesarean birth. Compared with ECV at term, beginning ECV prior to 37 weeks' gestation decreases the number of infants in a non-cephalic presentation at birth. The purpose of this secondary analysis was to investigate factors associated with a successful ECV procedure and to present this in a clinically useful format. Data were collected as part of the Early ECV Pilot and Early ECV2 Trials, which randomized 1776 women with a fetus in breech presentation to either early ECV (34-36 weeks' gestation) or delayed ECV (at or after 37 weeks). The outcome of interest was successful ECV, defined as the fetus being in a cephalic presentation immediately following the procedure, as well as at the time of birth. The importance of several factors in predicting successful ECV was investigated using two statistical methods: logistic regression and classification and regression tree (CART) analyses. Among nulliparas, non-engagement of the presenting part and an easily palpable fetal head were independently associated with success. Among multiparas, non-engagement of the presenting part, gestation less than 37 weeks and an easily palpable fetal head were found to be independent predictors of success. These findings were consistent with results of the CART analyses. Regardless of parity, descent of the presenting part was the most discriminating factor in predicting successful ECV and cephalic presentation at birth. © 2017 Nordic Federation of Societies of Obstetrics and Gynecology.
Microbial comparative pan-genomics using binomial mixture models
Directory of Open Access Journals (Sweden)
Ussery David W
2009-08-01
Full Text Available Abstract Background The size of the core- and pan-genome of bacterial species is a topic of increasing interest due to the growing number of sequenced prokaryote genomes, many from the same species. Attempts to estimate these quantities have been made, using regression methods or mixture models. We extend the latter approach by using statistical ideas developed for capture-recapture problems in ecology and epidemiology. Results We estimate core- and pan-genome sizes for 16 different bacterial species. The results reveal a complex dependency structure for most species, manifested as heterogeneous detection probabilities. Estimated pan-genome sizes range from small (around 2600 gene families in Buchnera aphidicola to large (around 43000 gene families in Escherichia coli. Results for Echerichia coli show that as more data become available, a larger diversity is estimated, indicating an extensive pool of rarely occurring genes in the population. Conclusion Analyzing pan-genomics data with binomial mixture models is a way to handle dependencies between genomes, which we find is always present. A bottleneck in the estimation procedure is the annotation of rarely occurring genes.
Validity of the negative binomial distribution in particle production
International Nuclear Information System (INIS)
Cugnon, J.; Harouna, O.
1987-01-01
Some aspects of the clan picture for particle production in nuclear and in high-energy processes are examined. In particular, it is shown that the requirement of having logarithmic distribution for the number of particles within a clan in order to generate a negative binomial should not be taken strictly. Large departures are allowed without distorting too much the negative binomial. The question of the undetected particles is also studied. It is shown that, under reasonable circumstances, the latter do not affect the negative binomial character of the multiplicity distribution
Directory of Open Access Journals (Sweden)
Matthias Schmid
Full Text Available Regression analysis with a bounded outcome is a common problem in applied statistics. Typical examples include regression models for percentage outcomes and the analysis of ratings that are measured on a bounded scale. In this paper, we consider beta regression, which is a generalization of logit models to situations where the response is continuous on the interval (0,1. Consequently, beta regression is a convenient tool for analyzing percentage responses. The classical approach to fit a beta regression model is to use maximum likelihood estimation with subsequent AIC-based variable selection. As an alternative to this established - yet unstable - approach, we propose a new estimation technique called boosted beta regression. With boosted beta regression estimation and variable selection can be carried out simultaneously in a highly efficient way. Additionally, both the mean and the variance of a percentage response can be modeled using flexible nonlinear covariate effects. As a consequence, the new method accounts for common problems such as overdispersion and non-binomial variance structures.
Greene, LaVana; Elzey, Brianda; Franklin, Mariah; Fakayode, Sayo O
2017-03-05
The negative health impact of polycyclic aromatic hydrocarbons (PAHs) and differences in pharmacological activity of enantiomers of chiral molecules in humans highlights the need for analysis of PAHs and their chiral analogue molecules in humans. Herein, the first use of cyclodextrin guest-host inclusion complexation, fluorescence spectrophotometry, and chemometric approach to PAH (anthracene) and chiral-PAH analogue derivatives (1-(9-anthryl)-2,2,2-triflouroethanol (TFE)) analyses are reported. The binding constants (K b ), stoichiometry (n), and thermodynamic properties (Gibbs free energy (ΔG), enthalpy (ΔH), and entropy (ΔS)) of anthracene and enantiomers of TFE-methyl-β-cyclodextrin (Me-β-CD) guest-host complexes were also determined. Chemometric partial-least-square (PLS) regression analysis of emission spectra data of Me-β-CD-guest-host inclusion complexes was used for the determination of anthracene and TFE enantiomer concentrations in Me-β-CD-guest-host inclusion complex samples. The values of calculated K b and negative ΔG suggest the thermodynamic favorability of anthracene-Me-β-CD and enantiomeric of TFE-Me-β-CD inclusion complexation reactions. However, anthracene-Me-β-CD and enantiomer TFE-Me-β-CD inclusion complexations showed notable differences in the binding affinity behaviors and thermodynamic properties. The PLS regression analysis resulted in square-correlation-coefficients of 0.997530 or better and a low LOD of 3.81×10 -7 M for anthracene and 3.48×10 -8 M for TFE enantiomers at physiological conditions. Most importantly, PLS regression accurately determined the anthracene and TFE enantiomer concentrations with an average low error of 2.31% for anthracene, 4.44% for R-TFE and 3.60% for S-TFE. The results of the study are highly significant because of its high sensitivity and accuracy for analysis of PAH and chiral PAH analogue derivatives without the need of an expensive chiral column, enantiomeric resolution, or use of a polarized
Pythagoras, Binomial, and de Moivre Revisited Through Differential Equations
Singh, Jitender; Bajaj, Renu
2018-01-01
The classical Pythagoras theorem, binomial theorem, de Moivre's formula, and numerous other deductions are made using the uniqueness theorem for the initial value problems in linear ordinary differential equations.
Application of binomial-edited CPMG to shale characterization.
Washburn, Kathryn E; Birdwell, Justin E
2014-09-01
Unconventional shale resources may contain a significant amount of hydrogen in organic solids such as kerogen, but it is not possible to directly detect these solids with many NMR systems. Binomial-edited pulse sequences capitalize on magnetization transfer between solids, semi-solids, and liquids to provide an indirect method of detecting solid organic materials in shales. When the organic solids can be directly measured, binomial-editing helps distinguish between different phases. We applied a binomial-edited CPMG pulse sequence to a range of natural and experimentally-altered shale samples. The most substantial signal loss is seen in shales rich in organic solids while fluids associated with inorganic pores seem essentially unaffected. This suggests that binomial-editing is a potential method for determining fluid locations, solid organic content, and kerogen-bitumen discrimination. Copyright © 2014 Elsevier Inc. All rights reserved.
Multifractal structure of multiplicity distributions and negative binomials
International Nuclear Information System (INIS)
Malik, S.; Delhi, Univ.
1997-01-01
The paper presents experimental results of the multifractal structure analysis in proton-emulsion interactions at 800 GeV. The multiplicity moments have a power law dependence on the mean multiplicity in varying bin sizes of pseudorapidity. The values of generalised dimensions are calculated from the slope value. The multifractal characteristics are also examined in the light of negative binomials. The observed multiplicity moments and those derived from the negative-binomial fits agree well with each other. Also the values of D q , both observed and derived from the negative-binomial fits not only decrease with q typifying multifractality but also agree well each other showing consistency with the negative-binomial form
Peluso, Marco E M; Munnia, Armelle; Ceppi, Marcello
2014-11-05
Exposures to bisphenol-A, a weak estrogenic chemical, largely used for the production of plastic containers, can affect the rodent behaviour. Thus, we examined the relationships between bisphenol-A and the anxiety-like behaviour, spatial skills, and aggressiveness, in 12 toxicity studies of rodent offspring from females orally exposed to bisphenol-A, while pregnant and/or lactating, by median and linear splines analyses. Subsequently, the meta-regression analysis was applied to quantify the behavioural changes. U-shaped, inverted U-shaped and J-shaped dose-response curves were found to describe the relationships between bisphenol-A with the behavioural outcomes. The occurrence of anxiogenic-like effects and spatial skill changes displayed U-shaped and inverted U-shaped curves, respectively, providing examples of effects that are observed at low-doses. Conversely, a J-dose-response relationship was observed for aggressiveness. When the proportion of rodents expressing certain traits or the time that they employed to manifest an attitude was analysed, the meta-regression indicated that a borderline significant increment of anxiogenic-like effects was present at low-doses regardless of sexes (β)=-0.8%, 95% C.I. -1.7/0.1, P=0.076, at ≤120 μg bisphenol-A. Whereas, only bisphenol-A-males exhibited a significant inhibition of spatial skills (β)=0.7%, 95% C.I. 0.2/1.2, P=0.004, at ≤100 μg/day. A significant increment of aggressiveness was observed in both the sexes (β)=67.9,C.I. 3.4, 172.5, P=0.038, at >4.0 μg. Then, bisphenol-A treatments significantly abrogated spatial learning and ability in males (Pbisphenol-A, e.g. ≤120 μg/day, were associated to behavioural aberrations in offspring. Copyright © 2014. Published by Elsevier Ireland Ltd.
Entanglement of Generalized Two-Mode Binomial States and Teleportation
International Nuclear Information System (INIS)
Wang Dongmei; Yu Youhong
2009-01-01
The entanglement of the generalized two-mode binomial states in the phase damping channel is studied by making use of the relative entropy of the entanglement. It is shown that the factors of q and p play the crucial roles in control the relative entropy of the entanglement. Furthermore, we propose a scheme of teleporting an unknown state via the generalized two-mode binomial states, and calculate the mean fidelity of the scheme. (general)
Detecting non-binomial sex allocation when developmental mortality operates.
Wilkinson, Richard D; Kapranas, Apostolos; Hardy, Ian C W
2016-11-07
Optimal sex allocation theory is one of the most intricately developed areas of evolutionary ecology. Under a range of conditions, particularly under population sub-division, selection favours sex being allocated to offspring non-randomly, generating non-binomial variances of offspring group sex ratios. Detecting non-binomial sex allocation is complicated by stochastic developmental mortality, as offspring sex can often only be identified on maturity with the sex of non-maturing offspring remaining unknown. We show that current approaches for detecting non-binomiality have limited ability to detect non-binomial sex allocation when developmental mortality has occurred. We present a new procedure using an explicit model of sex allocation and mortality and develop a Bayesian model selection approach (available as an R package). We use the double and multiplicative binomial distributions to model over- and under-dispersed sex allocation and show how to calculate Bayes factors for comparing these alternative models to the null hypothesis of binomial sex allocation. The ability to detect non-binomial sex allocation is greatly increased, particularly in cases where mortality is common. The use of Bayesian methods allows for the quantification of the evidence in favour of each hypothesis, and our modelling approach provides an improved descriptive capability over existing approaches. We use a simulation study to demonstrate substantial improvements in power for detecting non-binomial sex allocation in situations where current methods fail, and we illustrate the approach in real scenarios using empirically obtained datasets on the sexual composition of groups of gregarious parasitoid wasps. Copyright © 2016 Elsevier Ltd. All rights reserved.
Negative Binomial Distribution and the multiplicity moments at the LHC
International Nuclear Information System (INIS)
Praszalowicz, Michal
2011-01-01
In this work we show that the latest LHC data on multiplicity moments C 2 -C 5 are well described by a two-step model in the form of a convolution of the Poisson distribution with energy-dependent source function. For the source function we take Γ Negative Binomial Distribution. No unexpected behavior of Negative Binomial Distribution parameter k is found. We give also predictions for the higher energies of 10 and 14 TeV.
On some binomial [Formula: see text]-difference sequence spaces.
Meng, Jian; Song, Meimei
2017-01-01
In this paper, we introduce the binomial sequence spaces [Formula: see text], [Formula: see text] and [Formula: see text] by combining the binomial transformation and difference operator. We prove the BK -property and some inclusion relations. Furthermore, we obtain Schauder bases and compute the α -, β - and γ -duals of these sequence spaces. Finally, we characterize matrix transformations on the sequence space [Formula: see text].
Confidence limits for parameters of Poisson and binomial distributions
International Nuclear Information System (INIS)
Arnett, L.M.
1976-04-01
The confidence limits for the frequency in a Poisson process and for the proportion of successes in a binomial process were calculated and tabulated for the situations in which the observed values of the frequency or proportion and an a priori distribution of these parameters are available. Methods are used that produce limits with exactly the stated confidence levels. The confidence interval [a,b] is calculated so that Pr [a less than or equal to lambda less than or equal to b c,μ], where c is the observed value of the parameter, and μ is the a priori hypothesis of the distribution of this parameter. A Bayesian type analysis is used. The intervals calculated are narrower and appreciably different from results, known to be conservative, that are often used in problems of this type. Pearson and Hartley recognized the characteristics of their methods and contemplated that exact methods could someday be used. The calculation of the exact intervals requires involved numerical analyses readily implemented only on digital computers not available to Pearson and Hartley. A Monte Carlo experiment was conducted to verify a selected interval from those calculated. This numerical experiment confirmed the results of the analytical methods and the prediction of Pearson and Hartley that their published tables give conservative results
Comparison of multinomial and binomial proportion methods for analysis of multinomial count data.
Galyean, M L; Wester, D B
2010-10-01
Simulation methods were used to generate 1,000 experiments, each with 3 treatments and 10 experimental units/treatment, in completely randomized (CRD) and randomized complete block designs. Data were counts in 3 ordered or 4 nominal categories from multinomial distributions. For the 3-category analyses, category probabilities were 0.6, 0.3, and 0.1, respectively, for 2 of the treatments, and 0.5, 0.35, and 0.15 for the third treatment. In the 4-category analysis (CRD only), probabilities were 0.3, 0.3, 0.2, and 0.2 for treatments 1 and 2 vs. 0.4, 0.4, 0.1, and 0.1 for treatment 3. The 3-category data were analyzed with generalized linear mixed models as an ordered multinomial distribution with a cumulative logit link or by regrouping the data (e.g., counts in 1 category/sum of counts in all categories), followed by analysis of single categories as binomial proportions. Similarly, the 4-category data were analyzed as a nominal multinomial distribution with a glogit link or by grouping data as binomial proportions. For the 3-category CRD analyses, empirically determined type I error rates based on pair-wise comparisons (F- and Wald chi(2) tests) did not differ between multinomial and individual binomial category analyses with 10 (P = 0.38 to 0.60) or 50 (P = 0.19 to 0.67) sampling units/experimental unit. When analyzed as binomial proportions, power estimates varied among categories, with analysis of the category with the greatest counts yielding power similar to the multinomial analysis. Agreement between methods (percentage of experiments with the same results for the overall test for treatment effects) varied considerably among categories analyzed and sampling unit scenarios for the 3-category CRD analyses. Power (F-test) was 24.3, 49.1, 66.9, 83.5, 86.8, and 99.7% for 10, 20, 30, 40, 50, and 100 sampling units/experimental unit for the 3-category multinomial CRD analyses. Results with randomized complete block design simulations were similar to those with the CRD
Johansen, Mette; Bahrt, Henriette; Altman, Roy D; Bartels, Else M; Juhl, Carsten B; Bliddal, Henning; Lund, Hans; Christensen, Robin
2016-08-01
The aim was to identify factors explaining inconsistent observations concerning the efficacy of intra-articular hyaluronic acid compared to intra-articular sham/control, or non-intervention control, in patients with symptomatic osteoarthritis, based on randomized clinical trials (RCTs). A systematic review and meta-regression analyses of available randomized trials were conducted. The outcome, pain, was assessed according to a pre-specified hierarchy of potentially available outcomes. Hedges׳s standardized mean difference [SMD (95% CI)] served as effect size. REstricted Maximum Likelihood (REML) mixed-effects models were used to combine study results, and heterogeneity was calculated and interpreted as Tau-squared and I-squared, respectively. Overall, 99 studies (14,804 patients) met the inclusion criteria: Of these, only 71 studies (72%), including 85 comparisons (11,216 patients), had adequate data available for inclusion in the primary meta-analysis. Overall, compared with placebo, intra-articular hyaluronic acid reduced pain with an effect size of -0.39 [-0.47 to -0.31; P hyaluronic acid. Based on available trial data, intra-articular hyaluronic acid showed a better effect than intra-articular saline on pain reduction in osteoarthritis. Publication bias and the risk of selective outcome reporting suggest only small clinical effect compared to saline. Copyright © 2016 Elsevier Inc. All rights reserved.
Fossati, Andrea; Widiger, Thomas A; Borroni, Serena; Maffei, Cesare; Somma, Antonella
2017-06-01
To extend the evidence on the reliability and construct validity of the Five-Factor Model Rating Form (FFMRF) in its self-report version, two independent samples of Italian participants, which were composed of 510 adolescent high school students and 457 community-dwelling adults, respectively, were administered the FFMRF in its Italian translation. Adolescent participants were also administered the Italian translation of the Borderline Personality Features Scale for Children-11 (BPFSC-11), whereas adult participants were administered the Italian translation of the Triarchic Psychopathy Measure (TriPM). Cronbach α values were consistent with previous findings; in both samples, average interitem r values indicated acceptable internal consistency for all FFMRF scales. A multidimensional graded item response theory model indicated that the majority of FFMRF items had adequate discrimination parameters; information indices supported the reliability of the FFMRF scales. Both categorical (i.e., item-level) and scale-level regression analyses suggested that the FFMRF scores may predict a nonnegligible amount of variance in the BPFSC-11 total score in adolescent participants, and in the TriPM scale scores in adult participants.
Samdal, Gro Beate; Eide, Geir Egil; Barth, Tom; Williams, Geoffrey; Meland, Eivind
2017-03-28
This systematic review aims to explain the heterogeneity in results of interventions to promote physical activity and healthy eating for overweight and obese adults, by exploring the differential effects of behaviour change techniques (BCTs) and other intervention characteristics. The inclusion criteria specified RCTs with ≥ 12 weeks' duration, from January 2007 to October 2014, for adults (mean age ≥ 40 years, mean BMI ≥ 30). Primary outcomes were measures of healthy diet or physical activity. Two reviewers rated study quality, coded the BCTs, and collected outcome results at short (≤6 months) and long term (≥12 months). Meta-analyses and meta-regressions were used to estimate effect sizes (ES), heterogeneity indices (I 2 ) and regression coefficients. We included 48 studies containing a total of 82 outcome reports. The 32 long term reports had an overall ES = 0.24 with 95% confidence interval (CI): 0.15 to 0.33 and I 2 = 59.4%. The 50 short term reports had an ES = 0.37 with 95% CI: 0.26 to 0.48, and I 2 = 71.3%. The number of BCTs unique to the intervention group, and the BCTs goal setting and self-monitoring of behaviour predicted the effect at short and long term. The total number of BCTs in both intervention arms and using the BCTs goal setting of outcome, feedback on outcome of behaviour, implementing graded tasks, and adding objects to the environment, e.g. using a step counter, significantly predicted the effect at long term. Setting a goal for change; and the presence of reporting bias independently explained 58.8% of inter-study variation at short term. Autonomy supportive and person-centred methods as in Motivational Interviewing, the BCTs goal setting of behaviour, and receiving feedback on the outcome of behaviour, explained all of the between study variations in effects at long term. There are similarities, but also differences in effective BCTs promoting change in healthy eating and physical activity and
Negative-binomial multiplicity distributions in the interaction of light ions with 12C at 4.2 GeV/c
International Nuclear Information System (INIS)
Tucholski, A.; Bogdanowicz, J.; Moroz, Z.; Wojtkowska, J.
1989-01-01
Multiplicity distribution of single-charged particles in the interaction of p, d, α and 12 C projectiles with C target nuclei at 4.2 GeV/c were analysed in terms of the negative binomial distribution. The experimental data obtained by the Dubna Propane Bubble Chamber Group were used. It is shown that the experimental distributions are satisfactorily described by the negative-binomial distribution. Values of the parameters of these distributions are discussed. (orig.)
Extra-binomial variation approach for analysis of pooled DNA sequencing data
Wallace, Chris
2012-01-01
Motivation: The invention of next-generation sequencing technology has made it possible to study the rare variants that are more likely to pinpoint causal disease genes. To make such experiments financially viable, DNA samples from several subjects are often pooled before sequencing. This induces large between-pool variation which, together with other sources of experimental error, creates over-dispersed data. Statistical analysis of pooled sequencing data needs to appropriately model this additional variance to avoid inflating the false-positive rate. Results: We propose a new statistical method based on an extra-binomial model to address the over-dispersion and apply it to pooled case-control data. We demonstrate that our model provides a better fit to the data than either a standard binomial model or a traditional extra-binomial model proposed by Williams and can analyse both rare and common variants with lower or more variable pool depths compared to the other methods. Availability: Package ‘extraBinomial’ is on http://cran.r-project.org/ Contact: chris.wallace@cimr.cam.ac.uk Supplementary information: Supplementary data are available at Bioinformatics Online. PMID:22976083
Hadronic multiplicity distributions: the negative binomial and its alternatives
International Nuclear Information System (INIS)
Carruthers, P.
1986-01-01
We review properties of the negative binomial distribution, along with its many possible statistical or dynamical origins. Considering the relation of the multiplicity distribution to the density matrix for Boson systems, we re-introduce the partially coherent laser distribution, which allows for coherent as well as incoherent hadronic emission from the k fundamental cells, and provides equally good phenomenological fits to existing data. The broadening of non-single diffractive hadron-hadron distributions can be equally well due to the decrease of coherent with increasing energy as to the large (and rapidly decreasing) values of k deduced from negative binomial fits. Similarly the narrowness of e + -e - multiplicity distribution is due to nearly coherent (therefore nearly Poissonian) emission from a small number of jets, in contrast to the negative binomial with enormous values of k. 31 refs
Sample size calculation for comparing two negative binomial rates.
Zhu, Haiyuan; Lakkis, Hassan
2014-02-10
Negative binomial model has been increasingly used to model the count data in recent clinical trials. It is frequently chosen over Poisson model in cases of overdispersed count data that are commonly seen in clinical trials. One of the challenges of applying negative binomial model in clinical trial design is the sample size estimation. In practice, simulation methods have been frequently used for sample size estimation. In this paper, an explicit formula is developed to calculate sample size based on the negative binomial model. Depending on different approaches to estimate the variance under null hypothesis, three variations of the sample size formula are proposed and discussed. Important characteristics of the formula include its accuracy and its ability to explicitly incorporate dispersion parameter and exposure time. The performance of the formula with each variation is assessed using simulations. Copyright © 2013 John Wiley & Sons, Ltd.
Hadronic multiplicity distributions: the negative binomial and its alternatives
International Nuclear Information System (INIS)
Carruthers, P.
1986-01-01
We review properties of the negative binomial distribution, along with its many possible statistical or dynamical origins. Considering the relation of the multiplicity distribution to the density matrix for boson systems, we re-introduce the partially coherent laser distribution, which allows for coherent as well as incoherent hadronic emission from the k fundamental cells, and provides equally good phenomenological fits to existing data. The broadening of non-single diffractive hadron-hadron distributions can be equally well due to the decrease of coherence with increasing energy as to the large (and rapidly decreasing) values of k deduced from negative binomial fits. Similarly the narrowness of e + -e - multiplicity distribution is due to nearly coherent (therefore nearly Poissonian) emission from a small number of jets, in contrast to the negative binomial with enormous values of k. 31 refs
Speech-discrimination scores modeled as a binomial variable.
Thornton, A R; Raffin, M J
1978-09-01
Many studies have reported variability data for tests of speech discrimination, and the disparate results of these studies have not been given a simple explanation. Arguments over the relative merits of 25- vs 50-word tests have ignored the basic mathematical properties inherent in the use of percentage scores. The present study models performance on clinical tests of speech discrimination as a binomial variable. A binomial model was developed, and some of its characteristics were tested against data from 4120 scores obtained on the CID Auditory Test W-22. A table for determining significant deviations between scores was generated and compared to observed differences in half-list scores for the W-22 tests. Good agreement was found between predicted and observed values. Implications of the binomial characteristics of speech-discrimination scores are discussed.
Spady, Richard; Stouli, Sami
2012-01-01
We propose dual regression as an alternative to the quantile regression process for the global estimation of conditional distribution functions under minimal assumptions. Dual regression provides all the interpretational power of the quantile regression process while avoiding the need for repairing the intersecting conditional quantile surfaces that quantile regression often produces in practice. Our approach introduces a mathematical programming characterization of conditional distribution f...
Simons, Monique; de Vet, Emely; Chinapaw, Mai Jm; de Boer, Michiel; Seidell, Jacob C; Brug, Johannes
2014-04-04
Playing video games contributes substantially to sedentary behavior in youth. A new generation of video games-active games-seems to be a promising alternative to sedentary games to promote physical activity and reduce sedentary behavior. At this time, little is known about correlates of active and non-active gaming among adolescents. The objective of this study was to examine potential personal, social, and game-related correlates of both active and non-active gaming in adolescents. A survey assessing game behavior and potential personal, social, and game-related correlates was conducted among adolescents (12-16 years, N=353) recruited via schools. Multivariable, multilevel logistic regression analyses, adjusted for demographics (age, sex and educational level of adolescents), were conducted to examine personal, social, and game-related correlates of active gaming ≥1 hour per week (h/wk) and non-active gaming >7 h/wk. Active gaming ≥1 h/wk was significantly associated with a more positive attitude toward active gaming (OR 5.3, CI 2.4-11.8; Pgames (OR 0.30, CI 0.1-0.6; P=.002), a higher score on habit strength regarding gaming (OR 1.9, CI 1.2-3.2; P=.008) and having brothers/sisters (OR 6.7, CI 2.6-17.1; Pgame engagement (OR 0.95, CI 0.91-0.997; P=.04). Non-active gaming >7 h/wk was significantly associated with a more positive attitude toward non-active gaming (OR 2.6, CI 1.1-6.3; P=.035), a stronger habit regarding gaming (OR 3.0, CI 1.7-5.3; P7 h/wk. Active gaming is most strongly (negatively) associated with attitude with respect to non-active games, followed by observed active game behavior of brothers and sisters and attitude with respect to active gaming (positive associations). On the other hand, non-active gaming is most strongly associated with observed non-active game behavior of friends, habit strength regarding gaming and attitude toward non-active gaming (positive associations). Habit strength was a correlate of both active and non-active gaming
de Vet, Emely; Chinapaw, Mai JM; de Boer, Michiel; Seidell, Jacob C; Brug, Johannes
2014-01-01
Background Playing video games contributes substantially to sedentary behavior in youth. A new generation of video games—active games—seems to be a promising alternative to sedentary games to promote physical activity and reduce sedentary behavior. At this time, little is known about correlates of active and non-active gaming among adolescents. Objective The objective of this study was to examine potential personal, social, and game-related correlates of both active and non-active gaming in adolescents. Methods A survey assessing game behavior and potential personal, social, and game-related correlates was conducted among adolescents (12-16 years, N=353) recruited via schools. Multivariable, multilevel logistic regression analyses, adjusted for demographics (age, sex and educational level of adolescents), were conducted to examine personal, social, and game-related correlates of active gaming ≥1 hour per week (h/wk) and non-active gaming >7 h/wk. Results Active gaming ≥1 h/wk was significantly associated with a more positive attitude toward active gaming (OR 5.3, CI 2.4-11.8; Pgames (OR 0.30, CI 0.1-0.6; P=.002), a higher score on habit strength regarding gaming (OR 1.9, CI 1.2-3.2; P=.008) and having brothers/sisters (OR 6.7, CI 2.6-17.1; Pgame engagement (OR 0.95, CI 0.91-0.997; P=.04). Non-active gaming >7 h/wk was significantly associated with a more positive attitude toward non-active gaming (OR 2.6, CI 1.1-6.3; P=.035), a stronger habit regarding gaming (OR 3.0, CI 1.7-5.3; P7 h/wk. Active gaming is most strongly (negatively) associated with attitude with respect to non-active games, followed by observed active game behavior of brothers and sisters and attitude with respect to active gaming (positive associations). On the other hand, non-active gaming is most strongly associated with observed non-active game behavior of friends, habit strength regarding gaming and attitude toward non-active gaming (positive associations). Habit strength was a
Self-similarity of the negative binomial multiplicity distributions
International Nuclear Information System (INIS)
Calucci, G.; Treleani, D.
1998-01-01
The negative binomial distribution is self-similar: If the spectrum over the whole rapidity range gives rise to a negative binomial, in the absence of correlation and if the source is unique, also a partial range in rapidity gives rise to the same distribution. The property is not seen in experimental data, which are rather consistent with the presence of a number of independent sources. When multiplicities are very large, self-similarity might be used to isolate individual sources in a complex production process. copyright 1997 The American Physical Society
Interpretations and implications of negative binomial distributions of multiparticle productions
International Nuclear Information System (INIS)
Arisawa, Tetsuo
2006-01-01
The number of particles produced in high energy experiments is approximated by a negative binomial distribution. Deriving a representation of the distribution from a stochastic equation, conditions for the process to satisfy the distribution are clarified. Based on them, it is proposed that multiparticle production consists of spontaneous and induced production. The rate of the induced production is proportional to the number of existing particles. The ratio of the two production rates remains constant during the process. The ''NBD space'' is also defined where the number of particles produced in its subspaces follows negative binomial distributions with different parameters
Quantum Theory for the Binomial Model in Finance Thoery
Chen, Zeqian
2001-01-01
In this paper, a quantum model for the binomial market in finance is proposed. We show that its risk-neutral world exhibits an intriguing structure as a disk in the unit ball of ${\\bf R}^3,$ whose radius is a function of the risk-free interest rate with two thresholds which prevent arbitrage opportunities from this quantum market. Furthermore, from the quantum mechanical point of view we re-deduce the Cox-Ross-Rubinstein binomial option pricing formula by considering Maxwell-Boltzmann statist...
López Martínez, Laura Elizabeth
2010-01-01
En este trabajo se realiza inferencia estadística en la distribución Binomial Negativa Generalizada (BNG) y los modelos que anida, los cuales son Binomial, Binomial Negativa y Poisson. Se aborda el problema de estimación de parámetros en la distribución BNG y se propone una prueba de razón de verosimilitud generalizada para discernir si un conjunto de datos se ajusta en particular al modelo Binomial, Binomial Negativa o Poisson. Además, se estudian las potencias y tamaños de la prueba p...
Differential susceptibility among reef-building coral species can lead to community shifts and loss of diversity as a result of temperature-induced mass bleaching events. However, the influence of the local environment on species-specific bleaching susceptibilities has not been ...
Currency lookback options and observation frequency: A binomial approach
T.H.F. Cheuk; A.C.F. Vorst (Ton)
1997-01-01
textabstractIn the last decade, interest in exotic options has been growing, especially in the over-the-counter currency market. In this paper we consider Iookback currency options, which are path-dependent. We show that a one-state variable binomial model for currency Iookback options can
Using the β-binomial distribution to characterize forest health
S.J. Zarnoch; R.L. Anderson; R.M. Sheffield
1995-01-01
The β-binomial distribution is suggested as a model for describing and analyzing the dichotomous data obtained from programs monitoring the health of forests in the United States. Maximum likelihood estimation of the parameters is given as well as asymptotic likelihood ratio tests. The procedure is illustrated with data on dogwood anthracnose infection (caused...
Generation of the reciprocal-binomial state for optical fields
International Nuclear Information System (INIS)
Valverde, C.; Avelar, A.T.; Baseia, B.; Malbouisson, J.M.C.
2003-01-01
We compare the efficiencies of two interesting schemes to generate truncated states of the light field in running modes, namely the 'quantum scissors' and the 'beam-splitter array' schemes. The latter is applied to create the reciprocal-binomial state as a travelling wave, required to implement recent experimental proposals of phase-distribution determination and of quantum lithography
Improved binomial charts for monitoring high-quality processes
Albers, Willem/Wim
2009-01-01
For processes concerning attribute data with (very) small failure rate p, often negative binomial control charts are used. The decision whether to stop or continue is made each time r failures have occurred, for some r≥1. Finding the optimal r for detecting a given increase of p first requires
Statistical inference for a class of multivariate negative binomial distributions
DEFF Research Database (Denmark)
Rubak, Ege Holger; Møller, Jesper; McCullagh, Peter
This paper considers statistical inference procedures for a class of models for positively correlated count variables called α-permanental random fields, and which can be viewed as a family of multivariate negative binomial distributions. Their appealing probabilistic properties have earlier been...
Improved binomial charts for high-quality processes
Albers, Willem/Wim
For processes concerning attribute data with (very) small failure rate p, often negative binomial control charts are used. The decision whether to stop or continue is made each time r failures have occurred, for some r≥1. Finding the optimal r for detecting a given increase of p first requires
Estimating negative binomial parameters from occurrence data with detection times.
Hwang, Wen-Han; Huggins, Richard; Stoklosa, Jakub
2016-11-01
The negative binomial distribution is a common model for the analysis of count data in biology and ecology. In many applications, we may not observe the complete frequency count in a quadrat but only that a species occurred in the quadrat. If only occurrence data are available then the two parameters of the negative binomial distribution, the aggregation index and the mean, are not identifiable. This can be overcome by data augmentation or through modeling the dependence between quadrat occupancies. Here, we propose to record the (first) detection time while collecting occurrence data in a quadrat. We show that under what we call proportionate sampling, where the time to survey a region is proportional to the area of the region, that both negative binomial parameters are estimable. When the mean parameter is larger than two, our proposed approach is more efficient than the data augmentation method developed by Solow and Smith (, Am. Nat. 176, 96-98), and in general is cheaper to conduct. We also investigate the effect of misidentification when collecting negative binomially distributed data, and conclude that, in general, the effect can be simply adjusted for provided that the mean and variance of misidentification probabilities are known. The results are demonstrated in a simulation study and illustrated in several real examples. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Calculation of generalized secant integral using binomial coefficients
International Nuclear Information System (INIS)
Guseinov, I.I.; Mamedov, B.A.
2004-01-01
A single series expansion relation is derived for the generalized secant (GS) integral in terms of binomial coefficients, exponential integrals and incomplete gamma functions. The convergence of the series is tested by the concrete cases of parameters. The formulas given in this study for the evaluation of GS integral show good rate of convergence and numerical stability
Selecting Tools to Model Integer and Binomial Multiplication
Pratt, Sarah Smitherman; Eddy, Colleen M.
2017-01-01
Mathematics teachers frequently provide concrete manipulatives to students during instruction; however, the rationale for using certain manipulatives in conjunction with concepts may not be explored. This article focuses on area models that are currently used in classrooms to provide concrete examples of integer and binomial multiplication. The…
A Neutrosophic Binomial Factorial Theorem with their Refrains
Directory of Open Access Journals (Sweden)
Huda E. Khalid
2016-12-01
Full Text Available The Neutrosophic Precalculus and the Neutrosophic Calculus can be developed in many ways, depending on the types of indeterminacy one has and on the method used to deal with such indeterminacy. This article is innovative since the form of neutrosophic binomial factorial theorem was constructed in addition to its refrains.
Metaprop: a Stata command to perform meta-analysis of binomial data.
Nyaga, Victoria N; Arbyn, Marc; Aerts, Marc
2014-01-01
Meta-analyses have become an essential tool in synthesizing evidence on clinical and epidemiological questions derived from a multitude of similar studies assessing the particular issue. Appropriate and accessible statistical software is needed to produce the summary statistic of interest. Metaprop is a statistical program implemented to perform meta-analyses of proportions in Stata. It builds further on the existing Stata procedure metan which is typically used to pool effects (risk ratios, odds ratios, differences of risks or means) but which is also used to pool proportions. Metaprop implements procedures which are specific to binomial data and allows computation of exact binomial and score test-based confidence intervals. It provides appropriate methods for dealing with proportions close to or at the margins where the normal approximation procedures often break down, by use of the binomial distribution to model the within-study variability or by allowing Freeman-Tukey double arcsine transformation to stabilize the variances. Metaprop was applied on two published meta-analyses: 1) prevalence of HPV-infection in women with a Pap smear showing ASC-US; 2) cure rate after treatment for cervical precancer using cold coagulation. The first meta-analysis showed a pooled HPV-prevalence of 43% (95% CI: 38%-48%). In the second meta-analysis, the pooled percentage of cured women was 94% (95% CI: 86%-97%). By using metaprop, no studies with 0% or 100% proportions were excluded from the meta-analysis. Furthermore, study specific and pooled confidence intervals always were within admissible values, contrary to the original publication, where metan was used.
Alexeeff, Stacey E; Schwartz, Joel; Kloog, Itai; Chudnovsky, Alexandra; Koutrakis, Petros; Coull, Brent A
2015-01-01
Many epidemiological studies use predicted air pollution exposures as surrogates for true air pollution levels. These predicted exposures contain exposure measurement error, yet simulation studies have typically found negligible bias in resulting health effect estimates. However, previous studies typically assumed a statistical spatial model for air pollution exposure, which may be oversimplified. We address this shortcoming by assuming a realistic, complex exposure surface derived from fine-scale (1 km × 1 km) remote-sensing satellite data. Using simulation, we evaluate the accuracy of epidemiological health effect estimates in linear and logistic regression when using spatial air pollution predictions from kriging and land use regression models. We examined chronic (long-term) and acute (short-term) exposure to air pollution. Results varied substantially across different scenarios. Exposure models with low out-of-sample R(2) yielded severe biases in the health effect estimates of some models, ranging from 60% upward bias to 70% downward bias. One land use regression exposure model with >0.9 out-of-sample R(2) yielded upward biases up to 13% for acute health effect estimates. Almost all models drastically underestimated the SEs. Land use regression models performed better in chronic effect simulations. These results can help researchers when interpreting health effect estimates in these types of studies.
Gaussian Process Regression Model in Spatial Logistic Regression
Sofro, A.; Oktaviarina, A.
2018-01-01
Spatial analysis has developed very quickly in the last decade. One of the favorite approaches is based on the neighbourhood of the region. Unfortunately, there are some limitations such as difficulty in prediction. Therefore, we offer Gaussian process regression (GPR) to accommodate the issue. In this paper, we will focus on spatial modeling with GPR for binomial data with logit link function. The performance of the model will be investigated. We will discuss the inference of how to estimate the parameters and hyper-parameters and to predict as well. Furthermore, simulation studies will be explained in the last section.
Stochastic analysis of complex reaction networks using binomial moment equations.
Barzel, Baruch; Biham, Ofer
2012-09-01
The stochastic analysis of complex reaction networks is a difficult problem because the number of microscopic states in such systems increases exponentially with the number of reactive species. Direct integration of the master equation is thus infeasible and is most often replaced by Monte Carlo simulations. While Monte Carlo simulations are a highly effective tool, equation-based formulations are more amenable to analytical treatment and may provide deeper insight into the dynamics of the network. Here, we present a highly efficient equation-based method for the analysis of stochastic reaction networks. The method is based on the recently introduced binomial moment equations [Barzel and Biham, Phys. Rev. Lett. 106, 150602 (2011)]. The binomial moments are linear combinations of the ordinary moments of the probability distribution function of the population sizes of the interacting species. They capture the essential combinatorics of the reaction processes reflecting their stoichiometric structure. This leads to a simple and transparent form of the equations, and allows a highly efficient and surprisingly simple truncation scheme. Unlike ordinary moment equations, in which the inclusion of high order moments is prohibitively complicated, the binomial moment equations can be easily constructed up to any desired order. The result is a set of equations that enables the stochastic analysis of complex reaction networks under a broad range of conditions. The number of equations is dramatically reduced from the exponential proliferation of the master equation to a polynomial (and often quadratic) dependence on the number of reactive species in the binomial moment equations. The aim of this paper is twofold: to present a complete derivation of the binomial moment equations; to demonstrate the applicability of the moment equations for a representative set of example networks, in which stochastic effects play an important role.
Chen, Wansu; Shi, Jiaxiao; Qian, Lei; Azen, Stanley P
2014-06-26
To estimate relative risks or risk ratios for common binary outcomes, the most popular model-based methods are the robust (also known as modified) Poisson and the log-binomial regression. Of the two methods, it is believed that the log-binomial regression yields more efficient estimators because it is maximum likelihood based, while the robust Poisson model may be less affected by outliers. Evidence to support the robustness of robust Poisson models in comparison with log-binomial models is very limited. In this study a simulation was conducted to evaluate the performance of the two methods in several scenarios where outliers existed. The findings indicate that for data coming from a population where the relationship between the outcome and the covariate was in a simple form (e.g. log-linear), the two models yielded comparable biases and mean square errors. However, if the true relationship contained a higher order term, the robust Poisson models consistently outperformed the log-binomial models even when the level of contamination is low. The robust Poisson models are more robust (or less sensitive) to outliers compared to the log-binomial models when estimating relative risks or risk ratios for common binary outcomes. Users should be aware of the limitations when choosing appropriate models to estimate relative risks or risk ratios.
Linking the Negative Binomial and Logarithmic Series Distributions via their Associated Series
SADINLE, MAURICIO
2008-01-01
The negative binomial distribution is associated to the series obtained by taking derivatives of the logarithmic series. Conversely, the logarithmic series distribution is associated to the series found by integrating the series associated to the negative binomial distribution. The parameter of the number of failures of the negative binomial distribution is the number of derivatives needed to obtain the negative binomial series from the logarithmic series. The reasoning in this article could ...
On pricing futures options on random binomial tree
International Nuclear Information System (INIS)
Bayram, Kamola; Ganikhodjaev, Nasir
2013-01-01
The discrete-time approach to real option valuation has typically been implemented in the finance literature using a binomial tree framework. Instead we develop a new model by randomizing the environment and call such model a random binomial tree. Whereas the usual model has only one environment (u, d) where the price of underlying asset can move by u times up and d times down, and pair (u, d) is constant over the life of the underlying asset, in our new model the underlying security is moving in two environments namely (u 1 , d 1 ) and (u 2 , d 2 ). Thus we obtain two volatilities σ 1 and σ 2 . This new approach enables calculations reflecting the real market since it consider the two states of market normal and extra ordinal. In this paper we define and study Futures options for such models.
Fat suppression in MR imaging with binomial pulse sequences
International Nuclear Information System (INIS)
Baudovin, C.J.; Bryant, D.J.; Bydder, G.M.; Young, I.R.
1989-01-01
This paper reports on a study to develop pulse sequences allowing suppression of fat signal on MR images without eliminating signal from other tissues with short T1. They have developed such a technique involving selective excitation of protons in water, based on a binomial pulse sequence. Imaging is performed at 0.15 T. Careful shimming is performed to maximize separation of fat and water peaks. A spin-echo 1,500/80 sequence is used, employing 90 degrees pulse with transit frequency optimized for water with null excitation of 20 H offset, followed by a section-selective 180 degrees pulse. With use of the binomial sequence for imagining, reduction in fat signal is seen on images of the pelvis and legs of volunteers. Patient studies show dramatic improvement in visualization of prostatic carcinoma compared with standard sequences
Determination of finite-difference weights using scaled binomial windows
Chu, Chunlei; Stoffa, Paul L.
2012-01-01
The finite-difference method evaluates a derivative through a weighted summation of function values from neighboring grid nodes. Conventional finite-difference weights can be calculated either from Taylor series expansions or by Lagrange interpolation polynomials. The finite-difference method can be interpreted as a truncated convolutional counterpart of the pseudospectral method in the space domain. For this reason, we also can derive finite-difference operators by truncating the convolution series of the pseudospectral method. Various truncation windows can be employed for this purpose and they result in finite-difference operators with different dispersion properties. We found that there exists two families of scaled binomial windows that can be used to derive conventional finite-difference operators analytically. With a minor change, these scaled binomial windows can also be used to derive optimized finite-difference operators with enhanced dispersion properties. © 2012 Society of Exploration Geophysicists.
Zero inflated negative binomial-generalized exponential distributionand its applications
Directory of Open Access Journals (Sweden)
Sirinapa Aryuyuen
2014-08-01
Full Text Available In this paper, we propose a new zero inflated distribution, namely, the zero inflated negative binomial-generalized exponential (ZINB-GE distribution. The new distribution is used for count data with extra zeros and is an alternative for data analysis with over-dispersed count data. Some characteristics of the distribution are given, such as mean, variance, skewness, and kurtosis. Parameter estimation of the ZINB-GE distribution uses maximum likelihood estimation (MLE method. Simulated and observed data are employed to examine this distribution. The results show that the MLE method seems to have high efficiency for large sample sizes. Moreover, the mean square error of parameter estimation is increased when the zero proportion is higher. For the real data sets, this new zero inflated distribution provides a better fit than the zero inflated Poisson and zero inflated negative binomial distributions.
Determination of finite-difference weights using scaled binomial windows
Chu, Chunlei
2012-05-01
The finite-difference method evaluates a derivative through a weighted summation of function values from neighboring grid nodes. Conventional finite-difference weights can be calculated either from Taylor series expansions or by Lagrange interpolation polynomials. The finite-difference method can be interpreted as a truncated convolutional counterpart of the pseudospectral method in the space domain. For this reason, we also can derive finite-difference operators by truncating the convolution series of the pseudospectral method. Various truncation windows can be employed for this purpose and they result in finite-difference operators with different dispersion properties. We found that there exists two families of scaled binomial windows that can be used to derive conventional finite-difference operators analytically. With a minor change, these scaled binomial windows can also be used to derive optimized finite-difference operators with enhanced dispersion properties. © 2012 Society of Exploration Geophysicists.
Correlation Structures of Correlated Binomial Models and Implied Default Distribution
S. Mori; K. Kitsukawa; M. Hisakado
2006-01-01
We show how to analyze and interpret the correlation structures, the conditional expectation values and correlation coefficients of exchangeable Bernoulli random variables. We study implied default distributions for the iTraxx-CJ tranches and some popular probabilistic models, including the Gaussian copula model, Beta binomial distribution model and long-range Ising model. We interpret the differences in their profiles in terms of the correlation structures. The implied default distribution h...
e+-e- hadronic multiplicity distributions: negative binomial or Poisson
International Nuclear Information System (INIS)
Carruthers, P.; Shih, C.C.
1986-01-01
On the basis of fits to the multiplicity distributions for variable rapidity windows and the forward backward correlation for the 2 jet subset of e + e - data it is impossible to distinguish between a global negative binomial and its generalization, the partially coherent distribution. It is suggested that intensity interferometry, especially the Bose-Einstein correlation, gives information which will discriminate among dynamical models. 16 refs
Data analysis using the Binomial Failure Rate common cause model
International Nuclear Information System (INIS)
Atwood, C.L.
1983-09-01
This report explains how to use the Binomial Failure Rate (BFR) method to estimate common cause failure rates. The entire method is described, beginning with the conceptual model, and covering practical issues of data preparation, treatment of variation in the failure rates, Bayesian estimation of the quantities of interest, checking the model assumptions for lack of fit to the data, and the ultimate application of the answers
Hits per trial: Basic analysis of binomial data
International Nuclear Information System (INIS)
Atwood, C.L.
1994-09-01
This report presents simple statistical methods for analyzing binomial data, such as the number of failures in some number of demands. It gives point estimates, confidence intervals, and Bayesian intervals for the failure probability. It shows how to compare subsets of the data, both graphically and by statistical tests, and how to look for trends in time. It presents a compound model when the failure probability varies randomly. Examples and SAS programs are given
Statistical Inference for a Class of Multivariate Negative Binomial Distributions
DEFF Research Database (Denmark)
Rubak, Ege H.; Møller, Jesper; McCullagh, Peter
This paper considers statistical inference procedures for a class of models for positively correlated count variables called -permanental random fields, and which can be viewed as a family of multivariate negative binomial distributions. Their appealing probabilistic properties have earlier been...... studied in the literature, while this is the first statistical paper on -permanental random fields. The focus is on maximum likelihood estimation, maximum quasi-likelihood estimation and on maximum composite likelihood estimation based on uni- and bivariate distributions. Furthermore, new results...
Hits per trial: Basic analysis of binomial data
Energy Technology Data Exchange (ETDEWEB)
Atwood, C.L.
1994-09-01
This report presents simple statistical methods for analyzing binomial data, such as the number of failures in some number of demands. It gives point estimates, confidence intervals, and Bayesian intervals for the failure probability. It shows how to compare subsets of the data, both graphically and by statistical tests, and how to look for trends in time. It presents a compound model when the failure probability varies randomly. Examples and SAS programs are given.
Generalization of Binomial Coefficients to Numbers on the Nodes of Graphs
Khmelnitskaya, A.; van der Laan, G.; Talman, Dolf
2016-01-01
The triangular array of binomial coefficients, or Pascal's triangle, is formed by starting with an apex of 1. Every row of Pascal's triangle can be seen as a line-graph, to each node of which the corresponding binomial coefficient is assigned. We show that the binomial coefficient of a node is equal
Generalization of binomial coefficients to numbers on the nodes of graphs
Khmelnitskaya, Anna Borisovna; van der Laan, Gerard; Talman, Dolf
The triangular array of binomial coefficients, or Pascal's triangle, is formed by starting with an apex of 1. Every row of Pascal's triangle can be seen as a line-graph, to each node of which the corresponding binomial coefficient is assigned. We show that the binomial coefficient of a node is equal
Abstract knowledge versus direct experience in processing of binomial expressions.
Morgan, Emily; Levy, Roger
2016-12-01
We ask whether word order preferences for binomial expressions of the form A and B (e.g. bread and butter) are driven by abstract linguistic knowledge of ordering constraints referencing the semantic, phonological, and lexical properties of the constituent words, or by prior direct experience with the specific items in questions. Using forced-choice and self-paced reading tasks, we demonstrate that online processing of never-before-seen binomials is influenced by abstract knowledge of ordering constraints, which we estimate with a probabilistic model. In contrast, online processing of highly frequent binomials is primarily driven by direct experience, which we estimate from corpus frequency counts. We propose a trade-off wherein processing of novel expressions relies upon abstract knowledge, while reliance upon direct experience increases with increased exposure to an expression. Our findings support theories of language processing in which both compositional generation and direct, holistic reuse of multi-word expressions play crucial roles. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.
Typification of the binomial Sedum boissierianum Hausskncht Crassulaceae
Directory of Open Access Journals (Sweden)
Valentin Bârcă
2016-06-01
Full Text Available Genus Sedum, described by Linne in 1753, a polyphyletic genus of flowering plants in the family Crassulaceae, was extensively studied both by scholars and succulents enthusiasts. Despite the great attention it has received over time, many binomials lack types and consistent taxonomic treatments. My currently undergoing process of the comprehensive revision of the genus Sedum in România and the Balkans called for the clarification of taxonomical status and typification of several basionyms described based on plants collected from the Balkans and the region around the Black Sea. Almost a century and a half ago, Haussknecht intensely studied flora of Near East and assigned several binomials to plants of the genus Sedum, some of which were neglected and forgotten although they represent interesting taxa. The binomial Sedum boissierianum Haussknecht first appeared in schedae as nomen nudum around 1892, for specimens originating from Near East, but was later validated by Froederstroem in 1932. Following extensive revision of the relevant literature and herbarium specimens, I located original material for this taxon in several herbaria in Europe (BUCF, LD, P, Z. I hereby designate the lectotype and isolectotypes for this basionym and furthermore provide pictures of the herbarium specimens and pinpoint on a map the original collections sites for the designated types.
Zhang, Hongyang; Welch, William J.; Zamar, Ruben H.
2017-01-01
Tomal et al. (2015) introduced the notion of "phalanxes" in the context of rare-class detection in two-class classification problems. A phalanx is a subset of features that work well for classification tasks. In this paper, we propose a different class of phalanxes for application in regression settings. We define a "Regression Phalanx" - a subset of features that work well together for prediction. We propose a novel algorithm which automatically chooses Regression Phalanxes from high-dimensi...
Wang, Hui; Sui, Weiguo; Xue, Wen; Wu, Junyong; Chen, Jiejing; Dai, Yong
2014-09-01
Immunoglobulin A nephropathy (IgAN) is a complex trait regulated by the interaction among multiple physiologic regulatory systems and probably involving numerous genes, which leads to inconsistent findings in genetic studies. One possibility of failure to replicate some single-locus results is that the underlying genetics of IgAN nephropathy is based on multiple genes with minor effects. To learn the association between 23 single nucleotide polymorphisms (SNPs) in 14 genes predisposing to chronic glomerular diseases and IgAN in Han males, the 23 SNPs genotypes of 21 Han males were detected and analyzed with a BaiO gene chip, and their associations were analyzed with univariate analysis and multiple linear regression analysis. Analysis showed that CTLA4 rs231726 and CR2 rs1048971 revealed a significant association with IgAN. These findings support the multi-gene nature of the etiology of IgAN and propose a potential gene-gene interactive model for future studies.
Directory of Open Access Journals (Sweden)
Susanne Unverzagt
Full Text Available This study is an in-depth-analysis to explain statistical heterogeneity in a systematic review of implementation strategies to improve guideline adherence of primary care physicians in the treatment of patients with cardiovascular diseases. The systematic review included randomized controlled trials from a systematic search in MEDLINE, EMBASE, CENTRAL, conference proceedings and registers of ongoing studies. Implementation strategies were shown to be effective with substantial heterogeneity of treatment effects across all investigated strategies. Primary aim of this study was to explain different effects of eligible trials and to identify methodological and clinical effect modifiers. Random effects meta-regression models were used to simultaneously assess the influence of multimodal implementation strategies and effect modifiers on physician adherence. Effect modifiers included the staff responsible for implementation, level of prevention and definition pf the primary outcome, unit of randomization, duration of follow-up and risk of bias. Six clinical and methodological factors were investigated as potential effect modifiers of the efficacy of different implementation strategies on guideline adherence in primary care practices on the basis of information from 75 eligible trials. Five effect modifiers were able to explain a substantial amount of statistical heterogeneity. Physician adherence was improved by 62% (95% confidence interval (95% CI 29 to 104% or 29% (95% CI 5 to 60% in trials where other non-medical professionals or nurses were included in the implementation process. Improvement of physician adherence was more successful in primary and secondary prevention of cardiovascular diseases by around 30% (30%; 95% CI -2 to 71% and 31%; 95% CI 9 to 57%, respectively compared to tertiary prevention. This study aimed to identify effect modifiers of implementation strategies on physician adherence. Especially the cooperation of different health
Duda, David P.; Minnis, Patrick
2009-01-01
Previous studies have shown that probabilistic forecasting may be a useful method for predicting persistent contrail formation. A probabilistic forecast to accurately predict contrail formation over the contiguous United States (CONUS) is created by using meteorological data based on hourly meteorological analyses from the Advanced Regional Prediction System (ARPS) and from the Rapid Update Cycle (RUC) as well as GOES water vapor channel measurements, combined with surface and satellite observations of contrails. Two groups of logistic models were created. The first group of models (SURFACE models) is based on surface-based contrail observations supplemented with satellite observations of contrail occurrence. The second group of models (OUTBREAK models) is derived from a selected subgroup of satellite-based observations of widespread persistent contrails. The mean accuracies for both the SURFACE and OUTBREAK models typically exceeded 75 percent when based on the RUC or ARPS analysis data, but decreased when the logistic models were derived from ARPS forecast data.
Crown, William H
2014-02-01
This paper examines the use of propensity score matching in economic analyses of observational data. Several excellent papers have previously reviewed practical aspects of propensity score estimation and other aspects of the propensity score literature. The purpose of this paper is to compare the conceptual foundation of propensity score models with alternative estimators of treatment effects. References are provided to empirical comparisons among methods that have appeared in the literature. These comparisons are available for a subset of the methods considered in this paper. However, in some cases, no pairwise comparisons of particular methods are yet available, and there are no examples of comparisons across all of the methods surveyed here. Irrespective of the availability of empirical comparisons, the goal of this paper is to provide some intuition about the relative merits of alternative estimators in health economic evaluations where nonlinearity, sample size, availability of pre/post data, heterogeneity, and missing variables can have important implications for choice of methodology. Also considered is the potential combination of propensity score matching with alternative methods such as differences-in-differences and decomposition methods that have not yet appeared in the empirical literature.
Duda, David P.; Minnis, Patrick
2009-01-01
Straightforward application of the Schmidt-Appleman contrail formation criteria to diagnose persistent contrail occurrence from numerical weather prediction data is hindered by significant bias errors in the upper tropospheric humidity. Logistic models of contrail occurrence have been proposed to overcome this problem, but basic questions remain about how random measurement error may affect their accuracy. A set of 5000 synthetic contrail observations is created to study the effects of random error in these probabilistic models. The simulated observations are based on distributions of temperature, humidity, and vertical velocity derived from Advanced Regional Prediction System (ARPS) weather analyses. The logistic models created from the simulated observations were evaluated using two common statistical measures of model accuracy, the percent correct (PC) and the Hanssen-Kuipers discriminant (HKD). To convert the probabilistic results of the logistic models into a dichotomous yes/no choice suitable for the statistical measures, two critical probability thresholds are considered. The HKD scores are higher when the climatological frequency of contrail occurrence is used as the critical threshold, while the PC scores are higher when the critical probability threshold is 0.5. For both thresholds, typical random errors in temperature, relative humidity, and vertical velocity are found to be small enough to allow for accurate logistic models of contrail occurrence. The accuracy of the models developed from synthetic data is over 85 percent for both the prediction of contrail occurrence and non-occurrence, although in practice, larger errors would be anticipated.
Correlation Structures of Correlated Binomial Models and Implied Default Distribution
Mori, Shintaro; Kitsukawa, Kenji; Hisakado, Masato
2008-11-01
We show how to analyze and interpret the correlation structures, the conditional expectation values and correlation coefficients of exchangeable Bernoulli random variables. We study implied default distributions for the iTraxx-CJ tranches and some popular probabilistic models, including the Gaussian copula model, Beta binomial distribution model and long-range Ising model. We interpret the differences in their profiles in terms of the correlation structures. The implied default distribution has singular correlation structures, reflecting the credit market implications. We point out two possible origins of the singular behavior.
Combinatorial interpretations of binomial coefficient analogues related to Lucas sequences
Sagan, Bruce; Savage, Carla
2009-01-01
Let s and t be variables. Define polynomials {n} in s, t by {0}=0, {1}=1, and {n}=s{n-1}+t{n-2} for n >= 2. If s, t are integers then the corresponding sequence of integers is called a Lucas sequence. Define an analogue of the binomial coefficients by C{n,k}={n}!/({k}!{n-k}!) where {n}!={1}{2}...{n}. It is easy to see that C{n,k} is a polynomial in s and t. The purpose of this note is to give two combinatorial interpretations for this polynomial in terms of statistics on integer partitions in...
Bayesian inference for disease prevalence using negative binomial group testing
Pritchard, Nicholas A.; Tebbs, Joshua M.
2011-01-01
Group testing, also known as pooled testing, and inverse sampling are both widely used methods of data collection when the goal is to estimate a small proportion. Taking a Bayesian approach, we consider the new problem of estimating disease prevalence from group testing when inverse (negative binomial) sampling is used. Using different distributions to incorporate prior knowledge of disease incidence and different loss functions, we derive closed form expressions for posterior distributions and resulting point and credible interval estimators. We then evaluate our new estimators, on Bayesian and classical grounds, and apply our methods to a West Nile Virus data set. PMID:21259308
Negative binomial models for abundance estimation of multiple closed populations
Boyce, Mark S.; MacKenzie, Darry I.; Manly, Bryan F.J.; Haroldson, Mark A.; Moody, David W.
2001-01-01
Counts of uniquely identified individuals in a population offer opportunities to estimate abundance. However, for various reasons such counts may be burdened by heterogeneity in the probability of being detected. Theoretical arguments and empirical evidence demonstrate that the negative binomial distribution (NBD) is a useful characterization for counts from biological populations with heterogeneity. We propose a method that focuses on estimating multiple populations by simultaneously using a suite of models derived from the NBD. We used this approach to estimate the number of female grizzly bears (Ursus arctos) with cubs-of-the-year in the Yellowstone ecosystem, for each year, 1986-1998. Akaike's Information Criteria (AIC) indicated that a negative binomial model with a constant level of heterogeneity across all years was best for characterizing the sighting frequencies of female grizzly bears. A lack-of-fit test indicated the model adequately described the collected data. Bootstrap techniques were used to estimate standard errors and 95% confidence intervals. We provide a Monte Carlo technique, which confirms that the Yellowstone ecosystem grizzly bear population increased during the period 1986-1998.
Low reheating temperatures in monomial and binomial inflationary models
International Nuclear Information System (INIS)
Rehagen, Thomas; Gelmini, Graciela B.
2015-01-01
We investigate the allowed range of reheating temperature values in light of the Planck 2015 results and the recent joint analysis of Cosmic Microwave Background (CMB) data from the BICEP2/Keck Array and Planck experiments, using monomial and binomial inflationary potentials. While the well studied ϕ 2 inflationary potential is no longer favored by current CMB data, as well as ϕ p with p>2, a ϕ 1 potential and canonical reheating (w re =0) provide a good fit to the CMB measurements. In this last case, we find that the Planck 2015 68% confidence limit upper bound on the spectral index, n s , implies an upper bound on the reheating temperature of T re ≲6×10 10 GeV, and excludes instantaneous reheating. The low reheating temperatures allowed by this model open the possibility that dark matter could be produced during the reheating period instead of when the Universe is radiation dominated, which could lead to very different predictions for the relic density and momentum distribution of WIMPs, sterile neutrinos, and axions. We also study binomial inflationary potentials and show the effects of a small departure from a ϕ 1 potential. We find that as a subdominant ϕ 2 term in the potential increases, first instantaneous reheating becomes allowed, and then the lowest possible reheating temperature of T re =4 MeV is excluded by the Planck 2015 68% confidence limit
Confidence Intervals for Asbestos Fiber Counts: Approximate Negative Binomial Distribution.
Bartley, David; Slaven, James; Harper, Martin
2017-03-01
The negative binomial distribution is adopted for analyzing asbestos fiber counts so as to account for both the sampling errors in capturing only a finite number of fibers and the inevitable human variation in identifying and counting sampled fibers. A simple approximation to this distribution is developed for the derivation of quantiles and approximate confidence limits. The success of the approximation depends critically on the use of Stirling's expansion to sufficient order, on exact normalization of the approximating distribution, on reasonable perturbation of quantities from the normal distribution, and on accurately approximating sums by inverse-trapezoidal integration. Accuracy of the approximation developed is checked through simulation and also by comparison to traditional approximate confidence intervals in the specific case that the negative binomial distribution approaches the Poisson distribution. The resulting statistics are shown to relate directly to early research into the accuracy of asbestos sampling and analysis. Uncertainty in estimating mean asbestos fiber concentrations given only a single count is derived. Decision limits (limits of detection) and detection limits are considered for controlling false-positive and false-negative detection assertions and are compared to traditional limits computed assuming normal distributions. Published by Oxford University Press on behalf of the British Occupational Hygiene Society 2017.
Zheng, Han; Kimber, Alan; Goodwin, Victoria A; Pickering, Ruth M
2018-01-01
A common design for a falls prevention trial is to assess falling at baseline, randomize participants into an intervention or control group, and ask them to record the number of falls they experience during a follow-up period of time. This paper addresses how best to include the baseline count in the analysis of the follow-up count of falls in negative binomial (NB) regression. We examine the performance of various approaches in simulated datasets where both counts are generated from a mixed Poisson distribution with shared random subject effect. Including the baseline count after log-transformation as a regressor in NB regression (NB-logged) or as an offset (NB-offset) resulted in greater power than including the untransformed baseline count (NB-unlogged). Cook and Wei's conditional negative binomial (CNB) model replicates the underlying process generating the data. In our motivating dataset, a statistically significant intervention effect resulted from the NB-logged, NB-offset, and CNB models, but not from NB-unlogged, and large, outlying baseline counts were overly influential in NB-unlogged but not in NB-logged. We conclude that there is little to lose by including the log-transformed baseline count in standard NB regression compared to CNB for moderate to larger sized datasets. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Logistic regression for dichotomized counts.
Preisser, John S; Das, Kalyan; Benecha, Habtamu; Stamm, John W
2016-12-01
Sometimes there is interest in a dichotomized outcome indicating whether a count variable is positive or zero. Under this scenario, the application of ordinary logistic regression may result in efficiency loss, which is quantifiable under an assumed model for the counts. In such situations, a shared-parameter hurdle model is investigated for more efficient estimation of regression parameters relating to overall effects of covariates on the dichotomous outcome, while handling count data with many zeroes. One model part provides a logistic regression containing marginal log odds ratio effects of primary interest, while an ancillary model part describes the mean count of a Poisson or negative binomial process in terms of nuisance regression parameters. Asymptotic efficiency of the logistic model parameter estimators of the two-part models is evaluated with respect to ordinary logistic regression. Simulations are used to assess the properties of the models with respect to power and Type I error, the latter investigated under both misspecified and correctly specified models. The methods are applied to data from a randomized clinical trial of three toothpaste formulations to prevent incident dental caries in a large population of Scottish schoolchildren. © The Author(s) 2014.
Matson, Johnny L.; Kozlowski, Alison M.
2010-01-01
Autistic regression is one of the many mysteries in the developmental course of autism and pervasive developmental disorders not otherwise specified (PDD-NOS). Various definitions of this phenomenon have been used, further clouding the study of the topic. Despite this problem, some efforts at establishing prevalence have been made. The purpose of…
Freund, Rudolf J; Sa, Ping
2006-01-01
The book provides complete coverage of the classical methods of statistical analysis. It is designed to give students an understanding of the purpose of statistical analyses, to allow the student to determine, at least to some degree, the correct type of statistical analyses to be performed in a given situation, and have some appreciation of what constitutes good experimental design
Assessing the Option to Abandon an Investment Project by the Binomial Options Pricing Model
Directory of Open Access Journals (Sweden)
Salvador Cruz Rambaud
2016-01-01
Full Text Available Usually, traditional methods for investment project appraisal such as the net present value (hereinafter NPV do not incorporate in their values the operational flexibility offered by including a real option included in the project. In this paper, real options, and more specifically the option to abandon, are analysed as a complement to cash flow sequence which quantifies the project. In this way, by considering the existing analogy with financial options, a mathematical expression is derived by using the binomial options pricing model. This methodology provides the value of the option to abandon the project within one, two, and in general n periods. Therefore, this paper aims to be a useful tool in determining the value of the option to abandon according to its residual value, thus making easier the control of the uncertainty element within the project.
Olive, David J
2017-01-01
This text covers both multiple linear regression and some experimental design models. The text uses the response plot to visualize the model and to detect outliers, does not assume that the error distribution has a known parametric distribution, develops prediction intervals that work when the error distribution is unknown, suggests bootstrap hypothesis tests that may be useful for inference after variable selection, and develops prediction regions and large sample theory for the multivariate linear regression model that has m response variables. A relationship between multivariate prediction regions and confidence regions provides a simple way to bootstrap confidence regions. These confidence regions often provide a practical method for testing hypotheses. There is also a chapter on generalized linear models and generalized additive models. There are many R functions to produce response and residual plots, to simulate prediction intervals and hypothesis tests, to detect outliers, and to choose response trans...
Yelland, Lisa N; Salter, Amy B; Ryan, Philip
2011-10-15
Modified Poisson regression, which combines a log Poisson regression model with robust variance estimation, is a useful alternative to log binomial regression for estimating relative risks. Previous studies have shown both analytically and by simulation that modified Poisson regression is appropriate for independent prospective data. This method is often applied to clustered prospective data, despite a lack of evidence to support its use in this setting. The purpose of this article is to evaluate the performance of the modified Poisson regression approach for estimating relative risks from clustered prospective data, by using generalized estimating equations to account for clustering. A simulation study is conducted to compare log binomial regression and modified Poisson regression for analyzing clustered data from intervention and observational studies. Both methods generally perform well in terms of bias, type I error, and coverage. Unlike log binomial regression, modified Poisson regression is not prone to convergence problems. The methods are contrasted by using example data sets from 2 large studies. The results presented in this article support the use of modified Poisson regression as an alternative to log binomial regression for analyzing clustered prospective data when clustering is taken into account by using generalized estimating equations.
Covering Resilience: A Recent Development for Binomial Checkpointing
Energy Technology Data Exchange (ETDEWEB)
Walther, Andrea; Narayanan, Sri Hari Krishna
2016-09-12
In terms of computing time, adjoint methods offer a very attractive alternative to compute gradient information, required, e.g., for optimization purposes. However, together with this very favorable temporal complexity result comes a memory requirement that is in essence proportional with the operation count of the underlying function, e.g., if algorithmic differentiation is used to provide the adjoints. For this reason, checkpointing approaches in many variants have become popular. This paper analyzes an extension of the so-called binomial approach to cover also possible failures of the computing systems. Such a measure of precaution is of special interest for massive parallel simulations and adjoint calculations where the mean time between failure of the large scale computing system is smaller than the time needed to complete the calculation of the adjoint information. We describe the extensions of standard checkpointing approaches required for such resilience, provide a corresponding implementation and discuss first numerical results.
A Bayesian equivalency test for two independent binomial proportions.
Kawasaki, Yohei; Shimokawa, Asanao; Yamada, Hiroshi; Miyaoka, Etsuo
2016-01-01
In clinical trials, it is often necessary to perform an equivalence study. The equivalence study requires actively denoting equivalence between two different drugs or treatments. Since it is not possible to assert equivalence that is not rejected by a superiority test, statistical methods known as equivalency tests have been suggested. These methods for equivalency tests are based on the frequency framework; however, there are few such methods in the Bayesian framework. Hence, this article proposes a new index that suggests the equivalency of binomial proportions, which is constructed based on the Bayesian framework. In this study, we provide two methods for calculating the index and compare the probabilities that have been calculated by these two calculation methods. Moreover, we apply this index to the results of actual clinical trials to demonstrate the utility of the index.
Bhamidipati, Ravi Kanth; Syed, Muzeeb; Mullangi, Ramesh; Srinivas, Nuggehally
2018-02-01
1. Dalbavancin, a lipoglycopeptide, is approved for treating gram-positive bacterial infections. Area under plasma concentration versus time curve (AUC inf ) of dalbavancin is a key parameter and AUC inf /MIC ratio is a critical pharmacodynamic marker. 2. Using end of intravenous infusion concentration (i.e. C max ) C max versus AUC inf relationship for dalbavancin was established by regression analyses (i.e. linear, log-log, log-linear and power models) using 21 pairs of subject data. 3. The predictions of the AUC inf were performed using published C max data by application of regression equations. The quotient of observed/predicted values rendered fold difference. The mean absolute error (MAE)/root mean square error (RMSE) and correlation coefficient (r) were used in the assessment. 4. MAE and RMSE values for the various models were comparable. The C max versus AUC inf exhibited excellent correlation (r > 0.9488). The internal data evaluation showed narrow confinement (0.84-1.14-fold difference) with a RMSE models predicted AUC inf with a RMSE of 3.02-27.46% with fold difference largely contained within 0.64-1.48. 5. Regardless of the regression models, a single time point strategy of using C max (i.e. end of 30-min infusion) is amenable as a prospective tool for predicting AUC inf of dalbavancin in patients.
Negative binomial mixed models for analyzing microbiome count data.
Zhang, Xinyan; Mallick, Himel; Tang, Zaixiang; Zhang, Lei; Cui, Xiangqin; Benson, Andrew K; Yi, Nengjun
2017-01-03
Recent advances in next-generation sequencing (NGS) technology enable researchers to collect a large volume of metagenomic sequencing data. These data provide valuable resources for investigating interactions between the microbiome and host environmental/clinical factors. In addition to the well-known properties of microbiome count measurements, for example, varied total sequence reads across samples, over-dispersion and zero-inflation, microbiome studies usually collect samples with hierarchical structures, which introduce correlation among the samples and thus further complicate the analysis and interpretation of microbiome count data. In this article, we propose negative binomial mixed models (NBMMs) for detecting the association between the microbiome and host environmental/clinical factors for correlated microbiome count data. Although having not dealt with zero-inflation, the proposed mixed-effects models account for correlation among the samples by incorporating random effects into the commonly used fixed-effects negative binomial model, and can efficiently handle over-dispersion and varying total reads. We have developed a flexible and efficient IWLS (Iterative Weighted Least Squares) algorithm to fit the proposed NBMMs by taking advantage of the standard procedure for fitting the linear mixed models. We evaluate and demonstrate the proposed method via extensive simulation studies and the application to mouse gut microbiome data. The results show that the proposed method has desirable properties and outperform the previously used methods in terms of both empirical power and Type I error. The method has been incorporated into the freely available R package BhGLM ( http://www.ssg.uab.edu/bhglm/ and http://github.com/abbyyan3/BhGLM ), providing a useful tool for analyzing microbiome data.
A new approach to analyse longitudinal epidemiological data with an excess of zeros.
Spriensma, Alette S; Hajos, Tibor R S; de Boer, Michiel R; Heymans, Martijn W; Twisk, Jos W R
2013-02-20
Within longitudinal epidemiological research, 'count' outcome variables with an excess of zeros frequently occur. Although these outcomes are frequently analysed with a linear mixed model, or a Poisson mixed model, a two-part mixed model would be better in analysing outcome variables with an excess of zeros. Therefore, objective of this paper was to introduce the relatively 'new' method of two-part joint regression modelling in longitudinal data analysis for outcome variables with an excess of zeros, and to compare the performance of this method to current approaches. Within an observational longitudinal dataset, we compared three techniques; two 'standard' approaches (a linear mixed model, and a Poisson mixed model), and a two-part joint mixed model (a binomial/Poisson mixed distribution model), including random intercepts and random slopes. Model fit indicators, and differences between predicted and observed values were used for comparisons. The analyses were performed with STATA using the GLLAMM procedure. Regarding the random intercept models, the two-part joint mixed model (binomial/Poisson) performed best. Adding random slopes for time to the models changed the sign of the regression coefficient for both the Poisson mixed model and the two-part joint mixed model (binomial/Poisson) and resulted into a much better fit. This paper showed that a two-part joint mixed model is a more appropriate method to analyse longitudinal data with an excess of zeros compared to a linear mixed model and a Poisson mixed model. However, in a model with random slopes for time a Poisson mixed model also performed remarkably well.
Sauzet, Odile; Peacock, Janet L
2017-07-20
The analysis of perinatal outcomes often involves datasets with some multiple births. These are datasets mostly formed of independent observations and a limited number of clusters of size two (twins) and maybe of size three or more. This non-independence needs to be accounted for in the statistical analysis. Using simulated data based on a dataset of preterm infants we have previously investigated the performance of several approaches to the analysis of continuous outcomes in the presence of some clusters of size two. Mixed models have been developed for binomial outcomes but very little is known about their reliability when only a limited number of small clusters are present. Using simulated data based on a dataset of preterm infants we investigated the performance of several approaches to the analysis of binomial outcomes in the presence of some clusters of size two. Logistic models, several methods of estimation for the logistic random intercept models and generalised estimating equations were compared. The presence of even a small percentage of twins means that a logistic regression model will underestimate all parameters but a logistic random intercept model fails to estimate the correlation between siblings if the percentage of twins is too small and will provide similar estimates to logistic regression. The method which seems to provide the best balance between estimation of the standard error and the parameter for any percentage of twins is the generalised estimating equations. This study has shown that the number of covariates or the level two variance do not necessarily affect the performance of the various methods used to analyse datasets containing twins but when the percentage of small clusters is too small, mixed models cannot capture the dependence between siblings.
Directory of Open Access Journals (Sweden)
Odile Sauzet
2017-07-01
Full Text Available Abstract Background The analysis of perinatal outcomes often involves datasets with some multiple births. These are datasets mostly formed of independent observations and a limited number of clusters of size two (twins and maybe of size three or more. This non-independence needs to be accounted for in the statistical analysis. Using simulated data based on a dataset of preterm infants we have previously investigated the performance of several approaches to the analysis of continuous outcomes in the presence of some clusters of size two. Mixed models have been developed for binomial outcomes but very little is known about their reliability when only a limited number of small clusters are present. Methods Using simulated data based on a dataset of preterm infants we investigated the performance of several approaches to the analysis of binomial outcomes in the presence of some clusters of size two. Logistic models, several methods of estimation for the logistic random intercept models and generalised estimating equations were compared. Results The presence of even a small percentage of twins means that a logistic regression model will underestimate all parameters but a logistic random intercept model fails to estimate the correlation between siblings if the percentage of twins is too small and will provide similar estimates to logistic regression. The method which seems to provide the best balance between estimation of the standard error and the parameter for any percentage of twins is the generalised estimating equations. Conclusions This study has shown that the number of covariates or the level two variance do not necessarily affect the performance of the various methods used to analyse datasets containing twins but when the percentage of small clusters is too small, mixed models cannot capture the dependence between siblings.
Time series regression model for infectious disease and weather.
Imai, Chisato; Armstrong, Ben; Chalabi, Zaid; Mangtani, Punam; Hashizume, Masahiro
2015-10-01
Time series regression has been developed and long used to evaluate the short-term associations of air pollution and weather with mortality or morbidity of non-infectious diseases. The application of the regression approaches from this tradition to infectious diseases, however, is less well explored and raises some new issues. We discuss and present potential solutions for five issues often arising in such analyses: changes in immune population, strong autocorrelations, a wide range of plausible lag structures and association patterns, seasonality adjustments, and large overdispersion. The potential approaches are illustrated with datasets of cholera cases and rainfall from Bangladesh and influenza and temperature in Tokyo. Though this article focuses on the application of the traditional time series regression to infectious diseases and weather factors, we also briefly introduce alternative approaches, including mathematical modeling, wavelet analysis, and autoregressive integrated moving average (ARIMA) models. Modifications proposed to standard time series regression practice include using sums of past cases as proxies for the immune population, and using the logarithm of lagged disease counts to control autocorrelation due to true contagion, both of which are motivated from "susceptible-infectious-recovered" (SIR) models. The complexity of lag structures and association patterns can often be informed by biological mechanisms and explored by using distributed lag non-linear models. For overdispersed models, alternative distribution models such as quasi-Poisson and negative binomial should be considered. Time series regression can be used to investigate dependence of infectious diseases on weather, but may need modifying to allow for features specific to this context. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
Generazio, Edward R.
2014-01-01
Unknown risks are introduced into failure critical systems when probability of detection (POD) capabilities are accepted without a complete understanding of the statistical method applied and the interpretation of the statistical results. The presence of this risk in the nondestructive evaluation (NDE) community is revealed in common statements about POD. These statements are often interpreted in a variety of ways and therefore, the very existence of the statements identifies the need for a more comprehensive understanding of POD methodologies. Statistical methodologies have data requirements to be met, procedures to be followed, and requirements for validation or demonstration of adequacy of the POD estimates. Risks are further enhanced due to the wide range of statistical methodologies used for determining the POD capability. Receiver/Relative Operating Characteristics (ROC) Display, simple binomial, logistic regression, and Bayes' rule POD methodologies are widely used in determining POD capability. This work focuses on Hit-Miss data to reveal the framework of the interrelationships between Receiver/Relative Operating Characteristics Display, simple binomial, logistic regression, and Bayes' Rule methodologies as they are applied to POD. Knowledge of these interrelationships leads to an intuitive and global understanding of the statistical data, procedural and validation requirements for establishing credible POD estimates.
Lea, Amanda J.
2015-01-01
Identifying sources of variation in DNA methylation levels is important for understanding gene regulation. Recently, bisulfite sequencing has become a popular tool for investigating DNA methylation levels. However, modeling bisulfite sequencing data is complicated by dramatic variation in coverage across sites and individual samples, and because of the computational challenges of controlling for genetic covariance in count data. To address these challenges, we present a binomial mixed model and an efficient, sampling-based algorithm (MACAU: Mixed model association for count data via data augmentation) for approximate parameter estimation and p-value computation. This framework allows us to simultaneously account for both the over-dispersed, count-based nature of bisulfite sequencing data, as well as genetic relatedness among individuals. Using simulations and two real data sets (whole genome bisulfite sequencing (WGBS) data from Arabidopsis thaliana and reduced representation bisulfite sequencing (RRBS) data from baboons), we show that our method provides well-calibrated test statistics in the presence of population structure. Further, it improves power to detect differentially methylated sites: in the RRBS data set, MACAU detected 1.6-fold more age-associated CpG sites than a beta-binomial model (the next best approach). Changes in these sites are consistent with known age-related shifts in DNA methylation levels, and are enriched near genes that are differentially expressed with age in the same population. Taken together, our results indicate that MACAU is an efficient, effective tool for analyzing bisulfite sequencing data, with particular salience to analyses of structured populations. MACAU is freely available at www.xzlab.org/software.html. PMID:26599596
Irwin, Brian J.; Wagner, Tyler; Bence, James R.; Kepler, Megan V.; Liu, Weihai; Hayes, Daniel B.
2013-01-01
Partitioning total variability into its component temporal and spatial sources is a powerful way to better understand time series and elucidate trends. The data available for such analyses of fish and other populations are usually nonnegative integer counts of the number of organisms, often dominated by many low values with few observations of relatively high abundance. These characteristics are not well approximated by the Gaussian distribution. We present a detailed description of a negative binomial mixed-model framework that can be used to model count data and quantify temporal and spatial variability. We applied these models to data from four fishery-independent surveys of Walleyes Sander vitreus across the Great Lakes basin. Specifically, we fitted models to gill-net catches from Wisconsin waters of Lake Superior; Oneida Lake, New York; Saginaw Bay in Lake Huron, Michigan; and Ohio waters of Lake Erie. These long-term monitoring surveys varied in overall sampling intensity, the total catch of Walleyes, and the proportion of zero catches. Parameter estimation included the negative binomial scaling parameter, and we quantified the random effects as the variations among gill-net sampling sites, the variations among sampled years, and site × year interactions. This framework (i.e., the application of a mixed model appropriate for count data in a variance-partitioning context) represents a flexible approach that has implications for monitoring programs (e.g., trend detection) and for examining the potential of individual variance components to serve as response metrics to large-scale anthropogenic perturbations or ecological changes.
Adjusted Wald Confidence Interval for a Difference of Binomial Proportions Based on Paired Data
Bonett, Douglas G.; Price, Robert M.
2012-01-01
Adjusted Wald intervals for binomial proportions in one-sample and two-sample designs have been shown to perform about as well as the best available methods. The adjusted Wald intervals are easy to compute and have been incorporated into introductory statistics courses. An adjusted Wald interval for paired binomial proportions is proposed here and…
Some normed binomial difference sequence spaces related to the [Formula: see text] spaces.
Song, Meimei; Meng, Jian
2017-01-01
The aim of this paper is to introduce the normed binomial sequence spaces [Formula: see text] by combining the binomial transformation and difference operator, where [Formula: see text]. We prove that these spaces are linearly isomorphic to the spaces [Formula: see text] and [Formula: see text], respectively. Furthermore, we compute Schauder bases and the α -, β - and γ -duals of these sequence spaces.
Revealing Word Order: Using Serial Position in Binomials to Predict Properties of the Speaker
Iliev, Rumen; Smirnova, Anastasia
2016-01-01
Three studies test the link between word order in binomials and psychological and demographic characteristics of a speaker. While linguists have already suggested that psychological, cultural and societal factors are important in choosing word order in binomials, the vast majority of relevant research was focused on general factors and on broadly…
An efficient binomial model-based measure for sequence comparison and its application.
Liu, Xiaoqing; Dai, Qi; Li, Lihua; He, Zerong
2011-04-01
Sequence comparison is one of the major tasks in bioinformatics, which could serve as evidence of structural and functional conservation, as well as of evolutionary relations. There are several similarity/dissimilarity measures for sequence comparison, but challenges remains. This paper presented a binomial model-based measure to analyze biological sequences. With help of a random indicator, the occurrence of a word at any position of sequence can be regarded as a random Bernoulli variable, and the distribution of a sum of the word occurrence is well known to be a binomial one. By using a recursive formula, we computed the binomial probability of the word count and proposed a binomial model-based measure based on the relative entropy. The proposed measure was tested by extensive experiments including classification of HEV genotypes and phylogenetic analysis, and further compared with alignment-based and alignment-free measures. The results demonstrate that the proposed measure based on binomial model is more efficient.
Directory of Open Access Journals (Sweden)
Xavier A. Harrison
2015-07-01
Full Text Available Overdispersion is a common feature of models of biological data, but researchers often fail to model the excess variation driving the overdispersion, resulting in biased parameter estimates and standard errors. Quantifying and modeling overdispersion when it is present is therefore critical for robust biological inference. One means to account for overdispersion is to add an observation-level random effect (OLRE to a model, where each data point receives a unique level of a random effect that can absorb the extra-parametric variation in the data. Although some studies have investigated the utility of OLRE to model overdispersion in Poisson count data, studies doing so for Binomial proportion data are scarce. Here I use a simulation approach to investigate the ability of both OLRE models and Beta-Binomial models to recover unbiased parameter estimates in mixed effects models of Binomial data under various degrees of overdispersion. In addition, as ecologists often fit random intercept terms to models when the random effect sample size is low (<5 levels, I investigate the performance of both model types under a range of random effect sample sizes when overdispersion is present. Simulation results revealed that the efficacy of OLRE depends on the process that generated the overdispersion; OLRE failed to cope with overdispersion generated from a Beta-Binomial mixture model, leading to biased slope and intercept estimates, but performed well for overdispersion generated by adding random noise to the linear predictor. Comparison of parameter estimates from an OLRE model with those from its corresponding Beta-Binomial model readily identified when OLRE were performing poorly due to disagreement between effect sizes, and this strategy should be employed whenever OLRE are used for Binomial data to assess their reliability. Beta-Binomial models performed well across all contexts, but showed a tendency to underestimate effect sizes when modelling non-Beta-Binomial
Harrison, Xavier A
2015-01-01
Overdispersion is a common feature of models of biological data, but researchers often fail to model the excess variation driving the overdispersion, resulting in biased parameter estimates and standard errors. Quantifying and modeling overdispersion when it is present is therefore critical for robust biological inference. One means to account for overdispersion is to add an observation-level random effect (OLRE) to a model, where each data point receives a unique level of a random effect that can absorb the extra-parametric variation in the data. Although some studies have investigated the utility of OLRE to model overdispersion in Poisson count data, studies doing so for Binomial proportion data are scarce. Here I use a simulation approach to investigate the ability of both OLRE models and Beta-Binomial models to recover unbiased parameter estimates in mixed effects models of Binomial data under various degrees of overdispersion. In addition, as ecologists often fit random intercept terms to models when the random effect sample size is low (model types under a range of random effect sample sizes when overdispersion is present. Simulation results revealed that the efficacy of OLRE depends on the process that generated the overdispersion; OLRE failed to cope with overdispersion generated from a Beta-Binomial mixture model, leading to biased slope and intercept estimates, but performed well for overdispersion generated by adding random noise to the linear predictor. Comparison of parameter estimates from an OLRE model with those from its corresponding Beta-Binomial model readily identified when OLRE were performing poorly due to disagreement between effect sizes, and this strategy should be employed whenever OLRE are used for Binomial data to assess their reliability. Beta-Binomial models performed well across all contexts, but showed a tendency to underestimate effect sizes when modelling non-Beta-Binomial data. Finally, both OLRE and Beta-Binomial models performed
DEFF Research Database (Denmark)
Azarang, Leyla; Scheike, Thomas; de Uña-Álvarez, Jacobo
2017-01-01
In this work, we present direct regression analysis for the transition probabilities in the possibly non-Markov progressive illness–death model. The method is based on binomial regression, where the response is the indicator of the occupancy for the given state along time. Randomly weighted score...
On the revival of the negative binomial distribution in multiparticle production
International Nuclear Information System (INIS)
Ekspong, G.
1990-01-01
This paper is based on published and some unpublished material pertaining to the revival of interest in and success of applying the negative binomial distribution to multiparticle production since 1983. After a historically oriented introduction going farther back in time, the main part of the paper is devoted to an unpublished derivation of the negative binomial distribution based on empirical observations of forward-backward multiplicity correlations. Some physical processes leading to the negative binomial distribution are mentioned and some comments made on published criticisms
Measured PET Data Characterization with the Negative Binomial Distribution Model.
Santarelli, Maria Filomena; Positano, Vincenzo; Landini, Luigi
2017-01-01
Accurate statistical model of PET measurements is a prerequisite for a correct image reconstruction when using statistical image reconstruction algorithms, or when pre-filtering operations must be performed. Although radioactive decay follows a Poisson distribution, deviation from Poisson statistics occurs on projection data prior to reconstruction due to physical effects, measurement errors, correction of scatter and random coincidences. Modelling projection data can aid in understanding the statistical nature of the data in order to develop efficient processing methods and to reduce noise. This paper outlines the statistical behaviour of measured emission data evaluating the goodness of fit of the negative binomial (NB) distribution model to PET data for a wide range of emission activity values. An NB distribution model is characterized by the mean of the data and the dispersion parameter α that describes the deviation from Poisson statistics. Monte Carlo simulations were performed to evaluate: (a) the performances of the dispersion parameter α estimator, (b) the goodness of fit of the NB model for a wide range of activity values. We focused on the effect produced by correction for random and scatter events in the projection (sinogram) domain, due to their importance in quantitative analysis of PET data. The analysis developed herein allowed us to assess the accuracy of the NB distribution model to fit corrected sinogram data, and to evaluate the sensitivity of the dispersion parameter α to quantify deviation from Poisson statistics. By the sinogram ROI-based analysis, it was demonstrated that deviation on the measured data from Poisson statistics can be quantitatively characterized by the dispersion parameter α, in any noise conditions and corrections.
Distribution-free Inference of Zero-inated Binomial Data for Longitudinal Studies.
He, H; Wang, W J; Hu, J; Gallop, R; Crits-Christoph, P; Xia, Y L
2015-10-01
Count reponses with structural zeros are very common in medical and psychosocial research, especially in alcohol and HIV research, and the zero-inflated poisson (ZIP) and zero-inflated negative binomial (ZINB) models are widely used for modeling such outcomes. However, as alcohol drinking outcomes such as days of drinkings are counts within a given period, their distributions are bounded above by an upper limit (total days in the period) and thus inherently follow a binomial or zero-inflated binomial (ZIB) distribution, rather than a Poisson or zero-inflated Poisson (ZIP) distribution, in the presence of structural zeros. In this paper, we develop a new semiparametric approach for modeling zero-inflated binomial (ZIB)-like count responses for cross-sectional as well as longitudinal data. We illustrate this approach with both simulated and real study data.
Difference of Sums Containing Products of Binomial Coefficients and Their Logarithms
National Research Council Canada - National Science Library
Miller, Allen R; Moskowitz, Ira S
2005-01-01
Properties of the difference of two sums containing products of binomial coefficients and their logarithms which arise in the application of Shannon's information theory to a certain class of covert channels are deduced...
Difference of Sums Containing Products of Binomial Coefficients and their Logarithms
National Research Council Canada - National Science Library
Miller, Allen R; Moskowitz, Ira S
2004-01-01
Properties of the difference of two sums containing products of binomial coefficients and their logarithms which arise in the application of Shannon's information theory to a certain class of covert channels are deduced...
Hung, Tran Loc; Giang, Le Truong
2016-01-01
Using the Stein-Chen method some upper bounds in Poisson approximation for distributions of row-wise triangular arrays of independent negative-binomial distributed random variables are established in this note.
Perbandingan Metode Binomial dan Metode Black-Scholes Dalam Penentuan Harga Opsi
Directory of Open Access Journals (Sweden)
Surya Amami Pramuditya
2016-04-01
Full Text Available ABSTRAKOpsi adalah kontrak antara pemegang dan penulis (buyer (holder dan seller (writer di mana penulis (writer memberikan hak (bukan kewajiban kepada holder untuk membeli atau menjual aset dari writer pada harga tertentu (strike atau latihan harga dan pada waktu tertentu dalam waktu (tanggal kadaluwarsa atau jatuh tempo waktu. Ada beberapa cara untuk menentukan harga opsi, diantaranya adalah Metode Black-Scholes dan Metode Binomial. Metode binomial berasal dari model pergerakan harga saham yang membagi waktu interval [0, T] menjadi n sama panjang. Sedangkan metode Black-Scholes, dimodelkan dengan pergerakan harga saham sebagai suatu proses stokastik. Semakin besar partisi waktu n pada Metode Binomial, maka nilai opsinya akan konvergen ke nilai opsi Metode Black-Scholes.Kata kunci: opsi, Binomial, Black-Scholes.ABSTRACT Option is a contract between the holder and the writer in which the writer gives the right (not the obligation to the holder to buy or sell an asset of a writer at a specified price (the strike or exercise price and at a specified time in the future (expiry date or maturity time. There are several ways to determine the price of options, including the Black-Scholes Method and Binomial Method. Binomial method come from a model of stock price movement that divide time interval [0, T] into n equally long. While the Black Scholes method, the stock price movement is modeled as a stochastic process. More larger the partition of time n in Binomial Method, the value option will converge to the value option in Black-Scholes Method.Key words: Options, Binomial, Black-Scholes
Directory of Open Access Journals (Sweden)
Quentin Noirhomme
2014-01-01
Full Text Available Multivariate classification is used in neuroimaging studies to infer brain activation or in medical applications to infer diagnosis. Their results are often assessed through either a binomial or a permutation test. Here, we simulated classification results of generated random data to assess the influence of the cross-validation scheme on the significance of results. Distributions built from classification of random data with cross-validation did not follow the binomial distribution. The binomial test is therefore not adapted. On the contrary, the permutation test was unaffected by the cross-validation scheme. The influence of the cross-validation was further illustrated on real-data from a brain–computer interface experiment in patients with disorders of consciousness and from an fMRI study on patients with Parkinson disease. Three out of 16 patients with disorders of consciousness had significant accuracy on binomial testing, but only one showed significant accuracy using permutation testing. In the fMRI experiment, the mental imagery of gait could discriminate significantly between idiopathic Parkinson's disease patients and healthy subjects according to the permutation test but not according to the binomial test. Hence, binomial testing could lead to biased estimation of significance and false positive or negative results. In our view, permutation testing is thus recommended for clinical application of classification with cross-validation.
Noirhomme, Quentin; Lesenfants, Damien; Gomez, Francisco; Soddu, Andrea; Schrouff, Jessica; Garraux, Gaëtan; Luxen, André; Phillips, Christophe; Laureys, Steven
2014-01-01
Multivariate classification is used in neuroimaging studies to infer brain activation or in medical applications to infer diagnosis. Their results are often assessed through either a binomial or a permutation test. Here, we simulated classification results of generated random data to assess the influence of the cross-validation scheme on the significance of results. Distributions built from classification of random data with cross-validation did not follow the binomial distribution. The binomial test is therefore not adapted. On the contrary, the permutation test was unaffected by the cross-validation scheme. The influence of the cross-validation was further illustrated on real-data from a brain-computer interface experiment in patients with disorders of consciousness and from an fMRI study on patients with Parkinson disease. Three out of 16 patients with disorders of consciousness had significant accuracy on binomial testing, but only one showed significant accuracy using permutation testing. In the fMRI experiment, the mental imagery of gait could discriminate significantly between idiopathic Parkinson's disease patients and healthy subjects according to the permutation test but not according to the binomial test. Hence, binomial testing could lead to biased estimation of significance and false positive or negative results. In our view, permutation testing is thus recommended for clinical application of classification with cross-validation.
Aly, Sharif S; Zhao, Jianyang; Li, Ben; Jiang, Jiming
2014-01-01
The Intraclass Correlation Coefficient (ICC) is commonly used to estimate the similarity between quantitative measures obtained from different sources. Overdispersed data is traditionally transformed so that linear mixed model (LMM) based ICC can be estimated. A common transformation used is the natural logarithm. The reliability of environmental sampling of fecal slurry on freestall pens has been estimated for Mycobacterium avium subsp. paratuberculosis using the natural logarithm transformed culture results. Recently, the negative binomial ICC was defined based on a generalized linear mixed model for negative binomial distributed data. The current study reports on the negative binomial ICC estimate which includes fixed effects using culture results of environmental samples. Simulations using a wide variety of inputs and negative binomial distribution parameters (r; p) showed better performance of the new negative binomial ICC compared to the ICC based on LMM even when negative binomial data was logarithm, and square root transformed. A second comparison that targeted a wider range of ICC values showed that the mean of estimated ICC closely approximated the true ICC.
Penalized estimation for competing risks regression with applications to high-dimensional covariates
DEFF Research Database (Denmark)
Ambrogi, Federico; Scheike, Thomas H.
2016-01-01
of competing events. The direct binomial regression model of Scheike and others (2008. Predicting cumulative incidence probability by direct binomial regression. Biometrika 95: (1), 205-220) is reformulated in a penalized framework to possibly fit a sparse regression model. The developed approach is easily...... Research 19: (1), 29-51), the research regarding competing risks is less developed (Binder and others, 2009. Boosting for high-dimensional time-to-event data with competing risks. Bioinformatics 25: (7), 890-896). The aim of this work is to consider how to do penalized regression in the presence...... implementable using existing high-performance software to do penalized regression. Results from simulation studies are presented together with an application to genomic data when the endpoint is progression-free survival. An R function is provided to perform regularized competing risks regression according...
Mulroy, Sara J; Winstein, Carolee J; Kulig, Kornelia; Beneck, George J; Fowler, Eileen G; DeMuth, Sharon K; Sullivan, Katherine J; Brown, David A; Lane, Christianne J
2011-12-01
Each of the 4 randomized clinical trials (RCTs) hosted by the Physical Therapy Clinical Research Network (PTClinResNet) targeted a different disability group (low back disorder in the Muscle-Specific Strength Training Effectiveness After Lumbar Microdiskectomy [MUSSEL] trial, chronic spinal cord injury in the Strengthening and Optimal Movements for Painful Shoulders in Chronic Spinal Cord Injury [STOMPS] trial, adult stroke in the Strength Training Effectiveness Post-Stroke [STEPS] trial, and pediatric cerebral palsy in the Pediatric Endurance and Limb Strengthening [PEDALS] trial for children with spastic diplegic cerebral palsy) and tested the effectiveness of a muscle-specific or functional activity-based intervention on primary outcomes that captured pain (STOMPS, MUSSEL) or locomotor function (STEPS, PEDALS). The focus of these secondary analyses was to determine causal relationships among outcomes across levels of the International Classification of Functioning, Disability and Health (ICF) framework for the 4 RCTs. With the database from PTClinResNet, we used 2 separate secondary statistical approaches-mediation analysis for the MUSSEL and STOMPS trials and regression analysis for the STEPS and PEDALS trials-to test relationships among muscle performance, primary outcomes (pain related and locomotor related), activity and participation measures, and overall quality of life. Predictive models were stronger for the 2 studies with pain-related primary outcomes. Change in muscle performance mediated or predicted reductions in pain for the MUSSEL and STOMPS trials and, to some extent, walking speed for the STEPS trial. Changes in primary outcome variables were significantly related to changes in activity and participation variables for all 4 trials. Improvement in activity and participation outcomes mediated or predicted increases in overall quality of life for the 3 trials with adult populations. Variables included in the statistical models were limited to those
Turi, Christina E; Murch, Susan J
2013-07-09
Ethnobotanical research and the study of plants used for rituals, ceremonies and to connect with the spirit world have led to the discovery of many novel psychoactive compounds such as nicotine, caffeine, and cocaine. In North America, spiritual and ceremonial uses of plants are well documented and can be accessed online via the University of Michigan's Native American Ethnobotany Database. The objective of the study was to compare Residual, Bayesian, Binomial and Imprecise Dirichlet Model (IDM) analyses of ritual, ceremonial and spiritual plants in Moerman's ethnobotanical database and to identify genera that may be good candidates for the discovery of novel psychoactive compounds. The database was queried with the following format "Family Name AND Ceremonial OR Spiritual" for 263 North American botanical families. Spiritual and ceremonial flora consisted of 86 families with 517 species belonging to 292 genera. Spiritual taxa were then grouped further into ceremonial medicines and items categories. Residual, Bayesian, Binomial and IDM analysis were performed to identify over and under-utilized families. The 4 statistical approaches were in good agreement when identifying under-utilized families but large families (>393 species) were underemphasized by Binomial, Bayesian and IDM approaches for over-utilization. Residual, Binomial, and IDM analysis identified similar families as over-utilized in the medium (92-392 species) and small (<92 species) classes. The families Apiaceae, Asteraceae, Ericacea, Pinaceae and Salicaceae were identified as significantly over-utilized as ceremonial medicines in medium and large sized families. Analysis of genera within the Apiaceae and Asteraceae suggest that the genus Ligusticum and Artemisia are good candidates for facilitating the discovery of novel psychoactive compounds. The 4 statistical approaches were not consistent in the selection of over-utilization of flora. Residual analysis revealed overall trends that were supported
Directory of Open Access Journals (Sweden)
Tudor DRUGAN
2003-08-01
Full Text Available The aim of the paper was to present the usefulness of the binomial distribution in studying of the contingency tables and the problems of approximation to normality of binomial distribution (the limits, advantages, and disadvantages. The classification of the medical keys parameters reported in medical literature and expressing them using the contingency table units based on their mathematical expressions restrict the discussion of the confidence intervals from 34 parameters to 9 mathematical expressions. The problem of obtaining different information starting with the computed confidence interval for a specified method, information like confidence intervals boundaries, percentages of the experimental errors, the standard deviation of the experimental errors and the deviation relative to significance level was solves through implementation in PHP programming language of original algorithms. The cases of expression, which contain two binomial variables, were separately treated. An original method of computing the confidence interval for the case of two-variable expression was proposed and implemented. The graphical representation of the expression of two binomial variables for which the variation domain of one of the variable depend on the other variable was a real problem because the most of the software used interpolation in graphical representation and the surface maps were quadratic instead of triangular. Based on an original algorithm, a module was implements in PHP in order to represent graphically the triangular surface plots. All the implementation described above was uses in computing the confidence intervals and estimating their performance for binomial distributions sample sizes and variable.
A fast algorithm for computing binomial coefficients modulo powers of two.
Andreica, Mugurel Ionut
2013-01-01
I present a new algorithm for computing binomial coefficients modulo 2N. The proposed method has an O(N3·Multiplication(N)+N4) preprocessing time, after which a binomial coefficient C(P, Q) with 0≤Q≤P≤2N-1 can be computed modulo 2N in O(N2·log(N)·Multiplication(N)) time. Multiplication(N) denotes the time complexity of multiplying two N-bit numbers, which can range from O(N2) to O(N·log(N)·log(log(N))) or better. Thus, the overall time complexity for evaluating M binomial coefficients C(P, Q) modulo 2N with 0≤Q≤P≤2N-1 is O((N3+M·N2·log(N))·Multiplication(N)+N4). After preprocessing, we can actually compute binomial coefficients modulo any 2R with R≤N. For larger values of P and Q, variations of Lucas' theorem must be used first in order to reduce the computation to the evaluation of multiple (O(log(P))) binomial coefficients C(P', Q') (or restricted types of factorials P'!) modulo 2N with 0≤Q'≤P'≤2N-1.
Water-selective excitation of short T2 species with binomial pulses.
Deligianni, Xeni; Bär, Peter; Scheffler, Klaus; Trattnig, Siegfried; Bieri, Oliver
2014-09-01
For imaging of fibrous musculoskeletal components, ultra-short echo time methods are often combined with fat suppression. Due to the increased chemical shift, spectral excitation of water might become a favorable option at ultra-high fields. Thus, this study aims to compare and explore short binomial excitation schemes for spectrally selective imaging of fibrous tissue components with short transverse relaxation time (T2 ). Water selective 1-1-binomial excitation is compared with nonselective imaging using a sub-millisecond spoiled gradient echo technique for in vivo imaging of fibrous tissue at 3T and 7T. Simulations indicate a maximum signal loss from binomial excitation of approximately 30% in the limit of very short T2 (0.1 ms), as compared to nonselective imaging; decreasing rapidly with increasing field strength and increasing T2 , e.g., to 19% at 3T and 10% at 7T for T2 of 1 ms. In agreement with simulations, a binomial phase close to 90° yielded minimum signal loss: approximately 6% at 3T and close to 0% at 7T for menisci, and for ligaments 9% and 13%, respectively. Overall, for imaging of short-lived T2 components, short 1-1 binomial excitation schemes prove to offer marginal signal loss especially at ultra-high fields with overall improved scanning efficiency. Copyright © 2013 Wiley Periodicals, Inc.
Differentiating regressed melanoma from regressed lichenoid keratosis.
Chan, Aegean H; Shulman, Kenneth J; Lee, Bonnie A
2017-04-01
Distinguishing regressed lichen planus-like keratosis (LPLK) from regressed melanoma can be difficult on histopathologic examination, potentially resulting in mismanagement of patients. We aimed to identify histopathologic features by which regressed melanoma can be differentiated from regressed LPLK. Twenty actively inflamed LPLK, 12 LPLK with regression and 15 melanomas with regression were compared and evaluated by hematoxylin and eosin staining as well as Melan-A, microphthalmia transcription factor (MiTF) and cytokeratin (AE1/AE3) immunostaining. (1) A total of 40% of regressed melanomas showed complete or near complete loss of melanocytes within the epidermis with Melan-A and MiTF immunostaining, while 8% of regressed LPLK exhibited this finding. (2) Necrotic keratinocytes were seen in the epidermis in 33% regressed melanomas as opposed to all of the regressed LPLK. (3) A dense infiltrate of melanophages in the papillary dermis was seen in 40% of regressed melanomas, a feature not seen in regressed LPLK. In summary, our findings suggest that a complete or near complete loss of melanocytes within the epidermis strongly favors a regressed melanoma over a regressed LPLK. In addition, necrotic epidermal keratinocytes and the presence of a dense band-like distribution of dermal melanophages can be helpful in differentiating these lesions. © 2016 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
International Nuclear Information System (INIS)
Joshi, A.; Lawande, S.V.
1990-01-01
A systematic study of squeezing obtained from k-photon anharmonic oscillator (with interaction hamiltonian of the form (a † ) k , k ≥ 2) interacting with light whose statistics can be varied from sub-Poissonian to poissonian via binomial state of field and super-Poissonian to poissonian via negative binomial state of field is presented. The authors predict that for all values of k there is a tendency increase in squeezing with increased sub-Poissonian character of the field while the reverse is true with super-Poissonian field. They also present non-classical behavior of the first order coherence function explicitly for k = 2 case (i.e., for two-photon anharmonic oscillator model used for a Kerr-like medium) with variation in the statistics of the input light
Possibility and Challenges of Conversion of Current Virus Species Names to Linnaean Binomials
Energy Technology Data Exchange (ETDEWEB)
Postler, Thomas S.; Clawson, Anna N.; Amarasinghe, Gaya K.; Basler, Christopher F.; Bavari, Sbina; Benkő, Mária; Blasdell, Kim R.; Briese, Thomas; Buchmeier, Michael J.; Bukreyev, Alexander; Calisher, Charles H.; Chandran, Kartik; Charrel, Rémi; Clegg, Christopher S.; Collins, Peter L.; Juan Carlos, De La Torre; Derisi, Joseph L.; Dietzgen, Ralf G.; Dolnik, Olga; Dürrwald, Ralf; Dye, John M.; Easton, Andrew J.; Emonet, Sébastian; Formenty, Pierre; Fouchier, Ron A. M.; Ghedin, Elodie; Gonzalez, Jean-Paul; Harrach, Balázs; Hewson, Roger; Horie, Masayuki; Jiāng, Dàohóng; Kobinger, Gary; Kondo, Hideki; Kropinski, Andrew M.; Krupovic, Mart; Kurath, Gael; Lamb, Robert A.; Leroy, Eric M.; Lukashevich, Igor S.; Maisner, Andrea; Mushegian, Arcady R.; Netesov, Sergey V.; Nowotny, Norbert; Patterson, Jean L.; Payne, Susan L.; PaWeska, Janusz T.; Peters, Clarence J.; Radoshitzky, Sheli R.; Rima, Bertus K.; Romanowski, Victor; Rubbenstroth, Dennis; Sabanadzovic, Sead; Sanfaçon, Hélène; Salvato, Maria S.; Schwemmle, Martin; Smither, Sophie J.; Stenglein, Mark D.; Stone, David M.; Takada, Ayato; Tesh, Robert B.; Tomonaga, Keizo; Tordo, Noël; Towner, Jonathan S.; Vasilakis, Nikos; Volchkov, Viktor E.; Wahl-Jensen, Victoria; Walker, Peter J.; Wang, Lin-Fa; Varsani, Arvind; Whitfield, Anna E.; Zerbini, F. Murilo; Kuhn, Jens H.
2016-10-22
Botanical, mycological, zoological, and prokaryotic species names follow the Linnaean format, consisting of an italicized Latinized binomen with a capitalized genus name and a lower case species epithet (e.g., Homo sapiens). Virus species names, however, do not follow a uniform format, and, even when binomial, are not Linnaean in style. In this thought exercise, we attempted to convert all currently official names of species included in the virus family Arenaviridae and the virus order Mononegavirales to Linnaean binomials, and to identify and address associated challenges and concerns. Surprisingly, this endeavor was not as complicated or time-consuming as even the authors of this article expected when conceiving the experiment. [Arenaviridae; binomials; ICTV; International Committee on Taxonomy of Viruses; Mononegavirales; virus nomenclature; virus taxonomy.
Analysis of generalized negative binomial distributions attached to hyperbolic Landau levels
International Nuclear Information System (INIS)
Chhaiba, Hassan; Demni, Nizar; Mouayn, Zouhair
2016-01-01
To each hyperbolic Landau level of the Poincaré disc is attached a generalized negative binomial distribution. In this paper, we compute the moment generating function of this distribution and supply its atomic decomposition as a perturbation of the negative binomial distribution by a finitely supported measure. Using the Mandel parameter, we also discuss the nonclassical nature of the associated coherent states. Next, we derive a Lévy-Khintchine-type representation of its characteristic function when the latter does not vanish and deduce that it is quasi-infinitely divisible except for the lowest hyperbolic Landau level corresponding to the negative binomial distribution. By considering the total variation of the obtained quasi-Lévy measure, we introduce a new infinitely divisible distribution for which we derive the characteristic function.
Analysis of generalized negative binomial distributions attached to hyperbolic Landau levels
Energy Technology Data Exchange (ETDEWEB)
Chhaiba, Hassan, E-mail: chhaiba.hassan@gmail.com [Department of Mathematics, Faculty of Sciences, Ibn Tofail University, P.O. Box 133, Kénitra (Morocco); Demni, Nizar, E-mail: nizar.demni@univ-rennes1.fr [IRMAR, Université de Rennes 1, Campus de Beaulieu, 35042 Rennes Cedex (France); Mouayn, Zouhair, E-mail: mouayn@fstbm.ac.ma [Department of Mathematics, Faculty of Sciences and Technics (M’Ghila), Sultan Moulay Slimane, P.O. Box 523, Béni Mellal (Morocco)
2016-07-15
To each hyperbolic Landau level of the Poincaré disc is attached a generalized negative binomial distribution. In this paper, we compute the moment generating function of this distribution and supply its atomic decomposition as a perturbation of the negative binomial distribution by a finitely supported measure. Using the Mandel parameter, we also discuss the nonclassical nature of the associated coherent states. Next, we derive a Lévy-Khintchine-type representation of its characteristic function when the latter does not vanish and deduce that it is quasi-infinitely divisible except for the lowest hyperbolic Landau level corresponding to the negative binomial distribution. By considering the total variation of the obtained quasi-Lévy measure, we introduce a new infinitely divisible distribution for which we derive the characteristic function.
Pedrini, D. T.; Pedrini, Bonnie C.
Regression, another mechanism studied by Sigmund Freud, has had much research, e.g., hypnotic regression, frustration regression, schizophrenic regression, and infra-human-animal regression (often directly related to fixation). Many investigators worked with hypnotic age regression, which has a long history, going back to Russian reflexologists.…
Relacionando las distribuciones binomial negativa y logarítmica vía sus series asociadas
Sadinle, Mauricio
2011-01-01
La distribución binomial negativa está asociada a la serie obtenida de derivar la serie logarítmica. Recíprocamente, la distribución logarítmica está asociada a la serie obtenida de integrar la serie asociada a la distribución binomial negativa. El parámetro del número de fallas de la distribución Binomial negativa es el número de derivadas necesarias para obtener la serie binomial negativa de la serie logarítmica. El razonamiento presentado puede emplearse como un método alternativo para pro...
Microbial comparative pan-genomics using binomial mixture models
DEFF Research Database (Denmark)
Ussery, David; Snipen, L; Almøy, T
2009-01-01
The size of the core- and pan-genome of bacterial species is a topic of increasing interest due to the growing number of sequenced prokaryote genomes, many from the same species. Attempts to estimate these quantities have been made, using regression methods or mixture models. We extend the latter...... approach by using statistical ideas developed for capture-recapture problems in ecology and epidemiology. RESULTS: We estimate core- and pan-genome sizes for 16 different bacterial species. The results reveal a complex dependency structure for most species, manifested as heterogeneous detection...... probabilities. Estimated pan-genome sizes range from small (around 2600 gene families) in Buchnera aphidicola to large (around 43000 gene families) in Escherichia coli. Results for Echerichia coli show that as more data become available, a larger diversity is estimated, indicating an extensive pool of rarely...
International Nuclear Information System (INIS)
Liu Tang-Kun; Zhang Kang-Long; Tao Yu; Shan Chuan-Jia; Liu Ji-Bing
2016-01-01
The temporal evolution of the degree of entanglement between two atoms in a system of the binomial optical field interacting with two arbitrary entangled atoms is investigated. The influence of the strength of the dipole–dipole interaction between two atoms, probabilities of the Bernoulli trial, and particle number of the binomial optical field on the temporal evolution of the atomic entanglement are discussed. The result shows that the two atoms are always in the entanglement state. Moreover, if and only if the two atoms are initially in the maximally entangled state, the entanglement evolution is not affected by the parameters, and the degree of entanglement is always kept as 1. (paper)
Parameter estimation of the zero inflated negative binomial beta exponential distribution
Sirichantra, Chutima; Bodhisuwan, Winai
2017-11-01
The zero inflated negative binomial-beta exponential (ZINB-BE) distribution is developed, it is an alternative distribution for the excessive zero counts with overdispersion. The ZINB-BE distribution is a mixture of two distributions which are Bernoulli and negative binomial-beta exponential distributions. In this work, some characteristics of the proposed distribution are presented, such as, mean and variance. The maximum likelihood estimation is applied to parameter estimation of the proposed distribution. Finally some results of Monte Carlo simulation study, it seems to have high-efficiency when the sample size is large.
On extinction time of a generalized endemic chain-binomial model.
Aydogmus, Ozgur
2016-09-01
We considered a chain-binomial epidemic model not conferring immunity after infection. Mean field dynamics of the model has been analyzed and conditions for the existence of a stable endemic equilibrium are determined. The behavior of the chain-binomial process is probabilistically linked to the mean field equation. As a result of this link, we were able to show that the mean extinction time of the epidemic increases at least exponentially as the population size grows. We also present simulation results for the process to validate our analytical findings. Copyright © 2016 Elsevier Inc. All rights reserved.
Binomial confidence intervals for testing non-inferiority or superiority: a practitioner's dilemma.
Pradhan, Vivek; Evans, John C; Banerjee, Tathagata
2016-08-01
In testing for non-inferiority or superiority in a single arm study, the confidence interval of a single binomial proportion is frequently used. A number of such intervals are proposed in the literature and implemented in standard software packages. Unfortunately, use of different intervals leads to conflicting conclusions. Practitioners thus face a serious dilemma in deciding which one to depend on. Is there a way to resolve this dilemma? We address this question by investigating the performances of ten commonly used intervals of a single binomial proportion, in the light of two criteria, viz., coverage and expected length of the interval. © The Author(s) 2013.
Better Autologistic Regression
Directory of Open Access Journals (Sweden)
Mark A. Wolters
2017-11-01
Full Text Available Autologistic regression is an important probability model for dichotomous random variables observed along with covariate information. It has been used in various fields for analyzing binary data possessing spatial or network structure. The model can be viewed as an extension of the autologistic model (also known as the Ising model, quadratic exponential binary distribution, or Boltzmann machine to include covariates. It can also be viewed as an extension of logistic regression to handle responses that are not independent. Not all authors use exactly the same form of the autologistic regression model. Variations of the model differ in two respects. First, the variable coding—the two numbers used to represent the two possible states of the variables—might differ. Common coding choices are (zero, one and (minus one, plus one. Second, the model might appear in either of two algebraic forms: a standard form, or a recently proposed centered form. Little attention has been paid to the effect of these differences, and the literature shows ambiguity about their importance. It is shown here that changes to either coding or centering in fact produce distinct, non-nested probability models. Theoretical results, numerical studies, and analysis of an ecological data set all show that the differences among the models can be large and practically significant. Understanding the nature of the differences and making appropriate modeling choices can lead to significantly improved autologistic regression analyses. The results strongly suggest that the standard model with plus/minus coding, which we call the symmetric autologistic model, is the most natural choice among the autologistic variants.
Model-based Quantile Regression for Discrete Data
Padellini, Tullia
2018-04-10
Quantile regression is a class of methods voted to the modelling of conditional quantiles. In a Bayesian framework quantile regression has typically been carried out exploiting the Asymmetric Laplace Distribution as a working likelihood. Despite the fact that this leads to a proper posterior for the regression coefficients, the resulting posterior variance is however affected by an unidentifiable parameter, hence any inferential procedure beside point estimation is unreliable. We propose a model-based approach for quantile regression that considers quantiles of the generating distribution directly, and thus allows for a proper uncertainty quantification. We then create a link between quantile regression and generalised linear models by mapping the quantiles to the parameter of the response variable, and we exploit it to fit the model with R-INLA. We extend it also in the case of discrete responses, where there is no 1-to-1 relationship between quantiles and distribution\\'s parameter, by introducing continuous generalisations of the most common discrete variables (Poisson, Binomial and Negative Binomial) to be exploited in the fitting.
Joint Analysis of Binomial and Continuous Traits with a Recursive Model
DEFF Research Database (Denmark)
Varona, Louis; Sorensen, Daniel
2014-01-01
This work presents a model for the joint analysis of a binomial and a Gaussian trait using a recursive parametrization that leads to a computationally efficient implementation. The model is illustrated in an analysis of mortality and litter size in two breeds of Danish pigs, Landrace and Yorkshir...
Learning Binomial Probability Concepts with Simulation, Random Numbers and a Spreadsheet
Rochowicz, John A., Jr.
2005-01-01
This paper introduces the reader to the concepts of binomial probability and simulation. A spreadsheet is used to illustrate these concepts. Random number generators are great technological tools for demonstrating the concepts of probability. Ideas of approximation, estimation, and mathematical usefulness provide numerous ways of learning…
Using the Binomial Series to Prove the Arithmetic Mean-Geometric Mean Inequality
Persky, Ronald L.
2003-01-01
In 1968, Leon Gerber compared (1 + x)[superscript a] to its kth partial sum as a binomial series. His result is stated and, as an application of this result, a proof of the arithmetic mean-geometric mean inequality is presented.
Justin S. Crotteau; Martin W. Ritchie; J. Morgan. Varner
2014-01-01
Many western USA fire regimes are typified by mixed-severity fire, which compounds the variability inherent to natural regeneration densities in associated forests. Tree regeneration data are often discrete and nonnegative; accordingly, we fit a series of Poisson and negative binomial variation models to conifer seedling counts across four distinct burn severities and...
Topology of unitary groups and the prime orders of binomial coefficients
Duan, HaiBao; Lin, XianZu
2017-09-01
Let $c:SU(n)\\rightarrow PSU(n)=SU(n)/\\mathbb{Z}_{n}$ be the quotient map of the special unitary group $SU(n)$ by its center subgroup $\\mathbb{Z}_{n}$. We determine the induced homomorphism $c^{\\ast}:$ $H^{\\ast}(PSU(n))\\rightarrow H^{\\ast}(SU(n))$ on cohomologies by computing with the prime orders of binomial coefficients
A Bayesian Approach to Functional Mixed Effect Modeling for Longitudinal Data with Binomial Outcomes
Kliethermes, Stephanie; Oleson, Jacob
2014-01-01
Longitudinal growth patterns are routinely seen in medical studies where individual and population growth is followed over a period of time. Many current methods for modeling growth presuppose a parametric relationship between the outcome and time (e.g., linear, quadratic); however, these relationships may not accurately capture growth over time. Functional mixed effects (FME) models provide flexibility in handling longitudinal data with nonparametric temporal trends. Although FME methods are well-developed for continuous, normally distributed outcome measures, nonparametric methods for handling categorical outcomes are limited. We consider the situation with binomially distributed longitudinal outcomes. Although percent correct data can be modeled assuming normality, estimates outside the parameter space are possible and thus estimated curves can be unrealistic. We propose a binomial FME model using Bayesian methodology to account for growth curves with binomial (percentage) outcomes. The usefulness of our methods is demonstrated using a longitudinal study of speech perception outcomes from cochlear implant users where we successfully model both the population and individual growth trajectories. Simulation studies also advocate the usefulness of the binomial model particularly when outcomes occur near the boundary of the probability parameter space and in situations with a small number of trials. PMID:24723495
A mixed-binomial model for Likert-type personality measures.
Allik, Jüri
2014-01-01
Personality measurement is based on the idea that values on an unobservable latent variable determine the distribution of answers on a manifest response scale. Typically, it is assumed in the Item Response Theory (IRT) that latent variables are related to the observed responses through continuous normal or logistic functions, determining the probability with which one of the ordered response alternatives on a Likert-scale item is chosen. Based on an analysis of 1731 self- and other-rated responses on the 240 NEO PI-3 questionnaire items, it was proposed that a viable alternative is a finite number of latent events which are related to manifest responses through a binomial function which has only one parameter-the probability with which a given statement is approved. For the majority of items, the best fit was obtained with a mixed-binomial distribution, which assumes two different subpopulations who endorse items with two different probabilities. It was shown that the fit of the binomial IRT model can be improved by assuming that about 10% of random noise is contained in the answers and by taking into account response biases toward one of the response categories. It was concluded that the binomial response model for the measurement of personality traits may be a workable alternative to the more habitual normal and logistic IRT models.
Confidence Intervals for Weighted Composite Scores under the Compound Binomial Error Model
Kim, Kyung Yong; Lee, Won-Chan
2018-01-01
Reporting confidence intervals with test scores helps test users make important decisions about examinees by providing information about the precision of test scores. Although a variety of estimation procedures based on the binomial error model are available for computing intervals for test scores, these procedures assume that items are randomly…
Binomial Coefficients Modulo a Prime--A Visualization Approach to Undergraduate Research
Bardzell, Michael; Poimenidou, Eirini
2011-01-01
In this article we present, as a case study, results of undergraduate research involving binomial coefficients modulo a prime "p." We will discuss how undergraduates were involved in the project, even with a minimal mathematical background beforehand. There are two main avenues of exploration described to discover these binomial…
Computational results on the compound binomial risk model with nonhomogeneous claim occurrences
Tuncel, A.; Tank, F.
2013-01-01
The aim of this paper is to give a recursive formula for non-ruin (survival) probability when the claim occurrences are nonhomogeneous in the compound binomial risk model. We give recursive formulas for non-ruin (survival) probability and for distribution of the total number of claims under the
Use of the negative binomial-truncated Poisson distribution in thunderstorm prediction
Cohen, A. C.
1971-01-01
A probability model is presented for the distribution of thunderstorms over a small area given that thunderstorm events (1 or more thunderstorms) are occurring over a larger area. The model incorporates the negative binomial and truncated Poisson distributions. Probability tables for Cape Kennedy for spring, summer, and fall months and seasons are presented. The computer program used to compute these probabilities is appended.
Determining order-up-to levels under periodic review for compound binomial (intermittent) demand
Teunter, R. H.; Syntetos, A. A.; Babai, M. Z.
2010-01-01
We propose a new method for determining order-up-to levels for intermittent demand items in a periodic review system. Contrary to existing methods, we exploit the intermittent character of demand by modelling lead time demand as a compound binomial process. in an extensive numerical study using
Negative binomial distribution for multiplicity distributions in e/sup +/e/sup -/ annihilation
International Nuclear Information System (INIS)
Chew, C.K.; Lim, Y.K.
1986-01-01
The authors show that the negative binomial distribution fits excellently the available charged-particle multiplicity distributions of e/sup +/e/sup -/ annihilation into hadrons at three different energies √s = 14, 22 and 34 GeV
Kliethermes, Stephanie; Oleson, Jacob
2014-08-15
Longitudinal growth patterns are routinely seen in medical studies where individual growth and population growth are followed up over a period of time. Many current methods for modeling growth presuppose a parametric relationship between the outcome and time (e.g., linear and quadratic); however, these relationships may not accurately capture growth over time. Functional mixed-effects (FME) models provide flexibility in handling longitudinal data with nonparametric temporal trends. Although FME methods are well developed for continuous, normally distributed outcome measures, nonparametric methods for handling categorical outcomes are limited. We consider the situation with binomially distributed longitudinal outcomes. Although percent correct data can be modeled assuming normality, estimates outside the parameter space are possible, and thus, estimated curves can be unrealistic. We propose a binomial FME model using Bayesian methodology to account for growth curves with binomial (percentage) outcomes. The usefulness of our methods is demonstrated using a longitudinal study of speech perception outcomes from cochlear implant users where we successfully model both the population and individual growth trajectories. Simulation studies also advocate the usefulness of the binomial model particularly when outcomes occur near the boundary of the probability parameter space and in situations with a small number of trials. Copyright © 2014 John Wiley & Sons, Ltd.
Time evolution of negative binomial optical field in a diffusion channel
International Nuclear Information System (INIS)
Liu Tang-Kun; Wu Pan-Pan; Shan Chuan-Jia; Liu Ji-Bing; Fan Hong-Yi
2015-01-01
We find the time evolution law of a negative binomial optical field in a diffusion channel. We reveal that by adjusting the diffusion parameter, the photon number can be controlled. Therefore, the diffusion process can be considered a quantum controlling scheme through photon addition. (paper)
Adaptive multiresolution Hermite-Binomial filters for image edge and texture analysis
Gu, Y.H.; Katsaggelos, A.K.
1994-01-01
A new multiresolution image analysis approach using adaptive Hermite-Binomial filters is presented in this paper. According to the local image structural and textural properties, the analysis filter kernels are made adaptive both in their scales and orders. Applications of such an adaptive filtering
A binomial random sum of present value models in investment analysis
Βουδούρη, Αγγελική; Ντζιαχρήστος, Ευάγγελος
1997-01-01
Stochastic present value models have been widely adopted in financial theory and practice and play a very important role in capital budgeting and profit planning. The purpose of this paper is to introduce a binomial random sum of stochastic present value models and offer an application in investment analysis.
Raw and Central Moments of Binomial Random Variables via Stirling Numbers
Griffiths, Martin
2013-01-01
We consider here the problem of calculating the moments of binomial random variables. It is shown how formulae for both the raw and the central moments of such random variables may be obtained in a recursive manner utilizing Stirling numbers of the first kind. Suggestions are also provided as to how students might be encouraged to explore this…
Fermat’s Little Theorem via Divisibility of Newton’s Binomial
Directory of Open Access Journals (Sweden)
Ziobro Rafał
2015-09-01
Full Text Available Solving equations in integers is an important part of the number theory [29]. In many cases it can be conducted by the factorization of equation’s elements, such as the Newton’s binomial. The article introduces several simple formulas, which may facilitate this process. Some of them are taken from relevant books [28], [14].
NBLDA: negative binomial linear discriminant analysis for RNA-Seq data.
Dong, Kai; Zhao, Hongyu; Tong, Tiejun; Wan, Xiang
2016-09-13
RNA-sequencing (RNA-Seq) has become a powerful technology to characterize gene expression profiles because it is more accurate and comprehensive than microarrays. Although statistical methods that have been developed for microarray data can be applied to RNA-Seq data, they are not ideal due to the discrete nature of RNA-Seq data. The Poisson distribution and negative binomial distribution are commonly used to model count data. Recently, Witten (Annals Appl Stat 5:2493-2518, 2011) proposed a Poisson linear discriminant analysis for RNA-Seq data. The Poisson assumption may not be as appropriate as the negative binomial distribution when biological replicates are available and in the presence of overdispersion (i.e., when the variance is larger than or equal to the mean). However, it is more complicated to model negative binomial variables because they involve a dispersion parameter that needs to be estimated. In this paper, we propose a negative binomial linear discriminant analysis for RNA-Seq data. By Bayes' rule, we construct the classifier by fitting a negative binomial model, and propose some plug-in rules to estimate the unknown parameters in the classifier. The relationship between the negative binomial classifier and the Poisson classifier is explored, with a numerical investigation of the impact of dispersion on the discriminant score. Simulation results show the superiority of our proposed method. We also analyze two real RNA-Seq data sets to demonstrate the advantages of our method in real-world applications. We have developed a new classifier using the negative binomial model for RNA-seq data classification. Our simulation results show that our proposed classifier has a better performance than existing works. The proposed classifier can serve as an effective tool for classifying RNA-seq data. Based on the comparison results, we have provided some guidelines for scientists to decide which method should be used in the discriminant analysis of RNA-Seq data
A Mechanistic Beta-Binomial Probability Model for mRNA Sequencing Data.
Smith, Gregory R; Birtwistle, Marc R
2016-01-01
A main application for mRNA sequencing (mRNAseq) is determining lists of differentially-expressed genes (DEGs) between two or more conditions. Several software packages exist to produce DEGs from mRNAseq data, but they typically yield different DEGs, sometimes markedly so. The underlying probability model used to describe mRNAseq data is central to deriving DEGs, and not surprisingly most softwares use different models and assumptions to analyze mRNAseq data. Here, we propose a mechanistic justification to model mRNAseq as a binomial process, with data from technical replicates given by a binomial distribution, and data from biological replicates well-described by a beta-binomial distribution. We demonstrate good agreement of this model with two large datasets. We show that an emergent feature of the beta-binomial distribution, given parameter regimes typical for mRNAseq experiments, is the well-known quadratic polynomial scaling of variance with the mean. The so-called dispersion parameter controls this scaling, and our analysis suggests that the dispersion parameter is a continually decreasing function of the mean, as opposed to current approaches that impose an asymptotic value to the dispersion parameter at moderate mean read counts. We show how this leads to current approaches overestimating variance for moderately to highly expressed genes, which inflates false negative rates. Describing mRNAseq data with a beta-binomial distribution thus may be preferred since its parameters are relatable to the mechanistic underpinnings of the technique and may improve the consistency of DEG analysis across softwares, particularly for moderately to highly expressed genes.
DEFF Research Database (Denmark)
Johansen, Søren
2008-01-01
The reduced rank regression model is a multivariate regression model with a coefficient matrix with reduced rank. The reduced rank regression algorithm is an estimation procedure, which estimates the reduced rank regression model. It is related to canonical correlations and involves calculating...
Regression analysis with categorized regression calibrated exposure: some interesting findings
Directory of Open Access Journals (Sweden)
Hjartåker Anette
2006-07-01
Full Text Available Abstract Background Regression calibration as a method for handling measurement error is becoming increasingly well-known and used in epidemiologic research. However, the standard version of the method is not appropriate for exposure analyzed on a categorical (e.g. quintile scale, an approach commonly used in epidemiologic studies. A tempting solution could then be to use the predicted continuous exposure obtained through the regression calibration method and treat it as an approximation to the true exposure, that is, include the categorized calibrated exposure in the main regression analysis. Methods We use semi-analytical calculations and simulations to evaluate the performance of the proposed approach compared to the naive approach of not correcting for measurement error, in situations where analyses are performed on quintile scale and when incorporating the original scale into the categorical variables, respectively. We also present analyses of real data, containing measures of folate intake and depression, from the Norwegian Women and Cancer study (NOWAC. Results In cases where extra information is available through replicated measurements and not validation data, regression calibration does not maintain important qualities of the true exposure distribution, thus estimates of variance and percentiles can be severely biased. We show that the outlined approach maintains much, in some cases all, of the misclassification found in the observed exposure. For that reason, regression analysis with the corrected variable included on a categorical scale is still biased. In some cases the corrected estimates are analytically equal to those obtained by the naive approach. Regression calibration is however vastly superior to the naive method when applying the medians of each category in the analysis. Conclusion Regression calibration in its most well-known form is not appropriate for measurement error correction when the exposure is analyzed on a
Hegazy, Maha A.; Lotfy, Hayam M.; Mowaka, Shereen; Mohamed, Ekram Hany
2016-07-01
Wavelets have been adapted for a vast number of signal-processing applications due to the amount of information that can be extracted from a signal. In this work, a comparative study on the efficiency of continuous wavelet transform (CWT) as a signal processing tool in univariate regression and a pre-processing tool in multivariate analysis using partial least square (CWT-PLS) was conducted. These were applied to complex spectral signals of ternary and quaternary mixtures. CWT-PLS method succeeded in the simultaneous determination of a quaternary mixture of drotaverine (DRO), caffeine (CAF), paracetamol (PAR) and p-aminophenol (PAP, the major impurity of paracetamol). While, the univariate CWT failed to simultaneously determine the quaternary mixture components and was able to determine only PAR and PAP, the ternary mixtures of DRO, CAF, and PAR and CAF, PAR, and PAP. During the calculations of CWT, different wavelet families were tested. The univariate CWT method was validated according to the ICH guidelines. While for the development of the CWT-PLS model a calibration set was prepared by means of an orthogonal experimental design and their absorption spectra were recorded and processed by CWT. The CWT-PLS model was constructed by regression between the wavelet coefficients and concentration matrices and validation was performed by both cross validation and external validation sets. Both methods were successfully applied for determination of the studied drugs in pharmaceutical formulations.
Wagner, Brandie; Riggs, Paula; Mikulich-Gilbertson, Susan
2015-01-01
It is important to correctly understand the associations among addiction to multiple drugs and between co-occurring substance use and psychiatric disorders. Substance-specific outcomes (e.g. number of days used cannabis) have distributional characteristics which range widely depending on the substance and the sample being evaluated. We recommend a four-part strategy for determining the appropriate distribution for modeling substance use data. We demonstrate this strategy by comparing the model fit and resulting inferences from applying four different distributions to model use of substances that range greatly in the prevalence and frequency of their use. Using Timeline Followback (TLFB) data from a previously-published study, we used negative binomial, beta-binomial and their zero-inflated counterparts to model proportion of days during treatment of cannabis, cigarettes, alcohol, and opioid use. The fit for each distribution was evaluated with statistical model selection criteria, visual plots and a comparison of the resulting inferences. We demonstrate the feasibility and utility of modeling each substance individually and show that no single distribution provides the best fit for all substances. Inferences regarding use of each substance and associations with important clinical variables were not consistent across models and differed by substance. Thus, the distribution chosen for modeling substance use must be carefully selected and evaluated because it may impact the resulting conclusions. Furthermore, the common procedure of aggregating use across different substances may not be ideal.
Yi, Jun; Yang, Wenhong; Sun, Wen-Hua; Nomura, Kotohiro; Hada, Masahiko
2017-11-30
The NMR chemical shifts of vanadium ( 51 V) in (imido)vanadium(V) dichloride complexes with imidazolin-2-iminato and imidazolidin-2-iminato ligands were calculated by the density functional theory (DFT) method with GIAO. The calculated 51 V NMR chemical shifts were analyzed by the multiple linear regression (MLR) analysis (MLRA) method with a series of calculated molecular properties. Some of calculated NMR chemical shifts were incorrect using the optimized molecular geometries of the X-ray structures. After the global minimum geometries of all of the molecules were determined, the trend of the observed chemical shifts was well reproduced by the present DFT method. The MLRA method was performed to investigate the correlation between the 51 V NMR chemical shift and the natural charge, band energy gap, and Wiberg bond index of the V═N bond. The 51 V NMR chemical shifts obtained with the present MLR model were well reproduced with a correlation coefficient of 0.97.
Kuss, Oliver; Hoyer, Annika; Solms, Alexander
2014-01-15
There are still challenges when meta-analyzing data from studies on diagnostic accuracy. This is mainly due to the bivariate nature of the response where information on sensitivity and specificity must be summarized while accounting for their correlation within a single trial. In this paper, we propose a new statistical model for the meta-analysis for diagnostic accuracy studies. This model uses beta-binomial distributions for the marginal numbers of true positives and true negatives and links these margins by a bivariate copula distribution. The new model comes with all the features of the current standard model, a bivariate logistic regression model with random effects, but has the additional advantages of a closed likelihood function and a larger flexibility for the correlation structure of sensitivity and specificity. In a simulation study, which compares three copula models and two implementations of the standard model, the Plackett and the Gauss copula do rarely perform worse but frequently better than the standard model. We use an example from a meta-analysis to judge the diagnostic accuracy of telomerase (a urinary tumor marker) for the diagnosis of primary bladder cancer for illustration. Copyright © 2013 John Wiley & Sons, Ltd.
INVESTIGATION OF E-MAIL TRAFFIC BY USING ZERO-INFLATED REGRESSION MODELS
Directory of Open Access Journals (Sweden)
Yılmaz KAYA
2012-06-01
Full Text Available Based on count data obtained with a value of zero may be greater than anticipated. These types of data sets should be used to analyze by regression methods taking into account zero values. Zero- Inflated Poisson (ZIP, Zero-Inflated negative binomial (ZINB, Poisson Hurdle (PH, negative binomial Hurdle (NBH are more common approaches in modeling more zero value possessing dependent variables than expected. In the present study, the e-mail traffic of Yüzüncü Yıl University in 2009 spring semester was investigated. ZIP and ZINB, PH and NBH regression methods were applied on the data set because more zeros counting (78.9% were found in data set than expected. ZINB and NBH regression considered zero dispersion and overdispersion were found to be more accurate results due to overdispersion and zero dispersion in sending e-mail. ZINB is determined to be best model accordingto Vuong statistics and information criteria.
Directory of Open Access Journals (Sweden)
Anke Hüls
2017-05-01
Full Text Available Antimicrobial resistance in livestock is a matter of general concern. To develop hygiene measures and methods for resistance prevention and control, epidemiological studies on a population level are needed to detect factors associated with antimicrobial resistance in livestock holdings. In general, regression models are used to describe these relationships between environmental factors and resistance outcome. Besides the study design, the correlation structures of the different outcomes of antibiotic resistance and structural zero measurements on the resistance outcome as well as on the exposure side are challenges for the epidemiological model building process. The use of appropriate regression models that acknowledge these complexities is essential to assure valid epidemiological interpretations. The aims of this paper are (i to explain the model building process comparing several competing models for count data (negative binomial model, quasi-Poisson model, zero-inflated model, and hurdle model and (ii to compare these models using data from a cross-sectional study on antibiotic resistance in animal husbandry. These goals are essential to evaluate which model is most suitable to identify potential prevention measures. The dataset used as an example in our analyses was generated initially to study the prevalence and associated factors for the appearance of cefotaxime-resistant Escherichia coli in 48 German fattening pig farms. For each farm, the outcome was the count of samples with resistant bacteria. There was almost no overdispersion and only moderate evidence of excess zeros in the data. Our analyses show that it is essential to evaluate regression models in studies analyzing the relationship between environmental factors and antibiotic resistances in livestock. After model comparison based on evaluation of model predictions, Akaike information criterion, and Pearson residuals, here the hurdle model was judged to be the most appropriate
Poisson and negative binomial item count techniques for surveys with sensitive question.
Tian, Guo-Liang; Tang, Man-Lai; Wu, Qin; Liu, Yin
2017-04-01
Although the item count technique is useful in surveys with sensitive questions, privacy of those respondents who possess the sensitive characteristic of interest may not be well protected due to a defect in its original design. In this article, we propose two new survey designs (namely the Poisson item count technique and negative binomial item count technique) which replace several independent Bernoulli random variables required by the original item count technique with a single Poisson or negative binomial random variable, respectively. The proposed models not only provide closed form variance estimate and confidence interval within [0, 1] for the sensitive proportion, but also simplify the survey design of the original item count technique. Most importantly, the new designs do not leak respondents' privacy. Empirical results show that the proposed techniques perform satisfactorily in the sense that it yields accurate parameter estimate and confidence interval.
Application of a binomial cusum control chart to monitor one drinking water indicator
Directory of Open Access Journals (Sweden)
Elisa Henning
2014-02-01
Full Text Available The aim of this study is to analyze the use of a binomial cumulative sum chart (CUSUM to monitor the presence of total coliforms, biological indicators of quality of water supplies in water treatment processes. The sample series were monthly taken from a water treatment plant and were analyzed from 2007 to 2009. The statistical treatment of the data was performed using GNU R, and routines were created for the approximation of the upper limit of the binomial CUSUM chart. Furthermore, a comparative study was conducted to investigate whether there is a significant difference in sensitivity between the use of CUSUM and the traditional Shewhart chart, the most commonly used chart in process monitoring. The results obtained demonstrate that this study was essential for making the right choice in selecting a chart for the statistical analysis of this process.
A Bayesian non-inferiority test for two independent binomial proportions.
Kawasaki, Yohei; Miyaoka, Etsuo
2013-01-01
In drug development, non-inferiority tests are often employed to determine the difference between two independent binomial proportions. Many test statistics for non-inferiority are based on the frequentist framework. However, research on non-inferiority in the Bayesian framework is limited. In this paper, we suggest a new Bayesian index τ = P(π₁ > π₂-Δ₀|X₁, X₂), where X₁ and X₂ denote binomial random variables for trials n1 and n₂, and parameters π₁ and π₂ , respectively, and the non-inferiority margin is Δ₀> 0. We show two calculation methods for τ, an approximate method that uses normal approximation and an exact method that uses an exact posterior PDF. We compare the approximate probability with the exact probability for τ. Finally, we present the results of actual clinical trials to show the utility of index τ. Copyright © 2013 John Wiley & Sons, Ltd.
The option to expand a project: its assessment with the binomial options pricing model
Directory of Open Access Journals (Sweden)
Salvador Cruz Rambaud
Full Text Available Traditional methods of investment appraisal, like the Net Present Value, are not able to include the value of the operational flexibility of the project. In this paper, real options, and more specifically the option to expand, are assumed to be included in the project information in addition to the expected cash flows. Thus, to calculate the total value of the project, we are going to apply the methodology of the Net Present Value to the different scenarios derived from the existence of the real option to expand. Taking into account the analogy between real and financial options, the value of including an option to expand is explored by using the binomial options pricing model. In this way, estimating the value of the option to expand is a tool which facilitates the control of the uncertainty element implicit in the project. Keywords: Real options, Option to expand, Binomial options pricing model, Investment project appraisal
International Nuclear Information System (INIS)
Wang Huan; Guo Xiuhua; Jia Zhongwei; Li Hongkai; Liang Zhigang; Li Kuncheng; He Qian
2010-01-01
Purpose: To introduce multilevel binomial logistic prediction model-based computer-aided diagnostic (CAD) method of small solitary pulmonary nodules (SPNs) diagnosis by combining patient and image characteristics by textural features of CT image. Materials and methods: Describe fourteen gray level co-occurrence matrix textural features obtained from 2171 benign and malignant small solitary pulmonary nodules, which belongs to 185 patients. Multilevel binomial logistic model is applied to gain these initial insights. Results: Five texture features, including Inertia, Entropy, Correlation, Difference-mean, Sum-Entropy, and age of patients own aggregating character on patient-level, which are statistically different (P < 0.05) between benign and malignant small solitary pulmonary nodules. Conclusion: Some gray level co-occurrence matrix textural features are efficiently descriptive features of CT image of small solitary pulmonary nodules, which can profit diagnosis of earlier period lung cancer if combined patient-level characteristics to some extent.
Modeling and Predistortion of Envelope Tracking Power Amplifiers using a Memory Binomial Model
DEFF Research Database (Denmark)
Tafuri, Felice Francesco; Sira, Daniel; Larsen, Torben
2013-01-01
. The model definition is based on binomial series, hence the name of memory binomial model (MBM). The MBM is here applied to measured data-sets acquired from an ET measurement set-up. When used as a PA model the MBM showed an NMSE (Normalized Mean Squared Error) as low as −40dB and an ACEPR (Adjacent Channel...... Error Power Ratio) below −51 dB. The simulated predistortion results showed that the MBM can improve the compensation of distortion in the adjacent channel of 5.8 dB and 5.7 dB compared to a memory polynomial predistorter (MPPD). The predistortion performance in the time domain showed an NMSE...
Possibility and challenges of conversion of current virus species names to Linnaean binomials
Thomas, Postler; Clawson, Anna N.; Amarasinghe, Gaya K.; Basler, Christopher F.; Bavari, Sina; Benko, Maria; Blasdell, Kim R.; Briese, Thomas; Buchmeier, Michael J.; Bukreyev, Alexander; Calisher, Charles H.; Chandran, Kartik; Charrel, Remi; Clegg, Christopher S.; Collins, Peter L.; De la Torre, Juan Carlos; DeRisi, Joseph L.; Dietzgen, Ralf G.; Dolnik, Olga; Durrwald, Ralf; Dye, John M.; Easton, Andrew J.; Emonet, Sebastian; Formenty, Pierre; Fouchier, Ron A. M.; Ghedin, Elodie; Gonzalez, Jean-Paul; Harrach, Balazs; Hewson, Roger; Horie, Masayuki; Jiang, Daohong; Kobinger, Gary P.; Kondo, Hideki; Kropinski, Andrew; Krupovic, Mart; Kurath, Gael; Lamb, Robert A.; Leroy, Eric M.; Lukashevich, Igor S.; Maisner, Andrea; Mushegian, Arcady; Netesov, Sergey V.; Nowotny, Norbert; Patterson, Jean L.; Payne, Susan L.; Paweska, Janusz T.; Peters, C.J.; Radoshitzky, Sheli; Rima, Bertus K.; Romanowski, Victor; Rubbenstroth, Dennis; Sabanadzovic, Sead; Sanfacon, Helene; Salvato , Maria; Schwemmle, Martin; Smither, Sophie J.; Stenglein, Mark; Stone, D.M.; Takada , Ayato; Tesh, Robert B.; Tomonaga, Keizo; Tordo, N.; Towner, Jonathan S.; Vasilakis, Nikos; Volchkov, Victor E.; Jensen, Victoria; Walker, Peter J.; Wang, Lin-Fa; Varsani, Arvind; Whitfield , Anna E.; Zerbini, Francisco Murilo; Kuhn, Jens H.
2017-01-01
Botanical, mycological, zoological, and prokaryotic species names follow the Linnaean format, consisting of an italicized Latinized binomen with a capitalized genus name and a lower case species epithet (e.g., Homo sapiens). Virus species names, however, do not follow a uniform format, and, even when binomial, are not Linnaean in style. In this thought exercise, we attempted to convert all currently official names of species included in the virus family Arenaviridae and the virus order Mononegavirales to Linnaean binomials, and to identify and address associated challenges and concerns. Surprisingly, this endeavor was not as complicated or time-consuming as even the authors of this article expected when conceiving the experiment.
[Application of detecting and taking overdispersion into account in Poisson regression model].
Bouche, G; Lepage, B; Migeot, V; Ingrand, P
2009-08-01
Researchers often use the Poisson regression model to analyze count data. Overdispersion can occur when a Poisson regression model is used, resulting in an underestimation of variance of the regression model parameters. Our objective was to take overdispersion into account and assess its impact with an illustration based on the data of a study investigating the relationship between use of the Internet to seek health information and number of primary care consultations. Three methods, overdispersed Poisson, a robust estimator, and negative binomial regression, were performed to take overdispersion into account in explaining variation in the number (Y) of primary care consultations. We tested overdispersion in the Poisson regression model using the ratio of the sum of Pearson residuals over the number of degrees of freedom (chi(2)/df). We then fitted the three models and compared parameter estimation to the estimations given by Poisson regression model. Variance of the number of primary care consultations (Var[Y]=21.03) was greater than the mean (E[Y]=5.93) and the chi(2)/df ratio was 3.26, which confirmed overdispersion. Standard errors of the parameters varied greatly between the Poisson regression model and the three other regression models. Interpretation of estimates from two variables (using the Internet to seek health information and single parent family) would have changed according to the model retained, with significant levels of 0.06 and 0.002 (Poisson), 0.29 and 0.09 (overdispersed Poisson), 0.29 and 0.13 (use of a robust estimator) and 0.45 and 0.13 (negative binomial) respectively. Different methods exist to solve the problem of underestimating variance in the Poisson regression model when overdispersion is present. The negative binomial regression model seems to be particularly accurate because of its theorical distribution ; in addition this regression is easy to perform with ordinary statistical software packages.
Analyzing hospitalization data: potential limitations of Poisson regression.
Weaver, Colin G; Ravani, Pietro; Oliver, Matthew J; Austin, Peter C; Quinn, Robert R
2015-08-01
Poisson regression is commonly used to analyze hospitalization data when outcomes are expressed as counts (e.g. number of days in hospital). However, data often violate the assumptions on which Poisson regression is based. More appropriate extensions of this model, while available, are rarely used. We compared hospitalization data between 206 patients treated with hemodialysis (HD) and 107 treated with peritoneal dialysis (PD) using Poisson regression and compared results from standard Poisson regression with those obtained using three other approaches for modeling count data: negative binomial (NB) regression, zero-inflated Poisson (ZIP) regression and zero-inflated negative binomial (ZINB) regression. We examined the appropriateness of each model and compared the results obtained with each approach. During a mean 1.9 years of follow-up, 183 of 313 patients (58%) were never hospitalized (indicating an excess of 'zeros'). The data also displayed overdispersion (variance greater than mean), violating another assumption of the Poisson model. Using four criteria, we determined that the NB and ZINB models performed best. According to these two models, patients treated with HD experienced similar hospitalization rates as those receiving PD {NB rate ratio (RR): 1.04 [bootstrapped 95% confidence interval (CI): 0.49-2.20]; ZINB summary RR: 1.21 (bootstrapped 95% CI 0.60-2.46)}. Poisson and ZIP models fit the data poorly and had much larger point estimates than the NB and ZINB models [Poisson RR: 1.93 (bootstrapped 95% CI 0.88-4.23); ZIP summary RR: 1.84 (bootstrapped 95% CI 0.88-3.84)]. We found substantially different results when modeling hospitalization data, depending on the approach used. Our results argue strongly for a sound model selection process and improved reporting around statistical methods used for modeling count data. © The Author 2015. Published by Oxford University Press on behalf of ERA-EDTA. All rights reserved.
Generalized harmonic, cyclotomic, and binomial sums, their polylogarithms and special numbers
Energy Technology Data Exchange (ETDEWEB)
Ablinger, J.; Schneider, C. [Johannes Kepler Univ., Linz (Austria). Research Inst. for Symbolic Computation (RISC); Bluemlein, J. [Deutsches Elektronen-Synchrotron (DESY), Zeuthen (Germany)
2013-10-15
A survey is given on mathematical structures which emerge in multi-loop Feynman diagrams. These are multiply nested sums, and, associated to them by an inverse Mellin transform, specific iterated integrals. Both classes lead to sets of special numbers. Starting with harmonic sums and polylogarithms we discuss recent extensions of these quantities as cyclotomic, generalized (cyclotomic), and binomially weighted sums, associated iterated integrals and special constants and their relations.
Study on Emission Measurement of Vehicle on Road Based on Binomial Logit Model
Aly, Sumarni Hamid; Selintung, Mary; Ramli, Muhammad Isran; Sumi, Tomonori
2011-01-01
This research attempts to evaluate emission measurement of on road vehicle. In this regard, the research develops failure probability model of vehicle emission test for passenger car which utilize binomial logit model. The model focuses on failure of CO and HC emission test for gasoline cars category and Opacity emission test for diesel-fuel cars category as dependent variables, while vehicle age, engine size, brand and type of the cars as independent variables. In order to imp...
Use of the Beta-Binomial Model for Central Statistical Monitoring of Multicenter Clinical Trials
Desmet, Lieven; Venet, David; Doffagne, Erik; Timmermans, Catherine; Legrand, Catherine; Burzykowski, Tomasz; Buyse, Marc
2017-01-01
As part of central statistical monitoring of multicenter clinical trial data, we propose a procedure based on the beta-binomial distribution for the detection of centers with atypical values for the probability of some event. The procedure makes no assumptions about the typical event proportion and uses the event counts from all centers to derive a reference model. The procedure is shown through simulations to have high sensitivity and high specificity if the contamination rate is small and t...
Directory of Open Access Journals (Sweden)
Patricio Peña-Rehbein
Full Text Available This paper describes the frequency and number of Sphyrion laevigatum in the skin of Genypterus blacodes, an important economic resource in Chile. The analysis of a spatial distribution model indicated that the parasites tended to cluster. Variations in the number of parasites per host could be described by a negative binomial distribution. The maximum number of parasites observed per host was two.
Nested (inverse) binomial sums and new iterated integrals for massive Feynman diagrams
International Nuclear Information System (INIS)
Ablinger, Jakob; Schneider, Carsten; Bluemlein, Johannes; Raab, Clemens G.
2014-07-01
Nested sums containing binomial coefficients occur in the computation of massive operatormatrix elements. Their associated iterated integrals lead to alphabets including radicals, for which we determined a suitable basis. We discuss algorithms for converting between sum and integral representations, mainly relying on the Mellin transform. To aid the conversion we worked out dedicated rewrite rules, based on which also some general patterns emerging in the process can be obtained.
Generalized harmonic, cyclotomic, and binomial sums, their polylogarithms and special numbers
International Nuclear Information System (INIS)
Ablinger, J.; Schneider, C.; Bluemlein, J.
2013-10-01
A survey is given on mathematical structures which emerge in multi-loop Feynman diagrams. These are multiply nested sums, and, associated to them by an inverse Mellin transform, specific iterated integrals. Both classes lead to sets of special numbers. Starting with harmonic sums and polylogarithms we discuss recent extensions of these quantities as cyclotomic, generalized (cyclotomic), and binomially weighted sums, associated iterated integrals and special constants and their relations.
Computation of Clebsch-Gordan and Gaunt coefficients using binomial coefficients
International Nuclear Information System (INIS)
Guseinov, I.I.; Oezmen, A.; Atav, Ue
1995-01-01
Using binomial coefficients the Clebsch-Gordan and Gaunt coefficients were calculated for extremely large quantum numbers. The main advantage of this approach is directly calculating these coefficients, instead of using recursion relations. Accuracy of the results is quite high for quantum numbers l 1 , and l 2 up to 100. Despite direct calculation, the CPU times are found comparable with those given in the related literature. 11 refs., 1 fig., 2 tabs
Negative binomial multiplicity distributions, a new empirical law for high energy collisions
International Nuclear Information System (INIS)
Van Hove, L.; Giovannini, A.
1987-01-01
For a variety of high energy hadron production reactions, recent experiments have confirmed the findings of the UA5 Collaboration that charged particle multiplicities in central (pseudo) rapidity intervals and in full phase space obey negative binomial (NB) distributions. The authors discuss the meaning of this new empirical law on the basis of new data and they show that they support the interpretation of the NB distributions in terms of a cascading mechanism of hardron production
Nested (inverse) binomial sums and new iterated integrals for massive Feynman diagrams
Energy Technology Data Exchange (ETDEWEB)
Ablinger, Jakob; Schneider, Carsten [Johannes Kepler Univ., Linz (Austria). Research Inst. for Symbolic Computation (RISC); Bluemlein, Johannes; Raab, Clemens G. [Deutsches Elektronen-Synchrotron (DESY), Zeuthen (Germany)
2014-07-15
Nested sums containing binomial coefficients occur in the computation of massive operatormatrix elements. Their associated iterated integrals lead to alphabets including radicals, for which we determined a suitable basis. We discuss algorithms for converting between sum and integral representations, mainly relying on the Mellin transform. To aid the conversion we worked out dedicated rewrite rules, based on which also some general patterns emerging in the process can be obtained.
Discrimination of numerical proportions: A comparison of binomial and Gaussian models.
Raidvee, Aire; Lember, Jüri; Allik, Jüri
2017-01-01
Observers discriminated the numerical proportion of two sets of elements (N = 9, 13, 33, and 65) that differed either by color or orientation. According to the standard Thurstonian approach, the accuracy of proportion discrimination is determined by irreducible noise in the nervous system that stochastically transforms the number of presented visual elements onto a continuum of psychological states representing numerosity. As an alternative to this customary approach, we propose a Thurstonian-binomial model, which assumes discrete perceptual states, each of which is associated with a certain visual element. It is shown that the probability β with which each visual element can be noticed and registered by the perceptual system can explain data of numerical proportion discrimination at least as well as the continuous Thurstonian-Gaussian model, and better, if the greater parsimony of the Thurstonian-binomial model is taken into account using AIC model selection. We conclude that Gaussian and binomial models represent two different fundamental principles-internal noise vs. using only a fraction of available information-which are both plausible descriptions of visual perception.
Genome-enabled predictions for binomial traits in sugar beet populations.
Biscarini, Filippo; Stevanato, Piergiorgio; Broccanello, Chiara; Stella, Alessandra; Saccomani, Massimo
2014-07-22
Genomic information can be used to predict not only continuous but also categorical (e.g. binomial) traits. Several traits of interest in human medicine and agriculture present a discrete distribution of phenotypes (e.g. disease status). Root vigor in sugar beet (B. vulgaris) is an example of binomial trait of agronomic importance. In this paper, a panel of 192 SNPs (single nucleotide polymorphisms) was used to genotype 124 sugar beet individual plants from 18 lines, and to classify them as showing "high" or "low" root vigor. A threshold model was used to fit the relationship between binomial root vigor and SNP genotypes, through the matrix of genomic relationships between individuals in a genomic BLUP (G-BLUP) approach. From a 5-fold cross-validation scheme, 500 testing subsets were generated. The estimated average cross-validation error rate was 0.000731 (0.073%). Only 9 out of 12326 test observations (500 replicates for an average test set size of 24.65) were misclassified. The estimated prediction accuracy was quite high. Such accurate predictions may be related to the high estimated heritability for root vigor (0.783) and to the few genes with large effect underlying the trait. Despite the sparse SNP panel, there was sufficient within-scaffold LD where SNPs with large effect on root vigor were located to allow for genome-enabled predictions to work.
Tollerup, Kris E; Marcum, Daniel; Wilson, Rob; Godfrey, Larry
2013-08-01
The two-spotted spider mite, Tetranychus urticae Koch, is an economic pest on peppermint [Mentha x piperita (L.), 'Black Mitcham'] grown in California. A sampling plan for T. urticae was developed under Pacific Northwest conditions in the early 1980s and has been used by California growers since approximately 1998. This sampling plan, however, is cumbersome and a poor predictor of T. urticae densities in California. Between June and August, the numbers of immature and adult T. urticae were counted on leaves at three commercial peppermint fields (sites) in 2010 and a single field in 2011. In each of seven locations per site, 45 leaves were sampled, that is, 9 leaves per five stems. Leaf samples were stratified by collecting three leaves from the top, middle, and bottom strata per stem. The on-plant distribution of T. urticae did not significantly differ among the stem strata through the growing season. Binomial and enumerative sampling plans were developed using generic Taylor's power law coefficient values. The best fit of our data for binomial sampling occurred using a tally threshold of T = 0. The optimum number of leaves required for T urticae at the critical density of five mites per leaf was 20 for the binomial and 23 for the enumerative sampling plans, respectively. Sampling models were validated using Resampling for Validation of Sampling Plan Software.
O'Donnell, Katherine M; Thompson, Frank R; Semlitsch, Raymond D
2015-01-01
Detectability of individual animals is highly variable and nearly always binomial mixture models to account for multiple sources of variation in detectability. The state process of the hierarchical model describes ecological mechanisms that generate spatial and temporal patterns in abundance, while the observation model accounts for the imperfect nature of counting individuals due to temporary emigration and false absences. We illustrate our model's potential advantages, including the allowance of temporary emigration between sampling periods, with a case study of southern red-backed salamanders Plethodon serratus. We fit our model and a standard binomial mixture model to counts of terrestrial salamanders surveyed at 40 sites during 3-5 surveys each spring and fall 2010-2012. Our models generated similar parameter estimates to standard binomial mixture models. Aspect was the best predictor of salamander abundance in our case study; abundance increased as aspect became more northeasterly. Increased time-since-rainfall strongly decreased salamander surface activity (i.e. availability for sampling), while higher amounts of woody cover objects and rocks increased conditional detection probability (i.e. probability of capture, given an animal is exposed to sampling). By explicitly accounting for both components of detectability, we increased congruence between our statistical modeling and our ecological understanding of the system. We stress the importance of choosing survey locations and protocols that maximize species availability and conditional detection probability to increase population parameter estimate reliability.
Analysis of railroad tank car releases using a generalized binomial model.
Liu, Xiang; Hong, Yili
2015-11-01
The United States is experiencing an unprecedented boom in shale oil production, leading to a dramatic growth in petroleum crude oil traffic by rail. In 2014, U.S. railroads carried over 500,000 tank carloads of petroleum crude oil, up from 9500 in 2008 (a 5300% increase). In light of continual growth in crude oil by rail, there is an urgent national need to manage this emerging risk. This need has been underscored in the wake of several recent crude oil release incidents. In contrast to highway transport, which usually involves a tank trailer, a crude oil train can carry a large number of tank cars, having the potential for a large, multiple-tank-car release incident. Previous studies exclusively assumed that railroad tank car releases in the same train accident are mutually independent, thereby estimating the number of tank cars releasing given the total number of tank cars derailed based on a binomial model. This paper specifically accounts for dependent tank car releases within a train accident. We estimate the number of tank cars releasing given the number of tank cars derailed based on a generalized binomial model. The generalized binomial model provides a significantly better description for the empirical tank car accident data through our numerical case study. This research aims to provide a new methodology and new insights regarding the further development of risk management strategies for improving railroad crude oil transportation safety. Copyright © 2015 Elsevier Ltd. All rights reserved.
Regression analysis by example
Chatterjee, Samprit
2012-01-01
Praise for the Fourth Edition: ""This book is . . . an excellent source of examples for regression analysis. It has been and still is readily readable and understandable."" -Journal of the American Statistical Association Regression analysis is a conceptually simple method for investigating relationships among variables. Carrying out a successful application of regression analysis, however, requires a balance of theoretical results, empirical rules, and subjective judgment. Regression Analysis by Example, Fifth Edition has been expanded
DEFF Research Database (Denmark)
Fitzenberger, Bernd; Wilke, Ralf Andreas
2015-01-01
if the mean regression model does not. We provide a short informal introduction into the principle of quantile regression which includes an illustrative application from empirical labor market research. This is followed by briefly sketching the underlying statistical model for linear quantile regression based......Quantile regression is emerging as a popular statistical approach, which complements the estimation of conditional mean models. While the latter only focuses on one aspect of the conditional distribution of the dependent variable, the mean, quantile regression provides more detailed insights...... by modeling conditional quantiles. Quantile regression can therefore detect whether the partial effect of a regressor on the conditional quantiles is the same for all quantiles or differs across quantiles. Quantile regression can provide evidence for a statistical relationship between two variables even...
Brenner, Tom; Chen, Johnny; Stait-Gardner, Tim; Zheng, Gang; Matsukawa, Shingo; Price, William S.
2018-03-01
A new family of binomial-like inversion sequences, named jump-and-return sandwiches (JRS), has been developed by inserting a binomial-like sequence into a standard jump-and-return sequence, discovered through use of a stochastic Genetic Algorithm optimisation. Compared to currently used binomial-like inversion sequences (e.g., 3-9-19 and W5), the new sequences afford wider inversion bands and narrower non-inversion bands with an equal number of pulses. As an example, two jump-and-return sandwich 10-pulse sequences achieved 95% inversion at offsets corresponding to 9.4% and 10.3% of the non-inversion band spacing, compared to 14.7% for the binomial-like W5 inversion sequence, i.e., they afforded non-inversion bands about two thirds the width of the W5 non-inversion band.
Bakbergenuly, Ilyas; Morgenthaler, Stephan
2016-01-01
We study bias arising as a result of nonlinear transformations of random variables in random or mixed effects models and its effect on inference in group‐level studies or in meta‐analysis. The findings are illustrated on the example of overdispersed binomial distributions, where we demonstrate considerable biases arising from standard log‐odds and arcsine transformations of the estimated probability p^, both for single‐group studies and in combining results from several groups or studies in meta‐analysis. Our simulations confirm that these biases are linear in ρ, for small values of ρ, the intracluster correlation coefficient. These biases do not depend on the sample sizes or the number of studies K in a meta‐analysis and result in abysmal coverage of the combined effect for large K. We also propose bias‐correction for the arcsine transformation. Our simulations demonstrate that this bias‐correction works well for small values of the intraclass correlation. The methods are applied to two examples of meta‐analyses of prevalence. PMID:27192062
Bakbergenuly, Ilyas; Kulinskaya, Elena; Morgenthaler, Stephan
2016-07-01
We study bias arising as a result of nonlinear transformations of random variables in random or mixed effects models and its effect on inference in group-level studies or in meta-analysis. The findings are illustrated on the example of overdispersed binomial distributions, where we demonstrate considerable biases arising from standard log-odds and arcsine transformations of the estimated probability p̂, both for single-group studies and in combining results from several groups or studies in meta-analysis. Our simulations confirm that these biases are linear in ρ, for small values of ρ, the intracluster correlation coefficient. These biases do not depend on the sample sizes or the number of studies K in a meta-analysis and result in abysmal coverage of the combined effect for large K. We also propose bias-correction for the arcsine transformation. Our simulations demonstrate that this bias-correction works well for small values of the intraclass correlation. The methods are applied to two examples of meta-analyses of prevalence. © 2016 The Authors. Biometrical Journal Published by Wiley-VCH Verlag GmbH & Co. KGaA.
Archer, A.W.; Maples, C.G.
1989-01-01
Numerous departures from ideal relationships are revealed by Monte Carlo simulations of widely accepted binomial coefficients. For example, simulations incorporating varying levels of matrix sparseness (presence of zeros indicating lack of data) and computation of expected values reveal that not only are all common coefficients influenced by zero data, but also that some coefficients do not discriminate between sparse or dense matrices (few zero data). Such coefficients computationally merge mutually shared and mutually absent information and do not exploit all the information incorporated within the standard 2 ?? 2 contingency table; therefore, the commonly used formulae for such coefficients are more complicated than the actual range of values produced. Other coefficients do differentiate between mutual presences and absences; however, a number of these coefficients do not demonstrate a linear relationship to matrix sparseness. Finally, simulations using nonrandom matrices with known degrees of row-by-row similarities signify that several coefficients either do not display a reasonable range of values or are nonlinear with respect to known relationships within the data. Analyses with nonrandom matrices yield clues as to the utility of certain coefficients for specific applications. For example, coefficients such as Jaccard, Dice, and Baroni-Urbani and Buser are useful if correction of sparseness is desired, whereas the Russell-Rao coefficient is useful when sparseness correction is not desired. ?? 1989 International Association for Mathematical Geology.
Lee, JuHee; Park, Chang Gi; Choi, Moonki
2016-05-01
This study was conducted to identify risk factors that influence regular exercise among patients with Parkinson's disease in Korea. Parkinson's disease is prevalent in the elderly, and may lead to a sedentary lifestyle. Exercise can enhance physical and psychological health. However, patients with Parkinson's disease are less likely to exercise than are other populations due to physical disability. A secondary data analysis and cross-sectional descriptive study were conducted. A convenience sample of 106 patients with Parkinson's disease was recruited at an outpatient neurology clinic of a tertiary hospital in Korea. Demographic characteristics, disease-related characteristics (including disease duration and motor symptoms), self-efficacy for exercise, balance, and exercise level were investigated. Negative binomial regression and zero-inflated negative binomial regression for exercise count data were utilized to determine factors involved in exercise. The mean age of participants was 65.85 ± 8.77 years, and the mean duration of Parkinson's disease was 7.23 ± 6.02 years. Most participants indicated that they engaged in regular exercise (80.19%). Approximately half of participants exercised at least 5 days per week for 30 min, as recommended (51.9%). Motor symptoms were a significant predictor of exercise in the count model, and self-efficacy for exercise was a significant predictor of exercise in the zero model. Severity of motor symptoms was related to frequency of exercise. Self-efficacy contributed to the probability of exercise. Symptom management and improvement of self-efficacy for exercise are important to encourage regular exercise in patients with Parkinson's disease. Copyright © 2015 Elsevier Inc. All rights reserved.
Introduction to regression graphics
Cook, R Dennis
2009-01-01
Covers the use of dynamic and interactive computer graphics in linear regression analysis, focusing on analytical graphics. Features new techniques like plot rotation. The authors have composed their own regression code, using Xlisp-Stat language called R-code, which is a nearly complete system for linear regression analysis and can be utilized as the main computer program in a linear regression course. The accompanying disks, for both Macintosh and Windows computers, contain the R-code and Xlisp-Stat. An Instructor's Manual presenting detailed solutions to all the problems in the book is ava
Alternative Methods of Regression
Birkes, David
2011-01-01
Of related interest. Nonlinear Regression Analysis and its Applications Douglas M. Bates and Donald G. Watts ".an extraordinary presentation of concepts and methods concerning the use and analysis of nonlinear regression models.highly recommend[ed].for anyone needing to use and/or understand issues concerning the analysis of nonlinear regression models." --Technometrics This book provides a balance between theory and practice supported by extensive displays of instructive geometrical constructs. Numerous in-depth case studies illustrate the use of nonlinear regression analysis--with all data s
Islam, Mohammad Mafijul; Alam, Morshed; Tariquzaman, Md; Kabir, Mohammad Alamgir; Pervin, Rokhsona; Begum, Munni; Khan, Md Mobarak Hossain
2013-01-08
Malnutrition is one of the principal causes of child mortality in developing countries including Bangladesh. According to our knowledge, most of the available studies, that addressed the issue of malnutrition among under-five children, considered the categorical (dichotomous/polychotomous) outcome variables and applied logistic regression (binary/multinomial) to find their predictors. In this study malnutrition variable (i.e. outcome) is defined as the number of under-five malnourished children in a family, which is a non-negative count variable. The purposes of the study are (i) to demonstrate the applicability of the generalized Poisson regression (GPR) model as an alternative of other statistical methods and (ii) to find some predictors of this outcome variable. The data is extracted from the Bangladesh Demographic and Health Survey (BDHS) 2007. Briefly, this survey employs a nationally representative sample which is based on a two-stage stratified sample of households. A total of 4,460 under-five children is analysed using various statistical techniques namely Chi-square test and GPR model. The GPR model (as compared to the standard Poisson regression and negative Binomial regression) is found to be justified to study the above-mentioned outcome variable because of its under-dispersion (variance variable namely mother's education, father's education, wealth index, sanitation status, source of drinking water, and total number of children ever born to a woman. Consistencies of our findings in light of many other studies suggest that the GPR model is an ideal alternative of other statistical models to analyse the number of under-five malnourished children in a family. Strategies based on significant predictors may improve the nutritional status of children in Bangladesh.
Directory of Open Access Journals (Sweden)
Katherine M O'Donnell
Full Text Available Detectability of individual animals is highly variable and nearly always < 1; imperfect detection must be accounted for to reliably estimate population sizes and trends. Hierarchical models can simultaneously estimate abundance and effective detection probability, but there are several different mechanisms that cause variation in detectability. Neglecting temporary emigration can lead to biased population estimates because availability and conditional detection probability are confounded. In this study, we extend previous hierarchical binomial mixture models to account for multiple sources of variation in detectability. The state process of the hierarchical model describes ecological mechanisms that generate spatial and temporal patterns in abundance, while the observation model accounts for the imperfect nature of counting individuals due to temporary emigration and false absences. We illustrate our model's potential advantages, including the allowance of temporary emigration between sampling periods, with a case study of southern red-backed salamanders Plethodon serratus. We fit our model and a standard binomial mixture model to counts of terrestrial salamanders surveyed at 40 sites during 3-5 surveys each spring and fall 2010-2012. Our models generated similar parameter estimates to standard binomial mixture models. Aspect was the best predictor of salamander abundance in our case study; abundance increased as aspect became more northeasterly. Increased time-since-rainfall strongly decreased salamander surface activity (i.e. availability for sampling, while higher amounts of woody cover objects and rocks increased conditional detection probability (i.e. probability of capture, given an animal is exposed to sampling. By explicitly accounting for both components of detectability, we increased congruence between our statistical modeling and our ecological understanding of the system. We stress the importance of choosing survey locations and
Beta-binomial model for meta-analysis of odds ratios.
Bakbergenuly, Ilyas; Kulinskaya, Elena
2017-05-20
In meta-analysis of odds ratios (ORs), heterogeneity between the studies is usually modelled via the additive random effects model (REM). An alternative, multiplicative REM for ORs uses overdispersion. The multiplicative factor in this overdispersion model (ODM) can be interpreted as an intra-class correlation (ICC) parameter. This model naturally arises when the probabilities of an event in one or both arms of a comparative study are themselves beta-distributed, resulting in beta-binomial distributions. We propose two new estimators of the ICC for meta-analysis in this setting. One is based on the inverted Breslow-Day test, and the other on the improved gamma approximation by Kulinskaya and Dollinger (2015, p. 26) to the distribution of Cochran's Q. The performance of these and several other estimators of ICC on bias and coverage is studied by simulation. Additionally, the Mantel-Haenszel approach to estimation of ORs is extended to the beta-binomial model, and we study performance of various ICC estimators when used in the Mantel-Haenszel or the inverse-variance method to combine ORs in meta-analysis. The results of the simulations show that the improved gamma-based estimator of ICC is superior for small sample sizes, and the Breslow-Day-based estimator is the best for n⩾100. The Mantel-Haenszel-based estimator of OR is very biased and is not recommended. The inverse-variance approach is also somewhat biased for ORs≠1, but this bias is not very large in practical settings. Developed methods and R programs, provided in the Web Appendix, make the beta-binomial model a feasible alternative to the standard REM for meta-analysis of ORs. © 2017 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd. © 2017 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.
Constructing Binomial Trees Via Random Maps for Analysis of Financial Assets
Directory of Open Access Journals (Sweden)
Antonio Airton Carneiro de Freitas
2010-04-01
Full Text Available Random maps can be constructed from a priori knowledge of the financial assets. It is also addressed the reverse problem, i.e. from a function of an empirical stationary probability density function we set up a random map that naturally leads to an implied binomial tree, allowing the adjustment of models, including the ability to incorporate jumps. An applica- tion related to the options market is presented. It is emphasized that the quality of the model to incorporate a priori knowledge of the financial asset may be affected, for example, by the skewed vision of the analyst.
Directory of Open Access Journals (Sweden)
M. Subbiah
2017-01-01
Full Text Available Extensive statistical practice has shown the importance and relevance of the inferential problem of estimating probability parameters in a binomial experiment; especially on the issues of competing intervals from frequentist, Bayesian, and Bootstrap approaches. The package written in the free R environment and presented in this paper tries to take care of the issues just highlighted, by pooling a number of widely available and well-performing methods and apporting on them essential variations. A wide range of functions helps users with differing skills to estimate, evaluate, summarize, numerically and graphically, various measures adopting either the frequentist or the Bayesian paradigm.
“Micro” Phraseology in Action: A Look at Fixed Binomials
Directory of Open Access Journals (Sweden)
Dušan Gabrovšek
2011-05-01
Full Text Available Multiword items in English are a motley crew, as they are not only numerous but also structurally, semantically, and functionally diverse. The paper offers a fresh look at fixed binomials, an intriguing and unexpectedly heterogeneous phraseological type prototypically consisting of two lexical components with the coordinating conjunction and – less commonly but, or, (neither/ (nor – acting as the connecting element, as e.g. in body and soul, slowly but surely, sooner or later, neither fish nor fowl. In particular, their idiomaticity and lexicographical significance are highlighted, while the cross-linguistic perspective is only outlined.
Use of negative binomial distribution to describe the presence of Anisakis in Thyrsites atun.
Peña-Rehbein, Patricio; De los Ríos-Escalante, Patricio
2012-01-01
Nematodes of the genus Anisakis have marine fishes as intermediate hosts. One of these hosts is Thyrsites atun, an important fishery resource in Chile between 38 and 41° S. This paper describes the frequency and number of Anisakis nematodes in the internal organs of Thyrsites atun. An analysis based on spatial distribution models showed that the parasites tend to be clustered. The variation in the number of parasites per host could be described by the negative binomial distribution. The maximum observed number of parasites was nine parasites per host. The environmental and zoonotic aspects of the study are also discussed.
Binomial tree method for pricing a regime-switching volatility stock loans
Putri, Endah R. M.; Zamani, Muhammad S.; Utomo, Daryono B.
2018-03-01
Binomial model with regime switching may represents the price of stock loan which follows the stochastic process. Stock loan is one of alternative that appeal investors to get the liquidity without selling the stock. The stock loan mechanism resembles that of American call option when someone can exercise any time during the contract period. From the resembles both of mechanism, determination price of stock loan can be interpreted from the model of American call option. The simulation result shows the behavior of the price of stock loan under a regime-switching with respect to various interest rate and maturity.
Directory of Open Access Journals (Sweden)
F.M.O. Borges
2003-12-01
que significou pouca influência da metodologia sobre essa medida. A FDN não mostrou ser melhor preditor de EM do que a FB.One experiment was run with broiler chickens, to obtain prediction equations for metabolizable energy (ME based on feedstuffs chemical analyses, and determined ME of wheat grain and its by-products, using four different methodologies. Seven wheat grain by-products were used in five treatments: wheat grain, wheat germ, white wheat flour, dark wheat flour, wheat bran for human use, wheat bran for animal use and rough wheat bran. Based on chemical analyses of crude fiber (CF, ether extract (EE, crude protein (CP, ash (AS and starch (ST of the feeds and the determined values of apparent energy (MEA, true energy (MEV, apparent corrected energy (MEAn and true energy corrected by nitrogen balance (MEVn in five treatments, prediction equations were obtained using the stepwise procedure. CF showed the best relationship with metabolizable energy values, however, this variable alone was not enough for a good estimate of the energy values (R² below 0.80. When EE and CP were included in the equations, R² increased to 0.90 or higher in most estimates. When the equations were calculated with all treatments, the equation for MEA were less precise and R² decreased. When ME data of the traditional or force-feeding methods were used separately, the precision of the equations increases (R² higher than 0.85. For MEV and MEVn values, the best multiple linear equations included CF, EE and CP (R²>0.90, independently of using all experimental data or separating by methodology. The estimates of MEVn values showed high precision and the linear coefficients (a of the equations were similar for all treatments or methodologies. Therefore, it explains the small influence of the different methodologies on this parameter. NDF was not a better predictor of ME than CF.
A Simulation Investigation of Principal Component Regression.
Allen, David E.
Regression analysis is one of the more common analytic tools used by researchers. However, multicollinearity between the predictor variables can cause problems in using the results of regression analyses. Problems associated with multicollinearity include entanglement of relative influences of variables due to reduced precision of estimation,…
Chang, Yu-Wei; Tsong, Yi; Zhao, Zhigen
2017-01-01
Assessing equivalence or similarity has drawn much attention recently as many drug products have lost or will lose their patents in the next few years, especially certain best-selling biologics. To claim equivalence between the test treatment and the reference treatment when assay sensitivity is well established from historical data, one has to demonstrate both superiority of the test treatment over placebo and equivalence between the test treatment and the reference treatment. Thus, there is urgency for practitioners to derive a practical way to calculate sample size for a three-arm equivalence trial. The primary endpoints of a clinical trial may not always be continuous, but may be discrete. In this paper, the authors derive power function and discuss sample size requirement for a three-arm equivalence trial with Poisson and negative binomial clinical endpoints. In addition, the authors examine the effect of the dispersion parameter on the power and the sample size by varying its coefficient from small to large. In extensive numerical studies, the authors demonstrate that required sample size heavily depends on the dispersion parameter. Therefore, misusing a Poisson model for negative binomial data may easily lose power up to 20%, depending on the value of the dispersion parameter.
Estimation of component failure probability from masked binomial system testing data
International Nuclear Information System (INIS)
Tan Zhibin
2005-01-01
The component failure probability estimates from analysis of binomial system testing data are very useful because they reflect the operational failure probability of components in the field which is similar to the test environment. In practice, this type of analysis is often confounded by the problem of data masking: the status of tested components is unknown. Methods in considering this type of uncertainty are usually computationally intensive and not practical to solve the problem for complex systems. In this paper, we consider masked binomial system testing data and develop a probabilistic model to efficiently estimate component failure probabilities. In the model, all system tests are classified into test categories based on component coverage. Component coverage of test categories is modeled by a bipartite graph. Test category failure probabilities conditional on the status of covered components are defined. An EM algorithm to estimate component failure probabilities is developed based on a simple but powerful concept: equivalent failures and tests. By simulation we not only demonstrate the convergence and accuracy of the algorithm but also show that the probabilistic model is capable of analyzing systems in series, parallel and any other user defined structures. A case study illustrates an application in test case prioritization
Marginal likelihood estimation of negative binomial parameters with applications to RNA-seq data.
León-Novelo, Luis; Fuentes, Claudio; Emerson, Sarah
2017-10-01
RNA-Seq data characteristically exhibits large variances, which need to be appropriately accounted for in any proposed model. We first explore the effects of this variability on the maximum likelihood estimator (MLE) of the dispersion parameter of the negative binomial distribution, and propose instead to use an estimator obtained via maximization of the marginal likelihood in a conjugate Bayesian framework. We show, via simulation studies, that the marginal MLE can better control this variation and produce a more stable and reliable estimator. We then formulate a conjugate Bayesian hierarchical model, and use this new estimator to propose a Bayesian hypothesis test to detect differentially expressed genes in RNA-Seq data. We use numerical studies to show that our much simpler approach is competitive with other negative binomial based procedures, and we use a real data set to illustrate the implementation and flexibility of the procedure. © The Author 2017. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Bayesian analysis of overdispersed chromosome aberration data with the negative binomial model
International Nuclear Information System (INIS)
Brame, R.S.; Groer, P.G.
2002-01-01
The usual assumption of a Poisson model for the number of chromosome aberrations in controlled calibration experiments implies variance equal to the mean. However, it is known that chromosome aberration data from experiments involving high linear energy transfer radiations can be overdispersed, i.e. the variance is greater than the mean. Present methods for dealing with overdispersed chromosome data rely on frequentist statistical techniques. In this paper, the problem of overdispersion is considered from a Bayesian standpoint. The Bayes Factor is used to compare Poisson and negative binomial models for two previously published calibration data sets describing the induction of dicentric chromosome aberrations by high doses of neutrons. Posterior densities for the model parameters, which characterise dose response and overdispersion are calculated and graphed. Calibrative densities are derived for unknown neutron doses from hypothetical radiation accident data to determine the impact of different model assumptions on dose estimates. The main conclusion is that an initial assumption of a negative binomial model is the conservative approach to chromosome dosimetry for high LET radiations. (author)
International Nuclear Information System (INIS)
Shultis, J.K.; Buranapan, W.; Eckhoff, N.D.
1981-12-01
Of considerable importance in the safety analysis of nuclear power plants are methods to estimate the probability of failure-on-demand, p, of a plant component that normally is inactive and that may fail when activated or stressed. Properties of five methods for estimating from failure-on-demand data the parameters of the beta prior distribution in a compound beta-binomial probability model are examined. Simulated failure data generated from a known beta-binomial marginal distribution are used to estimate values of the beta parameters by (1) matching moments of the prior distribution to those of the data, (2) the maximum likelihood method based on the prior distribution, (3) a weighted marginal matching moments method, (4) an unweighted marginal matching moments method, and (5) the maximum likelihood method based on the marginal distribution. For small sample sizes (N = or < 10) with data typical of low failure probability components, it was found that the simple prior matching moments method is often superior (e.g. smallest bias and mean squared error) while for larger sample sizes the marginal maximum likelihood estimators appear to be best
International Nuclear Information System (INIS)
Zhang Yu; Wang Guangyi; Lu Xinmiao; Hu Yongcai; Xu Jiangtao
2016-01-01
The random telegraph signal noise in the pixel source follower MOSFET is the principle component of the noise in the CMOS image sensor under low light. In this paper, the physical and statistical model of the random telegraph signal noise in the pixel source follower based on the binomial distribution is set up. The number of electrons captured or released by the oxide traps in the unit time is described as the random variables which obey the binomial distribution. As a result, the output states and the corresponding probabilities of the first and the second samples of the correlated double sampling circuit are acquired. The standard deviation of the output states after the correlated double sampling circuit can be obtained accordingly. In the simulation section, one hundred thousand samples of the source follower MOSFET have been simulated, and the simulation results show that the proposed model has the similar statistical characteristics with the existing models under the effect of the channel length and the density of the oxide trap. Moreover, the noise histogram of the proposed model has been evaluated at different environmental temperatures. (paper)
Ma, Zhuanglin; Zhang, Honglu; Chien, Steven I-Jy; Wang, Jin; Dong, Chunjiao
2017-01-01
To investigate the relationship between crash frequency and potential influence factors, the accident data for events occurring on a 50km long expressway in China, including 567 crash records (2006-2008), were collected and analyzed. Both the fixed-length and the homogeneous longitudinal grade methods were applied to divide the study expressway section into segments. A negative binomial (NB) model and a random effect negative binomial (RENB) model were developed to predict crash frequency. The parameters of both models were determined using the maximum likelihood (ML) method, and the mixed stepwise procedure was applied to examine the significance of explanatory variables. Three explanatory variables, including longitudinal grade, road width, and ratio of longitudinal grade and curve radius (RGR), were found as significantly affecting crash frequency. The marginal effects of significant explanatory variables to the crash frequency were analyzed. The model performance was determined by the relative prediction error and the cumulative standardized residual. The results show that the RENB model outperforms the NB model. It was also found that the model performance with the fixed-length segment method is superior to that with the homogeneous longitudinal grade segment method. Copyright © 2016. Published by Elsevier Ltd.
Directory of Open Access Journals (Sweden)
David Shilane
2013-01-01
Full Text Available The negative binomial distribution becomes highly skewed under extreme dispersion. Even at moderately large sample sizes, the sample mean exhibits a heavy right tail. The standard normal approximation often does not provide adequate inferences about the data's expected value in this setting. In previous work, we have examined alternative methods of generating confidence intervals for the expected value. These methods were based upon Gamma and Chi Square approximations or tail probability bounds such as Bernstein's inequality. We now propose growth estimators of the negative binomial mean. Under high dispersion, zero values are likely to be overrepresented in the data. A growth estimator constructs a normal-style confidence interval by effectively removing a small, predetermined number of zeros from the data. We propose growth estimators based upon multiplicative adjustments of the sample mean and direct removal of zeros from the sample. These methods do not require estimating the nuisance dispersion parameter. We will demonstrate that the growth estimators' confidence intervals provide improved coverage over a wide range of parameter values and asymptotically converge to the sample mean. Interestingly, the proposed methods succeed despite adding both bias and variance to the normal approximation.
Weisberg, Sanford
2013-01-01
Praise for the Third Edition ""...this is an excellent book which could easily be used as a course text...""-International Statistical Institute The Fourth Edition of Applied Linear Regression provides a thorough update of the basic theory and methodology of linear regression modeling. Demonstrating the practical applications of linear regression analysis techniques, the Fourth Edition uses interesting, real-world exercises and examples. Stressing central concepts such as model building, understanding parameters, assessing fit and reliability, and drawing conclusions, the new edition illus
Hosmer, David W; Sturdivant, Rodney X
2013-01-01
A new edition of the definitive guide to logistic regression modeling for health science and other applications This thoroughly expanded Third Edition provides an easily accessible introduction to the logistic regression (LR) model and highlights the power of this model by examining the relationship between a dichotomous outcome and a set of covariables. Applied Logistic Regression, Third Edition emphasizes applications in the health sciences and handpicks topics that best suit the use of modern statistical software. The book provides readers with state-of-
Multilingual speaker age recognition: regression analyses on the Lwazi corpus
CSIR Research Space (South Africa)
Feld, M
2009-12-01
Full Text Available Multilinguality represents an area of significant opportunities for automatic speech-processing systems: whereas multilingual societies are commonplace, the majority of speechprocessing systems are developed with a single language in mind. As a step...
Liu, Lian; Zhang, Shao-Wu; Huang, Yufei; Meng, Jia
2017-08-31
As a newly emerged research area, RNA epigenetics has drawn increasing attention recently for the participation of RNA methylation and other modifications in a number of crucial biological processes. Thanks to high throughput sequencing techniques, such as, MeRIP-Seq, transcriptome-wide RNA methylation profile is now available in the form of count-based data, with which it is often of interests to study the dynamics at epitranscriptomic layer. However, the sample size of RNA methylation experiment is usually very small due to its costs; and additionally, there usually exist a large number of genes whose methylation level cannot be accurately estimated due to their low expression level, making differential RNA methylation analysis a difficult task. We present QNB, a statistical approach for differential RNA methylation analysis with count-based small-sample sequencing data. Compared with previous approaches such as DRME model based on a statistical test covering the IP samples only with 2 negative binomial distributions, QNB is based on 4 independent negative binomial distributions with their variances and means linked by local regressions, and in the way, the input control samples are also properly taken care of. In addition, different from DRME approach, which relies only the input control sample only for estimating the background, QNB uses a more robust estimator for gene expression by combining information from both input and IP samples, which could largely improve the testing performance for very lowly expressed genes. QNB showed improved performance on both simulated and real MeRIP-Seq datasets when compared with competing algorithms. And the QNB model is also applicable to other datasets related RNA modifications, including but not limited to RNA bisulfite sequencing, m 1 A-Seq, Par-CLIP, RIP-Seq, etc.
The effect of the negative binomial distribution on the line-width of the micromaser cavity field
International Nuclear Information System (INIS)
Kremid, A. M.
2009-01-01
The influence of negative binomial distribution (NBD) on the line-width of the negative binomial distribution (NBD) on the line-width of the micromaser is considered. The threshold of the micromaser is shifted towards higher values of the pumping parameter q. Moreover the line-width exhibits sharp dips 'resonances' when the cavity temperature reduces to a very low value. These dips are very clear evidence for the occurrence of the so-called trapping states regime in the micromaser. This statistics prevents the appearance of these trapping states, namely by increasing the negative binomial parameter q these dips wash out and the line-width becomes more broadening. For small values of the parameter q the line-width at large values of q randomly oscillates around its transition line. As q becomes large this oscillatory behavior occurs at rarely values of q. (author)
Directory of Open Access Journals (Sweden)
Mok Tik
2014-06-01
Full Text Available This study formulates regression of vector data that will enable statistical analysis of various geodetic phenomena such as, polar motion, ocean currents, typhoon/hurricane tracking, crustal deformations, and precursory earthquake signals. The observed vector variable of an event (dependent vector variable is expressed as a function of a number of hypothesized phenomena realized also as vector variables (independent vector variables and/or scalar variables that are likely to impact the dependent vector variable. The proposed representation has the unique property of solving the coefficients of independent vector variables (explanatory variables also as vectors, hence it supersedes multivariate multiple regression models, in which the unknown coefficients are scalar quantities. For the solution, complex numbers are used to rep- resent vector information, and the method of least squares is deployed to estimate the vector model parameters after transforming the complex vector regression model into a real vector regression model through isomorphism. Various operational statistics for testing the predictive significance of the estimated vector parameter coefficients are also derived. A simple numerical example demonstrates the use of the proposed vector regression analysis in modeling typhoon paths.
Multicollinearity and Regression Analysis
Daoud, Jamal I.
2017-12-01
In regression analysis it is obvious to have a correlation between the response and predictor(s), but having correlation among predictors is something undesired. The number of predictors included in the regression model depends on many factors among which, historical data, experience, etc. At the end selection of most important predictors is something objective due to the researcher. Multicollinearity is a phenomena when two or more predictors are correlated, if this happens, the standard error of the coefficients will increase [8]. Increased standard errors means that the coefficients for some or all independent variables may be found to be significantly different from In other words, by overinflating the standard errors, multicollinearity makes some variables statistically insignificant when they should be significant. In this paper we focus on the multicollinearity, reasons and consequences on the reliability of the regression model.
A distribución binomial como ferramenta na resolución de problemas de xenética
Ron Pedreira, Antonio Miguel de; Martínez, Ana María
1999-01-01
La distribución binomial presenta una amplia gama de campos de aplicación debido a que en situaciones cotidianas, se presenta con elevada frecuencia algún tipo de situación basada en dos hechos diferentes, alternativos, excluyentes y con probablilidades que suman 1 (cien por cien), es decir, el hecho cierto. En cuanto a la genética, pueden encontrarse supuestos que permitan la aplicación de la distribución binomial en ámbitos de la genética molecular, mendeliana, cuantitativa y genética de po...
Analysis of dental caries using generalized linear and count regression models
Directory of Open Access Journals (Sweden)
Javali M. Phil
2013-11-01
Full Text Available Generalized linear models (GLM are generalization of linear regression models, which allow fitting regression models to response data in all the sciences especially medical and dental sciences that follow a general exponential family. These are flexible and widely used class of such models that can accommodate response variables. Count data are frequently characterized by overdispersion and excess zeros. Zero-inflated count models provide a parsimonious yet powerful way to model this type of situation. Such models assume that the data are a mixture of two separate data generation processes: one generates only zeros, and the other is either a Poisson or a negative binomial data-generating process. Zero inflated count regression models such as the zero-inflated Poisson (ZIP, zero-inflated negative binomial (ZINB regression models have been used to handle dental caries count data with many zeros. We present an evaluation framework to the suitability of applying the GLM, Poisson, NB, ZIP and ZINB to dental caries data set where the count data may exhibit evidence of many zeros and over-dispersion. Estimation of the model parameters using the method of maximum likelihood is provided. Based on the Vuong test statistic and the goodness of fit measure for dental caries data, the NB and ZINB regression models perform better than other count regression models.
DEFF Research Database (Denmark)
Bache, Stefan Holst
A new and alternative quantile regression estimator is developed and it is shown that the estimator is root n-consistent and asymptotically normal. The estimator is based on a minimax ‘deviance function’ and has asymptotically equivalent properties to the usual quantile regression estimator. It is......, however, a different and therefore new estimator. It allows for both linear- and nonlinear model specifications. A simple algorithm for computing the estimates is proposed. It seems to work quite well in practice but whether it has theoretical justification is still an open question....
DEFF Research Database (Denmark)
Ozenne, Brice; Sørensen, Anne Lyngholm; Scheike, Thomas
2017-01-01
In the presence of competing risks a prediction of the time-dynamic absolute risk of an event can be based on cause-specific Cox regression models for the event and the competing risks (Benichou and Gail, 1990). We present computationally fast and memory optimized C++ functions with an R interface...... for predicting the covariate specific absolute risks, their confidence intervals, and their confidence bands based on right censored time to event data. We provide explicit formulas for our implementation of the estimator of the (stratified) baseline hazard function in the presence of tied event times. As a by...... functionals. The software presented here is implemented in the riskRegression package....
The multi-class binomial failure rate model for the treatment of common-cause failures
International Nuclear Information System (INIS)
Hauptmanns, U.
1995-01-01
The impact of common cause failures (CCF) on PSA results for NPPs is in sharp contrast with the limited quality which can be achieved in their assessment. This is due to the dearth of observations and cannot be remedied in the short run. Therefore the methods employed for calculating failure rates should be devised such as to make the best use of the few available observations on CCF. The Multi-Class Binomial Failure Rate (MCBFR) Model achieves this by assigning observed failures to different classes according to their technical characteristics and applying the BFR formalism to each of these. The results are hence determined by a superposition of BFR type expressions for each class, each of them with its own coupling factor. The model thus obtained flexibly reproduces the dependence of CCF rates on failure multiplicity insinuated by the observed failure multiplicities. This is demonstrated by evaluating CCFs observed for combined impulse pilot valves in German NPPs. (orig.) [de
Xie, Wen-Jie; Han, Rui-Qi; Jiang, Zhi-Qiang; Wei, Lijian; Zhou, Wei-Xing
2017-08-01
Complex network is not only a powerful tool for the analysis of complex system, but also a promising way to analyze time series. The algorithm of horizontal visibility graph (HVG) maps time series into graphs, whose degree distributions are numerically and analytically investigated for certain time series. We derive the degree distributions of HVGs through an iterative construction process of HVGs. The degree distributions of the HVG and the directed HVG for random series are derived to be exponential, which confirms the analytical results from other methods. We also obtained the analytical expressions of degree distributions of HVGs and in-degree and out-degree distributions of directed HVGs transformed from multifractal binomial measures, which agree excellently with numerical simulations.
Temporary disaster debris management site identification using binomial cluster analysis and GIS.
Grzeda, Stanislaw; Mazzuchi, Thomas A; Sarkani, Shahram
2014-04-01
An essential component of disaster planning and preparation is the identification and selection of temporary disaster debris management sites (DMS). However, since DMS identification is a complex process involving numerous variable constraints, many regional, county and municipal jurisdictions initiate this process during the post-disaster response and recovery phases, typically a period of severely stressed resources. Hence, a pre-disaster approach in identifying the most likely sites based on the number of locational constraints would significantly contribute to disaster debris management planning. As disasters vary in their nature, location and extent, an effective approach must facilitate scalability, flexibility and adaptability to variable local requirements, while also being generalisable to other regions and geographical extents. This study demonstrates the use of binomial cluster analysis in potential DMS identification in a case study conducted in Hamilton County, Indiana. © 2014 The Author(s). Disasters © Overseas Development Institute, 2014.
A comparison of LMC and SDL complexity measures on binomial distributions
Piqueira, José Roberto C.
2016-02-01
The concept of complexity has been widely discussed in the last forty years, with a lot of thinking contributions coming from all areas of the human knowledge, including Philosophy, Linguistics, History, Biology, Physics, Chemistry and many others, with mathematicians trying to give a rigorous view of it. In this sense, thermodynamics meets information theory and, by using the entropy definition, López-Ruiz, Mancini and Calbet proposed a definition for complexity that is referred as LMC measure. Shiner, Davison and Landsberg, by slightly changing the LMC definition, proposed the SDL measure and the both, LMC and SDL, are satisfactory to measure complexity for a lot of problems. Here, SDL and LMC measures are applied to the case of a binomial probability distribution, trying to clarify how the length of the data set implies complexity and how the success probability of the repeated trials determines how complex the whole set is.
Directory of Open Access Journals (Sweden)
Xiong Wang
2013-01-01
Full Text Available Based on characteristics of the nonlife joint-stock insurance company, this paper presents a compound binomial risk model that randomizes the premium income on unit time and sets the threshold for paying dividends to shareholders. In this model, the insurance company obtains the insurance policy in unit time with probability and pays dividends to shareholders with probability when the surplus is no less than . We then derive the recursive formulas of the expected discounted penalty function and the asymptotic estimate for it. And we will derive the recursive formulas and asymptotic estimates for the ruin probability and the distribution function of the deficit at ruin. The numerical examples have been shown to illustrate the accuracy of the asymptotic estimations.
Prediction, Regression and Critical Realism
DEFF Research Database (Denmark)
Næss, Petter
2004-01-01
This paper considers the possibility of prediction in land use planning, and the use of statistical research methods in analyses of relationships between urban form and travel behaviour. Influential writers within the tradition of critical realism reject the possibility of predicting social...... phenomena. This position is fundamentally problematic to public planning. Without at least some ability to predict the likely consequences of different proposals, the justification for public sector intervention into market mechanisms will be frail. Statistical methods like regression analyses are commonly...... seen as necessary in order to identify aggregate level effects of policy measures, but are questioned by many advocates of critical realist ontology. Using research into the relationship between urban structure and travel as an example, the paper discusses relevant research methods and the kinds...
Multiple linear regression analysis
Edwards, T. R.
1980-01-01
Program rapidly selects best-suited set of coefficients. User supplies only vectors of independent and dependent data and specifies confidence level required. Program uses stepwise statistical procedure for relating minimal set of variables to set of observations; final regression contains only most statistically significant coefficients. Program is written in FORTRAN IV for batch execution and has been implemented on NOVA 1200.
Bayesian logistic regression analysis
Van Erp, H.R.N.; Van Gelder, P.H.A.J.M.
2012-01-01
In this paper we present a Bayesian logistic regression analysis. It is found that if one wishes to derive the posterior distribution of the probability of some event, then, together with the traditional Bayes Theorem and the integrating out of nuissance parameters, the Jacobian transformation is an
Seber, George A F
2012-01-01
Concise, mathematically clear, and comprehensive treatment of the subject.* Expanded coverage of diagnostics and methods of model fitting.* Requires no specialized knowledge beyond a good grasp of matrix algebra and some acquaintance with straight-line regression and simple analysis of variance models.* More than 200 problems throughout the book plus outline solutions for the exercises.* This revision has been extensively class-tested.
Ritz, Christian; Parmigiani, Giovanni
2009-01-01
R is a rapidly evolving lingua franca of graphical display and statistical analysis of experiments from the applied sciences. This book provides a coherent treatment of nonlinear regression with R by means of examples from a diversity of applied sciences such as biology, chemistry, engineering, medicine and toxicology.
Bayesian ARTMAP for regression.
Sasu, L M; Andonie, R
2013-10-01
Bayesian ARTMAP (BA) is a recently introduced neural architecture which uses a combination of Fuzzy ARTMAP competitive learning and Bayesian learning. Training is generally performed online, in a single-epoch. During training, BA creates input data clusters as Gaussian categories, and also infers the conditional probabilities between input patterns and categories, and between categories and classes. During prediction, BA uses Bayesian posterior probability estimation. So far, BA was used only for classification. The goal of this paper is to analyze the efficiency of BA for regression problems. Our contributions are: (i) we generalize the BA algorithm using the clustering functionality of both ART modules, and name it BA for Regression (BAR); (ii) we prove that BAR is a universal approximator with the best approximation property. In other words, BAR approximates arbitrarily well any continuous function (universal approximation) and, for every given continuous function, there is one in the set of BAR approximators situated at minimum distance (best approximation); (iii) we experimentally compare the online trained BAR with several neural models, on the following standard regression benchmarks: CPU Computer Hardware, Boston Housing, Wisconsin Breast Cancer, and Communities and Crime. Our results show that BAR is an appropriate tool for regression tasks, both for theoretical and practical reasons. Copyright © 2013 Elsevier Ltd. All rights reserved.
Bounded Gaussian process regression
DEFF Research Database (Denmark)
Jensen, Bjørn Sand; Nielsen, Jens Brehm; Larsen, Jan
2013-01-01
We extend the Gaussian process (GP) framework for bounded regression by introducing two bounded likelihood functions that model the noise on the dependent variable explicitly. This is fundamentally different from the implicit noise assumption in the previously suggested warped GP framework. We...... with the proposed explicit noise-model extension....
and Multinomial Logistic Regression
African Journals Online (AJOL)
This work presented the results of an experimental comparison of two models: Multinomial Logistic Regression (MLR) and Artificial Neural Network (ANN) for classifying students based on their academic performance. The predictive accuracy for each model was measured by their average Classification Correct Rate (CCR).
Mechanisms of neuroblastoma regression
Brodeur, Garrett M.; Bagatell, Rochelle
2014-01-01
Recent genomic and biological studies of neuroblastoma have shed light on the dramatic heterogeneity in the clinical behaviour of this disease, which spans from spontaneous regression or differentiation in some patients, to relentless disease progression in others, despite intensive multimodality therapy. This evidence also suggests several possible mechanisms to explain the phenomena of spontaneous regression in neuroblastomas, including neurotrophin deprivation, humoral or cellular immunity, loss of telomerase activity and alterations in epigenetic regulation. A better understanding of the mechanisms of spontaneous regression might help to identify optimal therapeutic approaches for patients with these tumours. Currently, the most druggable mechanism is the delayed activation of developmentally programmed cell death regulated by the tropomyosin receptor kinase A pathway. Indeed, targeted therapy aimed at inhibiting neurotrophin receptors might be used in lieu of conventional chemotherapy or radiation in infants with biologically favourable tumours that require treatment. Alternative approaches consist of breaking immune tolerance to tumour antigens or activating neurotrophin receptor pathways to induce neuronal differentiation. These approaches are likely to be most effective against biologically favourable tumours, but they might also provide insights into treatment of biologically unfavourable tumours. We describe the different mechanisms of spontaneous neuroblastoma regression and the consequent therapeutic approaches. PMID:25331179
Brian S. Cade; Barry R. Noon; Rick D. Scherer; John J. Keane
2017-01-01
Counts of avian fledglings, nestlings, or clutch size that are bounded below by zero and above by some small integer form a discrete random variable distribution that is not approximated well by conventional parametric count distributions such as the Poisson or negative binomial. We developed a logistic quantile regression model to provide estimates of the empirical...
Ridge Regression Signal Processing
Kuhl, Mark R.
1990-01-01
The introduction of the Global Positioning System (GPS) into the National Airspace System (NAS) necessitates the development of Receiver Autonomous Integrity Monitoring (RAIM) techniques. In order to guarantee a certain level of integrity, a thorough understanding of modern estimation techniques applied to navigational problems is required. The extended Kalman filter (EKF) is derived and analyzed under poor geometry conditions. It was found that the performance of the EKF is difficult to predict, since the EKF is designed for a Gaussian environment. A novel approach is implemented which incorporates ridge regression to explain the behavior of an EKF in the presence of dynamics under poor geometry conditions. The basic principles of ridge regression theory are presented, followed by the derivation of a linearized recursive ridge estimator. Computer simulations are performed to confirm the underlying theory and to provide a comparative analysis of the EKF and the recursive ridge estimator.
Subset selection in regression
Miller, Alan
2002-01-01
Originally published in 1990, the first edition of Subset Selection in Regression filled a significant gap in the literature, and its critical and popular success has continued for more than a decade. Thoroughly revised to reflect progress in theory, methods, and computing power, the second edition promises to continue that tradition. The author has thoroughly updated each chapter, incorporated new material on recent developments, and included more examples and references. New in the Second Edition:A separate chapter on Bayesian methodsComplete revision of the chapter on estimationA major example from the field of near infrared spectroscopyMore emphasis on cross-validationGreater focus on bootstrappingStochastic algorithms for finding good subsets from large numbers of predictors when an exhaustive search is not feasible Software available on the Internet for implementing many of the algorithms presentedMore examplesSubset Selection in Regression, Second Edition remains dedicated to the techniques for fitting...
Pradhan, Vivek; Saha, Krishna K; Banerjee, Tathagata; Evans, John C
2014-07-30
Inference on the difference between two binomial proportions in the paired binomial setting is often an important problem in many biomedical investigations. Tang et al. (2010, Statistics in Medicine) discussed six methods to construct confidence intervals (henceforth, we abbreviate it as CI) for the difference between two proportions in paired binomial setting using method of variance estimates recovery. In this article, we propose weighted profile likelihood-based CIs for the difference between proportions of a paired binomial distribution. However, instead of the usual likelihood, we use weighted likelihood that is essentially making adjustments to the cell frequencies of a 2 × 2 table in the spirit of Agresti and Min (2005, Statistics in Medicine). We then conduct numerical studies to compare the performances of the proposed CIs with that of Tang et al. and Agresti and Min in terms of coverage probabilities and expected lengths. Our numerical study clearly indicates that the weighted profile likelihood-based intervals and Jeffreys interval (cf. Tang et al.) are superior in terms of achieving the nominal level, and in terms of expected lengths, they are competitive. Finally, we illustrate the use of the proposed CIs with real-life examples. Copyright © 2014 John Wiley & Sons, Ltd.
Directory of Open Access Journals (Sweden)
Jianwei Zhou
2014-01-01
Full Text Available The explicit formulae of spectral norms for circulant-type matrices are investigated; the matrices are circulant matrix, skew-circulant matrix, and g-circulant matrix, respectively. The entries are products of binomial coefficients with harmonic numbers. Explicit identities for these spectral norms are obtained. Employing these approaches, some numerical tests are listed to verify the results.
An algorithm for sequential tail value at risk for path-independent payoffs in a binomial tree
Roorda, Berend
2010-01-01
We present an algorithm that determines Sequential Tail Value at Risk (STVaR) for path-independent payoffs in a binomial tree. STVaR is a dynamic version of Tail-Value-at-Risk (TVaR) characterized by the property that risk levels at any moment must be in the range of risk levels later on. The
Regression in organizational leadership.
Kernberg, O F
1979-02-01
The choice of good leaders is a major task for all organizations. Inforamtion regarding the prospective administrator's personality should complement questions regarding his previous experience, his general conceptual skills, his technical knowledge, and the specific skills in the area for which he is being selected. The growing psychoanalytic knowledge about the crucial importance of internal, in contrast to external, object relations, and about the mutual relationships of regression in individuals and in groups, constitutes an important practical tool for the selection of leaders.
Classification and regression trees
Breiman, Leo; Olshen, Richard A; Stone, Charles J
1984-01-01
The methodology used to construct tree structured rules is the focus of this monograph. Unlike many other statistical procedures, which moved from pencil and paper to calculators, this text's use of trees was unthinkable before computers. Both the practical and theoretical sides have been developed in the authors' study of tree methods. Classification and Regression Trees reflects these two sides, covering the use of trees as a data analysis method, and in a more mathematical framework, proving some of their fundamental properties.
KELEŞ, Taliha; ALTUN, Murat
2016-01-01
Regression analysis is a statistical technique for investigating and modeling the relationship between variables. The purpose of this study was the trivial presentation of the equation for orthogonal regression (OR) and the comparison of classical linear regression (CLR) and OR techniques with respect to the sum of squared perpendicular distances. For that purpose, the analyses were shown by an example. It was found that the sum of squared perpendicular distances of OR is smaller. Thus, it wa...
Hilbe, Joseph M
2009-01-01
This book really does cover everything you ever wanted to know about logistic regression … with updates available on the author's website. Hilbe, a former national athletics champion, philosopher, and expert in astronomy, is a master at explaining statistical concepts and methods. Readers familiar with his other expository work will know what to expect-great clarity.The book provides considerable detail about all facets of logistic regression. No step of an argument is omitted so that the book will meet the needs of the reader who likes to see everything spelt out, while a person familiar with some of the topics has the option to skip "obvious" sections. The material has been thoroughly road-tested through classroom and web-based teaching. … The focus is on helping the reader to learn and understand logistic regression. The audience is not just students meeting the topic for the first time, but also experienced users. I believe the book really does meet the author's goal … .-Annette J. Dobson, Biometric...
Lara, Jesus R; Hoddle, Mark S
2015-08-01
Oligonychus perseae Tuttle, Baker, & Abatiello is a foliar pest of 'Hass' avocados [Persea americana Miller (Lauraceae)]. The recommended action threshold is 50-100 motile mites per leaf, but this count range and other ecological factors associated with O. perseae infestations limit the application of enumerative sampling plans in the field. Consequently, a comprehensive modeling approach was implemented to compare the practical application of various binomial sampling models for decision-making of O. perseae in California. An initial set of sequential binomial sampling models were developed using three mean-proportion modeling techniques (i.e., Taylor's power law, maximum likelihood, and an empirical model) in combination with two-leaf infestation tally thresholds of either one or two mites. Model performance was evaluated using a robust mite count database consisting of >20,000 Hass avocado leaves infested with varying densities of O. perseae and collected from multiple locations. Operating characteristic and average sample number results for sequential binomial models were used as the basis to develop and validate a standardized fixed-size binomial sampling model with guidelines on sample tree and leaf selection within blocks of avocado trees. This final validated model requires a leaf sampling cost of 30 leaves and takes into account the spatial dynamics of O. perseae to make reliable mite density classifications for a 50-mite action threshold. Recommendations for implementing this fixed-size binomial sampling plan to assess densities of O. perseae in commercial California avocado orchards are discussed. © The Authors 2015. Published by Oxford University Press on behalf of Entomological Society of America. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Steganalysis using logistic regression
Lubenko, Ivans; Ker, Andrew D.
2011-02-01
We advocate Logistic Regression (LR) as an alternative to the Support Vector Machine (SVM) classifiers commonly used in steganalysis. LR offers more information than traditional SVM methods - it estimates class probabilities as well as providing a simple classification - and can be adapted more easily and efficiently for multiclass problems. Like SVM, LR can be kernelised for nonlinear classification, and it shows comparable classification accuracy to SVM methods. This work is a case study, comparing accuracy and speed of SVM and LR classifiers in detection of LSB Matching and other related spatial-domain image steganography, through the state-of-art 686-dimensional SPAM feature set, in three image sets.
DEFF Research Database (Denmark)
Ozenne, Brice; Sørensen, Anne Lyngholm; Scheike, Thomas
2017-01-01
In the presence of competing risks a prediction of the time-dynamic absolute risk of an event can be based on cause-specific Cox regression models for the event and the competing risks (Benichou and Gail, 1990). We present computationally fast and memory optimized C++ functions with an R interface......-product we obtain fast access to the baseline hazards (compared to survival::basehaz()) and predictions of survival probabilities, their confidence intervals and confidence bands. Confidence intervals and confidence bands are based on point-wise asymptotic expansions of the corresponding statistical...
Adaptive metric kernel regression
DEFF Research Database (Denmark)
Goutte, Cyril; Larsen, Jan
2000-01-01
Kernel smoothing is a widely used non-parametric pattern recognition technique. By nature, it suffers from the curse of dimensionality and is usually difficult to apply to high input dimensions. In this contribution, we propose an algorithm that adapts the input metric used in multivariate...... regression by minimising a cross-validation estimate of the generalisation error. This allows to automatically adjust the importance of different dimensions. The improvement in terms of modelling performance is illustrated on a variable selection task where the adaptive metric kernel clearly outperforms...
Adaptive Metric Kernel Regression
DEFF Research Database (Denmark)
Goutte, Cyril; Larsen, Jan
1998-01-01
Kernel smoothing is a widely used nonparametric pattern recognition technique. By nature, it suffers from the curse of dimensionality and is usually difficult to apply to high input dimensions. In this paper, we propose an algorithm that adapts the input metric used in multivariate regression...... by minimising a cross-validation estimate of the generalisation error. This allows one to automatically adjust the importance of different dimensions. The improvement in terms of modelling performance is illustrated on a variable selection task where the adaptive metric kernel clearly outperforms the standard...
Energy Technology Data Exchange (ETDEWEB)
Schlattmann, Peter [University Hospital of Friedrich-Schiller University Jena, Department of Medical Statistics, Informatics and Documentation, Jena (Germany); Schuetz, Georg M. [Freie Universitaet Berlin, Charite, Medical School, Department of Radiology, Humboldt-Universitaet zu Berlin, Berlin (Germany); Dewey, Marc [Freie Universitaet Berlin, Charite, Medical School, Department of Radiology, Humboldt-Universitaet zu Berlin, Berlin (Germany); Charite, Institut fuer Radiologie, Berlin (Germany)
2011-09-15
To evaluate the impact of coronary artery disease (CAD) prevalence on the predictive values of coronary CT angiography. We performed a meta-regression based on a generalised linear mixed model using the binomial distribution and a logit link to analyse the influence of the prevalence of CAD in published studies on the per-patient negative and positive predictive values of CT in comparison to conventional coronary angiography as the reference standard. A prevalence range in which the negative predictive value was higher than 90%, while at the same time the positive predictive value was higher than 70% was considered appropriate. The summary negative and positive predictive values of coronary CT angiography were 93.7% (95% confidence interval [CI] 92.8-94.5%) and 87.5% (95% CI, 86.5-88.5%), respectively. With 95% confidence, negative and positive predictive values higher than 90% and 70% were available with CT for a CAD prevalence of 18-63%. CT systems with >16 detector rows met these requirements for the positive (P < 0.01) and negative (P < 0.05) predictive values in a significantly broader range than systems with {<=}16 detector rows. It is reasonable to perform coronary CT angiography as a rule-out test in patients with a low-to-intermediate likelihood of disease. (orig.)
International Nuclear Information System (INIS)
Schlattmann, Peter; Schuetz, Georg M.; Dewey, Marc
2011-01-01
To evaluate the impact of coronary artery disease (CAD) prevalence on the predictive values of coronary CT angiography. We performed a meta-regression based on a generalised linear mixed model using the binomial distribution and a logit link to analyse the influence of the prevalence of CAD in published studies on the per-patient negative and positive predictive values of CT in comparison to conventional coronary angiography as the reference standard. A prevalence range in which the negative predictive value was higher than 90%, while at the same time the positive predictive value was higher than 70% was considered appropriate. The summary negative and positive predictive values of coronary CT angiography were 93.7% (95% confidence interval [CI] 92.8-94.5%) and 87.5% (95% CI, 86.5-88.5%), respectively. With 95% confidence, negative and positive predictive values higher than 90% and 70% were available with CT for a CAD prevalence of 18-63%. CT systems with >16 detector rows met these requirements for the positive (P < 0.01) and negative (P < 0.05) predictive values in a significantly broader range than systems with ≤16 detector rows. It is reasonable to perform coronary CT angiography as a rule-out test in patients with a low-to-intermediate likelihood of disease. (orig.)
Directory of Open Access Journals (Sweden)
Branka Remenarić
2018-01-01
Full Text Available In July 2014, the International Accounting Standards Board (IASB published International Financial Reporting Standard 9 Financial Instruments (IFRS 9. This standard introduces an expected credit loss (ECL impairment model that applies to financial instruments, including trade and lease receivables. IFRS 9 applies to annual periods beginning on or after 1 January 2018 in the European Union member states. While the main reason for amending the current model was to require major banks to recognize losses in advance of a credit event occurring, this new model also applies to all receivables, including trade receivables, lease receivables, related party loan receivables in non-financial sector entities. The new impairment model is intended to result in earlier recognition of credit losses. The previous model described in International Accounting Standard 39 Financial instruments (IAS 39 was based on incurred losses. One of the major questions now is what models to use to predict expected credit losses in non-financial sector entities. The purpose of this paper is to research the application of the current impairment model, the extent to which the current impairment model can be modified to satisfy new impairment model requirements and the applicability of the binomial model for measuring expected credit losses from accounts receivable.
Directory of Open Access Journals (Sweden)
Andrei ACHIMAŞ CADARIU
2004-08-01
Full Text Available Assessments of a controlled clinical trial suppose to interpret some key parameters as the controlled event rate, experimental event date, relative risk, absolute risk reduction, relative risk reduction, number needed to treat when the effect of the treatment are dichotomous variables. Defined as the difference in the event rate between treatment and control groups, the absolute risk reduction is the parameter that allowed computing the number needed to treat. The absolute risk reduction is compute when the experimental treatment reduces the risk for an undesirable outcome/event. In medical literature when the absolute risk reduction is report with its confidence intervals, the method used is the asymptotic one, even if it is well know that may be inadequate. The aim of this paper is to introduce and assess nine methods of computing confidence intervals for absolute risk reduction and absolute risk reduction – like function.Computer implementations of the methods use the PHP language. Methods comparison uses the experimental errors, the standard deviations, and the deviation relative to the imposed significance level for specified sample sizes. Six methods of computing confidence intervals for absolute risk reduction and absolute risk reduction-like functions were assessed using random binomial variables and random sample sizes.The experiments shows that the ADAC, and ADAC1 methods obtains the best overall performance of computing confidence intervals for absolute risk reduction.
Forecasting asthma-related hospital admissions in London using negative binomial models.
Soyiri, Ireneous N; Reidpath, Daniel D; Sarran, Christophe
2013-05-01
Health forecasting can improve health service provision and individual patient outcomes. Environmental factors are known to impact chronic respiratory conditions such as asthma, but little is known about the extent to which these factors can be used for forecasting. Using weather, air quality and hospital asthma admissions, in London (2005-2006), two related negative binomial models were developed and compared with a naive seasonal model. In the first approach, predictive forecasting models were fitted with 7-day averages of each potential predictor, and then a subsequent multivariable model is constructed. In the second strategy, an exhaustive search of the best fitting models between possible combinations of lags (0-14 days) of all the environmental effects on asthma admission was conducted. Three models were considered: a base model (seasonal effects), contrasted with a 7-day average model and a selected lags model (weather and air quality effects). Season is the best predictor of asthma admissions. The 7-day average and seasonal models were trivial to implement. The selected lags model was computationally intensive, but of no real value over much more easily implemented models. Seasonal factors can predict daily hospital asthma admissions in London, and there is a little evidence that additional weather and air quality information would add to forecast accuracy.
The binomial work-health in the transit of Curitiba city.
Tokars, Eunice; Moro, Antonio Renato Pereira; Cruz, Roberto Moraes
2012-01-01
The working activity in traffic of the big cities complex interacts with the environment is often in unsafe and unhealthy imbalance favoring the binomial work - health. The aim of this paper was to analyze the relationship between work and health of taxi drivers in Curitiba, Brazil. This cross-sectional observational study with 206 individuals used a questionnaire on the organization's profile and perception of the environment and direct observation of work. It was found that the majority are male, aged between 26 and 49 years and has a high school degree. They are sedentary, like making a journey from 8 to 12 hours. They consider a stressful profession, related low back pain and are concerned about safety and accidents. 40% are smokers and consume alcoholic drink and 65% do not have or do not use devices of comfort. Risk factors present in the daily taxi constraints cause physical, cognitive and organizational and can affect your performance. It is concluded that the taxi drivers must change the unhealthy lifestyle, requiring a more efficient management of government authorities for this work is healthy and safe for all involved.
Generazio, Edward R.
2011-01-01
The capability of an inspection system is established by applications of various methodologies to determine the probability of detection (POD). One accepted metric of an adequate inspection system is that for a minimum flaw size and all greater flaw sizes, there is 0.90 probability of detection with 95% confidence (90/95 POD). Directed design of experiments for probability of detection (DOEPOD) has been developed to provide an efficient and accurate methodology that yields estimates of POD and confidence bounds for both Hit-Miss or signal amplitude testing, where signal amplitudes are reduced to Hit-Miss by using a signal threshold Directed DOEPOD uses a nonparametric approach for the analysis or inspection data that does require any assumptions about the particular functional form of a POD function. The DOEPOD procedure identifies, for a given sample set whether or not the minimum requirement of 0.90 probability of detection with 95% confidence is demonstrated for a minimum flaw size and for all greater flaw sizes (90/95 POD). The DOEPOD procedures are sequentially executed in order to minimize the number of samples needed to demonstrate that there is a 90/95 POD lower confidence bound at a given flaw size and that the POD is monotonic for flaw sizes exceeding that 90/95 POD flaw size. The conservativeness of the DOEPOD methodology results is discussed. Validated guidelines for binomial estimation of POD for fracture critical inspection are established.
Design of Ultra-Wideband Tapered Slot Antenna by Using Binomial Transformer with Corrugation
Chareonsiri, Yosita; Thaiwirot, Wanwisa; Akkaraekthalin, Prayoot
2017-05-01
In this paper, the tapered slot antenna (TSA) with corrugation is proposed for UWB applications. The multi-section binomial transformer is used to design taper profile of the proposed TSA that does not involve using time consuming optimization. A step-by-step procedure for synthesis of the step impedance values related with step slot widths of taper profile is presented. The smooth taper can be achieved by fitting the smoothing curve to the entire step slot. The design of TSA based on this method yields results with a quite flat gain and wide impedance bandwidth covering UWB spectrum from 3.1 GHz to 10.6 GHz. To further improve the radiation characteristics, the corrugation is added on the both edges of the proposed TSA. The effects of different corrugation shapes on the improvement of antenna gain and front-to-back ratio (F-to-B ratio) are investigated. To demonstrate the validity of the design, the prototypes of TSA without and with corrugation are fabricated and measured. The results show good agreement between simulation and measurement.
Hilpert, Markus; Rasmuson, Anna; Johnson, William
2017-04-01
Transport of colloids in saturated porous media is significantly influenced by colloidal interactions with grain surfaces. Near-surface fluid domain colloids experience relatively low fluid drag and relatively strong colloidal forces that slow their down-gradient translation relative to colloids in bulk fluid. Near surface fluid domain colloids may re-enter into the bulk fluid via diffusion (nanoparticles) or expulsion at rear flow stagnation zones, they may immobilize (attach) via strong primary minimum interactions, or they may move along a grain-to-grain contact to the near surface fluid domain of an adjacent grain. We introduce a simple model that accounts for all possible permutations of mass transfer within a dual pore and grain network. The primary phenomena thereby represented in the model are mass transfer of colloids between the bulk and near-surface fluid domains and immobilization onto grain surfaces. Colloid movement is described by a sequence of trials in a series of unit cells, and the binomial distribution is used to calculate the probabilities of each possible sequence. Pore-scale simulations provide mechanistically-determined likelihoods and timescales associated with the above pore-scale colloid mass transfer processes, whereas the network-scale model employs pore and grain topology to determine probabilities of transfer from up-gradient bulk and near-surface fluid domains to down-gradient bulk and near-surface fluid domains. Inter-grain transport of colloids in the near surface fluid domain can cause extended tailing.
DEFF Research Database (Denmark)
Hansen, Henrik; Tarp, Finn
2001-01-01
This paper examines the relationship between foreign aid and growth in real GDP per capita as it emerges from simple augmentations of popular cross country growth specifications. It is shown that aid in all likelihood increases the growth rate, and this result is not conditional on ‘good’ policy....... investment. We conclude by stressing the need for more theoretical work before this kind of cross-country regressions are used for policy purposes.......This paper examines the relationship between foreign aid and growth in real GDP per capita as it emerges from simple augmentations of popular cross country growth specifications. It is shown that aid in all likelihood increases the growth rate, and this result is not conditional on ‘good’ policy...
Naznin, Farhana; Currie, Graham; Logan, David; Sarvi, Majid
2016-07-01
Safety is a key concern in the design, operation and development of light rail systems including trams or streetcars as they impose crash risks on road users in terms of crash frequency and severity. The aim of this study is to identify key traffic, transit and route factors that influence tram-involved crash frequencies along tram route sections in Melbourne. A random effects negative binomial (RENB) regression model was developed to analyze crash frequency data obtained from Yarra Trams, the tram operator in Melbourne. The RENB modelling approach can account for spatial and temporal variations within observation groups in panel count data structures by assuming that group specific effects are randomly distributed across locations. The results identify many significant factors effecting tram-involved crash frequency including tram service frequency (2.71), tram stop spacing (-0.42), tram route section length (0.31), tram signal priority (-0.25), general traffic volume (0.18), tram lane priority (-0.15) and ratio of platform tram stops (-0.09). Findings provide useful insights on route section level tram-involved crashes in an urban tram or streetcar operating environment. The method described represents a useful planning tool for transit agencies hoping to improve safety performance. Copyright © 2016 Elsevier Ltd. All rights reserved.
Modified Regression Correlation Coefficient for Poisson Regression Model
Kaengthong, Nattacha; Domthong, Uthumporn
2017-09-01
This study gives attention to indicators in predictive power of the Generalized Linear Model (GLM) which are widely used; however, often having some restrictions. We are interested in regression correlation coefficient for a Poisson regression model. This is a measure of predictive power, and defined by the relationship between the dependent variable (Y) and the expected value of the dependent variable given the independent variables [E(Y|X)] for the Poisson regression model. The dependent variable is distributed as Poisson. The purpose of this research was modifying regression correlation coefficient for Poisson regression model. We also compare the proposed modified regression correlation coefficient with the traditional regression correlation coefficient in the case of two or more independent variables, and having multicollinearity in independent variables. The result shows that the proposed regression correlation coefficient is better than the traditional regression correlation coefficient based on Bias and the Root Mean Square Error (RMSE).
Measurement Error in Education and Growth Regressions
Portela, M.; Teulings, C.N.; Alessie, R.
The perpetual inventory method used for the construction of education data per country leads to systematic measurement error. This paper analyses the effect of this measurement error on GDP regressions. There is a systematic difference in the education level between census data and observations
Measurement error in education and growth regressions
Portela, Miguel; Teulings, Coen; Alessie, R.
2004-01-01
The perpetual inventory method used for the construction of education data per country leads to systematic measurement error. This paper analyses the effect of this measurement error on GDP regressions. There is a systematic difference in the education level between census data and observations
Panel data specifications in nonparametric kernel regression
DEFF Research Database (Denmark)
Czekaj, Tomasz Gerard; Henningsen, Arne
parametric panel data estimators to analyse the production technology of Polish crop farms. The results of our nonparametric kernel regressions generally differ from the estimates of the parametric models but they only slightly depend on the choice of the kernel functions. Based on economic reasoning, we...
Predicting company growth using logistic regression and neural networks
Directory of Open Access Journals (Sweden)
Marijana Zekić-Sušac
2016-12-01
Full Text Available The paper aims to establish an efficient model for predicting company growth by leveraging the strengths of logistic regression and neural networks. A real dataset of Croatian companies was used which described the relevant industry sector, financial ratios, income, and assets in the input space, with a dependent binomial variable indicating whether a company had high-growth if it had annualized growth in assets by more than 20% a year over a three-year period. Due to a large number of input variables, factor analysis was performed in the pre -processing stage in order to extract the most important input components. Building an efficient model with a high classification rate and explanatory ability required application of two data mining methods: logistic regression as a parametric and neural networks as a non -parametric method. The methods were tested on the models with and without variable reduction. The classification accuracy of the models was compared using statistical tests and ROC curves. The results showed that neural networks produce a significantly higher classification accuracy in the model when incorporating all available variables. The paper further discusses the advantages and disadvantages of both approaches, i.e. logistic regression and neural networks in modelling company growth. The suggested model is potentially of benefit to investors and economic policy makers as it provides support for recognizing companies with growth potential, especially during times of economic downturn.
Luo, Chongliang; Liu, Jin; Dey, Dipak K; Chen, Kun
2016-07-01
In many fields, multi-view datasets, measuring multiple distinct but interrelated sets of characteristics on the same set of subjects, together with data on certain outcomes or phenotypes, are routinely collected. The objective in such a problem is often two-fold: both to explore the association structures of multiple sets of measurements and to develop a parsimonious model for predicting the future outcomes. We study a unified canonical variate regression framework to tackle the two problems simultaneously. The proposed criterion integrates multiple canonical correlation analysis with predictive modeling, balancing between the association strength of the canonical variates and their joint predictive power on the outcomes. Moreover, the proposed criterion seeks multiple sets of canonical variates simultaneously to enable the examination of their joint effects on the outcomes, and is able to handle multivariate and non-Gaussian outcomes. An efficient algorithm based on variable splitting and Lagrangian multipliers is proposed. Simulation studies show the superior performance of the proposed approach. We demonstrate the effectiveness of the proposed approach in an [Formula: see text] intercross mice study and an alcohol dependence study. © The Author 2016. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Directory of Open Access Journals (Sweden)
Giselle Garcia-Sepulveda
2017-06-01
Full Text Available This paper describes the frequency and number of Trifur tortuosus in the skin of Merluccius gayi, an important economic resource in Chile. Analysis of a spatial distribution model indicated that the parasites tended to cluster. Variations in the number of parasites per host can be described by a negative binomial distribution. The maximum number of parasites observed per host was one, similar patterns was described for other parasites in Chilean marine fishes.
Criterios sobre el uso de la distribución normal para aproximar la distribución binomial
Ortiz Pinilla, Jorge; Castro, Amparo; Neira, Tito; Torres, Pedro; Castañeda, Javier
2012-01-01
Las dos reglas empíricas más conocidas para aceptar la aproximación normal de la distribución binomial carecen de regularidad en el control del margen de error cometido al utilizar la aproximación normal. Se propone un criterio y algunas fórmulas para controlarlo cerca de algunos valores escogidos para el error máximo.
Hilpert, Markus; Rasmuson, Anna; Johnson, William P.
2017-07-01
Colloid transport in saturated porous media is significantly influenced by colloidal interactions with grain surfaces. Near-surface fluid domain colloids experience relatively low fluid drag and relatively strong colloidal forces that slow their downgradient translation relative to colloids in bulk fluid. Near-surface fluid domain colloids may reenter into the bulk fluid via diffusion (nanoparticles) or expulsion at rear flow stagnation zones, they may immobilize (attach) via primary minimum interactions, or they may move along a grain-to-grain contact to the near-surface fluid domain of an adjacent grain. We introduce a simple model that accounts for all possible permutations of mass transfer within a dual pore and grain network. The primary phenomena thereby represented in the model are mass transfer of colloids between the bulk and near-surface fluid domains and immobilization. Colloid movement is described by a Markov chain, i.e., a sequence of trials in a 1-D network of unit cells, which contain a pore and a grain. Using combinatorial analysis, which utilizes the binomial coefficient, we derive the residence time distribution, i.e., an inventory of the discrete colloid travel times through the network and of their probabilities to occur. To parameterize the network model, we performed mechanistic pore-scale simulations in a single unit cell that determined the likelihoods and timescales associated with the above colloid mass transfer processes. We found that intergrain transport of colloids in the near-surface fluid domain can cause extended tailing, which has traditionally been attributed to hydrodynamic dispersion emanating from flow tortuosity of solute trajectories.
Chen, Connie; Gribble, Matthew O; Bartroff, Jay; Bay, Steven M; Goldstein, Larry
2017-05-01
The United States's Clean Water Act stipulates in section 303(d) that states must identify impaired water bodies for which total maximum daily loads (TMDLs) of pollution inputs into water bodies are developed. Decision-making procedures about how to list, or delist, water bodies as impaired, or not, per Clean Water Act 303(d) differ across states. In states such as California, whether or not a particular monitoring sample suggests that water quality is impaired can be regarded as a binary outcome variable, and California's current regulatory framework invokes a version of the exact binomial test to consolidate evidence across samples and assess whether the overall water body complies with the Clean Water Act. Here, we contrast the performance of California's exact binomial test with one potential alternative, the Sequential Probability Ratio Test (SPRT). The SPRT uses a sequential testing framework, testing samples as they become available and evaluating evidence as it emerges, rather than measuring all the samples and calculating a test statistic at the end of the data collection process. Through simulations and theoretical derivations, we demonstrate that the SPRT on average requires fewer samples to be measured to have comparable Type I and Type II error rates as the current fixed-sample binomial test. Policymakers might consider efficient alternatives such as SPRT to current procedure. Copyright © 2017 Elsevier Ltd. All rights reserved.
Polynomial regression analysis and significance test of the regression function
International Nuclear Information System (INIS)
Gao Zhengming; Zhao Juan; He Shengping
2012-01-01
In order to analyze the decay heating power of a certain radioactive isotope per kilogram with polynomial regression method, the paper firstly demonstrated the broad usage of polynomial function and deduced its parameters with ordinary least squares estimate. Then significance test method of polynomial regression function is derived considering the similarity between the polynomial regression model and the multivariable linear regression model. Finally, polynomial regression analysis and significance test of the polynomial function are done to the decay heating power of the iso tope per kilogram in accord with the authors' real work. (authors)
Recursive Algorithm For Linear Regression
Varanasi, S. V.
1988-01-01
Order of model determined easily. Linear-regression algorithhm includes recursive equations for coefficients of model of increased order. Algorithm eliminates duplicative calculations, facilitates search for minimum order of linear-regression model fitting set of data satisfactory.
International Nuclear Information System (INIS)
Ghosh, D.; Deb, A.; Haldar, P.K.; Sahoo, S.R.; Maity, D.
2004-01-01
This work studies the validity of the negative binomial distribution in the multiplicity distribution of charged secondaries in 16 O and 32 S interactions with AgBr at 60 GeV/c per nucleon and 200 GeV/c per nucleon, respectively. The validity of negative binomial distribution (NBD) is studied in different azimuthal phase spaces. It is observed that the data can be well parameterized in terms of the NBD law for different azimuthal phase spaces. (authors)
DEFF Research Database (Denmark)
Tan, Qihua; Bathum, L; Christiansen, L
2003-01-01
In this paper, we apply logistic regression models to measure genetic association with human survival for highly polymorphic and pleiotropic genes. By modelling genotype frequency as a function of age, we introduce a logistic regression model with polytomous responses to handle the polymorphic...... situation. Genotype and allele-based parameterization can be used to investigate the modes of gene action and to reduce the number of parameters, so that the power is increased while the amount of multiple testing minimized. A binomial logistic regression model with fractional polynomials is used to capture...... the age-dependent or antagonistic pleiotropic effects. The models are applied to HFE genotype data to assess the effects on human longevity by different alleles and to detect if an age-dependent effect exists. Application has shown that these methods can serve as useful tools in searching for important...
Directory of Open Access Journals (Sweden)
Germán Lobos
2008-12-01
Full Text Available The main objective of this research was to identify the determining factors of the use of public instruments to manage risk in the Chilean wine industry. A binomial logistic regression model was proposed. Based on a survey of 104 viticulture and winemaking companies, a database was constructed between January and October 2007. The model was fitted using maximum likelihood estimation. The variables that turned out to be statistically significant were: risk of the wine price, availability of external consultancy and number of permanent workers. From the Public Management point of view, the main conclusion suggests that the use of public instruments could be increased if viticulturists and winemakers had more external counseling.
Combining Alphas via Bounded Regression
Directory of Open Access Journals (Sweden)
Zura Kakushadze
2015-11-01
Full Text Available We give an explicit algorithm and source code for combining alpha streams via bounded regression. In practical applications, typically, there is insufficient history to compute a sample covariance matrix (SCM for a large number of alphas. To compute alpha allocation weights, one then resorts to (weighted regression over SCM principal components. Regression often produces alpha weights with insufficient diversification and/or skewed distribution against, e.g., turnover. This can be rectified by imposing bounds on alpha weights within the regression procedure. Bounded regression can also be applied to stock and other asset portfolio construction. We discuss illustrative examples.
Regression in autistic spectrum disorders.
Stefanatos, Gerry A
2008-12-01
A significant proportion of children diagnosed with Autistic Spectrum Disorder experience a developmental regression characterized by a loss of previously-acquired skills. This may involve a loss of speech or social responsitivity, but often entails both. This paper critically reviews the phenomena of regression in autistic spectrum disorders, highlighting the characteristics of regression, age of onset, temporal course, and long-term outcome. Important considerations for diagnosis are discussed and multiple etiological factors currently hypothesized to underlie the phenomenon are reviewed. It is argued that regressive autistic spectrum disorders can be conceptualized on a spectrum with other regressive disorders that may share common pathophysiological features. The implications of this viewpoint are discussed.
Linear regression in astronomy. I
Isobe, Takashi; Feigelson, Eric D.; Akritas, Michael G.; Babu, Gutti Jogesh
1990-01-01
Five methods for obtaining linear regression fits to bivariate data with unknown or insignificant measurement errors are discussed: ordinary least-squares (OLS) regression of Y on X, OLS regression of X on Y, the bisector of the two OLS lines, orthogonal regression, and 'reduced major-axis' regression. These methods have been used by various researchers in observational astronomy, most importantly in cosmic distance scale applications. Formulas for calculating the slope and intercept coefficients and their uncertainties are given for all the methods, including a new general form of the OLS variance estimates. The accuracy of the formulas was confirmed using numerical simulations. The applicability of the procedures is discussed with respect to their mathematical properties, the nature of the astronomical data under consideration, and the scientific purpose of the regression. It is found that, for problems needing symmetrical treatment of the variables, the OLS bisector performs significantly better than orthogonal or reduced major-axis regression.
Advanced statistics: linear regression, part I: simple linear regression.
Marill, Keith A
2004-01-01
Simple linear regression is a mathematical technique used to model the relationship between a single independent predictor variable and a single dependent outcome variable. In this, the first of a two-part series exploring concepts in linear regression analysis, the four fundamental assumptions and the mechanics of simple linear regression are reviewed. The most common technique used to derive the regression line, the method of least squares, is described. The reader will be acquainted with other important concepts in simple linear regression, including: variable transformations, dummy variables, relationship to inference testing, and leverage. Simplified clinical examples with small datasets and graphic models are used to illustrate the points. This will provide a foundation for the second article in this series: a discussion of multiple linear regression, in which there are multiple predictor variables.
Modelling infant mortality rate in Central Java, Indonesia use generalized poisson regression method
Prahutama, Alan; Sudarno
2018-05-01
The infant mortality rate is the number of deaths under one year of age occurring among the live births in a given geographical area during a given year, per 1,000 live births occurring among the population of the given geographical area during the same year. This problem needs to be addressed because it is an important element of a country’s economic development. High infant mortality rate will disrupt the stability of a country as it relates to the sustainability of the population in the country. One of regression model that can be used to analyze the relationship between dependent variable Y in the form of discrete data and independent variable X is Poisson regression model. Recently The regression modeling used for data with dependent variable is discrete, among others, poisson regression, negative binomial regression and generalized poisson regression. In this research, generalized poisson regression modeling gives better AIC value than poisson regression. The most significant variable is the Number of health facilities (X1), while the variable that gives the most influence to infant mortality rate is the average breastfeeding (X9).
Directory of Open Access Journals (Sweden)
Gisela Lundberg
2008-08-01
Full Text Available Amplification of the oncogene MYCN in double minutes (DMs is a common finding in neuroblastoma (NB. Because DMs lack centromeric sequences it has been unclear how NB cells retain and amplify extrachromosomal MYCN copies during tumour development.We show that MYCN-carrying DMs in NB cells translocate from the nuclear interior to the periphery of the condensing chromatin at transition from interphase to prophase and are preferentially located adjacent to the telomere repeat sequences of the chromosomes throughout cell division. However, DM segregation was not affected by disruption of the telosome nucleoprotein complex and DMs readily migrated from human to murine chromatin in human/mouse cell hybrids, indicating that they do not bind to specific positional elements in human chromosomes. Scoring DM copy-numbers in ana/telophase cells revealed that DM segregation could be closely approximated by a binomial random distribution. Colony-forming assay demonstrated a strong growth-advantage for NB cells with high DM (MYCN copy-numbers, compared to NB cells with lower copy-numbers. In fact, the overall distribution of DMs in growing NB cell populations could be readily reproduced by a mathematical model assuming binomial segregation at cell division combined with a proliferative advantage for cells with high DM copy-numbers.Binomial segregation at cell division explains the high degree of MYCN copy-number variability in NB. Our findings also provide a proof-of-principle for oncogene amplification through creation of genetic diversity by random events followed by Darwinian selection.
Directory of Open Access Journals (Sweden)
Paweł Mielcarz
2007-06-01
Full Text Available The article presents a case study of valuation of real options included in a investment project. The main goal of the article is to present the calculation and methodological issues of application the methodology for real option valuation. In order to do it there are used the binomial model and Market Asset Declaimer methodology. The project presented in the article concerns the introduction of radio station to a new market. It includes two valuable real options: to abandon the project and to expand.
International Nuclear Information System (INIS)
Ghosh, D.; Mukhopadhyay, A.; Ghosh, A.; Roy, J.
1989-01-01
This letter presents new data on the multiplicity distribution of charged secondaries in 24 Mg interactions with AgBr at 4.5 GeV/c per nucleon. The validity of the negative binomial distribution (NBD) is studied. It is observed that the data can be well parametrized in terms of the NBD law for the whole phase space and also for different pseudo-rapidity bins. A comparison of different parameters with those in the case of h-h interactions reveals some interesting results, the implications of which are discussed. (orig.)
Directory of Open Access Journals (Sweden)
Robert Kurniawan
2017-11-01
Full Text Available AbstractThe incidence rates of DHF in Jakarta in 2010 to 2014 are always higher than that of the national rates. Therefore, this study aims to find the effect of weather parameter on DHF cases. Weather is chosen because it can be observed daily and can be predicted so that it can be used as earlydetection in estimating the number of DHF cases. Data use includes DHF cases which is collected daily and weather data including lowest and highest temperatures and rainfall. Data analysis used is zero-truncated negative binomial analysis at 10% significance level. Based on the periodic dataof selected variables from January 1st 2015 until May 31st 2015, the study revealed that weather factors consisting of highest temperature, lowest temperature, and rainfall rate were significant enough to predict the number of DHF patients in DKI Jakarta. The three variables had positiveeffects in influencing the number of DHF patients in the same period. However, the weather factors cannot be controlled by humans, so that appropriate preventions are required whenever weather’s predictions indicate the increasing number of DHF cases in DKI Jakarta.Keywords: Dengue Hemorrhagic Fever, zero truncated negative binomial, early warning.AbstrakAngka kesakitan DBD pada tahun 2010 hingga 2014 selalu lebih tinggi dibandingkan dengan angka kesakitan DBD nasional. Oleh karena itu, penelitian ini bertujuan untuk mencari pengaruh faktor cuaca terhadap kasus DBD. Faktor cuaca dipilih karena dapat diamati setiap harinya dan dapat diprediksi sehingga dapat dijadikan deteksi dini dalam perkiraan jumlah penderita DBD. Data yang digunakan dalam penelitian ini adalah data jumlah penderita DBD di DKI Jakarta per hari dan data cuaca yang meliputi suhu terendah, suhu tertinggi dan curah hujan. Untuk mengetahui pengaruh faktor cuaca tersebut terhadap jumlah penderita DBD di DKI Jakarta digunakan metode analisis zero-truncated negative binomial. Berdasarkan data periode 1 Januari 2015 hingga
Non-binomial distribution of palladium Kα Ln X-ray satellites emitted after excitation by 16O ions
International Nuclear Information System (INIS)
Rymuza, P.; Sujkowski, Z.; Carlen, M.; Dousse, J.C.; Gasser, M.; Kern, J.; Perny, B.; Rheme, C.
1988-02-01
The palladium K α L n X-ray satellites spectrum emitted after excitation by 5.4 MeV/u 16 O ions has been measured. The distribution of the satellites yield is found to be significantly narrower than the binomial one. The deviations can be accounted for by assuming that the L-shell ionization is due to two uncorrelated processes: the direct ionization by impact and the electron capture to the K-shell of the projectile. 12 refs., 1 fig., 1 tab. (author)
Linear regression in astronomy. II
Feigelson, Eric D.; Babu, Gutti J.
1992-01-01
A wide variety of least-squares linear regression procedures used in observational astronomy, particularly investigations of the cosmic distance scale, are presented and discussed. The classes of linear models considered are (1) unweighted regression lines, with bootstrap and jackknife resampling; (2) regression solutions when measurement error, in one or both variables, dominates the scatter; (3) methods to apply a calibration line to new data; (4) truncated regression models, which apply to flux-limited data sets; and (5) censored regression models, which apply when nondetections are present. For the calibration problem we develop two new procedures: a formula for the intercept offset between two parallel data sets, which propagates slope errors from one regression to the other; and a generalization of the Working-Hotelling confidence bands to nonstandard least-squares lines. They can provide improved error analysis for Faber-Jackson, Tully-Fisher, and similar cosmic distance scale relations.
Time-adaptive quantile regression
DEFF Research Database (Denmark)
Møller, Jan Kloppenborg; Nielsen, Henrik Aalborg; Madsen, Henrik
2008-01-01
and an updating procedure are combined into a new algorithm for time-adaptive quantile regression, which generates new solutions on the basis of the old solution, leading to savings in computation time. The suggested algorithm is tested against a static quantile regression model on a data set with wind power......An algorithm for time-adaptive quantile regression is presented. The algorithm is based on the simplex algorithm, and the linear optimization formulation of the quantile regression problem is given. The observations have been split to allow a direct use of the simplex algorithm. The simplex method...... production, where the models combine splines and quantile regression. The comparison indicates superior performance for the time-adaptive quantile regression in all the performance parameters considered....
Retro-regression--another important multivariate regression improvement.
Randić, M
2001-01-01
We review the serious problem associated with instabilities of the coefficients of regression equations, referred to as the MRA (multivariate regression analysis) "nightmare of the first kind". This is manifested when in a stepwise regression a descriptor is included or excluded from a regression. The consequence is an unpredictable change of the coefficients of the descriptors that remain in the regression equation. We follow with consideration of an even more serious problem, referred to as the MRA "nightmare of the second kind", arising when optimal descriptors are selected from a large pool of descriptors. This process typically causes at different steps of the stepwise regression a replacement of several previously used descriptors by new ones. We describe a procedure that resolves these difficulties. The approach is illustrated on boiling points of nonanes which are considered (1) by using an ordered connectivity basis; (2) by using an ordering resulting from application of greedy algorithm; and (3) by using an ordering derived from an exhaustive search for optimal descriptors. A novel variant of multiple regression analysis, called retro-regression (RR), is outlined showing how it resolves the ambiguities associated with both "nightmares" of the first and the second kind of MRA.
Quantile regression theory and applications
Davino, Cristina; Vistocco, Domenico
2013-01-01
A guide to the implementation and interpretation of Quantile Regression models This book explores the theory and numerous applications of quantile regression, offering empirical data analysis as well as the software tools to implement the methods. The main focus of this book is to provide the reader with a comprehensivedescription of the main issues concerning quantile regression; these include basic modeling, geometrical interpretation, estimation and inference for quantile regression, as well as issues on validity of the model, diagnostic tools. Each methodological aspect is explored and
Logistic regression applied to natural hazards: rare event logistic regression with replications
Directory of Open Access Journals (Sweden)
M. Guns
2012-06-01
Full Text Available Statistical analysis of natural hazards needs particular attention, as most of these phenomena are rare events. This study shows that the ordinary rare event logistic regression, as it is now commonly used in geomorphologic studies, does not always lead to a robust detection of controlling factors, as the results can be strongly sample-dependent. In this paper, we introduce some concepts of Monte Carlo simulations in rare event logistic regression. This technique, so-called rare event logistic regression with replications, combines the strength of probabilistic and statistical methods, and allows overcoming some of the limitations of previous developments through robust variable selection. This technique was here developed for the analyses of landslide controlling factors, but the concept is widely applicable for statistical analyses of natural hazards.
Logistic regression applied to natural hazards: rare event logistic regression with replications
Guns, M.; Vanacker, V.
2012-06-01
Statistical analysis of natural hazards needs particular attention, as most of these phenomena are rare events. This study shows that the ordinary rare event logistic regression, as it is now commonly used in geomorphologic studies, does not always lead to a robust detection of controlling factors, as the results can be strongly sample-dependent. In this paper, we introduce some concepts of Monte Carlo simulations in rare event logistic regression. This technique, so-called rare event logistic regression with replications, combines the strength of probabilistic and statistical methods, and allows overcoming some of the limitations of previous developments through robust variable selection. This technique was here developed for the analyses of landslide controlling factors, but the concept is widely applicable for statistical analyses of natural hazards.
Panel Smooth Transition Regression Models
DEFF Research Database (Denmark)
González, Andrés; Terasvirta, Timo; Dijk, Dick van
We introduce the panel smooth transition regression model. This new model is intended for characterizing heterogeneous panels, allowing the regression coefficients to vary both across individuals and over time. Specifically, heterogeneity is allowed for by assuming that these coefficients are bou...
Testing discontinuities in nonparametric regression
Dai, Wenlin
2017-01-19
In nonparametric regression, it is often needed to detect whether there are jump discontinuities in the mean function. In this paper, we revisit the difference-based method in [13 H.-G. Müller and U. Stadtmüller, Discontinuous versus smooth regression, Ann. Stat. 27 (1999), pp. 299–337. doi: 10.1214/aos/1018031100
Testing discontinuities in nonparametric regression
Dai, Wenlin; Zhou, Yuejin; Tong, Tiejun
2017-01-01
In nonparametric regression, it is often needed to detect whether there are jump discontinuities in the mean function. In this paper, we revisit the difference-based method in [13 H.-G. Müller and U. Stadtmüller, Discontinuous versus smooth regression, Ann. Stat. 27 (1999), pp. 299–337. doi: 10.1214/aos/1018031100
Logistic Regression: Concept and Application
Cokluk, Omay
2010-01-01
The main focus of logistic regression analysis is classification of individuals in different groups. The aim of the present study is to explain basic concepts and processes of binary logistic regression analysis intended to determine the combination of independent variables which best explain the membership in certain groups called dichotomous…
Choi, Seung Hoan; Labadorf, Adam T; Myers, Richard H; Lunetta, Kathryn L; Dupuis, Josée; DeStefano, Anita L
2017-02-06
Next generation sequencing provides a count of RNA molecules in the form of short reads, yielding discrete, often highly non-normally distributed gene expression measurements. Although Negative Binomial (NB) regression has been generally accepted in the analysis of RNA sequencing (RNA-Seq) data, its appropriateness has not been exhaustively evaluated. We explore logistic regression as an alternative method for RNA-Seq studies designed to compare cases and controls, where disease status is modeled as a function of RNA-Seq reads using simulated and Huntington disease data. We evaluate the effect of adjusting for covariates that have an unknown relationship with gene expression. Finally, we incorporate the data adaptive method in order to compare false positive rates. When the sample size is small or the expression levels of a gene are highly dispersed, the NB regression shows inflated Type-I error rates but the Classical logistic and Bayes logistic (BL) regressions are conservative. Firth's logistic (FL) regression performs well or is slightly conservative. Large sample size and low dispersion generally make Type-I error rates of all methods close to nominal alpha levels of 0.05 and 0.01. However, Type-I error rates are controlled after applying the data adaptive method. The NB, BL, and FL regressions gain increased power with large sample size, large log2 fold-change, and low dispersion. The FL regression has comparable power to NB regression. We conclude that implementing the data adaptive method appropriately controls Type-I error rates in RNA-Seq analysis. Firth's logistic regression provides a concise statistical inference process and reduces spurious associations from inaccurately estimated dispersion parameters in the negative binomial framework.
Jakaitiene, Audrone; Avino, Mariano; Guarracino, Mario Rosario
2017-04-01
Against diminishing costs, next-generation sequencing (NGS) still remains expensive for studies with a large number of individuals. As cost saving, sequencing genome of pools containing multiple samples might be used. Currently, there are many software available for the detection of single-nucleotide polymorphisms (SNPs). Sensitivity and specificity depend on the model used and data analyzed, indicating that all software have space for improvement. We use beta-binomial model to detect rare mutations in untagged pooled NGS experiments. We propose a multireference framework for pooled data with ability being specific up to two patients affected by neuromuscular disorders (NMD). We assessed the results comparing with The Genome Analysis Toolkit (GATK), CRISP, SNVer, and FreeBayes. Our results show that the multireference approach applying beta-binomial model is accurate in predicting rare mutations at 0.01 fraction. Finally, we explored the concordance of mutations between the model and software, checking their involvement in any NMD-related gene. We detected seven novel SNPs, for which the functional analysis produced enriched terms related to locomotion and musculature.
Fungible weights in logistic regression.
Jones, Jeff A; Waller, Niels G
2016-06-01
In this article we develop methods for assessing parameter sensitivity in logistic regression models. To set the stage for this work, we first review Waller's (2008) equations for computing fungible weights in linear regression. Next, we describe 2 methods for computing fungible weights in logistic regression. To demonstrate the utility of these methods, we compute fungible logistic regression weights using data from the Centers for Disease Control and Prevention's (2010) Youth Risk Behavior Surveillance Survey, and we illustrate how these alternate weights can be used to evaluate parameter sensitivity. To make our work accessible to the research community, we provide R code (R Core Team, 2015) that will generate both kinds of fungible logistic regression weights. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
International Nuclear Information System (INIS)
Leng Ling; Zhang Tianyi; Kleinman, Lawrence; Zhu Wei
2007-01-01
Regression analysis, especially the ordinary least squares method which assumes that errors are confined to the dependent variable, has seen a fair share of its applications in aerosol science. The ordinary least squares approach, however, could be problematic due to the fact that atmospheric data often does not lend itself to calling one variable independent and the other dependent. Errors often exist for both measurements. In this work, we examine two regression approaches available to accommodate this situation. They are orthogonal regression and geometric mean regression. Comparisons are made theoretically as well as numerically through an aerosol study examining whether the ratio of organic aerosol to CO would change with age
Tumor regression patterns in retinoblastoma
International Nuclear Information System (INIS)
Zafar, S.N.; Siddique, S.N.; Zaheer, N.
2016-01-01
To observe the types of tumor regression after treatment, and identify the common pattern of regression in our patients. Study Design: Descriptive study. Place and Duration of Study: Department of Pediatric Ophthalmology and Strabismus, Al-Shifa Trust Eye Hospital, Rawalpindi, Pakistan, from October 2011 to October 2014. Methodology: Children with unilateral and bilateral retinoblastoma were included in the study. Patients were referred to Pakistan Institute of Medical Sciences, Islamabad, for chemotherapy. After every cycle of chemotherapy, dilated funds examination under anesthesia was performed to record response of the treatment. Regression patterns were recorded on RetCam II. Results: Seventy-four tumors were included in the study. Out of 74 tumors, 3 were ICRB group A tumors, 43 were ICRB group B tumors, 14 tumors belonged to ICRB group C, and remaining 14 were ICRB group D tumors. Type IV regression was seen in 39.1% (n=29) tumors, type II in 29.7% (n=22), type III in 25.6% (n=19), and type I in 5.4% (n=4). All group A tumors (100%) showed type IV regression. Seventeen (39.5%) group B tumors showed type IV regression. In group C, 5 tumors (35.7%) showed type II regression and 5 tumors (35.7%) showed type IV regression. In group D, 6 tumors (42.9%) regressed to type II non-calcified remnants. Conclusion: The response and success of the focal and systemic treatment, as judged by the appearance of different patterns of tumor regression, varies with the ICRB grouping of the tumor. (author)
Regression to Causality : Regression-style presentation influences causal attribution
DEFF Research Database (Denmark)
Bordacconi, Mats Joe; Larsen, Martin Vinæs
2014-01-01
of equivalent results presented as either regression models or as a test of two sample means. Our experiment shows that the subjects who were presented with results as estimates from a regression model were more inclined to interpret these results causally. Our experiment implies that scholars using regression...... models – one of the primary vehicles for analyzing statistical results in political science – encourage causal interpretation. Specifically, we demonstrate that presenting observational results in a regression model, rather than as a simple comparison of means, makes causal interpretation of the results...... more likely. Our experiment drew on a sample of 235 university students from three different social science degree programs (political science, sociology and economics), all of whom had received substantial training in statistics. The subjects were asked to compare and evaluate the validity...
Augmenting Data with Published Results in Bayesian Linear Regression
de Leeuw, Christiaan; Klugkist, Irene
2012-01-01
In most research, linear regression analyses are performed without taking into account published results (i.e., reported summary statistics) of similar previous studies. Although the prior density in Bayesian linear regression could accommodate such prior knowledge, formal models for doing so are absent from the literature. The goal of this…
Predicting Word Reading Ability: A Quantile Regression Study
McIlraith, Autumn L.
2018-01-01
Predictors of early word reading are well established. However, it is unclear if these predictors hold for readers across a range of word reading abilities. This study used quantile regression to investigate predictive relationships at different points in the distribution of word reading. Quantile regression analyses used preschool and…
Advanced statistics: linear regression, part II: multiple linear regression.
Marill, Keith A
2004-01-01
The applications of simple linear regression in medical research are limited, because in most situations, there are multiple relevant predictor variables. Univariate statistical techniques such as simple linear regression use a single predictor variable, and they often may be mathematically correct but clinically misleading. Multiple linear regression is a mathematical technique used to model the relationship between multiple independent predictor variables and a single dependent outcome variable. It is used in medical research to model observational data, as well as in diagnostic and therapeutic studies in which the outcome is dependent on more than one factor. Although the technique generally is limited to data that can be expressed with a linear function, it benefits from a well-developed mathematical framework that yields unique solutions and exact confidence intervals for regression coefficients. Building on Part I of this series, this article acquaints the reader with some of the important concepts in multiple regression analysis. These include multicollinearity, interaction effects, and an expansion of the discussion of inference testing, leverage, and variable transformations to multivariate models. Examples from the first article in this series are expanded on using a primarily graphic, rather than mathematical, approach. The importance of the relationships among the predictor variables and the dependence of the multivariate model coefficients on the choice of these variables are stressed. Finally, concepts in regression model building are discussed.
Logic regression and its extensions.
Schwender, Holger; Ruczinski, Ingo
2010-01-01
Logic regression is an adaptive classification and regression procedure, initially developed to reveal interacting single nucleotide polymorphisms (SNPs) in genetic association studies. In general, this approach can be used in any setting with binary predictors, when the interaction of these covariates is of primary interest. Logic regression searches for Boolean (logic) combinations of binary variables that best explain the variability in the outcome variable, and thus, reveals variables and interactions that are associated with the response and/or have predictive capabilities. The logic expressions are embedded in a generalized linear regression framework, and thus, logic regression can handle a variety of outcome types, such as binary responses in case-control studies, numeric responses, and time-to-event data. In this chapter, we provide an introduction to the logic regression methodology, list some applications in public health and medicine, and summarize some of the direct extensions and modifications of logic regression that have been proposed in the literature. Copyright © 2010 Elsevier Inc. All rights reserved.
Abstract Expression Grammar Symbolic Regression
Korns, Michael F.
This chapter examines the use of Abstract Expression Grammars to perform the entire Symbolic Regression process without the use of Genetic Programming per se. The techniques explored produce a symbolic regression engine which has absolutely no bloat, which allows total user control of the search space and output formulas, which is faster, and more accurate than the engines produced in our previous papers using Genetic Programming. The genome is an all vector structure with four chromosomes plus additional epigenetic and constraint vectors, allowing total user control of the search space and the final output formulas. A combination of specialized compiler techniques, genetic algorithms, particle swarm, aged layered populations, plus discrete and continuous differential evolution are used to produce an improved symbolic regression sytem. Nine base test cases, from the literature, are used to test the improvement in speed and accuracy. The improved results indicate that these techniques move us a big step closer toward future industrial strength symbolic regression systems.
Quantile Regression With Measurement Error
Wei, Ying; Carroll, Raymond J.
2009-01-01
. The finite sample performance of the proposed method is investigated in a simulation study, and compared to the standard regression calibration approach. Finally, we apply our methodology to part of the National Collaborative Perinatal Project growth data, a
Examples of mixed-effects modeling with crossed random effects and with binomial data
Quené, H.; van den Bergh, H.
2008-01-01
Psycholinguistic data are often analyzed with repeated-measures analyses of variance (ANOVA), but this paper argues that mixed-effects (multilevel) models provide a better alternative method. First, models are discussed in which the two random factors of participants and items are crossed, and not
From Rasch scores to regression
DEFF Research Database (Denmark)
Christensen, Karl Bang
2006-01-01
Rasch models provide a framework for measurement and modelling latent variables. Having measured a latent variable in a population a comparison of groups will often be of interest. For this purpose the use of observed raw scores will often be inadequate because these lack interval scale propertie....... This paper compares two approaches to group comparison: linear regression models using estimated person locations as outcome variables and latent regression models based on the distribution of the score....
Testing Heteroscedasticity in Robust Regression
Czech Academy of Sciences Publication Activity Database
Kalina, Jan
2011-01-01
Roč. 1, č. 4 (2011), s. 25-28 ISSN 2045-3345 Grant - others:GA ČR(CZ) GA402/09/0557 Institutional research plan: CEZ:AV0Z10300504 Keywords : robust regression * heteroscedasticity * regression quantiles * diagnostics Subject RIV: BB - Applied Statistics , Operational Research http://www.researchjournals.co.uk/documents/Vol4/06%20Kalina.pdf
Regression methods for medical research
Tai, Bee Choo
2013-01-01
Regression Methods for Medical Research provides medical researchers with the skills they need to critically read and interpret research using more advanced statistical methods. The statistical requirements of interpreting and publishing in medical journals, together with rapid changes in science and technology, increasingly demands an understanding of more complex and sophisticated analytic procedures.The text explains the application of statistical models to a wide variety of practical medical investigative studies and clinical trials. Regression methods are used to appropriately answer the
Forecasting with Dynamic Regression Models
Pankratz, Alan
2012-01-01
One of the most widely used tools in statistical forecasting, single equation regression models is examined here. A companion to the author's earlier work, Forecasting with Univariate Box-Jenkins Models: Concepts and Cases, the present text pulls together recent time series ideas and gives special attention to possible intertemporal patterns, distributed lag responses of output to input series and the auto correlation patterns of regression disturbance. It also includes six case studies.
Shirazi, Mohammadali; Dhavala, Soma Sekhar; Lord, Dominique; Geedipally, Srinivas Reddy
2017-10-01
Safety analysts usually use post-modeling methods, such as the Goodness-of-Fit statistics or the Likelihood Ratio Test, to decide between two or more competitive distributions or models. Such metrics require all competitive distributions to be fitted to the data before any comparisons can be accomplished. Given the continuous growth in introducing new statistical distributions, choosing the best one using such post-modeling methods is not a trivial task, in addition to all theoretical or numerical issues the analyst may face during the analysis. Furthermore, and most importantly, these measures or tests do not provide any intuitions into why a specific distribution (or model) is preferred over another (Goodness-of-Logic). This paper ponders into these issues by proposing a methodology to design heuristics for Model Selection based on the characteristics of data, in terms of descriptive summary statistics, before fitting the models. The proposed methodology employs two analytic tools: (1) Monte-Carlo Simulations and (2) Machine Learning Classifiers, to design easy heuristics to predict the label of the 'most-likely-true' distribution for analyzing data. The proposed methodology was applied to investigate when the recently introduced Negative Binomial Lindley (NB-L) distribution is preferred over the Negative Binomial (NB) distribution. Heuristics were designed to select the 'most-likely-true' distribution between these two distributions, given a set of prescribed summary statistics of data. The proposed heuristics were successfully compared against classical tests for several real or observed datasets. Not only they are easy to use and do not need any post-modeling inputs, but also, using these heuristics, the analyst can attain useful information about why the NB-L is preferred over the NB - or vice versa- when modeling data. Copyright © 2017 Elsevier Ltd. All rights reserved.
Directory of Open Access Journals (Sweden)
Patricio Peña-Rehbein
2012-03-01
Full Text Available Nematodes of the genus Anisakis have marine fishes as intermediate hosts. One of these hosts is Thyrsites atun, an important fishery resource in Chile between 38 and 41° S. This paper describes the frequency and number of Anisakis nematodes in the internal organs of Thyrsites atun. An analysis based on spatial distribution models showed that the parasites tend to be clustered. The variation in the number of parasites per host could be described by the negative binomial distribution. The maximum observed number of parasites was nine parasites per host. The environmental and zoonotic aspects of the study are also discussed.Nematóides do gênero Anisakis têm nos peixes marinhos seus hospedeiros intermediários. Um desses hospedeiros é Thyrsites atun, um importante recurso pesqueiro no Chile entre 38 e 41° S. Este artigo descreve a freqüência e o número de nematóides Anisakis nos órgãos internos de Thyrsites atun. Uma análise baseada em modelos de distribuição espacial demonstrou que os parasitos tendem a ficar agrupados. A variação numérica de parasitas por hospedeiro pôde ser descrita por distribuição binomial negativa. O número máximo observado de parasitas por hospedeiro foi nove. Os aspectos ambientais e zoonóticos desse estudo também serão discutidos.
Genomic breeding value estimation using nonparametric additive regression models
Directory of Open Access Journals (Sweden)
Solberg Trygve
2009-01-01
Full Text Available Abstract Genomic selection refers to the use of genomewide dense markers for breeding value estimation and subsequently for selection. The main challenge of genomic breeding value estimation is the estimation of many effects from a limited number of observations. Bayesian methods have been proposed to successfully cope with these challenges. As an alternative class of models, non- and semiparametric models were recently introduced. The present study investigated the ability of nonparametric additive regression models to predict genomic breeding values. The genotypes were modelled for each marker or pair of flanking markers (i.e. the predictors separately. The nonparametric functions for the predictors were estimated simultaneously using additive model theory, applying a binomial kernel. The optimal degree of smoothing was determined by bootstrapping. A mutation-drift-balance simulation was carried out. The breeding values of the last generation (genotyped was predicted using data from the next last generation (genotyped and phenotyped. The results show moderate to high accuracies of the predicted breeding values. A determination of predictor specific degree of smoothing increased the accuracy.
What do you do when the binomial cannot value real options? The LSM model
Directory of Open Access Journals (Sweden)
S. Alonso
2014-12-01
Full Text Available The Least-Squares Monte Carlo model (LSM model has emerged as the derivative valuation technique with the greatest impact in current practice. As with other options valuation models, the LSM algorithm was initially posited in the field of financial derivatives and its extension to the realm of real options requires considering certain questions which might hinder understanding of the algorithm and which the present paper seeks to address. The implementation of the LSM model combines Monte Carlo simulation, dynamic programming and statistical regression in a flexible procedure suitable for application to valuing nearly all types of corporate investments. The goal of this paper is to show how the LSM algorithm is applied in the context of a corporate investment, thus contributing to the understanding of the principles of its operation.
Silva, Ivair R
2018-01-15
Type I error probability spending functions are commonly used for designing sequential analysis of binomial data in clinical trials, but it is also quickly emerging for near-continuous sequential analysis of post-market drug and vaccine safety surveillance. It is well known that, for clinical trials, when the null hypothesis is not rejected, it is still important to minimize the sample size. Unlike in post-market drug and vaccine safety surveillance, that is not important. In post-market safety surveillance, specially when the surveillance involves identification of potential signals, the meaningful statistical performance measure to be minimized is the expected sample size when the null hypothesis is rejected. The present paper shows that, instead of the convex Type I error spending shape conventionally used in clinical trials, a concave shape is more indicated for post-market drug and vaccine safety surveillance. This is shown for both, continuous and group sequential analysis. Copyright © 2017 John Wiley & Sons, Ltd.
Tang, Yongqiang
2017-05-25
We derive the sample size formulae for comparing two negative binomial rates based on both the relative and absolute rate difference metrics in noninferiority and equivalence trials with unequal follow-up times, and establish an approximate relationship between the sample sizes required for the treatment comparison based on the two treatment effect metrics. The proposed method allows the dispersion parameter to vary by treatment groups. The accuracy of these methods is assessed by simulations. It is demonstrated that ignoring the between-subject variation in the follow-up time by setting the follow-up time for all individuals to be the mean follow-up time may greatly underestimate the required size, resulting in underpowered studies. Methods are provided for back-calculating the dispersion parameter based on the published summary results.
Directory of Open Access Journals (Sweden)
Federica Giardina
Full Text Available The Research Center for Human Development in Dakar (CRDH with the technical assistance of ICF Macro and the National Malaria Control Programme (NMCP conducted in 2008/2009 the Senegal Malaria Indicator Survey (SMIS, the first nationally representative household survey collecting parasitological data and malaria-related indicators. In this paper, we present spatially explicit parasitaemia risk estimates and number of infected children below 5 years. Geostatistical Zero-Inflated Binomial models (ZIB were developed to take into account the large number of zero-prevalence survey locations (70% in the data. Bayesian variable selection methods were incorporated within a geostatistical framework in order to choose the best set of environmental and climatic covariates associated with the parasitaemia risk. Model validation confirmed that the ZIB model had a better predictive ability than the standard Binomial analogue. Markov chain Monte Carlo (MCMC methods were used for inference. Several insecticide treated nets (ITN coverage indicators were calculated to assess the effectiveness of interventions. After adjusting for climatic and socio-economic factors, the presence of at least one ITN per every two household members and living in urban areas reduced the odds of parasitaemia by 86% and 81% respectively. Posterior estimates of the ORs related to the wealth index show a decreasing trend with the quintiles. Infection odds appear to be increasing with age. The population-adjusted prevalence ranges from 0.12% in Thillé-Boubacar to 13.1% in Dabo. Tambacounda has the highest population-adjusted predicted prevalence (8.08% whereas the region with the highest estimated number of infected children under the age of 5 years is Kolda (13940. The contemporary map and estimates of malaria burden identify the priority areas for future control interventions and provide baseline information for monitoring and evaluation. Zero-Inflated formulations are more appropriate
Producing The New Regressive Left
DEFF Research Database (Denmark)
Crone, Christine
members, this thesis investigates a growing political trend and ideological discourse in the Arab world that I have called The New Regressive Left. On the premise that a media outlet can function as a forum for ideology production, the thesis argues that an analysis of this material can help to trace...... the contexture of The New Regressive Left. If the first part of the thesis lays out the theoretical approach and draws the contextual framework, through an exploration of the surrounding Arab media-and ideoscapes, the second part is an analytical investigation of the discourse that permeates the programmes aired...... becomes clear from the analytical chapters is the emergence of the new cross-ideological alliance of The New Regressive Left. This emerging coalition between Shia Muslims, religious minorities, parts of the Arab Left, secular cultural producers, and the remnants of the political,strategic resistance...
A Matlab program for stepwise regression
Directory of Open Access Journals (Sweden)
Yanhong Qi
2016-03-01
Full Text Available The stepwise linear regression is a multi-variable regression for identifying statistically significant variables in the linear regression equation. In present study, we presented the Matlab program of stepwise regression.
Correlation and simple linear regression.
Zou, Kelly H; Tuncali, Kemal; Silverman, Stuart G
2003-06-01
In this tutorial article, the concepts of correlation and regression are reviewed and demonstrated. The authors review and compare two correlation coefficients, the Pearson correlation coefficient and the Spearman rho, for measuring linear and nonlinear relationships between two continuous variables. In the case of measuring the linear relationship between a predictor and an outcome variable, simple linear regression analysis is conducted. These statistical concepts are illustrated by using a data set from published literature to assess a computed tomography-guided interventional technique. These statistical methods are important for exploring the relationships between variables and can be applied to many radiologic studies.
Regression filter for signal resolution
International Nuclear Information System (INIS)
Matthes, W.
1975-01-01
The problem considered is that of resolving a measured pulse height spectrum of a material mixture, e.g. gamma ray spectrum, Raman spectrum, into a weighed sum of the spectra of the individual constituents. The model on which the analytical formulation is based is described. The problem reduces to that of a multiple linear regression. A stepwise linear regression procedure was constructed. The efficiency of this method was then tested by transforming the procedure in a computer programme which was used to unfold test spectra obtained by mixing some spectra, from a library of arbitrary chosen spectra, and adding a noise component. (U.K.)
Nonparametric Mixture of Regression Models.
Huang, Mian; Li, Runze; Wang, Shaoli
2013-07-01
Motivated by an analysis of US house price index data, we propose nonparametric finite mixture of regression models. We study the identifiability issue of the proposed models, and develop an estimation procedure by employing kernel regression. We further systematically study the sampling properties of the proposed estimators, and establish their asymptotic normality. A modified EM algorithm is proposed to carry out the estimation procedure. We show that our algorithm preserves the ascent property of the EM algorithm in an asymptotic sense. Monte Carlo simulations are conducted to examine the finite sample performance of the proposed estimation procedure. An empirical analysis of the US house price index data is illustrated for the proposed methodology.
Cactus: An Introduction to Regression
Hyde, Hartley
2008-01-01
When the author first used "VisiCalc," the author thought it a very useful tool when he had the formulas. But how could he design a spreadsheet if there was no known formula for the quantities he was trying to predict? A few months later, the author relates he learned to use multiple linear regression software and suddenly it all clicked into…
Regression Models for Repairable Systems
Czech Academy of Sciences Publication Activity Database
Novák, Petr
2015-01-01
Roč. 17, č. 4 (2015), s. 963-972 ISSN 1387-5841 Institutional support: RVO:67985556 Keywords : Reliability analysis * Repair models * Regression Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.782, year: 2015 http://library.utia.cas.cz/separaty/2015/SI/novak-0450902.pdf
Survival analysis II: Cox regression
Stel, Vianda S.; Dekker, Friedo W.; Tripepi, Giovanni; Zoccali, Carmine; Jager, Kitty J.
2011-01-01
In contrast to the Kaplan-Meier method, Cox proportional hazards regression can provide an effect estimate by quantifying the difference in survival between patient groups and can adjust for confounding effects of other variables. The purpose of this article is to explain the basic concepts of the
Kernel regression with functional response
Ferraty, Frédéric; Laksaci, Ali; Tadj, Amel; Vieu, Philippe
2011-01-01
We consider kernel regression estimate when both the response variable and the explanatory one are functional. The rates of uniform almost complete convergence are stated as function of the small ball probability of the predictor and as function of the entropy of the set on which uniformity is obtained.
DEFF Research Database (Denmark)
Vilsen, Søren B.; Tvedebrink, Torben; Mogensen, Helle Smidt
2015-01-01
We present a model fitting the distribution of non-systematic errors in STR second generation sequencing, SGS, analysis. The model fits the distribution of non-systematic errors, i.e. the noise, using a one-inflated, zero-truncated, negative binomial model. The model is a two component model...
Linear regression and the normality assumption.
Schmidt, Amand F; Finan, Chris
2017-12-16
Researchers often perform arbitrary outcome transformations to fulfill the normality assumption of a linear regression model. This commentary explains and illustrates that in large data settings, such transformations are often unnecessary, and worse may bias model estimates. Linear regression assumptions are illustrated using simulated data and an empirical example on the relation between time since type 2 diabetes diagnosis and glycated hemoglobin levels. Simulation results were evaluated on coverage; i.e., the number of times the 95% confidence interval included the true slope coefficient. Although outcome transformations bias point estimates, violations of the normality assumption in linear regression analyses do not. The normality assumption is necessary to unbiasedly estimate standard errors, and hence confidence intervals and P-values. However, in large sample sizes (e.g., where the number of observations per variable is >10) violations of this normality assumption often do not noticeably impact results. Contrary to this, assumptions on, the parametric model, absence of extreme observations, homoscedasticity, and independency of the errors, remain influential even in large sample size settings. Given that modern healthcare research typically includes thousands of subjects focusing on the normality assumption is often unnecessary, does not guarantee valid results, and worse may bias estimates due to the practice of outcome transformations. Copyright © 2017 Elsevier Inc. All rights reserved.
Quantile Regression With Measurement Error
Wei, Ying
2009-08-27
Regression quantiles can be substantially biased when the covariates are measured with error. In this paper we propose a new method that produces consistent linear quantile estimation in the presence of covariate measurement error. The method corrects the measurement error induced bias by constructing joint estimating equations that simultaneously hold for all the quantile levels. An iterative EM-type estimation algorithm to obtain the solutions to such joint estimation equations is provided. The finite sample performance of the proposed method is investigated in a simulation study, and compared to the standard regression calibration approach. Finally, we apply our methodology to part of the National Collaborative Perinatal Project growth data, a longitudinal study with an unusual measurement error structure. © 2009 American Statistical Association.
Multivariate and semiparametric kernel regression
Härdle, Wolfgang; Müller, Marlene
1997-01-01
The paper gives an introduction to theory and application of multivariate and semiparametric kernel smoothing. Multivariate nonparametric density estimation is an often used pilot tool for examining the structure of data. Regression smoothing helps in investigating the association between covariates and responses. We concentrate on kernel smoothing using local polynomial fitting which includes the Nadaraya-Watson estimator. Some theory on the asymptotic behavior and bandwidth selection is pro...
Regression algorithm for emotion detection
Berthelon , Franck; Sander , Peter
2013-01-01
International audience; We present here two components of a computational system for emotion detection. PEMs (Personalized Emotion Maps) store links between bodily expressions and emotion values, and are individually calibrated to capture each person's emotion profile. They are an implementation based on aspects of Scherer's theoretical complex system model of emotion~\\cite{scherer00, scherer09}. We also present a regression algorithm that determines a person's emotional feeling from sensor m...
Directional quantile regression in R
Czech Academy of Sciences Publication Activity Database
Boček, Pavel; Šiman, Miroslav
2017-01-01
Roč. 53, č. 3 (2017), s. 480-492 ISSN 0023-5954 R&D Projects: GA ČR GA14-07234S Institutional support: RVO:67985556 Keywords : multivariate quantile * regression quantile * halfspace depth * depth contour Subject RIV: BD - Theory of Information OBOR OECD: Applied mathematics Impact factor: 0.379, year: 2016 http://library.utia.cas.cz/separaty/2017/SI/bocek-0476587.pdf
Energy Technology Data Exchange (ETDEWEB)
Hernandez I, S.; Ortiz C, E.; Chavez M, C., E-mail: lunitza@gmail.com [UNAM, Facultad de Ingenieria, Circuito Interior, Ciudad Universitaria, 04510 Mexico D. F. (Mexico)
2011-11-15
At the present time, is an unquestionable fact that the nuclear electrical energy is a topic of vital importance, no more because eliminates the dependence of the hydrocarbons and is friendly with the environment, but because is also a sure and reliable energy source, and represents a viable alternative before the claims in the growing demand of electricity in Mexico. Before this panorama, was intended several scenarios to elevate the capacity of electric generation of nuclear origin with a variable participation. One of the contemplated scenarios is represented by the expansion project of the nuclear power plant Laguna Verde through the addition of a third reactor that serves as detonator of an integral program that proposes the installation of more nuclear reactors in the country. Before this possible scenario, the Federal Commission of Electricity like responsible organism of supplying energy to the population should have tools that offer it the flexibility to be adapted to the possible changes that will be presented along the project and also gives a value to the risk to future. The methodology denominated Real Options, Binomial model was proposed as an evaluation tool that allows to quantify the value of the expansion proposal, demonstrating the feasibility of the project through a periodic visualization of their evolution, all with the objective of supplying a financial analysis that serves as base and justification before the evident apogee of the nuclear energy that will be presented in future years. (Author)
Polylinear regression analysis in radiochemistry
International Nuclear Information System (INIS)
Kopyrin, A.A.; Terent'eva, T.N.; Khramov, N.N.
1995-01-01
A number of radiochemical problems have been formulated in the framework of polylinear regression analysis, which permits the use of conventional mathematical methods for their solution. The authors have considered features of the use of polylinear regression analysis for estimating the contributions of various sources to the atmospheric pollution, for studying irradiated nuclear fuel, for estimating concentrations from spectral data, for measuring neutron fields of a nuclear reactor, for estimating crystal lattice parameters from X-ray diffraction patterns, for interpreting data of X-ray fluorescence analysis, for estimating complex formation constants, and for analyzing results of radiometric measurements. The problem of estimating the target parameters can be incorrect at certain properties of the system under study. The authors showed the possibility of regularization by adding a fictitious set of data open-quotes obtainedclose quotes from the orthogonal design. To estimate only a part of the parameters under consideration, the authors used incomplete rank models. In this case, it is necessary to take into account the possibility of confounding estimates. An algorithm for evaluating the degree of confounding is presented which is realized using standard software or regression analysis
Directory of Open Access Journals (Sweden)
Goovaerts Pierre
2011-12-01
Full Text Available Abstract Background Although prostate cancer-related incidence and mortality have declined recently, striking racial/ethnic differences persist in the United States. Visualizing and modelling temporal trends of prostate cancer late-stage incidence, and how they vary according to geographic locations and race, should help explaining such disparities. Joinpoint regression is increasingly used to identify the timing and extent of changes in time series of health outcomes. Yet, most analyses of temporal trends are aspatial and conducted at the national level or for a single cancer registry. Methods Time series (1981-2007 of annual proportions of prostate cancer late-stage cases were analyzed for non-Hispanic Whites and non-Hispanic Blacks in each county of Florida. Noise in the data was first filtered by binomial kriging and results were modelled using joinpoint regression. A similar analysis was also conducted at the state level and for groups of metropolitan and non-metropolitan counties. Significant racial differences were detected using tests of parallelism and coincidence of time trends. A new disparity statistic was introduced to measure spatial and temporal changes in the frequency of racial disparities. Results State-level percentage of late-stage diagnosis decreased 50% since 1981; a decline that accelerated in the 90's when Prostate Specific Antigen (PSA screening was introduced. Analysis at the metropolitan and non-metropolitan levels revealed that the frequency of late-stage diagnosis increased recently in urban areas, and this trend was significant for white males. The annual rate of decrease in late-stage diagnosis and the onset years for significant declines varied greatly among counties and racial groups. Most counties with non-significant average annual percent change (AAPC were located in the Florida Panhandle for white males, whereas they clustered in South-eastern Florida for black males. The new disparity statistic indicated
Goovaerts, Pierre; Xiao, Hong
2011-12-05
Although prostate cancer-related incidence and mortality have declined recently, striking racial/ethnic differences persist in the United States. Visualizing and modelling temporal trends of prostate cancer late-stage incidence, and how they vary according to geographic locations and race, should help explaining such disparities. Joinpoint regression is increasingly used to identify the timing and extent of changes in time series of health outcomes. Yet, most analyses of temporal trends are aspatial and conducted at the national level or for a single cancer registry. Time series (1981-2007) of annual proportions of prostate cancer late-stage cases were analyzed for non-Hispanic Whites and non-Hispanic Blacks in each county of Florida. Noise in the data was first filtered by binomial kriging and results were modelled using joinpoint regression. A similar analysis was also conducted at the state level and for groups of metropolitan and non-metropolitan counties. Significant racial differences were detected using tests of parallelism and coincidence of time trends. A new disparity statistic was introduced to measure spatial and temporal changes in the frequency of racial disparities. State-level percentage of late-stage diagnosis decreased 50% since 1981; a decline that accelerated in the 90's when Prostate Specific Antigen (PSA) screening was introduced. Analysis at the metropolitan and non-metropolitan levels revealed that the frequency of late-stage diagnosis increased recently in urban areas, and this trend was significant for white males. The annual rate of decrease in late-stage diagnosis and the onset years for significant declines varied greatly among counties and racial groups. Most counties with non-significant average annual percent change (AAPC) were located in the Florida Panhandle for white males, whereas they clustered in South-eastern Florida for black males. The new disparity statistic indicated that the spatial extent of racial disparities reached a
Rusli, Rusdi; Haque, Md Mazharul; King, Mark; Voon, Wong Shaw
2017-05-01
Mountainous highways generally associate with complex driving environment because of constrained road geometries, limited cross-section elements, inappropriate roadside features, and adverse weather conditions. As a result, single-vehicle (SV) crashes are overrepresented along mountainous roads, particularly in developing countries, but little attention is known about the roadway geometric, traffic and weather factors contributing to these SV crashes. As such, the main objective of the present study is to investigate SV crashes using detailed data obtained from a rigorous site survey and existing databases. The final dataset included a total of 56 variables representing road geometries including horizontal and vertical alignment, traffic characteristics, real-time weather condition, cross-sectional elements, roadside features, and spatial characteristics. To account for structured heterogeneities resulting from multiple observations within a site and other unobserved heterogeneities, the study applied a random parameters negative binomial model. Results suggest that rainfall during the crash is positively associated with SV crashes, but real-time visibility is negatively associated. The presence of a road shoulder, particularly a bitumen shoulder or wider shoulders, along mountainous highways is associated with less SV crashes. While speeding along downgrade slopes increases the likelihood of SV crashes, proper delineation decreases the likelihood. Findings of this study have significant implications for designing safer highways in mountainous areas, particularly in the context of a developing country. Copyright © 2017 Elsevier Ltd. All rights reserved.
Chen, Johnny; Zheng, Gang; Price, William S
2017-02-01
A new 8-pulse Phase Modulated binomial-like selective inversion pulse sequence, dubbed '8PM', was developed by optimizing the nutation and phase angles of the constituent radio-frequency pulses so that the inversion profile resembled a target profile. Suppression profiles were obtained for both the 8PM and W5 based excitation sculpting sequences with equal inter-pulse delays. Significant distortions were observed in both profiles because of the offset effect of the radio frequency pulses. These distortions were successfully reduced by adjusting the inter-pulse delays. With adjusted inter-pulse delays, the 8PM and W5 based excitation sculpting sequences were tested on an aqueous lysozyme solution. The 8 PM based sequence provided higher suppression selectivity than the W5 based sequence. Two-dimensional nuclear Overhauser effect spectroscopy experiments were also performed on the lysozyme sample with 8PM and W5 based water signal suppression. The 8PM based suppression provided a spectrum with significantly increased (~ doubled) cross-peak intensity around the suppressed water resonance compared to the W5 based suppression. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Assefa, Enyew; Tadesse, Mekonnen
2017-08-01
The major causes for poor health in developing countries are inadequate access and under-use of modern health care services. The objective of this study was to identify and examine factors related to the use of antenatal care services using the 2011 Ethiopia Demographic and Health Survey data. The number of antenatal care visits during the last pregnancy by mothers aged 15 to 49 years (n = 7,737) was analyzed. More than 55% of the mothers did not use antenatal care (ANC) services, while more than 22% of the women used antenatal care services less than four times. More than half of the women (52%) who had access to health services had at least four antenatal care visits. The zero-inflated negative binomial model was found to be more appropriate for analyzing the data. Place of residence, age of mothers, woman's educational level, employment status, mass media exposure, religion, and access to health services were significantly associated with the use of antenatal care services. Accordingly, there should be progress toward a health-education program that enables more women to utilize ANC services, with the program targeting women in rural areas, uneducated women, and mothers with higher birth orders through appropriate media.
Xiao, Chuan-Le; Chen, Xiao-Zhou; Du, Yang-Li; Sun, Xuesong; Zhang, Gong; He, Qing-Yu
2013-01-04
Mass spectrometry has become one of the most important technologies in proteomic analysis. Tandem mass spectrometry (LC-MS/MS) is a major tool for the analysis of peptide mixtures from protein samples. The key step of MS data processing is the identification of peptides from experimental spectra by searching public sequence databases. Although a number of algorithms to identify peptides from MS/MS data have been already proposed, e.g. Sequest, OMSSA, X!Tandem, Mascot, etc., they are mainly based on statistical models considering only peak-matches between experimental and theoretical spectra, but not peak intensity information. Moreover, different algorithms gave different results from the same MS data, implying their probable incompleteness and questionable reproducibility. We developed a novel peptide identification algorithm, ProVerB, based on a binomial probability distribution model of protein tandem mass spectrometry combined with a new scoring function, making full use of peak intensity information and, thus, enhancing the ability of identification. Compared with Mascot, Sequest, and SQID, ProVerB identified significantly more peptides from LC-MS/MS data sets than the current algorithms at 1% False Discovery Rate (FDR) and provided more confident peptide identifications. ProVerB is also compatible with various platforms and experimental data sets, showing its robustness and versatility. The open-source program ProVerB is available at http://bioinformatics.jnu.edu.cn/software/proverb/ .
International Nuclear Information System (INIS)
Hernandez I, S.; Ortiz C, E.; Chavez M, C.
2011-11-01
At the present time, is an unquestionable fact that the nuclear electrical energy is a topic of vital importance, no more because eliminates the dependence of the hydrocarbons and is friendly with the environment, but because is also a sure and reliable energy source, and represents a viable alternative before the claims in the growing demand of electricity in Mexico. Before this panorama, was intended several scenarios to elevate the capacity of electric generation of nuclear origin with a variable participation. One of the contemplated scenarios is represented by the expansion project of the nuclear power plant Laguna Verde through the addition of a third reactor that serves as detonator of an integral program that proposes the installation of more nuclear reactors in the country. Before this possible scenario, the Federal Commission of Electricity like responsible organism of supplying energy to the population should have tools that offer it the flexibility to be adapted to the possible changes that will be presented along the project and also gives a value to the risk to future. The methodology denominated Real Options, Binomial model was proposed as an evaluation tool that allows to quantify the value of the expansion proposal, demonstrating the feasibility of the project through a periodic visualization of their evolution, all with the objective of supplying a financial analysis that serves as base and justification before the evident apogee of the nuclear energy that will be presented in future years. (Author)
Spontaneous regression of pulmonary bullae
International Nuclear Information System (INIS)
Satoh, H.; Ishikawa, H.; Ohtsuka, M.; Sekizawa, K.
2002-01-01
The natural history of pulmonary bullae is often characterized by gradual, progressive enlargement. Spontaneous regression of bullae is, however, very rare. We report a case in which complete resolution of pulmonary bullae in the left upper lung occurred spontaneously. The management of pulmonary bullae is occasionally made difficult because of gradual progressive enlargement associated with abnormal pulmonary function. Some patients have multiple bulla in both lungs and/or have a history of pulmonary emphysema. Others have a giant bulla without emphysematous change in the lungs. Our present case had treated lung cancer with no evidence of local recurrence. He had no emphysematous change in lung function test and had no complaints, although the high resolution CT scan shows evidence of underlying minimal changes of emphysema. Ortin and Gurney presented three cases of spontaneous reduction in size of bulla. Interestingly, one of them had a marked decrease in the size of a bulla in association with thickening of the wall of the bulla, which was observed in our patient. This case we describe is of interest, not only because of the rarity with which regression of pulmonary bulla has been reported in the literature, but also because of the spontaneous improvements in the radiological picture in the absence of overt infection or tumor. Copyright (2002) Blackwell Science Pty Ltd
Quantum algorithm for linear regression
Wang, Guoming
2017-07-01
We present a quantum algorithm for fitting a linear regression model to a given data set using the least-squares approach. Differently from previous algorithms which yield a quantum state encoding the optimal parameters, our algorithm outputs these numbers in the classical form. So by running it once, one completely determines the fitted model and then can use it to make predictions on new data at little cost. Moreover, our algorithm works in the standard oracle model, and can handle data sets with nonsparse design matrices. It runs in time poly( log2(N ) ,d ,κ ,1 /ɛ ) , where N is the size of the data set, d is the number of adjustable parameters, κ is the condition number of the design matrix, and ɛ is the desired precision in the output. We also show that the polynomial dependence on d and κ is necessary. Thus, our algorithm cannot be significantly improved. Furthermore, we also give a quantum algorithm that estimates the quality of the least-squares fit (without computing its parameters explicitly). This algorithm runs faster than the one for finding this fit, and can be used to check whether the given data set qualifies for linear regression in the first place.
Interpretation of commonly used statistical regression models.
Kasza, Jessica; Wolfe, Rory
2014-01-01
A review of some regression models commonly used in respiratory health applications is provided in this article. Simple linear regression, multiple linear regression, logistic regression and ordinal logistic regression are considered. The focus of this article is on the interpretation of the regression coefficients of each model, which are illustrated through the application of these models to a respiratory health research study. © 2013 The Authors. Respirology © 2013 Asian Pacific Society of Respirology.
Keith, Timothy Z
2014-01-01
Multiple Regression and Beyond offers a conceptually oriented introduction to multiple regression (MR) analysis and structural equation modeling (SEM), along with analyses that flow naturally from those methods. By focusing on the concepts and purposes of MR and related methods, rather than the derivation and calculation of formulae, this book introduces material to students more clearly, and in a less threatening way. In addition to illuminating content necessary for coursework, the accessibility of this approach means students are more likely to be able to conduct research using MR or SEM--and more likely to use the methods wisely. Covers both MR and SEM, while explaining their relevance to one another Also includes path analysis, confirmatory factor analysis, and latent growth modeling Figures and tables throughout provide examples and illustrate key concepts and techniques For additional resources, please visit: http://tzkeith.com/.
Directory of Open Access Journals (Sweden)
Gastón Silverio Milanesi
2013-01-01
Full Text Available El trabajo propone un modelo de valoración de opciones reales con base en el modelo binomial utilizando la transformación de Edgeworth (Rubinstein, 1998 para incorporar momentos estocásticos de orden superior, especialmente para ciertos tipos de organizaciones, como empresas de base tecnológica, donde no se dispone de cartera de activos financieros gemelos, comparables de mercado y procesos estocásticos no gaussianos. Primero, se presenta el desarrollo formal del modelo, luego su aplicación sobre la valuación de spin-off tecnológico universitario, sensibilizando asimetría-curtosis y exponiendo el impacto en el valor del proyecto. Finalmente, se concluye sobre limitaciones y ventajas de la propuesta de valoración que resume la simplicidad del modelo binomial e incorporando momentos de orden superior en subyacentes con procesos no normales.
Directory of Open Access Journals (Sweden)
Gastón Silverio Milanesi
2013-07-01
Full Text Available El trabajo propone un modelo de valoración de opciones reales con base en el modelo binomial utilizando la transformación de Edgeworth (Rubinstein, 1998 para incorporar momentos estocásticos de orden supe- rior, especialmente para ciertos tipos de organizaciones, como empresas de base tecnológica, donde no se dispone de cartera de activos financieros gemelos, comparables de mercado y procesos estocásticos no gaussianos. Primero, se presenta el desarrollo formal del modelo, luego su aplicación sobre la valuación de spin-off tecnológico universitario, sensibilizando asimetría-curtosis y exponiendo el impacto en el valor del proyecto. Finalmente, se concluye sobre limitaciones y ventajas de la propuesta de valoración que resume la simplicidad del modelo binomial e incorporando momentos de orden superior en subyacentes con pro- cesos no normales.
On Weighted Support Vector Regression
DEFF Research Database (Denmark)
Han, Xixuan; Clemmensen, Line Katrine Harder
2014-01-01
We propose a new type of weighted support vector regression (SVR), motivated by modeling local dependencies in time and space in prediction of house prices. The classic weights of the weighted SVR are added to the slack variables in the objective function (OF‐weights). This procedure directly...... shrinks the coefficient of each observation in the estimated functions; thus, it is widely used for minimizing influence of outliers. We propose to additionally add weights to the slack variables in the constraints (CF‐weights) and call the combination of weights the doubly weighted SVR. We illustrate...... the differences and similarities of the two types of weights by demonstrating the connection between the Least Absolute Shrinkage and Selection Operator (LASSO) and the SVR. We show that an SVR problem can be transformed to a LASSO problem plus a linear constraint and a box constraint. We demonstrate...
Cohen, Joel E; Poulin, Robert; Lagrue, Clément
2017-01-03
The spatial distribution of individuals of any species is a basic concern of ecology. The spatial distribution of parasites matters to control and conservation of parasites that affect human and nonhuman populations. This paper develops a quantitative theory to predict the spatial distribution of parasites based on the distribution of parasites in hosts and the spatial distribution of hosts. Four models are tested against observations of metazoan hosts and their parasites in littoral zones of four lakes in Otago, New Zealand. These models differ in two dichotomous assumptions, constituting a 2 × 2 theoretical design. One assumption specifies whether the variance function of the number of parasites per host individual is described by Taylor's law (TL) or the negative binomial distribution (NBD). The other assumption specifies whether the numbers of parasite individuals within each host in a square meter of habitat are independent or perfectly correlated among host individuals. We find empirically that the variance-mean relationship of the numbers of parasites per square meter is very well described by TL but is not well described by NBD. Two models that posit perfect correlation of the parasite loads of hosts in a square meter of habitat approximate observations much better than two models that posit independence of parasite loads of hosts in a square meter, regardless of whether the variance-mean relationship of parasites per host individual obeys TL or NBD. We infer that high local interhost correlations in parasite load strongly influence the spatial distribution of parasites. Local hotspots could influence control and conservation of parasites.
Barriga-Rivera, Alejandro; Moya, María José; Lopez-Alonso, Manuel
2016-11-01
The evaluation of symptom association between gastroesophageal reflux and cardiorespiratory events in preterm infants remains unclear. This paper describes a conservative approach to decision-making of anti-reflux surgery through symptom association analysis. Forty-three neonates with potentially reflux-related cardiorespiratory symptoms underwent synchronized esophageal impedance-pH and cardiorespiratory monitoring. Three indices were considered to evaluate symptom association, the symptom index (SI), the symptom sensitivity index (SSI) and the symptom association probability (SAP). A conservative strategy was adopted regarding the decision of anti-reflux surgery, and therefore, patients were scheduled for laparoscopic Nissen fundoplication if the three indices showed a positive assessment of symptom association. Retrospectively, these indices and the binomial symptom index (BSI) were contrasted against the decision of anti-reflux surgery using different windows of association. Thirteen patients showed positive symptom association but only two underwent anti-reflux surgery. The SI and the SSI showed an increasing trend with the width of the window of association. The SAP was affected randomly by slightly altering the windowing parameters. The BSI showed the best performance with the two-minute window (κ =0.78) CONCLUSIONS: The pathology under study is known to improve with maturity. However, the severity of cardiorespiratory symptoms may threaten the neonate's life and therefore, in some occasions, invasive treatments must be considered to protect life. The BSI provides a good prediction of a combination of positive SI, SSI and SAP, which may improve clinical decisions. However, further clinical studies are required to prove the BSI as an optimal predictor of clinical outcomes. Copyright © 2015 Asociación Española de Pediatría. Publicado por Elsevier España, S.L.U. All rights reserved.
Hilpert, Markus; Johnson, William P.
2018-01-01
We used a recently developed simple mathematical network model to upscale pore-scale colloid transport information determined under unfavorable attachment conditions. Classical log-linear and nonmonotonic retention profiles, both well-reported under favorable and unfavorable attachment conditions, respectively, emerged from our upscaling. The primary attribute of the network is colloid transfer between bulk pore fluid, the near-surface fluid domain (NSFD), and attachment (treated as irreversible). The network model accounts for colloid transfer to the NSFD of downgradient grains and for reentrainment to bulk pore fluid via diffusion or via expulsion at rear flow stagnation zones (RFSZs). The model describes colloid transport by a sequence of random trials in a one-dimensional (1-D) network of Happel cells, which contain a grain and a pore. Using combinatorial analysis that capitalizes on the binomial coefficient, we derived from the pore-scale information the theoretical residence time distribution of colloids in the network. The transition from log-linear to nonmonotonic retention profiles occurs when the conditions underlying classical filtration theory are not fulfilled, i.e., when an NSFD colloid population is maintained. Then, nonmonotonic retention profiles result potentially both for attached and NSFD colloids. The concentration maxima shift downgradient depending on specific parameter choice. The concentration maxima were also shown to shift downgradient temporally (with continued elution) under conditions where attachment is negligible, explaining experimentally observed downgradient transport of retained concentration maxima of adhesion-deficient bacteria. For the case of zero reentrainment, we develop closed-form, analytical expressions for the shape, and the maximum of the colloid retention profile.
Alekseeva, N P; Alekseev, A O; Vakhtin, Iu B; Kravtsov, V Iu; Kuzovatov, S N; Skorikova, T I
2008-01-01
Distributions of nuclear morphology anomalies in transplantable rabdomiosarcoma RA-23 cell populations were investigated under effect of ionizing radiation from 0 to 45 Gy. Internuclear bridges, nuclear protrusions and dumbbell-shaped nuclei were accepted for morphological anomalies. Empirical distributions of the number of anomalies per 100 nuclei were used. The adequate model of reentrant binomial distribution has been found. The sum of binomial random variables with binomial number of summands has such distribution. Averages of these random variables were named, accordingly, internal and external average reentrant components. Their maximum likelihood estimations were received. Statistical properties of these estimations were investigated by means of statistical modeling. It has been received that at equally significant correlation between the radiation dose and the average of nuclear anomalies in cell populations after two-three cellular cycles from the moment of irradiation in vivo the irradiation doze significantly correlates with internal average reentrant component, and in remote descendants of cell transplants irradiated in vitro - with external one.
Credit Scoring Problem Based on Regression Analysis
Khassawneh, Bashar Suhil Jad Allah
2014-01-01
ABSTRACT: This thesis provides an explanatory introduction to the regression models of data mining and contains basic definitions of key terms in the linear, multiple and logistic regression models. Meanwhile, the aim of this study is to illustrate fitting models for the credit scoring problem using simple linear, multiple linear and logistic regression models and also to analyze the found model functions by statistical tools. Keywords: Data mining, linear regression, logistic regression....
Variable selection and model choice in geoadditive regression models.
Kneib, Thomas; Hothorn, Torsten; Tutz, Gerhard
2009-06-01
Model choice and variable selection are issues of major concern in practical regression analyses, arising in many biometric applications such as habitat suitability analyses, where the aim is to identify the influence of potentially many environmental conditions on certain species. We describe regression models for breeding bird communities that facilitate both model choice and variable selection, by a boosting algorithm that works within a class of geoadditive regression models comprising spatial effects, nonparametric effects of continuous covariates, interaction surfaces, and varying coefficients. The major modeling components are penalized splines and their bivariate tensor product extensions. All smooth model terms are represented as the sum of a parametric component and a smooth component with one degree of freedom to obtain a fair comparison between the model terms. A generic representation of the geoadditive model allows us to devise a general boosting algorithm that automatically performs model choice and variable selection.
An Original Stepwise Multilevel Logistic Regression Analysis of Discriminatory Accuracy
DEFF Research Database (Denmark)
Merlo, Juan; Wagner, Philippe; Ghith, Nermin
2016-01-01
BACKGROUND AND AIM: Many multilevel logistic regression analyses of "neighbourhood and health" focus on interpreting measures of associations (e.g., odds ratio, OR). In contrast, multilevel analysis of variance is rarely considered. We propose an original stepwise analytical approach that disting...
Interpreting Multiple Linear Regression: A Guidebook of Variable Importance
Nathans, Laura L.; Oswald, Frederick L.; Nimon, Kim
2012-01-01
Multiple regression (MR) analyses are commonly employed in social science fields. It is also common for interpretation of results to typically reflect overreliance on beta weights, often resulting in very limited interpretations of variable importance. It appears that few researchers employ other methods to obtain a fuller understanding of what…
Accounting for Zero Inflation of Mussel Parasite Counts Using Discrete Regression Models
Directory of Open Access Journals (Sweden)
Emel Çankaya
2017-06-01
Full Text Available In many ecological applications, the absences of species are inevitable due to either detection faults in samples or uninhabitable conditions for their existence, resulting in high number of zero counts or abundance. Usual practice for modelling such data is regression modelling of log(abundance+1 and it is well know that resulting model is inadequate for prediction purposes. New discrete models accounting for zero abundances, namely zero-inflated regression (ZIP and ZINB, Hurdle-Poisson (HP and Hurdle-Negative Binomial (HNB amongst others are widely preferred to the classical regression models. Due to the fact that mussels are one of the economically most important aquatic products of Turkey, the purpose of this study is therefore to examine the performances of these four models in determination of the significant biotic and abiotic factors on the occurrences of Nematopsis legeri parasite harming the existence of Mediterranean mussels (Mytilus galloprovincialis L.. The data collected from the three coastal regions of Sinop city in Turkey showed more than 50% of parasite counts on the average are zero-valued and model comparisons were based on information criterion. The results showed that the probability of the occurrence of this parasite is here best formulated by ZINB or HNB models and influential factors of models were found to be correspondent with ecological differences of the regions.