Beta-binomial regression and bimodal utilization.
Liu, Chuan-Fen; Burgess, James F; Manning, Willard G; Maciejewski, Matthew L
2013-10-01
To illustrate how the analysis of bimodal U-shaped distributed utilization can be modeled with beta-binomial regression, which is rarely used in health services research. Veterans Affairs (VA) administrative data and Medicare claims in 2001-2004 for 11,123 Medicare-eligible VA primary care users in 2000. We compared means and distributions of VA reliance (the proportion of all VA/Medicare primary care visits occurring in VA) predicted from beta-binomial, binomial, and ordinary least-squares (OLS) models. Beta-binomial model fits the bimodal distribution of VA reliance better than binomial and OLS models due to the nondependence on normality and the greater flexibility in shape parameters. Increased awareness of beta-binomial regression may help analysts apply appropriate methods to outcomes with bimodal or U-shaped distributions. © Health Research and Educational Trust.
Application of Negative Binomial Regression for Assessing Public ...
African Journals Online (AJOL)
Because the variance was nearly two times greater than the mean, the negative binomial regression model provided an improved fit to the data and accounted better for overdispersion than the Poisson regression model, which assumed that the mean and variance are the same. The level of education and race were found
Amaliana, Luthfatul; Sa'adah, Umu; Wayan Surya Wardhani, Ni
2017-12-01
Tetanus Neonatorum is an infectious disease that can be prevented by immunization. The number of Tetanus Neonatorum cases in East Java Province is the highest in Indonesia until 2015. Tetanus Neonatorum data contain over dispersion and big enough proportion of zero-inflation. Negative Binomial (NB) regression is an alternative method when over dispersion happens in Poisson regression. However, the data containing over dispersion and zero-inflation are more appropriately analyzed by using Zero-Inflated Negative Binomial (ZINB) regression. The purpose of this study are: (1) to model Tetanus Neonatorum cases in East Java Province with 71.05 percent proportion of zero-inflation by using NB and ZINB regression, (2) to obtain the best model. The result of this study indicates that ZINB is better than NB regression with smaller AIC.
Predicting Cumulative Incidence Probability by Direct Binomial Regression
DEFF Research Database (Denmark)
Scheike, Thomas H.; Zhang, Mei-Jie
Binomial modelling; cumulative incidence probability; cause-specific hazards; subdistribution hazard......Binomial modelling; cumulative incidence probability; cause-specific hazards; subdistribution hazard...
Estimation of adjusted rate differences using additive negative binomial regression.
Donoghoe, Mark W; Marschner, Ian C
2016-08-15
Rate differences are an important effect measure in biostatistics and provide an alternative perspective to rate ratios. When the data are event counts observed during an exposure period, adjusted rate differences may be estimated using an identity-link Poisson generalised linear model, also known as additive Poisson regression. A problem with this approach is that the assumption of equality of mean and variance rarely holds in real data, which often show overdispersion. An additive negative binomial model is the natural alternative to account for this; however, standard model-fitting methods are often unable to cope with the constrained parameter space arising from the non-negativity restrictions of the additive model. In this paper, we propose a novel solution to this problem using a variant of the expectation-conditional maximisation-either algorithm. Our method provides a reliable way to fit an additive negative binomial regression model and also permits flexible generalisations using semi-parametric regression functions. We illustrate the method using a placebo-controlled clinical trial of fenofibrate treatment in patients with type II diabetes, where the outcome is the number of laser therapy courses administered to treat diabetic retinopathy. An R package is available that implements the proposed method. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Dynamic prediction of cumulative incidence functions by direct binomial regression.
Grand, Mia K; de Witte, Theo J M; Putter, Hein
2018-03-25
In recent years there have been a series of advances in the field of dynamic prediction. Among those is the development of methods for dynamic prediction of the cumulative incidence function in a competing risk setting. These models enable the predictions to be updated as time progresses and more information becomes available, for example when a patient comes back for a follow-up visit after completing a year of treatment, the risk of death, and adverse events may have changed since treatment initiation. One approach to model the cumulative incidence function in competing risks is by direct binomial regression, where right censoring of the event times is handled by inverse probability of censoring weights. We extend the approach by combining it with landmarking to enable dynamic prediction of the cumulative incidence function. The proposed models are very flexible, as they allow the covariates to have complex time-varying effects, and we illustrate how to investigate possible time-varying structures using Wald tests. The models are fitted using generalized estimating equations. The method is applied to bone marrow transplant data and the performance is investigated in a simulation study. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
[Evaluation of estimation of prevalence ratio using bayesian log-binomial regression model].
Gao, W L; Lin, H; Liu, X N; Ren, X W; Li, J S; Shen, X P; Zhu, S L
2017-03-10
To evaluate the estimation of prevalence ratio ( PR ) by using bayesian log-binomial regression model and its application, we estimated the PR of medical care-seeking prevalence to caregivers' recognition of risk signs of diarrhea in their infants by using bayesian log-binomial regression model in Openbugs software. The results showed that caregivers' recognition of infant' s risk signs of diarrhea was associated significantly with a 13% increase of medical care-seeking. Meanwhile, we compared the differences in PR 's point estimation and its interval estimation of medical care-seeking prevalence to caregivers' recognition of risk signs of diarrhea and convergence of three models (model 1: not adjusting for the covariates; model 2: adjusting for duration of caregivers' education, model 3: adjusting for distance between village and township and child month-age based on model 2) between bayesian log-binomial regression model and conventional log-binomial regression model. The results showed that all three bayesian log-binomial regression models were convergence and the estimated PRs were 1.130(95 %CI : 1.005-1.265), 1.128(95 %CI : 1.001-1.264) and 1.132(95 %CI : 1.004-1.267), respectively. Conventional log-binomial regression model 1 and model 2 were convergence and their PRs were 1.130(95 % CI : 1.055-1.206) and 1.126(95 % CI : 1.051-1.203), respectively, but the model 3 was misconvergence, so COPY method was used to estimate PR , which was 1.125 (95 %CI : 1.051-1.200). In addition, the point estimation and interval estimation of PRs from three bayesian log-binomial regression models differed slightly from those of PRs from conventional log-binomial regression model, but they had a good consistency in estimating PR . Therefore, bayesian log-binomial regression model can effectively estimate PR with less misconvergence and have more advantages in application compared with conventional log-binomial regression model.
Cao, Qingqing; Wu, Zhenqiang; Sun, Ying; Wang, Tiezhu; Han, Tengwei; Gu, Chaomei; Sun, Yehuan
2011-11-01
To Eexplore the application of negative binomial regression and modified Poisson regression analysis in analyzing the influential factors for injury frequency and the risk factors leading to the increase of injury frequency. 2917 primary and secondary school students were selected from Hefei by cluster random sampling method and surveyed by questionnaire. The data on the count event-based injuries used to fitted modified Poisson regression and negative binomial regression model. The risk factors incurring the increase of unintentional injury frequency for juvenile students was explored, so as to probe the efficiency of these two models in studying the influential factors for injury frequency. The Poisson model existed over-dispersion (P binomial regression model, was fitted better. respectively. Both showed that male gender, younger age, father working outside of the hometown, the level of the guardian being above junior high school and smoking might be the results of higher injury frequencies. On a tendency of clustered frequency data on injury event, both the modified Poisson regression analysis and negative binomial regression analysis can be used. However, based on our data, the modified Poisson regression fitted better and this model could give a more accurate interpretation of relevant factors affecting the frequency of injury.
Bennett, Bradley C; Husby, Chad E
2008-03-28
Botanical pharmacopoeias are non-random subsets of floras, with some taxonomic groups over- or under-represented. Moerman [Moerman, D.E., 1979. Symbols and selectivity: a statistical analysis of Native American medical ethnobotany, Journal of Ethnopharmacology 1, 111-119] introduced linear regression/residual analysis to examine these patterns. However, regression, the commonly-employed analysis, suffers from several statistical flaws. We use contingency table and binomial analyses to examine patterns of Shuar medicinal plant use (from Amazonian Ecuador). We first analyzed the Shuar data using Moerman's approach, modified to better meet requirements of linear regression analysis. Second, we assessed the exact randomization contingency table test for goodness of fit. Third, we developed a binomial model to test for non-random selection of plants in individual families. Modified regression models (which accommodated assumptions of linear regression) reduced R(2) to from 0.59 to 0.38, but did not eliminate all problems associated with regression analyses. Contingency table analyses revealed that the entire flora departs from the null model of equal proportions of medicinal plants in all families. In the binomial analysis, only 10 angiosperm families (of 115) differed significantly from the null model. These 10 families are largely responsible for patterns seen at higher taxonomic levels. Contingency table and binomial analyses offer an easy and statistically valid alternative to the regression approach.
Censored Hurdle Negative Binomial Regression (Case Study: Neonatorum Tetanus Case in Indonesia)
Yuli Rusdiana, Riza; Zain, Ismaini; Wulan Purnami, Santi
2017-06-01
Hurdle negative binomial model regression is a method that can be used for discreate dependent variable, excess zero and under- and overdispersion. It uses two parts approach. The first part estimates zero elements from dependent variable is zero hurdle model and the second part estimates not zero elements (non-negative integer) from dependent variable is called truncated negative binomial models. The discrete dependent variable in such cases is censored for some values. The type of censor that will be studied in this research is right censored. This study aims to obtain the parameter estimator hurdle negative binomial regression for right censored dependent variable. In the assessment of parameter estimation methods used Maximum Likelihood Estimator (MLE). Hurdle negative binomial model regression for right censored dependent variable is applied on the number of neonatorum tetanus cases in Indonesia. The type data is count data which contains zero values in some observations and other variety value. This study also aims to obtain the parameter estimator and test statistic censored hurdle negative binomial model. Based on the regression results, the factors that influence neonatorum tetanus case in Indonesia is the percentage of baby health care coverage and neonatal visits.
Marginalized zero-inflated negative binomial regression with application to dental caries.
Preisser, John S; Das, Kalyan; Long, D Leann; Divaris, Kimon
2016-05-10
The zero-inflated negative binomial regression model (ZINB) is often employed in diverse fields such as dentistry, health care utilization, highway safety, and medicine to examine relationships between exposures of interest and overdispersed count outcomes exhibiting many zeros. The regression coefficients of ZINB have latent class interpretations for a susceptible subpopulation at risk for the disease/condition under study with counts generated from a negative binomial distribution and for a non-susceptible subpopulation that provides only zero counts. The ZINB parameters, however, are not well-suited for estimating overall exposure effects, specifically, in quantifying the effect of an explanatory variable in the overall mixture population. In this paper, a marginalized zero-inflated negative binomial regression (MZINB) model for independent responses is proposed to model the population marginal mean count directly, providing straightforward inference for overall exposure effects based on maximum likelihood estimation. Through simulation studies, the finite sample performance of MZINB is compared with marginalized zero-inflated Poisson, Poisson, and negative binomial regression. The MZINB model is applied in the evaluation of a school-based fluoride mouthrinse program on dental caries in 677 children. Copyright © 2015 John Wiley & Sons, Ltd.
Bianca N.I. Eskelson; Hailemariam Temesgen; Tara M. Barrett
2009-01-01
Cavity tree and snag abundance data are highly variable and contain many zero observations. We predict cavity tree and snag abundance from variables that are readily available from forest cover maps or remotely sensed data using negative binomial (NB), zero-inflated NB, and zero-altered NB (ZANB) regression models as well as nearest neighbor (NN) imputation methods....
Zero inflated Poisson and negative binomial regression models: application in education.
Salehi, Masoud; Roudbari, Masoud
2015-01-01
The number of failed courses and semesters in students are indicators of their performance. These amounts have zero inflated (ZI) distributions. Using ZI Poisson and negative binomial distributions we can model these count data to find the associated factors and estimate the parameters. This study aims at to investigate the important factors related to the educational performance of students. This cross-sectional study performed in 2008-2009 at Iran University of Medical Sciences (IUMS) with a population of almost 6000 students, 670 students selected using stratified random sampling. The educational and demographical data were collected using the University records. The study design was approved at IUMS and the students' data kept confidential. The descriptive statistics and ZI Poisson and negative binomial regressions were used to analyze the data. The data were analyzed using STATA. In the number of failed semesters, Poisson and negative binomial distributions with ZI, students' total average and quota system had the most roles. For the number of failed courses, total average, and being in undergraduate or master levels had the most effect in both models. In all models the total average have the most effect on the number of failed courses or semesters. The next important factor is quota system in failed semester and undergraduate and master levels in failed courses. Therefore, average has an important inverse effect on the numbers of failed courses and semester.
Lu, Hai-Xia; Wong, May Chun Mei; Lo, Edward Chin Man; McGrath, Colman
2013-08-19
Limited information on oral health status for young adults aged 18 year-olds is known, and no available data exists in Hong Kong. The aims of this study were to investigate the oral health status and its risk indicators among young adults in Hong Kong using negative binomial regression. A survey was conducted in a representative sample of Hong Kong young adults aged 18 years. Clinical examinations were taken to assess oral health status using DMFT index and Community Periodontal Index (CPI) according to WHO criteria. Negative binomial regressions for DMFT score and the number of sextants with healthy gums were performed to identify the risk indicators of oral health status. A total of 324 young adults were examined. Prevalence of dental caries experience among the subjects was 59% and the overall mean DMFT score was 1.4. Most subjects (95%) had a score of 2 as their highest CPI score. Negative binomial regression analyses revealed that subjects who had a dental visit within 3 years had significantly higher DMFT scores (IRR = 1.68, p < 0.001). Subjects who brushed their teeth more frequently (IRR = 1.93, p < 0.001) and those with better dental knowledge (IRR = 1.09, p = 0.002) had significantly more sextants with healthy gums. Dental caries experience of the young adults aged 18 years in Hong Kong was not high but their periodontal condition was unsatisfactory. Their oral health status was related to their dental visit behavior, oral hygiene habit, and oral health knowledge.
Gomes, Marcos José Timbó Lima; Cunto, Flávio; da Silva, Alan Ricardo
2017-09-01
Generalized Linear Models (GLM) with negative binomial distribution for errors, have been widely used to estimate safety at the level of transportation planning. The limited ability of this technique to take spatial effects into account can be overcome through the use of local models from spatial regression techniques, such as Geographically Weighted Poisson Regression (GWPR). Although GWPR is a system that deals with spatial dependency and heterogeneity and has already been used in some road safety studies at the planning level, it fails to account for the possible overdispersion that can be found in the observations on road-traffic crashes. Two approaches were adopted for the Geographically Weighted Negative Binomial Regression (GWNBR) model to allow discrete data to be modeled in a non-stationary form and to take note of the overdispersion of the data: the first examines the constant overdispersion for all the traffic zones and the second includes the variable for each spatial unit. This research conducts a comparative analysis between non-spatial global crash prediction models and spatial local GWPR and GWNBR at the level of traffic zones in Fortaleza/Brazil. A geographic database of 126 traffic zones was compiled from the available data on exposure, network characteristics, socioeconomic factors and land use. The models were calibrated by using the frequency of injury crashes as a dependent variable and the results showed that GWPR and GWNBR achieved a better performance than GLM for the average residuals and likelihood as well as reducing the spatial autocorrelation of the residuals, and the GWNBR model was more able to capture the spatial heterogeneity of the crash frequency. Copyright © 2017 Elsevier Ltd. All rights reserved.
Majumdar, Arunabha; Witte, John S; Ghosh, Saurabh
2015-12-01
Binary phenotypes commonly arise due to multiple underlying quantitative precursors and genetic variants may impact multiple traits in a pleiotropic manner. Hence, simultaneously analyzing such correlated traits may be more powerful than analyzing individual traits. Various genotype-level methods, e.g., MultiPhen (O'Reilly et al. []), have been developed to identify genetic factors underlying a multivariate phenotype. For univariate phenotypes, the usefulness and applicability of allele-level tests have been investigated. The test of allele frequency difference among cases and controls is commonly used for mapping case-control association. However, allelic methods for multivariate association mapping have not been studied much. In this article, we explore two allelic tests of multivariate association: one using a Binomial regression model based on inverted regression of genotype on phenotype (Binomial regression-based Association of Multivariate Phenotypes [BAMP]), and the other employing the Mahalanobis distance between two sample means of the multivariate phenotype vector for two alleles at a single-nucleotide polymorphism (Distance-based Association of Multivariate Phenotypes [DAMP]). These methods can incorporate both discrete and continuous phenotypes. Some theoretical properties for BAMP are studied. Using simulations, the power of the methods for detecting multivariate association is compared with the genotype-level test MultiPhen's. The allelic tests yield marginally higher power than MultiPhen for multivariate phenotypes. For one/two binary traits under recessive mode of inheritance, allelic tests are found to be substantially more powerful. All three tests are applied to two different real data and the results offer some support for the simulation study. We propose a hybrid approach for testing multivariate association that implements MultiPhen when Hardy-Weinberg Equilibrium (HWE) is violated and BAMP otherwise, because the allelic approaches assume HWE
Martina, R; Kay, R; van Maanen, R; Ridder, A
2015-01-01
Clinical studies in overactive bladder have traditionally used analysis of covariance or nonparametric methods to analyse the number of incontinence episodes and other count data. It is known that if the underlying distributional assumptions of a particular parametric method do not hold, an alternative parametric method may be more efficient than a nonparametric one, which makes no assumptions regarding the underlying distribution of the data. Therefore, there are advantages in using methods based on the Poisson distribution or extensions of that method, which incorporate specific features that provide a modelling framework for count data. One challenge with count data is overdispersion, but methods are available that can account for this through the introduction of random effect terms in the modelling, and it is this modelling framework that leads to the negative binomial distribution. These models can also provide clinicians with a clearer and more appropriate interpretation of treatment effects in terms of rate ratios. In this paper, the previously used parametric and non-parametric approaches are contrasted with those based on Poisson regression and various extensions in trials evaluating solifenacin and mirabegron in patients with overactive bladder. In these applications, negative binomial models are seen to fit the data well. Copyright © 2014 John Wiley & Sons, Ltd.
Liu, Xiang; Saat, M Rapik; Qin, Xiao; Barkan, Christopher P L
2013-10-01
Derailments are the most common type of freight-train accidents in the United States. Derailments cause damage to infrastructure and rolling stock, disrupt services, and may cause casualties and harm the environment. Accordingly, derailment analysis and prevention has long been a high priority in the rail industry and government. Despite the low probability of a train derailment, the potential for severe consequences justify the need to better understand the factors influencing train derailment severity. In this paper, a zero-truncated negative binomial (ZTNB) regression model is developed to estimate the conditional mean of train derailment severity. Recognizing that the mean is not the only statistic describing data distribution, a quantile regression (QR) model is also developed to estimate derailment severity at different quantiles. The two regression models together provide a better understanding of train derailment severity distribution. Results of this work can be used to estimate train derailment severity under various operational conditions and by different accident causes. This research is intended to provide insights regarding development of cost-efficient train safety policies. Copyright © 2013 Elsevier Ltd. All rights reserved.
Robust inference in the negative binomial regression model with an application to falls data.
Aeberhard, William H; Cantoni, Eva; Heritier, Stephane
2014-12-01
A popular way to model overdispersed count data, such as the number of falls reported during intervention studies, is by means of the negative binomial (NB) distribution. Classical estimating methods are well-known to be sensitive to model misspecifications, taking the form of patients falling much more than expected in such intervention studies where the NB regression model is used. We extend in this article two approaches for building robust M-estimators of the regression parameters in the class of generalized linear models to the NB distribution. The first approach achieves robustness in the response by applying a bounded function on the Pearson residuals arising in the maximum likelihood estimating equations, while the second approach achieves robustness by bounding the unscaled deviance components. For both approaches, we explore different choices for the bounding functions. Through a unified notation, we show how close these approaches may actually be as long as the bounding functions are chosen and tuned appropriately, and provide the asymptotic distributions of the resulting estimators. Moreover, we introduce a robust weighted maximum likelihood estimator for the overdispersion parameter, specific to the NB distribution. Simulations under various settings show that redescending bounding functions yield estimates with smaller biases under contamination while keeping high efficiency at the assumed model, and this for both approaches. We present an application to a recent randomized controlled trial measuring the effectiveness of an exercise program at reducing the number of falls among people suffering from Parkinsons disease to illustrate the diagnostic use of such robust procedures and their need for reliable inference. © 2014, The International Biometric Society.
Directory of Open Access Journals (Sweden)
Xuedong Yan
2012-01-01
Full Text Available In this study, the traffic crash rate, total crash frequency, and injury and fatal crash frequency were taken into consideration for distinguishing between rural and urban road segment safety. The GIS-based crash data during four and half years in Pikes Peak Area, US were applied for the analyses. The comparative statistical results show that the crash rates in rural segments are consistently lower than urban segments. Further, the regression results based on Zero-Inflated Negative Binomial (ZINB regression models indicate that the urban areas have a higher crash risk in terms of both total crash frequency and injury and fatal crash frequency, compared to rural areas. Additionally, it is found that crash frequencies increase as traffic volume and segment length increase, though the higher traffic volume lower the likelihood of severe crash occurrence; compared to 2-lane roads, the 4-lane roads have lower crash frequencies but have a higher probability of severe crash occurrence; and better road facilities with higher free flow speed can benefit from high standard design feature thus resulting in a lower total crash frequency, but they cannot mitigate the severe crash risk.
Salmerón, Diego; Cano, Juan A; Chirlaque, María D
2015-08-30
In cohort studies, binary outcomes are very often analyzed by logistic regression. However, it is well known that when the goal is to estimate a risk ratio, the logistic regression is inappropriate if the outcome is common. In these cases, a log-binomial regression model is preferable. On the other hand, the estimation of the regression coefficients of the log-binomial model is difficult owing to the constraints that must be imposed on these coefficients. Bayesian methods allow a straightforward approach for log-binomial regression models and produce smaller mean squared errors in the estimation of risk ratios than the frequentist methods, and the posterior inferences can be obtained using the software WinBUGS. However, Markov chain Monte Carlo methods implemented in WinBUGS can lead to large Monte Carlo errors in the approximations to the posterior inferences because they produce correlated simulations, and the accuracy of the approximations are inversely related to this correlation. To reduce correlation and to improve accuracy, we propose a reparameterization based on a Poisson model and a sampling algorithm coded in R. Copyright © 2015 John Wiley & Sons, Ltd.
Moghimbeigi, Abbas
2015-05-07
Poisson regression models provide a standard framework for quantitative trait locus (QTL) mapping of count traits. In practice, however, count traits are often over-dispersed relative to the Poisson distribution. In these situations, the zero-inflated Poisson (ZIP), zero-inflated generalized Poisson (ZIGP) and zero-inflated negative binomial (ZINB) regression may be useful for QTL mapping of count traits. Added genetic variables to the negative binomial part equation, may also affect extra zero data. In this study, to overcome these challenges, I apply two-part ZINB model. The EM algorithm with Newton-Raphson method in the M-step uses for estimating parameters. An application of the two-part ZINB model for QTL mapping is considered to detect associations between the formation of gallstone and the genotype of markers. Copyright © 2015 Elsevier Ltd. All rights reserved.
Najera-Zuloaga, Josu; Lee, Dae-Jin; Arostegui, Inmaculada
2017-01-01
Health-related quality of life has become an increasingly important indicator of health status in clinical trials and epidemiological research. Moreover, the study of the relationship of health-related quality of life with patients and disease characteristics has become one of the primary aims of many health-related quality of life studies. Health-related quality of life scores are usually assumed to be distributed as binomial random variables and often highly skewed. The use of the beta-binomial distribution in the regression context has been proposed to model such data; however, the beta-binomial regression has been performed by means of two different approaches in the literature: (i) beta-binomial distribution with a logistic link; and (ii) hierarchical generalized linear models. None of the existing literature in the analysis of health-related quality of life survey data has performed a comparison of both approaches in terms of adequacy and regression parameter interpretation context. This paper is motivated by the analysis of a real data application of health-related quality of life outcomes in patients with Chronic Obstructive Pulmonary Disease, where the use of both approaches yields to contradictory results in terms of covariate effects significance and consequently the interpretation of the most relevant factors in health-related quality of life. We present an explanation of the results in both methodologies through a simulation study and address the need to apply the proper approach in the analysis of health-related quality of life survey data for practitioners, providing an R package.
Hierarchical regression for analyses of multiple outcomes.
Richardson, David B; Hamra, Ghassan B; MacLehose, Richard F; Cole, Stephen R; Chu, Haitao
2015-09-01
In cohort mortality studies, there often is interest in associations between an exposure of primary interest and mortality due to a range of different causes. A standard approach to such analyses involves fitting a separate regression model for each type of outcome. However, the statistical precision of some estimated associations may be poor because of sparse data. In this paper, we describe a hierarchical regression model for estimation of parameters describing outcome-specific relative rate functions and associated credible intervals. The proposed model uses background stratification to provide flexible control for the outcome-specific associations of potential confounders, and it employs a hierarchical "shrinkage" approach to stabilize estimates of an exposure's associations with mortality due to different causes of death. The approach is illustrated in analyses of cancer mortality in 2 cohorts: a cohort of dioxin-exposed US chemical workers and a cohort of radiation-exposed Japanese atomic bomb survivors. Compared with standard regression estimates of associations, hierarchical regression yielded estimates with improved precision that tended to have less extreme values. The hierarchical regression approach also allowed the fitting of models with effect-measure modification. The proposed hierarchical approach can yield estimates of association that are more precise than conventional estimates when one wishes to estimate associations with multiple outcomes. © The Author 2015. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Buonaccorsi, John; Prochenka, Agnieszka; Thoresen, Magne; Ploski, Rafal
2016-09-30
Motivated by a genetic application, this paper addresses the problem of fitting regression models when the predictor is a proportion measured with error. While the problem of dealing with additive measurement error in fitting regression models has been extensively studied, the problem where the additive error is of a binomial nature has not been addressed. The measurement errors here are heteroscedastic for two reasons; dependence on the underlying true value and changing sampling effort over observations. While some of the previously developed methods for treating additive measurement error with heteroscedasticity can be used in this setting, other methods need modification. A new version of simulation extrapolation is developed, and we also explore a variation on the standard regression calibration method that uses a beta-binomial model based on the fact that the true value is a proportion. Although most of the methods introduced here can be used for fitting non-linear models, this paper will focus primarily on their use in fitting a linear model. While previous work has focused mainly on estimation of the coefficients, we will, with motivation from our example, also examine estimation of the variance around the regression line. In addressing these problems, we also discuss the appropriate manner in which to bootstrap for both inferences and bias assessment. The various methods are compared via simulation, and the results are illustrated using our motivating data, for which the goal is to relate the methylation rate of a blood sample to the age of the individual providing the sample. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Tang, Yongqiang
2015-01-01
A sample size formula is derived for negative binomial regression for the analysis of recurrent events, in which subjects can have unequal follow-up time. We obtain sharp lower and upper bounds on the required size, which is easy to compute. The upper bound is generally only slightly larger than the required size, and hence can be used to approximate the sample size. The lower and upper size bounds can be decomposed into two terms. The first term relies on the mean number of events in each group, and the second term depends on two factors that measure, respectively, the extent of between-subject variability in event rates, and follow-up time. Simulation studies are conducted to assess the performance of the proposed method. An application of our formulae to a multiple sclerosis trial is provided.
Ali, Asad; Zaidi, Farrah; Fatima, Syeda Hira; Adnan, Muhammad; Ullah, Saleem
2018-03-24
In this study, we propose to develop a geostatistical computational framework to model the distribution of rat bite infestation of epidemic proportion in Peshawar valley, Pakistan. Two species Rattus norvegicus and Rattus rattus are suspected to spread the infestation. The framework combines strengths of maximum entropy algorithm and binomial kriging with logistic regression to spatially model the distribution of infestation and to determine the individual role of environmental predictors in modeling the distribution trends. Our results demonstrate the significance of a number of social and environmental factors in rat infestations such as (I) high human population density; (II) greater dispersal ability of rodents due to the availability of better connectivity routes such as roads, and (III) temperature and precipitation influencing rodent fecundity and life cycle.
Directory of Open Access Journals (Sweden)
Manu Batra
2016-01-01
Full Text Available Context: Dental caries among children has been described as a pandemic disease with a multifactorial nature. Various sociodemographic factors and oral hygiene practices are commonly tested for their influence on dental caries. In recent years, a recent statistical model that allows for covariate adjustment has been developed and is commonly referred zero-inflated negative binomial (ZINB models. Aim: To compare the fit of the two models, the conventional linear regression (LR model and ZINB model to assess the risk factors associated with dental caries. Materials and Methods: A cross-sectional survey was conducted on 1138 12-year-old school children in Moradabad Town, Uttar Pradesh during months of February-August 2014. Selected participants were interviewed using a questionnaire. Dental caries was assessed by recording decayed, missing, or filled teeth (DMFT index. Statistical Analysis Used: To assess the risk factor associated with dental caries in children, two approaches have been applied - LR model and ZINB model. Results: The prevalence of caries-free subjects was 24.1%, and mean DMFT was 3.4 ± 1.8. In LR model, all the variables were statistically significant. Whereas in ZINB model, negative binomial part showed place of residence, father′s education level, tooth brushing frequency, and dental visit statistically significant implying that the degree of being caries-free (DMFT = 0 increases for group of children who are living in urban, whose father is university pass out, who brushes twice a day and if have ever visited a dentist. Conclusion: The current study report that the LR model is a poorly fitted model and may lead to spurious conclusions whereas ZINB model has shown better goodness of fit (Akaike information criterion values - LR: 3.94; ZINB: 2.39 and can be preferred if high variance and number of an excess of zeroes are present.
Batra, Manu; Shah, Aasim Farooq; Rajput, Prashant; Shah, Ishrat Aasim
2016-01-01
Dental caries among children has been described as a pandemic disease with a multifactorial nature. Various sociodemographic factors and oral hygiene practices are commonly tested for their influence on dental caries. In recent years, a recent statistical model that allows for covariate adjustment has been developed and is commonly referred zero-inflated negative binomial (ZINB) models. To compare the fit of the two models, the conventional linear regression (LR) model and ZINB model to assess the risk factors associated with dental caries. A cross-sectional survey was conducted on 1138 12-year-old school children in Moradabad Town, Uttar Pradesh during months of February-August 2014. Selected participants were interviewed using a questionnaire. Dental caries was assessed by recording decayed, missing, or filled teeth (DMFT) index. To assess the risk factor associated with dental caries in children, two approaches have been applied - LR model and ZINB model. The prevalence of caries-free subjects was 24.1%, and mean DMFT was 3.4 ± 1.8. In LR model, all the variables were statistically significant. Whereas in ZINB model, negative binomial part showed place of residence, father's education level, tooth brushing frequency, and dental visit statistically significant implying that the degree of being caries-free (DMFT = 0) increases for group of children who are living in urban, whose father is university pass out, who brushes twice a day and if have ever visited a dentist. The current study report that the LR model is a poorly fitted model and may lead to spurious conclusions whereas ZINB model has shown better goodness of fit (Akaike information criterion values - LR: 3.94; ZINB: 2.39) and can be preferred if high variance and number of an excess of zeroes are present.
2014-01-01
Background Whole-genome bisulfite sequencing currently provides the highest-precision view of the epigenome, with quantitative information about populations of cells down to single nucleotide resolution. Several studies have demonstrated the value of this precision: meaningful features that correlate strongly with biological functions can be found associated with only a few CpG sites. Understanding the role of DNA methylation, and more broadly the role of DNA accessibility, requires that methylation differences between populations of cells are identified with extreme precision and in complex experimental designs. Results In this work we investigated the use of beta-binomial regression as a general approach for modeling whole-genome bisulfite data to identify differentially methylated sites and genomic intervals. Conclusions The regression-based analysis can handle medium- and large-scale experiments where it becomes critical to accurately model variation in methylation levels between replicates and account for influence of various experimental factors like cell types or batch effects. PMID:24962134
Regression og geometrisk data analyse (2. del)
DEFF Research Database (Denmark)
Brinkkjær, Ulf
2010-01-01
Artiklen søger at vise, hvordan regressionsanalyse og geometrisk data analyse kan integreres. Det er interessant, fordi disse metoder ofte opstilles som modsætninger f.eks. som en modsætning mellem beskrivende og forklarende metoder. Artiklens første del bragtes i Praktiske Grunde 3-4 / 2007....
An, Qingyu; Wu, Jun; Fan, Xuesong; Pan, Liyang; Sun, Wei
2016-01-01
The hand foot and mouth disease (HFMD) is a human syndrome caused by intestinal viruses like that coxsackie A virus 16, enterovirus 71 and easily developed into outbreak in kindergarten and school. Scientifically and accurately early detection of the start time of HFMD epidemic is a key principle in planning of control measures and minimizing the impact of HFMD. The objective of this study was to establish a reliable early detection model for start timing of hand foot mouth disease epidemic in Dalian and to evaluate the performance of model by analyzing the sensitivity in detectability. The negative binomial regression model was used to estimate the weekly baseline case number of HFMD and identified the optimal alerting threshold between tested difference threshold values during the epidemic and non-epidemic year. Circular distribution method was used to calculate the gold standard of start timing of HFMD epidemic. From 2009 to 2014, a total of 62022 HFMD cases were reported (36879 males and 25143 females) in Dalian, Liaoning Province, China, including 15 fatal cases. The median age of the patients was 3 years. The incidence rate of epidemic year ranged from 137.54 per 100,000 population to 231.44 per 100,000population, the incidence rate of non-epidemic year was lower than 112 per 100,000 population. The negative binomial regression model with AIC value 147.28 was finally selected to construct the baseline level. The threshold value was 100 for the epidemic year and 50 for the non- epidemic year had the highest sensitivity(100%) both in retrospective and prospective early warning and the detection time-consuming was 2 weeks before the actual starting of HFMD epidemic. The negative binomial regression model could early warning the start of a HFMD epidemic with good sensitivity and appropriate detection time in Dalian.
Najaf, Pooya; Duddu, Venkata R; Pulugurtha, Srinivas S
2018-03-01
Machine learning (ML) techniques have higher prediction accuracy compared to conventional statistical methods for crash frequency modelling. However, their black-box nature limits the interpretability. The objective of this research is to combine both ML and statistical methods to develop hybrid link-level crash frequency models with high predictability and interpretability. For this purpose, M5' model trees method (M5') is introduced and applied to classify the crash data and then calibrate a model for each homogenous class. The data for 1134 and 345 randomly selected links on urban arterials in the city of Charlotte, North Carolina was used to develop and validate models, respectively. The outputs from the hybrid approach are compared with the outputs from cluster-based negative binomial regression (NBR) and general NBR models. Findings indicate that M5' has high predictability and is very reliable to interpret the role of different attributes on crash frequency compared to other developed models.
Statistical power analyses using G*Power 3.1: tests for correlation and regression analyses.
Faul, Franz; Erdfelder, Edgar; Buchner, Axel; Lang, Albert-Georg
2009-11-01
G*Power is a free power analysis program for a variety of statistical tests. We present extensions and improvements of the version introduced by Faul, Erdfelder, Lang, and Buchner (2007) in the domain of correlation and regression analyses. In the new version, we have added procedures to analyze the power of tests based on (1) single-sample tetrachoric correlations, (2) comparisons of dependent correlations, (3) bivariate linear regression, (4) multiple linear regression based on the random predictor model, (5) logistic regression, and (6) Poisson regression. We describe these new features and provide a brief introduction to their scope and handling.
Directory of Open Access Journals (Sweden)
Glauco H.S. Mendes
2013-09-01
Full Text Available Critical success factors in new product development (NPD in the Brazilian small and medium enterprises (SMEs are identified and analyzed. Critical success factors are best practices that can be used to improve NPD management and performance in a company. However, the traditional method for identifying these factors is survey methods. Subsequently, the collected data are reduced through traditional multivariate analysis. The objective of this work is to develop a logistic regression model for predicting the success or failure of the new product development. This model allows for an evaluation and prioritization of resource commitments. The results will be helpful for guiding management actions, as one way to improve NPD performance in those industries.
2014-01-01
Background Large-scale public health interventions with rapid scale-up are increasingly being implemented worldwide. Such implementation allows for a large target population to be reached in a short period of time. But when the time comes to investigate the effectiveness of these interventions, the rapid scale-up creates several methodological challenges, such as the lack of baseline data and the absence of control groups. One example of such an intervention is Avahan, the India HIV/AIDS initiative of the Bill & Melinda Gates Foundation. One question of interest is the effect of Avahan on condom use by female sex workers with their clients. By retrospectively reconstructing condom use and sex work history from survey data, it is possible to estimate how condom use rates evolve over time. However formal inference about how this rate changes at a given point in calendar time remains challenging. Methods We propose a new statistical procedure based on a mixture of binomial regression and Cox regression. We compare this new method to an existing approach based on generalized estimating equations through simulations and application to Indian data. Results Both methods are unbiased, but the proposed method is more powerful than the existing method, especially when initial condom use is high. When applied to the Indian data, the new method mostly agrees with the existing method, but seems to have corrected some implausible results of the latter in a few districts. We also show how the new method can be used to analyze the data of all districts combined. Conclusions The use of both methods can be recommended for exploratory data analysis. However for formal statistical inference, the new method has better power. PMID:24397563
Kondo, Yumi; Zhao, Yinshan; Petkau, John
2015-06-15
We develop a new modeling approach to enhance a recently proposed method to detect increases of contrast-enhancing lesions (CELs) on repeated magnetic resonance imaging, which have been used as an indicator for potential adverse events in multiple sclerosis clinical trials. The method signals patients with unusual increases in CEL activity by estimating the probability of observing CEL counts as large as those observed on a patient's recent scans conditional on the patient's CEL counts on previous scans. This conditional probability index (CPI), computed based on a mixed-effect negative binomial regression model, can vary substantially depending on the choice of distribution for the patient-specific random effects. Therefore, we relax this parametric assumption to model the random effects with an infinite mixture of beta distributions, using the Dirichlet process, which effectively allows any form of distribution. To our knowledge, no previous literature considers a mixed-effect regression for longitudinal count variables where the random effect is modeled with a Dirichlet process mixture. As our inference is in the Bayesian framework, we adopt a meta-analytic approach to develop an informative prior based on previous clinical trials. This is particularly helpful at the early stages of trials when less data are available. Our enhanced method is illustrated with CEL data from 10 previous multiple sclerosis clinical trials. Our simulation study shows that our procedure estimates the CPI more accurately than parametric alternatives when the patient-specific random effect distribution is misspecified and that an informative prior improves the accuracy of the CPI estimates. Copyright © 2015 John Wiley & Sons, Ltd.
Villalobos-Rodelo, Juan J; Medina-Solís, Carlo E; Verdugo-Barraza, Lourdes; Islas-Granillo, Horacio; García-Jau, Rosa A; Escoffié-Ramírez, Mauricio; Maupomé, Gerardo
2013-01-01
Dental caries is one of the most common chronic childhood diseases worldwide. In Mexico it is a public health problem. To identify variables associated with caries occurrence (non-reversible and reversible lesions) in a sample of Mexican schoolchildren. We performed a cross-sectional study in 640 schoolchildren of 11 and 12 years of age. The dependent variable was the D 1+2 MFT index, comprising reversible and irreversible carious lesions (dental caries) according to the Pitts D 1 /D 2 classification. Clinical examinations were performed by trained and standardized examiners. Using structured questionnaires we collected socio-demographic, socio-economic and health-related oral behaviors. Negative binomial regression was used for the analysis. The D 1+2 MFT index was 5.68±3.47. The schoolchildren characteristics associated with an increase in the expected average rate of dental caries were: being female (27.1%), having 12 years of age (23.2%), consuming larger amounts of sugar (13.9%), having mediocre (31.3%) and poor/very poor oral hygiene (62.3%). Conversely, when the family owned a car the expected mean D 1+2 MFT decreased 13.5%. When dental caries occurrence (about 6 decayed teeth) is estimated taking into consideration not only cavities (lesions in need of restorative dental treatment) but also incipient carious lesions, the character of this disease as a common clinical problem and as a public health problem are further emphasized. Results revealed the need to establish preventive and curative strategies in the sample.
Analysis of hypoglycemic events using negative binomial models.
Luo, Junxiang; Qu, Yongming
2013-01-01
Negative binomial regression is a standard model to analyze hypoglycemic events in diabetes clinical trials. Adjusting for baseline covariates could potentially increase the estimation efficiency of negative binomial regression. However, adjusting for covariates raises concerns about model misspecification, in which the negative binomial regression is not robust because of its requirement for strong model assumptions. In some literature, it was suggested to correct the standard error of the maximum likelihood estimator through introducing overdispersion, which can be estimated by the Deviance or Pearson Chi-square. We proposed to conduct the negative binomial regression using Sandwich estimation to calculate the covariance matrix of the parameter estimates together with Pearson overdispersion correction (denoted by NBSP). In this research, we compared several commonly used negative binomial model options with our proposed NBSP. Simulations and real data analyses showed that NBSP is the most robust to model misspecification, and the estimation efficiency will be improved by adjusting for baseline hypoglycemia. Copyright © 2013 John Wiley & Sons, Ltd.
Multiple Regression Analyses in Clinical Child and Adolescent Psychology
Jaccard, James; Guilamo-Ramos, Vincent; Johansson, Margaret; Bouris, Alida
2006-01-01
A major form of data analysis in clinical child and adolescent psychology is multiple regression. This article reviews issues in the application of such methods in light of the research designs typical of this field. Issues addressed include controlling covariates, evaluation of predictor relevance, comparing predictors, analysis of moderation,…
Statistical and regression analyses of detected extrasolar systems
Czech Academy of Sciences Publication Activity Database
Pintr, Pavel; Peřinová, V.; Lukš, A.; Pathak, A.
2013-01-01
Roč. 75, č. 1 (2013), s. 37-45 ISSN 0032-0633 Institutional support: RVO:61389021 Keywords : Exoplanets * Kepler candidates * Regression analysis Subject RIV: BN - Astronomy, Celestial Mechanics, Astrophysics Impact factor: 1.630, year: 2013 http://www.sciencedirect.com/science/article/pii/S0032063312003066
Wei, Feng; Lovegrove, Gordon
2013-12-01
Today, North American governments are more willing to consider compact neighborhoods with increased use of sustainable transportation modes. Bicycling, one of the most effective modes for short trips with distances less than 5km is being encouraged. However, as vulnerable road users (VRUs), cyclists are more likely to be injured when involved in collisions. In order to create a safe road environment for them, evaluating cyclists' road safety at a macro level in a proactive way is necessary. In this paper, different generalized linear regression methods for collision prediction model (CPM) development are reviewed and previous studies on micro-level and macro-level bicycle-related CPMs are summarized. On the basis of insights gained in the exploration stage, this paper also reports on efforts to develop negative binomial models for bicycle-auto collisions at a community-based, macro-level. Data came from the Central Okanagan Regional District (CORD), of British Columbia, Canada. The model results revealed two types of statistical associations between collisions and each explanatory variable: (1) An increase in bicycle-auto collisions is associated with an increase in total lane kilometers (TLKM), bicycle lane kilometers (BLKM), bus stops (BS), traffic signals (SIG), intersection density (INTD), and arterial-local intersection percentage (IALP). (2) A decrease in bicycle collisions was found to be associated with an increase in the number of drive commuters (DRIVE), and in the percentage of drive commuters (DRP). These results support our hypothesis that in North America, with its current low levels of bicycle use (<4%), we can initially expect to see an increase in bicycle collisions as cycle mode share increases. However, as bicycle mode share increases beyond some unknown 'critical' level, our hypothesis also predicts a net safety improvement. To test this hypothesis and to further explore the statistical relationships between bicycle mode split and overall road
Analysing inequalities in Germany a structured additive distributional regression approach
Silbersdorff, Alexander
2017-01-01
This book seeks new perspectives on the growing inequalities that our societies face, putting forward Structured Additive Distributional Regression as a means of statistical analysis that circumvents the common problem of analytical reduction to simple point estimators. This new approach allows the observed discrepancy between the individuals’ realities and the abstract representation of those realities to be explicitly taken into consideration using the arithmetic mean alone. In turn, the method is applied to the question of economic inequality in Germany.
DEFF Research Database (Denmark)
Elmasry, Amr; Jensen, Claus; Katajainen, Jyrki
2017-01-01
the (total) number of elements stored in the data structure(s) prior to the operation. As the resulting data structure consists of two components that are different variants of binomial heaps, we call it a bipartite binomial heap. Compared to its counterpart, a multipartite binomial heap, the new structure...
Chain binomial models and binomial autoregressive processes.
Weiss, Christian H; Pollett, Philip K
2012-09-01
We establish a connection between a class of chain-binomial models of use in ecology and epidemiology and binomial autoregressive (AR) processes. New results are obtained for the latter, including expressions for the lag-conditional distribution and related quantities. We focus on two types of chain-binomial model, extinction-colonization and colonization-extinction models, and present two approaches to parameter estimation. The asymptotic distributions of the resulting estimators are studied, as well as their finite-sample performance, and we give an application to real data. A connection is made with standard AR models, which also has implications for parameter estimation. © 2011, The International Biometric Society.
Tripepi, Giovanni; Jager, Kitty J.; Stel, Vianda S.; Dekker, Friedo W.; Zoccali, Carmine
2011-01-01
Because of some limitations of stratification methods, epidemiologists frequently use multiple linear and logistic regression analyses to address specific epidemiological questions. If the dependent variable is a continuous one (for example, systolic pressure and serum creatinine), the researcher
Standardized binomial models for risk or prevalence ratios and differences.
Richardson, David B; Kinlaw, Alan C; MacLehose, Richard F; Cole, Stephen R
2015-10-01
Epidemiologists often analyse binary outcomes in cohort and cross-sectional studies using multivariable logistic regression models, yielding estimates of adjusted odds ratios. It is widely known that the odds ratio closely approximates the risk or prevalence ratio when the outcome is rare, and it does not do so when the outcome is common. Consequently, investigators may decide to directly estimate the risk or prevalence ratio using a log binomial regression model. We describe the use of a marginal structural binomial regression model to estimate standardized risk or prevalence ratios and differences. We illustrate the proposed approach using data from a cohort study of coronary heart disease status in Evans County, Georgia, USA. The approach reduces problems with model convergence typical of log binomial regression by shifting all explanatory variables except the exposures of primary interest from the linear predictor of the outcome regression model to a model for the standardization weights. The approach also facilitates evaluation of departures from additivity in the joint effects of two exposures. Epidemiologists should consider reporting standardized risk or prevalence ratios and differences in cohort and cross-sectional studies. These are readily-obtained using the SAS, Stata and R statistical software packages. The proposed approach estimates the exposure effect in the total population. © The Author 2015; all rights reserved. Published by Oxford University Press on behalf of the International Epidemiological Association.
Khan, Iftekhar; Morris, Stephen
2014-11-12
The performance of the Beta Binomial (BB) model is compared with several existing models for mapping the EORTC QLQ-C30 (QLQ-C30) on to the EQ-5D-3L using data from lung cancer trials. Data from 2 separate non small cell lung cancer clinical trials (TOPICAL and SOCCAR) are used to develop and validate the BB model. Comparisons with Linear, TOBIT, Quantile, Quadratic and CLAD models are carried out. The mean prediction error, R(2), proportion predicted outside the valid range, clinical interpretation of coefficients, model fit and estimation of Quality Adjusted Life Years (QALY) are reported and compared. Monte-Carlo simulation is also used. The Beta-Binomial regression model performed 'best' among all models. For TOPICAL and SOCCAR trials, respectively, residual mean square error (RMSE) was 0.09 and 0.11; R(2) was 0.75 and 0.71; observed vs. predicted means were 0.612 vs. 0.608 and 0.750 vs. 0.749. Mean difference in QALY's (observed vs. predicted) were 0.051 vs. 0.053 and 0.164 vs. 0.162 for TOPICAL and SOCCAR respectively. Models tested on independent data show simulated 95% confidence from the BB model containing the observed mean more often (77% and 59% for TOPICAL and SOCCAR respectively) compared to the other models. All algorithms over-predict at poorer health states but the BB model was relatively better, particularly for the SOCCAR data. The BB model may offer superior predictive properties amongst mapping algorithms considered and may be more useful when predicting EQ-5D-3L at poorer health states. We recommend the algorithm derived from the TOPICAL data due to better predictive properties and less uncertainty.
Kelly, Adrian B; Haynes, Michele A; Marlatt, G Alan
2008-05-01
Tobacco use is prevalent in adolescents and understanding factors that contribute to smoking uptake remains a critical public health priority. While there is now good support for the role of implicit (preconscious) cognitive processing in accounting for changes in drug use, these models have not been applied to tobacco use. Longitudinal analysis of smoking data presents unique problems, because these data are usually highly positively skewed (with excess zeros) and render conventional statistical tools (e.g., OLS regression) largely inappropriate. This study advanced the application of implicit memory theory to adolescent smoking by adopting statistical methods that do not rely on assumptions of normality, and produce robust estimates from data with correlated observations. The study examined the longitudinal association of implicit tobacco-related memory associations (TMAs) and smoking in 114 Australian high school students. Participants completed TMA tasks and behavioural checklists designed to obscure the tobacco-related focus of the study. Results showed that the TMA-smoking association remained significant when accounting for within-subject variability, and TMAs at Time 1 were modestly associated with smoking at Time 2 after accounting for within subject variability. Students with stronger preconscious smoking-related associations appear to be at greater risk of smoking. Strategies that target implicit TMAs may be an effective early intervention or prevention tool. The statistical method will be of use in future research on adolescent smoking, and for research on other behavioural distributions that are highly positively skewed.
Scott, Neil W; Fayers, Peter M; Aaronson, Neil K; Bottomley, Andrew; de Graeff, Alexander; Groenvold, Mogens; Gundy, Chad; Koller, Michael; Petersen, Morten A; Sprangers, Mirjam A G
2010-08-04
Differential item functioning (DIF) methods can be used to determine whether different subgroups respond differently to particular items within a health-related quality of life (HRQoL) subscale, after allowing for overall subgroup differences in that scale. This article reviews issues that arise when testing for DIF in HRQoL instruments. We focus on logistic regression methods, which are often used because of their efficiency, simplicity and ease of application. A review of logistic regression DIF analyses in HRQoL was undertaken. Methodological articles from other fields and using other DIF methods were also included if considered relevant. There are many competing approaches for the conduct of DIF analyses and many criteria for determining what constitutes significant DIF. DIF in short scales, as commonly found in HRQL instruments, may be more difficult to interpret. Qualitative methods may aid interpretation of such DIF analyses. A number of methodological choices must be made when applying logistic regression for DIF analyses, and many of these affect the results. We provide recommendations based on reviewing the current evidence. Although the focus is on logistic regression, many of our results should be applicable to DIF analyses in general. There is a need for more empirical and theoretical work in this area.
USE OF THE SIMPLE LINEAR REGRESSION MODEL IN MACRO-ECONOMICAL ANALYSES
Directory of Open Access Journals (Sweden)
Constantin ANGHELACHE
2011-10-01
Full Text Available The article presents the fundamental aspects of the linear regression, as a toolbox which can be used in macroeconomic analyses. The article describes the estimation of the parameters, the statistical tests used, the homoscesasticity and heteroskedasticity. The use of econometrics instrument in macroeconomics is an important factor that guarantees the quality of the models, analyses, results and possible interpretation that can be drawn at this level.
Random regression analyses using B-splines to model growth of Australian Angus cattle
Directory of Open Access Journals (Sweden)
Meyer Karin
2005-09-01
Full Text Available Abstract Regression on the basis function of B-splines has been advocated as an alternative to orthogonal polynomials in random regression analyses. Basic theory of splines in mixed model analyses is reviewed, and estimates from analyses of weights of Australian Angus cattle from birth to 820 days of age are presented. Data comprised 84 533 records on 20 731 animals in 43 herds, with a high proportion of animals with 4 or more weights recorded. Changes in weights with age were modelled through B-splines of age at recording. A total of thirteen analyses, considering different combinations of linear, quadratic and cubic B-splines and up to six knots, were carried out. Results showed good agreement for all ages with many records, but fluctuated where data were sparse. On the whole, analyses using B-splines appeared more robust against "end-of-range" problems and yielded more consistent and accurate estimates of the first eigenfunctions than previous, polynomial analyses. A model fitting quadratic B-splines, with knots at 0, 200, 400, 600 and 821 days and a total of 91 covariance components, appeared to be a good compromise between detailedness of the model, number of parameters to be estimated, plausibility of results, and fit, measured as residual mean square error.
Lee, Bao-Shiang; Krishnanchettiar, Sangeeth; Lateef, Syed Salman; Li, Chun-Ting; Gupta, Shalini
2006-07-01
The phenomenon of protein randomly associating among themselves during MALDI-TOF mass spectrometric measurements is manifest on all the proteins tested here. The magnitude of this random association seems to be protein-dependent. However, a detailed mathematical analysis of this process has not been reported so far. Here, binomial and multinomial equations are used to analyze the relative populations of multimer ions formed by random protein association during MALDI-TOF mass spectrometric measurements. Hemoglobin A (which consists of two [alpha]-globins and two [beta]-globins) and biotinylated insulin (which contains intact, singly biotinylated, and doubly biotinylated insulin) are used as the test cases for two- and three-component protein systems, respectively. MALDI-TOF spectra are acquired using standard MALDI-TOF techniques and equipment. The binomial distribution matches the relative populations of multimer ions of Hb A perfectly. For biotinylated insulin sample, taking lesser relative populations for doubly biotinylated insulin and intact insulin compared with singly biotinylated insulin into account, the relative populations of multimer ions of the biotinylated insulin confirms the prediction of multinomial equation. The pairs of unrelated proteins such as myoglobin, avidin, and lysozyme show lesser intensities for heteromultimers than for homomultimers, indicating weaker propensities to associate between different proteins during MALDI-TOF mass spectrometric measurements. Contrary to the suggestion that the multimer ions are formed in the solution phase prior to MALDI-TOF mass spectrometric measurement through multistage sequential reactions of the aggregation of protein molecules, we postulate that multimer ions of proteins are formed after the protein molecules have been vaporized into the gas phase through the assistance of the laser and matrix.
Analysing count data of Butterflies communities in Jasin, Melaka: A Poisson regression analysis
Afiqah Muhamad Jamil, Siti; Asrul Affendi Abdullah, M.; Kek, Sie Long; Nor, Maria Elena; Mohamed, Maryati; Ismail, Norradihah
2017-09-01
Counting outcomes normally have remaining values highly skewed toward the right as they are often characterized by large values of zeros. The data of butterfly communities, had been taken from Jasin, Melaka and consists of 131 number of subject visits in Jasin, Melaka. In this paper, considering the count data of butterfly communities, an analysis is considered Poisson regression analysis as it is assumed to be an alternative way on better suited to the counting process. This research paper is about analysing count data from zero observation ecological inference of butterfly communities in Jasin, Melaka by using Poisson regression analysis. The software for Poisson regression is readily available and it is becoming more widely used in many field of research and the data was analysed by using SAS software. The purpose of analysis comprised the framework of identifying the concerns. Besides, by using Poisson regression analysis, the study determines the fitness of data for accessing the reliability on using the count data. The finding indicates that the highest and lowest number of subject comes from the third family (Nymphalidae) family and fifth (Hesperidae) family and the Poisson distribution seems to fit the zero values.
Longitudinal beta-binomial modeling using GEE for overdispersed binomial data.
Wu, Hongqian; Zhang, Ying; Long, Jeffrey D
2017-03-15
Longitudinal binomial data are frequently generated from multiple questionnaires and assessments in various scientific settings for which the binomial data are often overdispersed. The standard generalized linear mixed effects model may result in severe underestimation of standard errors of estimated regression parameters in such cases and hence potentially bias the statistical inference. In this paper, we propose a longitudinal beta-binomial model for overdispersed binomial data and estimate the regression parameters under a probit model using the generalized estimating equation method. A hybrid algorithm of the Fisher scoring and the method of moments is implemented for computing the method. Extensive simulation studies are conducted to justify the validity of the proposed method. Finally, the proposed method is applied to analyze functional impairment in subjects who are at risk of Huntington disease from a multisite observational study of prodromal Huntington disease. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Negative binomial multiplicity distribution from binomial cluster production
International Nuclear Information System (INIS)
Iso, C.; Mori, K.
1990-01-01
Two-step interpretation of negative binomial multiplicity distribution as a compound of binomial cluster production and negative binomial like cluster decay distribution is proposed. In this model we can expect the average multiplicity for the cluster production increases with increasing energy, different from a compound Poisson-Logarithmic distribution. (orig.)
Binomial collisions and near collisions
Blokhuis, Aart; Brouwer, Andries; de Weger, Benne
2017-01-01
We describe efficient algorithms to search for cases in which binomial coefficients are equal or almost equal, give a conjecturally complete list of all cases where two binomial coefficients differ by 1, and give some identities for binomial coefficients that seem to be new.
Statistical inference involving binomial and negative binomial parameters.
García-Pérez, Miguel A; Núñez-Antón, Vicente
2009-05-01
Statistical inference about two binomial parameters implies that they are both estimated by binomial sampling. There are occasions in which one aims at testing the equality of two binomial parameters before and after the occurrence of the first success along a sequence of Bernoulli trials. In these cases, the binomial parameter before the first success is estimated by negative binomial sampling whereas that after the first success is estimated by binomial sampling, and both estimates are related. This paper derives statistical tools to test two hypotheses, namely, that both binomial parameters equal some specified value and that both parameters are equal though unknown. Simulation studies are used to show that in small samples both tests are accurate in keeping the nominal Type-I error rates, and also to determine sample size requirements to detect large, medium, and small effects with adequate power. Additional simulations also show that the tests are sufficiently robust to certain violations of their assumptions.
Karami, K; Zerehdaran, S; Barzanooni, B; Lotfi, E
2017-12-01
1. The aim of the present study was to estimate genetic parameters for average egg weight (EW) and egg number (EN) at different ages in Japanese quail using multi-trait random regression (MTRR) models. 2. A total of 8534 records from 900 quail, hatched between 2014 and 2015, were used in the study. Average weekly egg weights and egg numbers were measured from second until sixth week of egg production. 3. Nine random regression models were compared to identify the best order of the Legendre polynomials (LP). The most optimal model was identified by the Bayesian Information Criterion. A model with second order of LP for fixed effects, second order of LP for additive genetic effects and third order of LP for permanent environmental effects (MTRR23) was found to be the best. 4. According to the MTRR23 model, direct heritability for EW increased from 0.26 in the second week to 0.53 in the sixth week of egg production, whereas the ratio of permanent environment to phenotypic variance decreased from 0.48 to 0.1. Direct heritability for EN was low, whereas the ratio of permanent environment to phenotypic variance decreased from 0.57 to 0.15 during the production period. 5. For each trait, estimated genetic correlations among weeks of egg production were high (from 0.85 to 0.98). Genetic correlations between EW and EN were low and negative for the first two weeks, but they were low and positive for the rest of the egg production period. 6. In conclusion, random regression models can be used effectively for analysing egg production traits in Japanese quail. Response to selection for increased egg weight would be higher at older ages because of its higher heritability and such a breeding program would have no negative genetic impact on egg production.
Directory of Open Access Journals (Sweden)
Esther Leushuis
2016-12-01
Full Text Available Background: Standardization of the semen analysis may improve reproducibility. We assessed variability between laboratories in semen analyses and evaluated whether a transformation using Z scores and regression statistics was able to reduce this variability. Materials and Methods: We performed a retrospective cohort study. We calculated between-laboratory coefficients of variation (CVB for sperm concentration and for morphology. Subsequently, we standardized the semen analysis results by calculating laboratory specific Z scores, and by using regression. We used analysis of variance for four semen parameters to assess systematic differences between laboratories before and after the transformations, both in the circulation samples and in the samples obtained in the prospective cohort study in the Netherlands between January 2002 and February 2004. Results: The mean CVB was 7% for sperm concentration (range 3 to 13% and 32% for sperm morphology (range 18 to 51%. The differences between the laboratories were statistically significant for all semen parameters (all P<0.001. Standardization using Z scores did not reduce the differences in semen analysis results between the laboratories (all P<0.001. Conclusion: There exists large between-laboratory variability for sperm morphology and small, but statistically significant, between-laboratory variation for sperm concentration. Standardization using Z scores does not eliminate between-laboratory variability.
CUMBIN - CUMULATIVE BINOMIAL PROGRAMS
Bowerman, P. N.
1994-01-01
The cumulative binomial program, CUMBIN, is one of a set of three programs which calculate cumulative binomial probability distributions for arbitrary inputs. The three programs, CUMBIN, NEWTONP (NPO-17556), and CROSSER (NPO-17557), can be used independently of one another. CUMBIN can be used by statisticians and users of statistical procedures, test planners, designers, and numerical analysts. The program has been used for reliability/availability calculations. CUMBIN calculates the probability that a system of n components has at least k operating if the probability that any one operating is p and the components are independent. Equivalently, this is the reliability of a k-out-of-n system having independent components with common reliability p. CUMBIN can evaluate the incomplete beta distribution for two positive integer arguments. CUMBIN can also evaluate the cumulative F distribution and the negative binomial distribution, and can determine the sample size in a test design. CUMBIN is designed to work well with all integer values 0 < k <= n. To run the program, the user simply runs the executable version and inputs the information requested by the program. The program is not designed to weed out incorrect inputs, so the user must take care to make sure the inputs are correct. Once all input has been entered, the program calculates and lists the result. The CUMBIN program is written in C. It was developed on an IBM AT with a numeric co-processor using Microsoft C 5.0. Because the source code is written using standard C structures and functions, it should compile correctly with most C compilers. The program format is interactive. It has been implemented under DOS 3.2 and has a memory requirement of 26K. CUMBIN was developed in 1988.
International Nuclear Information System (INIS)
Bhowmik, K.R.; Islam, S.
2016-01-01
Logistic regression (LR) analysis is the most common statistical methodology to find out the determinants of childhood mortality. However, the significant predictors cannot be ranked according to their influence on the response variable. Multiple classification (MC) analysis can be applied to identify the significant predictors with a priority index which helps to rank the predictors. The main objective of the study is to find the socio-demographic determinants of childhood mortality at neonatal, post-neonatal, and post-infant period by fitting LR model as well as to rank those through MC analysis. The study is conducted using the data of Bangladesh Demographic and Health Survey 2007 where birth and death information of children were collected from their mothers. Three dichotomous response variables are constructed from children age at death to fit the LR and MC models. Socio-economic and demographic variables significantly associated with the response variables separately are considered in LR and MC analyses. Both the LR and MC models identified the same significant predictors for specific childhood mortality. For both the neonatal and child mortality, biological factors of children, regional settings, and parents socio-economic status are found as 1st, 2nd, and 3rd significant groups of predictors respectively. Mother education and household environment are detected as major significant predictors of post-neonatal mortality. This study shows that MC analysis with or without LR analysis can be applied to detect determinants with rank which help the policy makers taking initiatives on a priority basis. (author)
John W. Edwards; Susan C. Loeb; David C. Guynn
1994-01-01
Multiple regression and use-availability analyses are two methods for examining habitat selection. Use-availability analysis is commonly used to evaluate macrohabitat selection whereas multiple regression analysis can be used to determine microhabitat selection. We compared these techniques using behavioral observations (n = 5534) and telemetry locations (n = 2089) of...
CROSSER - CUMULATIVE BINOMIAL PROGRAMS
Bowerman, P. N.
1994-01-01
The cumulative binomial program, CROSSER, is one of a set of three programs which calculate cumulative binomial probability distributions for arbitrary inputs. The three programs, CROSSER, CUMBIN (NPO-17555), and NEWTONP (NPO-17556), can be used independently of one another. CROSSER can be used by statisticians and users of statistical procedures, test planners, designers, and numerical analysts. The program has been used for reliability/availability calculations. CROSSER calculates the point at which the reliability of a k-out-of-n system equals the common reliability of the n components. It is designed to work well with all integer values 0 < k <= n. To run the program, the user simply runs the executable version and inputs the information requested by the program. The program is not designed to weed out incorrect inputs, so the user must take care to make sure the inputs are correct. Once all input has been entered, the program calculates and lists the result. It also lists the number of iterations of Newton's method required to calculate the answer within the given error. The CROSSER program is written in C. It was developed on an IBM AT with a numeric co-processor using Microsoft C 5.0. Because the source code is written using standard C structures and functions, it should compile correctly with most C compilers. The program format is interactive. It has been implemented under DOS 3.2 and has a memory requirement of 26K. CROSSER was developed in 1988.
Karakawa, Ayako; Murata, Hiroshi; Hirasawa, Hiroyo; Mayama, Chihiro; Asaoka, Ryo
2013-01-01
To compare the performance of newly proposed point-wise linear regression (PLR) with the binomial test (binomial PLR) against mean deviation (MD) trend analysis and permutation analyses of PLR (PoPLR), in detecting global visual field (VF) progression in glaucoma. 15 VFs (Humphrey Field Analyzer, SITA standard, 24-2) were collected from 96 eyes of 59 open angle glaucoma patients (6.0 ± 1.5 [mean ± standard deviation] years). Using the total deviation of each point on the 2(nd) to 16(th) VFs (VF2-16), linear regression analysis was carried out. The numbers of VF test points with a significant trend at various probability levels (pbinomial test (one-side). A VF series was defined as "significant" if the median p-value from the binomial test was binomial PLR method (0.14 to 0.86) was significantly higher than MD trend analysis (0.04 to 0.89) and PoPLR (0.09 to 0.93). The PIS of the proposed method (0.0 to 0.17) was significantly lower than the MD approach (0.0 to 0.67) and PoPLR (0.07 to 0.33). The PBNS of the three approaches were not significantly different. The binomial BLR method gives more consistent results than MD trend analysis and PoPLR, hence it will be helpful as a tool to 'flag' possible VF deterioration.
de Schryver, Tom; Eisinga, R.N.
2011-01-01
The key question in research on dismissals of head coaches in sports clubs is not whether they should happen but when they will happen. This paper applies piecewise linear regression to advance our understanding of the timing of head coach dismissals. Essentially, the regression sacrifices degrees
Distinguishing between Binomial, Hypergeometric and Negative Binomial Distributions
Wroughton, Jacqueline; Cole, Tarah
2013-01-01
Recognizing the differences between three discrete distributions (Binomial, Hypergeometric and Negative Binomial) can be challenging for students. We present an activity designed to help students differentiate among these distributions. In addition, we present assessment results in the form of pre- and post-tests that were designed to assess the…
The number of subjects per variable required in linear regression analyses.
Austin, Peter C; Steyerberg, Ewout W
2015-06-01
To determine the number of independent variables that can be included in a linear regression model. We used a series of Monte Carlo simulations to examine the impact of the number of subjects per variable (SPV) on the accuracy of estimated regression coefficients and standard errors, on the empirical coverage of estimated confidence intervals, and on the accuracy of the estimated R(2) of the fitted model. A minimum of approximately two SPV tended to result in estimation of regression coefficients with relative bias of less than 10%. Furthermore, with this minimum number of SPV, the standard errors of the regression coefficients were accurately estimated and estimated confidence intervals had approximately the advertised coverage rates. A much higher number of SPV were necessary to minimize bias in estimating the model R(2), although adjusted R(2) estimates behaved well. The bias in estimating the model R(2) statistic was inversely proportional to the magnitude of the proportion of variation explained by the population regression model. Linear regression models require only two SPV for adequate estimation of regression coefficients, standard errors, and confidence intervals. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
NEWTONP - CUMULATIVE BINOMIAL PROGRAMS
Bowerman, P. N.
1994-01-01
The cumulative binomial program, NEWTONP, is one of a set of three programs which calculate cumulative binomial probability distributions for arbitrary inputs. The three programs, NEWTONP, CUMBIN (NPO-17555), and CROSSER (NPO-17557), can be used independently of one another. NEWTONP can be used by statisticians and users of statistical procedures, test planners, designers, and numerical analysts. The program has been used for reliability/availability calculations. NEWTONP calculates the probably p required to yield a given system reliability V for a k-out-of-n system. It can also be used to determine the Clopper-Pearson confidence limits (either one-sided or two-sided) for the parameter p of a Bernoulli distribution. NEWTONP can determine Bayesian probability limits for a proportion (if the beta prior has positive integer parameters). It can determine the percentiles of incomplete beta distributions with positive integer parameters. It can also determine the percentiles of F distributions and the midian plotting positions in probability plotting. NEWTONP is designed to work well with all integer values 0 < k <= n. To run the program, the user simply runs the executable version and inputs the information requested by the program. NEWTONP is not designed to weed out incorrect inputs, so the user must take care to make sure the inputs are correct. Once all input has been entered, the program calculates and lists the result. It also lists the number of iterations of Newton's method required to calculate the answer within the given error. The NEWTONP program is written in C. It was developed on an IBM AT with a numeric co-processor using Microsoft C 5.0. Because the source code is written using standard C structures and functions, it should compile correctly with most C compilers. The program format is interactive. It has been implemented under DOS 3.2 and has a memory requirement of 26K. NEWTONP was developed in 1988.
Binomial Rings: Axiomatisation, Transfer and Classification
Xantcha, Qimh Richey
2011-01-01
Hall's binomial rings, rings with binomial coefficients, are given an axiomatisation and proved identical to the numerical rings studied by Ekedahl. The Binomial Transfer Principle is established, enabling combinatorial proofs of algebraical identities. The finitely generated binomial rings are completely classified. An application to modules over binomial rings is given.
On a Fractional Binomial Process
Cahoy, Dexter O.; Polito, Federico
2012-02-01
The classical binomial process has been studied by Jakeman (J. Phys. A 23:2815-2825, 1990) (and the references therein) and has been used to characterize a series of radiation states in quantum optics. In particular, he studied a classical birth-death process where the chance of birth is proportional to the difference between a larger fixed number and the number of individuals present. It is shown that at large times, an equilibrium is reached which follows a binomial process. In this paper, the classical binomial process is generalized using the techniques of fractional calculus and is called the fractional binomial process. The fractional binomial process is shown to preserve the binomial limit at large times while expanding the class of models that include non-binomial fluctuations (non-Markovian) at regular and small times. As a direct consequence, the generality of the fractional binomial model makes the proposed model more desirable than its classical counterpart in describing real physical processes. More statistical properties are also derived.
Analysing conjoint data with OLS and PLS regression: a case study with wine.
Jaeger, Sara R; Mielby, Line H; Heymann, Hildegarde; Jia, Yilin; Frøst, Michael B
2013-12-01
This paper presents a case study with wine where two statistical methods for the analysis of rating-based conjoint analysis data were applied. Traditionally, ordinary least squares (OLS) regression is used to estimate the relative importance of the experimental factors and the part-worth utilities of factor levels. Partial least squares (PLS) regression, which is a popular tool in sensory and consumer science, can also be used for the analysis of interval-level conjoint data. Using conjoint analysis, purchase intentions for Californian red and white wine were obtained from a convenience sample of young US adults (n ≈ 250). OLS and PLS regression uncovered the same systematic patterns in the data: negative utility associated with more expensive wine, and positive utility associated with famous wine regions. While OLS regression provided more accessible top-line results, an advantage of PLS regression was the graphical format of results. This provided easy insight to individual differences in the importance attached to the factors driving purchase intention. OLS and PLS regression can complement each other in the analysis of interval-level conjoint data. Dual analysis can help to ensure that the right insights are drawn from the study and communicated to internal/external clients. It may also facilitate communication within project teams. © 2013 Society of Chemical Industry.
Directory of Open Access Journals (Sweden)
Giuliano de Oliveira Freitas
2013-10-01
Full Text Available PURPOSE: To determine linear regression models between Alpins descriptive indices and Thibos astigmatic power vectors (APV, assessing the validity and strength of such correlations. METHODS: This case series prospectively assessed 62 eyes of 31 consecutive cataract patients with preoperative corneal astigmatism between 0.75 and 2.50 diopters in both eyes. Patients were randomly assorted among two phacoemulsification groups: one assigned to receive AcrySof®Toric intraocular lens (IOL in both eyes and another assigned to have AcrySof Natural IOL associated with limbal relaxing incisions, also in both eyes. All patients were reevaluated postoperatively at 6 months, when refractive astigmatism analysis was performed using both Alpins and Thibos methods. The ratio between Thibos postoperative APV and preoperative APV (APVratio and its linear regression to Alpins percentage of success of astigmatic surgery, percentage of astigmatism corrected and percentage of astigmatism reduction at the intended axis were assessed. RESULTS: Significant negative correlation between the ratio of post- and preoperative Thibos APVratio and Alpins percentage of success (%Success was found (Spearman's ρ=-0.93; linear regression is given by the following equation: %Success = (-APVratio + 1.00x100. CONCLUSION: The linear regression we found between APVratio and %Success permits a validated mathematical inference concerning the overall success of astigmatic surgery.
Li, Spencer D.
2011-01-01
Mediation analysis in child and adolescent development research is possible using large secondary data sets. This article provides an overview of two statistical methods commonly used to test mediated effects in secondary analysis: multiple regression and structural equation modeling (SEM). Two empirical studies are presented to illustrate the…
Multiple regression analyses in artificial-grammar learning: the importance of control groups.
Lotz, Anja; Kinder, Annette; Lachnit, Harald
2009-03-01
In artificial-grammar learning, it is crucial to ensure that above-chance performance in the test stage is due to learning in the training stage but not due to judgemental biases. Here we argue that multiple regression analysis can be successfully combined with the use of control groups to assess whether participants were able to transfer knowledge acquired during training when making judgements about test stimuli. We compared the regression weights of judgements in a transfer condition (training and test strings were constructed by the same grammar but with different letters) with those in a control condition. Predictors were identical in both conditions-judgements of control participants were treated as if they were based on knowledge gained in a standard training stage. The results of this experiment as well as reanalyses of a former study support the usefulness of our approach.
DEFF Research Database (Denmark)
Scott, Neil W; Fayers, Peter M; Aaronson, Neil K
2010-01-01
Differential item functioning (DIF) methods can be used to determine whether different subgroups respond differently to particular items within a health-related quality of life (HRQoL) subscale, after allowing for overall subgroup differences in that scale. This article reviews issues that arise...... when testing for DIF in HRQoL instruments. We focus on logistic regression methods, which are often used because of their efficiency, simplicity and ease of application....
DEFF Research Database (Denmark)
Scott, Neil W.; Fayers, Peter M.; Aaronson, Neil K.
2010-01-01
Differential item functioning (DIF) methods can be used to determine whether different subgroups respond differently to particular items within a health-related quality of life (HRQoL) subscale, after allowing for overall subgroup differences in that scale. This article reviews issues that arise ...... when testing for DIF in HRQoL instruments. We focus on logistic regression methods, which are often used because of their efficiency, simplicity and ease of application....
Analyses of Developmental Rate Isomorphy in Ectotherms: Introducing the Dirichlet Regression.
Directory of Open Access Journals (Sweden)
David S Boukal
Full Text Available Temperature drives development in insects and other ectotherms because their metabolic rate and growth depends directly on thermal conditions. However, relative durations of successive ontogenetic stages often remain nearly constant across a substantial range of temperatures. This pattern, termed 'developmental rate isomorphy' (DRI in insects, appears to be widespread and reported departures from DRI are generally very small. We show that these conclusions may be due to the caveats hidden in the statistical methods currently used to study DRI. Because the DRI concept is inherently based on proportional data, we propose that Dirichlet regression applied to individual-level data is an appropriate statistical method to critically assess DRI. As a case study we analyze data on five aquatic and four terrestrial insect species. We find that results obtained by Dirichlet regression are consistent with DRI violation in at least eight of the studied species, although standard analysis detects significant departure from DRI in only four of them. Moreover, the departures from DRI detected by Dirichlet regression are consistently much larger than previously reported. The proposed framework can also be used to infer whether observed departures from DRI reflect life history adaptations to size- or stage-dependent effects of varying temperature. Our results indicate that the concept of DRI in insects and other ectotherms should be critically re-evaluated and put in a wider context, including the concept of 'equiproportional development' developed for copepods.
Analyses of Developmental Rate Isomorphy in Ectotherms: Introducing the Dirichlet Regression.
Boukal, David S; Ditrich, Tomáš; Kutcherov, Dmitry; Sroka, Pavel; Dudová, Pavla; Papáček, Miroslav
2015-01-01
Temperature drives development in insects and other ectotherms because their metabolic rate and growth depends directly on thermal conditions. However, relative durations of successive ontogenetic stages often remain nearly constant across a substantial range of temperatures. This pattern, termed 'developmental rate isomorphy' (DRI) in insects, appears to be widespread and reported departures from DRI are generally very small. We show that these conclusions may be due to the caveats hidden in the statistical methods currently used to study DRI. Because the DRI concept is inherently based on proportional data, we propose that Dirichlet regression applied to individual-level data is an appropriate statistical method to critically assess DRI. As a case study we analyze data on five aquatic and four terrestrial insect species. We find that results obtained by Dirichlet regression are consistent with DRI violation in at least eight of the studied species, although standard analysis detects significant departure from DRI in only four of them. Moreover, the departures from DRI detected by Dirichlet regression are consistently much larger than previously reported. The proposed framework can also be used to infer whether observed departures from DRI reflect life history adaptations to size- or stage-dependent effects of varying temperature. Our results indicate that the concept of DRI in insects and other ectotherms should be critically re-evaluated and put in a wider context, including the concept of 'equiproportional development' developed for copepods.
The Binomial Distribution in Shooting
Chalikias, Miltiadis S.
2009-01-01
The binomial distribution is used to predict the winner of the 49th International Shooting Sport Federation World Championship in double trap shooting held in 2006 in Zagreb, Croatia. The outcome of the competition was definitely unexpected.
Accommodating linkage disequilibrium in genetic-association analyses via ridge regression.
Malo, Nathalie; Libiger, Ondrej; Schork, Nicholas J
2008-02-01
Large-scale genetic-association studies that take advantage of an extremely dense set of genetic markers have begun to produce very compelling statistical associations between multiple makers exhibiting strong linkage disequilibrium (LD) in a single genomic region and a phenotype of interest. However, the ultimate biological or "functional" significance of these multiple associations has been difficult to discern. In fact, the LD relationships between not only the markers found to be associated with the phenotype but also potential functionally or causally relevant genetic variations that reside near those markers have been exploited in such studies. Unfortunately, LD, especially strong LD, between variations at neighboring loci can make it difficult to distinguish the functionally relevant variations from nonfunctional variations. Although there are (rare) situations in which it is impossible to determine the independent phenotypic effects of variations in LD, there are strategies for accommodating LD between variations at different loci, and they can be used to tease out their independent effects on a phenotype. These strategies make it possible to differentiate potentially causative from noncausative variations. We describe one such approach involving ridge regression. We showcase the method by using both simulated and real data. Our results suggest that ridge regression and related techniques have the potential to distinguish causative from noncausative variations in association studies.
Negative binomial properties and clan structure in multiplicity distributions
International Nuclear Information System (INIS)
Giovannini, A.; Van Hove, L.
1988-01-01
We review the negative binomial properties measured recently for many multiplicity distributions of high energy hadronic, semi-leptonic reactions in selected rapidity intervals. We analyse them in terms of the ''clan'' structure which can be defined for any negative binomial distribution. By comparing reactions we exhibit a number of regularities for the average number N-bar of clans and the average charged multiplicity (n-bar) c per clan. 22 refs., 6 figs. (author)
DEFF Research Database (Denmark)
Tybjærg-Hansen, Anne
2009-01-01
Within-person variability in measured values of multiple risk factors can bias their associations with disease. The multivariate regression calibration (RC) approach can correct for such measurement error and has been applied to studies in which true values or independent repeat measurements......-specific, averaged and empirical Bayes estimates of RC parameters. Additionally, we allow for binary covariates (e.g. smoking status) and for uncertainty and time trends in the measurement error corrections. Our methods are illustrated using a subset of individual participant data from prospective long-term studies...... in the Fibrinogen Studies Collaboration to assess the relationship between usual levels of plasma fibrinogen and the risk of coronary heart disease, allowing for measurement error in plasma fibrinogen and several confounders Udgivelsesdato: 2009/3/30...
DEFF Research Database (Denmark)
Tybjærg-Hansen, Anne
2009-01-01
Within-person variability in measured values of multiple risk factors can bias their associations with disease. The multivariate regression calibration (RC) approach can correct for such measurement error and has been applied to studies in which true values or independent repeat measurements...... of the risk factors are observed on a subsample. We extend the multivariate RC techniques to a meta-analysis framework where multiple studies provide independent repeat measurements and information on disease outcome. We consider the cases where some or all studies have repeat measurements, and compare study......-specific, averaged and empirical Bayes estimates of RC parameters. Additionally, we allow for binary covariates (e.g. smoking status) and for uncertainty and time trends in the measurement error corrections. Our methods are illustrated using a subset of individual participant data from prospective long-term studies...
Grégoire, G.
2014-12-01
The logistic regression originally is intended to explain the relationship between the probability of an event and a set of covariables. The model's coefficients can be interpreted via the odds and odds ratio, which are presented in introduction of the chapter. The observations are possibly got individually, then we speak of binary logistic regression. When they are grouped, the logistic regression is said binomial. In our presentation we mainly focus on the binary case. For statistical inference the main tool is the maximum likelihood methodology: we present the Wald, Rao and likelihoods ratio results and their use to compare nested models. The problems we intend to deal with are essentially the same as in multiple linear regression: testing global effect, individual effect, selection of variables to build a model, measure of the fitness of the model, prediction of new values… . The methods are demonstrated on data sets using R. Finally we briefly consider the binomial case and the situation where we are interested in several events, that is the polytomous (multinomial) logistic regression and the particular case of ordinal logistic regression.
PENERAPAN REGRESI BINOMIAL NEGATIF UNTUK MENGATASI OVERDISPERSI PADA REGRESI POISSON
Directory of Open Access Journals (Sweden)
PUTU SUSAN PRADAWATI
2013-09-01
Full Text Available Poisson regression was used to analyze the count data which Poisson distributed. Poisson regression analysis requires state equidispersion, in which the mean value of the response variable is equal to the value of the variance. However, there are deviations in which the value of the response variable variance is greater than the mean. This is called overdispersion. If overdispersion happens and Poisson Regression analysis is being used, then underestimated standard errors will be obtained. Negative Binomial Regression can handle overdispersion because it contains a dispersion parameter. From the simulation data which experienced overdispersion in the Poisson Regression model it was found that the Negative Binomial Regression was better than the Poisson Regression model.
Tutorial on Using Regression Models with Count Outcomes Using R
Directory of Open Access Journals (Sweden)
A. Alexander Beaujean
2016-02-01
Full Text Available Education researchers often study count variables, such as times a student reached a goal, discipline referrals, and absences. Most researchers that study these variables use typical regression methods (i.e., ordinary least-squares either with or without transforming the count variables. In either case, using typical regression for count data can produce parameter estimates that are biased, thus diminishing any inferences made from such data. As count-variable regression models are seldom taught in training programs, we present a tutorial to help educational researchers use such methods in their own research. We demonstrate analyzing and interpreting count data using Poisson, negative binomial, zero-inflated Poisson, and zero-inflated negative binomial regression models. The count regression methods are introduced through an example using the number of times students skipped class. The data for this example are freely available and the R syntax used run the example analyses are included in the Appendix.
[Using log-binomial model for estimating the prevalence ratio].
Ye, Rong; Gao, Yan-hui; Yang, Yi; Chen, Yue
2010-05-01
To estimate the prevalence ratios, using a log-binomial model with or without continuous covariates. Prevalence ratios for individuals' attitude towards smoking-ban legislation associated with smoking status, estimated by using a log-binomial model were compared with odds ratios estimated by logistic regression model. In the log-binomial modeling, maximum likelihood method was used when there were no continuous covariates and COPY approach was used if the model did not converge, for example due to the existence of continuous covariates. We examined the association between individuals' attitude towards smoking-ban legislation and smoking status in men and women. Prevalence ratio and odds ratio estimation provided similar results for the association in women since smoking was not common. In men however, the odds ratio estimates were markedly larger than the prevalence ratios due to a higher prevalence of outcome. The log-binomial model did not converge when age was included as a continuous covariate and COPY method was used to deal with the situation. All analysis was performed by SAS. Prevalence ratio seemed to better measure the association than odds ratio when prevalence is high. SAS programs were provided to calculate the prevalence ratios with or without continuous covariates in the log-binomial regression analysis.
Integer Solutions of Binomial Coefficients
Gilbertson, Nicholas J.
2016-01-01
A good formula is like a good story, rich in description, powerful in communication, and eye-opening to readers. The formula presented in this article for determining the coefficients of the binomial expansion of (x + y)n is one such "good read." The beauty of this formula is in its simplicity--both describing a quantitative situation…
Smoothness in Binomial Edge Ideals
Directory of Open Access Journals (Sweden)
Hamid Damadi
2016-06-01
Full Text Available In this paper we study some geometric properties of the algebraic set associated to the binomial edge ideal of a graph. We study the singularity and smoothness of the algebraic set associated to the binomial edge ideal of a graph. Some of these algebraic sets are irreducible and some of them are reducible. If every irreducible component of the algebraic set is smooth we call the graph an edge smooth graph, otherwise it is called an edge singular graph. We show that complete graphs are edge smooth and introduce two conditions such that the graph G is edge singular if and only if it satisfies these conditions. Then, it is shown that cycles and most of trees are edge singular. In addition, it is proved that complete bipartite graphs are edge smooth.
Seyoum, Awoke; Ndlovu, Principal; Zewotir, Temesgen
2016-01-01
CD4 cells are a type of white blood cells that plays a significant role in protecting humans from infectious diseases. Lack of information on associated factors on CD4 cell count reduction is an obstacle for improvement of cells in HIV positive adults. Therefore, the main objective of this study was to investigate baseline factors that could affect initial CD4 cell count change after highly active antiretroviral therapy had been given to adult patients in North West Ethiopia. A retrospective cross-sectional study was conducted among 792 HIV positive adult patients who already started antiretroviral therapy for 1 month of therapy. A Chi square test of association was used to assess of predictor covariates on the variable of interest. Data was secondary source and modeled using generalized linear models, especially Quasi-Poisson regression. The patients' CD4 cell count changed within a month ranged from 0 to 109 cells/mm 3 with a mean of 15.9 cells/mm 3 and standard deviation 18.44 cells/mm 3 . The first month CD4 cell count change was significantly affected by poor adherence to highly active antiretroviral therapy (aRR = 0.506, P value = 2e -16 ), fair adherence (aRR = 0.592, P value = 0.0120), initial CD4 cell count (aRR = 1.0212, P value = 1.54e -15 ), low household income (aRR = 0.63, P value = 0.671e -14 ), middle income (aRR = 0.74, P value = 0.629e -12 ), patients without cell phone (aRR = 0.67, P value = 0.615e -16 ), WHO stage 2 (aRR = 0.91, P value = 0.0078), WHO stage 3 (aRR = 0.91, P value = 0.0058), WHO stage 4 (0876, P value = 0.0214), age (aRR = 0.987, P value = 0.000) and weight (aRR = 1.0216, P value = 3.98e -14 ). Adherence to antiretroviral therapy, initial CD4 cell count, household income, WHO stages, age, weight and owner of cell phone played a major role for the variation of CD4 cell count in our data. Hence, we recommend a close follow-up of patients to adhere the prescribed medication for
Chromosome aberration analysis based on a beta-binomial distribution
International Nuclear Information System (INIS)
Otake, Masanori; Prentice, R.L.
1983-10-01
Analyses carried out here generalized on earlier studies of chromosomal aberrations in the populations of Hiroshima and Nagasaki, by allowing extra-binomial variation in aberrant cell counts corresponding to within-subject correlations in cell aberrations. Strong within-subject correlations were detected with corresponding standard errors for the average number of aberrant cells that were often substantially larger than was previously assumed. The extra-binomial variation is accomodated in the analysis in the present report, as described in the section on dose-response models, by using a beta-binomial (B-B) variance structure. It is emphasized that we have generally satisfactory agreement between the observed and the B-B fitted frequencies by city-dose category. The chromosomal aberration data considered here are not extensive enough to allow a precise discrimination between competing dose-response models. A quadratic gamma ray and linear neutron model, however, most closely fits the chromosome data. (author)
Application of Negative Binomial Regression for Assessing Public ...
African Journals Online (AJOL)
This study investigated the social or demographic factors associated with public awareness of health warnings on the .... television, and in cinemas, newspapers, magazines, billboards, posters in shops and pamphlets. ... demographic and socio-economic factors affecting respondents' awareness of health warnings on the ...
Zero-truncated negative binomial - Erlang distribution
Bodhisuwan, Winai; Pudprommarat, Chookait; Bodhisuwan, Rujira; Saothayanun, Luckhana
2017-11-01
The zero-truncated negative binomial-Erlang distribution is introduced. It is developed from negative binomial-Erlang distribution. In this work, the probability mass function is derived and some properties are included. The parameters of the zero-truncated negative binomial-Erlang distribution are estimated by using the maximum likelihood estimation. Finally, the proposed distribution is applied to real data, the number of methamphetamine in the Bangkok, Thailand. Based on the results, it shows that the zero-truncated negative binomial-Erlang distribution provided a better fit than the zero-truncated Poisson, zero-truncated negative binomial, zero-truncated generalized negative-binomial and zero-truncated Poisson-Lindley distributions for this data.
International Nuclear Information System (INIS)
Arneodo, M.; Ferrero, M.I.; Peroni, C.; Bee, C.P.; Bird, I.; Coughlan, J.; Sloan, T.; Braun, H.; Brueck, H.; Drees, J.; Edwards, A.; Krueger, J.; Montgomery, H.E.; Peschel, H.; Pietrzyk, U.; Poetsch, M.; Schneider, A.; Dreyer, T.; Ernst, T.; Haas, J.; Kabuss, E.M.; Landgraf, U.; Mohr, W.; Rith, K.; Schlagboehmer, A.; Schroeder, T.; Stier, H.E.; Wallucks, W.
1987-01-01
The multiplicity distributions of charged hadrons produced in the deep inelastic muon-proton scattering at 280 GeV are analysed in various rapidity intervals, as a function of the total hadronic centre of mass energy W ranging from 4-20 GeV. Multiplicity distributions for the backward and forward hemispheres are also analysed separately. The data can be well parameterized by binomial distributions, extending their range of applicability to the case of lepton-proton scattering. The energy and the rapidity dependence of the parameters is presented and a smooth transition from the binomial distribution via Poissonian to the ordinary binomial is observed. (orig.)
System-Reliability Cumulative-Binomial Program
Scheuer, Ernest M.; Bowerman, Paul N.
1989-01-01
Cumulative-binomial computer program, NEWTONP, one of set of three programs, calculates cumulative binomial probability distributions for arbitrary inputs. NEWTONP, CUMBIN (NPO-17555), and CROSSER (NPO-17557), used independently of one another. Program finds probability required to yield given system reliability. Used by statisticians and users of statistical procedures, test planners, designers, and numerical analysts. Program written in C.
A class of orthogonal nonrecursive binomial filters.
Haddad, R. A.
1971-01-01
The time- and frequency-domain properties of the orthogonal binomial sequences are presented. It is shown that these sequences, or digital filters based on them, can be generated using adders and delay elements only. The frequency-domain behavior of these nonrecursive binomial filters suggests a number of applications as low-pass Gaussian filters or as inexpensive bandpass filters.
Common-Reliability Cumulative-Binomial Program
Scheuer, Ernest, M.; Bowerman, Paul N.
1989-01-01
Cumulative-binomial computer program, CROSSER, one of set of three programs, calculates cumulative binomial probability distributions for arbitrary inputs. CROSSER, CUMBIN (NPO-17555), and NEWTONP (NPO-17556), used independently of one another. Point of equality between reliability of system and common reliability of components found. Used by statisticians and users of statistical procedures, test planners, designers, and numerical analysts. Program written in C.
Problems on Divisibility of Binomial Coefficients
Osler, Thomas J.; Smoak, James
2004-01-01
Twelve unusual problems involving divisibility of the binomial coefficients are represented in this article. The problems are listed in "The Problems" section. All twelve problems have short solutions which are listed in "The Solutions" section. These problems could be assigned to students in any course in which the binomial theorem and Pascal's…
Directory of Open Access Journals (Sweden)
Michaela Fakiola
2010-12-01
Full Text Available Genome wide linkage studies (GWLS have provided evidence for loci controlling visceral leishmaniasis on Chromosomes 1p22, 6q27, 22q12 in Sudan and 6q27, 9p21, 17q11-q21 in Brazil. Genome wide studies from the major focus of disease in India have not previously been reported.We undertook a GWLS in India in which a primary ∼10 cM (515 microsatellites scan was carried out in 58 multicase pedigrees (74 nuclear families; 176 affected, 353 total individuals and replication sought in 79 pedigrees (102 nuclear families; 218 affected, 473 total individuals. The primary scan provided evidence (≥2 adjacent markers allele-sharing LOD≥0.59; nominal P≤0.05 for linkage on Chromosomes 2, 5, 6, 7, 8, 10, 11, 20 and X, with peaks at 6p25.3-p24.3 and 8p23.1-p21.3 contributed to largely by 31 Hindu families and at Xq21.1-q26.1 by 27 Muslim families. Refined mapping confirmed linkage across all primary scan families at 2q12.2-q14.1 and 11q13.2-q23.3, but only 11q13.2-q23.3 replicated (combined LOD = 1.59; P = 0.0034. Linkage at 6p25.3-p24.3 and 8p23.1-p21.3, and at Xq21.1-q26.1, was confirmed by refined mapping for primary Hindu and Muslim families, respectively, but only Xq21.1-q26.1 replicated across all Muslim families (combined LOD 1.49; P = 0.0045. STRUCTURE and SMARTPCA did not identify population genetic substructure related to religious group. Classification and regression tree, and spatial interpolation, analyses confirm geographical heterogeneity for linkages at 6p25.3-p24.3, 8p23.1-p21.3 and Xq21.1-q26.1, with specific clusters of families contributing LOD scores of 2.13 (P = 0.0009, 1.75 (P = 0.002 and 1.84 (P = 0.001, respectively.GWLS has identified novel loci that show geographical heterogeneity in their influence on susceptibility to VL in India.
Directory of Open Access Journals (Sweden)
Željko V. Račić
2010-12-01
Full Text Available This paper aims to present the specifics of the application of multiple linear regression model. The economic (financial crisis is analyzed in terms of gross domestic product which is in a function of the foreign trade balance (on one hand and the credit cards, i.e. indebtedness of the population on this basis (on the other hand, in the USA (from 1999. to 2008. We used the extended application model which shows how the analyst should run the whole development process of regression model. This process began with simple statistical features and the application of regression procedures, and ended with residual analysis, intended for the study of compatibility of data and model settings. This paper also analyzes the values of some standard statistics used in the selection of appropriate regression model. Testing of the model is carried out with the use of the Statistics PASW 17 program.
Correlated binomial models and correlation structures
International Nuclear Information System (INIS)
Hisakado, Masato; Kitsukawa, Kenji; Mori, Shintaro
2006-01-01
We discuss a general method to construct correlated binomial distributions by imposing several consistent relations on the joint probability function. We obtain self-consistency relations for the conditional correlations and conditional probabilities. The beta-binomial distribution is derived by a strong symmetric assumption on the conditional correlations. Our derivation clarifies the 'correlation' structure of the beta-binomial distribution. It is also possible to study the correlation structures of other probability distributions of exchangeable (homogeneous) correlated Bernoulli random variables. We study some distribution functions and discuss their behaviours in terms of their correlation structures
International Nuclear Information System (INIS)
Memon, Abdul Ghafoor; Memon, Rizwan Ahmed; Harijan, Khanji; Uqaili, Mohammad Aslam
2015-01-01
Highlights: • Thermo-environmental and exergoeconomic models of a combined cycle power plant are defined. • Effects of various operating parameters on performance, CO 2 emissions and costs are deliberated. • Multiple polynomial regression models are developed. • For various operating conditions, optimal operating parameters are determined. - Abstract: A combined cycle power plant is analyzed through thermo-environmental, exergoeconomic and statistical methods. The plant is first modeled and parametrically studied to deliberate the effects of various operating parameters on the thermo-environmental quantities, like net power output, energy efficiency, exergy efficiency and CO 2 emissions. These quantities are then correlated with operating parameters through multiple polynomial regression analysis. Moreover, exergoeconomic analysis is performed to look into the impact of operating parameters on fuel cost, capital cost and exergy destruction cost. The optimal operating parameters are then determined using the Nelder-Mead simplex method by defining two objective functions, namely exergy efficiency (maximized) and total cost (minimized). According to the parametric analysis, the operating parameters impart significant effects on the performance and cost rates. The regression models are appearing to be a good estimator of the response variables since appended with satisfactory R 2 values. The optimization results exhibit that the exergy efficiency is increased and cost rates are decreased by selecting the best trade-off values at different power output conditions
Newton Binomial Formulas in Schubert Calculus
Cordovez, Jorge; Gatto, Letterio; Santiago, Taise
2008-01-01
We prove Newton's binomial formulas for Schubert Calculus to determine numbers of base point free linear series on the projective line with prescribed ramification divisor supported at given distinct points.
Calculating Cumulative Binomial-Distribution Probabilities
Scheuer, Ernest M.; Bowerman, Paul N.
1989-01-01
Cumulative-binomial computer program, CUMBIN, one of set of three programs, calculates cumulative binomial probability distributions for arbitrary inputs. CUMBIN, NEWTONP (NPO-17556), and CROSSER (NPO-17557), used independently of one another. Reliabilities and availabilities of k-out-of-n systems analyzed. Used by statisticians and users of statistical procedures, test planners, designers, and numerical analysts. Used for calculations of reliability and availability. Program written in C.
Log-binomial models: exploring failed convergence.
Williamson, Tyler; Eliasziw, Misha; Fick, Gordon Hilton
2013-12-13
Relative risk is a summary metric that is commonly used in epidemiological investigations. Increasingly, epidemiologists are using log-binomial models to study the impact of a set of predictor variables on a single binary outcome, as they naturally offer relative risks. However, standard statistical software may report failed convergence when attempting to fit log-binomial models in certain settings. The methods that have been proposed in the literature for dealing with failed convergence use approximate solutions to avoid the issue. This research looks directly at the log-likelihood function for the simplest log-binomial model where failed convergence has been observed, a model with a single linear predictor with three levels. The possible causes of failed convergence are explored and potential solutions are presented for some cases. Among the principal causes is a failure of the fitting algorithm to converge despite the log-likelihood function having a single finite maximum. Despite these limitations, log-binomial models are a viable option for epidemiologists wishing to describe the relationship between a set of predictors and a binary outcome where relative risk is the desired summary measure. Epidemiologists are encouraged to continue to use log-binomial models and advocate for improvements to the fitting algorithms to promote the widespread use of log-binomial models.
The Validation of a Beta-Binomial Model for Overdispersed Binomial Data.
Kim, Jongphil; Lee, Ji-Hyun
2017-01-01
The beta-binomial model has been widely used as an analytically tractable alternative that captures the overdispersion of an intra-correlated, binomial random variable, X . However, the model validation for X has been rarely investigated. As a beta-binomial mass function takes on a few different shapes, the model validation is examined for each of the classified shapes in this paper. Further, the mean square error (MSE) is illustrated for each shape by the maximum likelihood estimator (MLE) based on a beta-binomial model approach and the method of moments estimator (MME) in order to gauge when and how much the MLE is biased.
Laszlo, Sarah; Federmeier, Kara D.
2010-01-01
Linking print with meaning tends to be divided into subprocesses, such as recognition of an input's lexical entry and subsequent access of semantics. However, recent results suggest that the set of semantic features activated by an input is broader than implied by a view wherein access serially follows recognition. EEG was collected from participants who viewed items varying in number and frequency of both orthographic neighbors and lexical associates. Regression analysis of single item ERPs replicated past findings, showing that N400 amplitudes are greater for items with more neighbors, and further revealed that N400 amplitudes increase for items with more lexical associates and with higher frequency neighbors or associates. Together, the data suggest that in the N400 time window semantic features of items broadly related to inputs are active, consistent with models in which semantic access takes place in parallel with stimulus recognition. PMID:20624252
Meta-analysis of studies with bivariate binary outcomes: a marginal beta-binomial model approach.
Chen, Yong; Hong, Chuan; Ning, Yang; Su, Xiao
2016-01-15
When conducting a meta-analysis of studies with bivariate binary outcomes, challenges arise when the within-study correlation and between-study heterogeneity should be taken into account. In this paper, we propose a marginal beta-binomial model for the meta-analysis of studies with binary outcomes. This model is based on the composite likelihood approach and has several attractive features compared with the existing models such as bivariate generalized linear mixed model (Chu and Cole, 2006) and Sarmanov beta-binomial model (Chen et al., 2012). The advantages of the proposed marginal model include modeling the probabilities in the original scale, not requiring any transformation of probabilities or any link function, having closed-form expression of likelihood function, and no constraints on the correlation parameter. More importantly, because the marginal beta-binomial model is only based on the marginal distributions, it does not suffer from potential misspecification of the joint distribution of bivariate study-specific probabilities. Such misspecification is difficult to detect and can lead to biased inference using currents methods. We compare the performance of the marginal beta-binomial model with the bivariate generalized linear mixed model and the Sarmanov beta-binomial model by simulation studies. Interestingly, the results show that the marginal beta-binomial model performs better than the Sarmanov beta-binomial model, whether or not the true model is Sarmanov beta-binomial, and the marginal beta-binomial model is more robust than the bivariate generalized linear mixed model under model misspecifications. Two meta-analyses of diagnostic accuracy studies and a meta-analysis of case-control studies are conducted for illustration. Copyright © 2015 John Wiley & Sons, Ltd.
Microbial comparative pan-genomics using binomial mixture models
DEFF Research Database (Denmark)
Ussery, David; Snipen, L; Almøy, T
2009-01-01
The size of the core- and pan-genome of bacterial species is a topic of increasing interest due to the growing number of sequenced prokaryote genomes, many from the same species. Attempts to estimate these quantities have been made, using regression methods or mixture models. We extend the latter...... occurring genes in the population. CONCLUSION: Analyzing pan-genomics data with binomial mixture models is a way to handle dependencies between genomes, which we find is always present. A bottleneck in the estimation procedure is the annotation of rarely occurring genes....
Qi, Danyi; Roe, Brian E
2016-01-01
We estimate models of consumer food waste awareness and attitudes using responses from a national survey of U.S. residents. Our models are interpreted through the lens of several theories that describe how pro-social behaviors relate to awareness, attitudes and opinions. Our analysis of patterns among respondents' food waste attitudes yields a model with three principal components: one that represents perceived practical benefits households may lose if food waste were reduced, one that represents the guilt associated with food waste, and one that represents whether households feel they could be doing more to reduce food waste. We find our respondents express significant agreement that some perceived practical benefits are ascribed to throwing away uneaten food, e.g., nearly 70% of respondents agree that throwing away food after the package date has passed reduces the odds of foodborne illness, while nearly 60% agree that some food waste is necessary to ensure meals taste fresh. We identify that these attitudinal responses significantly load onto a single principal component that may represent a key attitudinal construct useful for policy guidance. Further, multivariate regression analysis reveals a significant positive association between the strength of this component and household income, suggesting that higher income households most strongly agree with statements that link throwing away uneaten food to perceived private benefits.
Directory of Open Access Journals (Sweden)
Danyi Qi
Full Text Available We estimate models of consumer food waste awareness and attitudes using responses from a national survey of U.S. residents. Our models are interpreted through the lens of several theories that describe how pro-social behaviors relate to awareness, attitudes and opinions. Our analysis of patterns among respondents' food waste attitudes yields a model with three principal components: one that represents perceived practical benefits households may lose if food waste were reduced, one that represents the guilt associated with food waste, and one that represents whether households feel they could be doing more to reduce food waste. We find our respondents express significant agreement that some perceived practical benefits are ascribed to throwing away uneaten food, e.g., nearly 70% of respondents agree that throwing away food after the package date has passed reduces the odds of foodborne illness, while nearly 60% agree that some food waste is necessary to ensure meals taste fresh. We identify that these attitudinal responses significantly load onto a single principal component that may represent a key attitudinal construct useful for policy guidance. Further, multivariate regression analysis reveals a significant positive association between the strength of this component and household income, suggesting that higher income households most strongly agree with statements that link throwing away uneaten food to perceived private benefits.
Directory of Open Access Journals (Sweden)
Abdelfattah M. Selim
2018-03-01
Full Text Available Aim: The present cross-sectional study was conducted to determine the seroprevalence and potential risk factors associated with Bovine viral diarrhea virus (BVDV disease in cattle and buffaloes in Egypt, to model the potential risk factors associated with the disease using logistic regression (LR models, and to fit the best predictive model for the current data. Materials and Methods: A total of 740 blood samples were collected within November 2012-March 2013 from animals aged between 6 months and 3 years. The potential risk factors studied were species, age, sex, and herd location. All serum samples were examined with indirect ELIZA test for antibody detection. Data were analyzed with different statistical approaches such as Chi-square test, odds ratios (OR, univariable, and multivariable LR models. Results: Results revealed a non-significant association between being seropositive with BVDV and all risk factors, except for species of animal. Seroprevalence percentages were 40% and 23% for cattle and buffaloes, respectively. OR for all categories were close to one with the highest OR for cattle relative to buffaloes, which was 2.237. Likelihood ratio tests showed a significant drop of the -2LL from univariable LR to multivariable LR models. Conclusion: There was an evidence of high seroprevalence of BVDV among cattle as compared with buffaloes with the possibility of infection in different age groups of animals. In addition, multivariable LR model was proved to provide more information for association and prediction purposes relative to univariable LR models and Chi-square tests if we have more than one predictor.
Simulation on Poisson and negative binomial models of count road accident modeling
Sapuan, M. S.; Razali, A. M.; Zamzuri, Z. H.; Ibrahim, K.
2016-11-01
Accident count data have often been shown to have overdispersion. On the other hand, the data might contain zero count (excess zeros). The simulation study was conducted to create a scenarios which an accident happen in T-junction with the assumption the dependent variables of generated data follows certain distribution namely Poisson and negative binomial distribution with different sample size of n=30 to n=500. The study objective was accomplished by fitting Poisson regression, negative binomial regression and Hurdle negative binomial model to the simulated data. The model validation was compared and the simulation result shows for each different sample size, not all model fit the data nicely even though the data generated from its own distribution especially when the sample size is larger. Furthermore, the larger sample size indicates that more zeros accident count in the dataset.
Misyura, Maksym; Sukhai, Mahadeo A; Kulasignam, Vathany; Zhang, Tong; Kamel-Reid, Suzanne; Stockley, Tracy L
2018-02-01
A standard approach in test evaluation is to compare results of the assay in validation to results from previously validated methods. For quantitative molecular diagnostic assays, comparison of test values is often performed using simple linear regression and the coefficient of determination (R 2 ), using R 2 as the primary metric of assay agreement. However, the use of R 2 alone does not adequately quantify constant or proportional errors required for optimal test evaluation. More extensive statistical approaches, such as Bland-Altman and expanded interpretation of linear regression methods, can be used to more thoroughly compare data from quantitative molecular assays. We present the application of Bland-Altman and linear regression statistical methods to evaluate quantitative outputs from next-generation sequencing assays (NGS). NGS-derived data sets from assay validation experiments were used to demonstrate the utility of the statistical methods. Both Bland-Altman and linear regression were able to detect the presence and magnitude of constant and proportional error in quantitative values of NGS data. Deming linear regression was used in the context of assay comparison studies, while simple linear regression was used to analyse serial dilution data. Bland-Altman statistical approach was also adapted to quantify assay accuracy, including constant and proportional errors, and precision where theoretical and empirical values were known. The complementary application of the statistical methods described in this manuscript enables more extensive evaluation of performance characteristics of quantitative molecular assays, prior to implementation in the clinical molecular laboratory. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Binomial distribution for the charge asymmetry parameter
International Nuclear Information System (INIS)
Chou, T.T.; Yang, C.N.
1984-01-01
It is suggested that for high energy collisions the distribution with respect to the charge asymmetry z = nsub(F) - nsub(B) is binomial, where nsub(F) and nsub(B) are the forward and backward charge multiplicities. (orig.)
Binomial test models and item difficulty
van der Linden, Willem J.
1979-01-01
In choosing a binomial test model, it is important to know exactly what conditions are imposed on item difficulty. In this paper these conditions are examined for both a deterministic and a stochastic conception of item responses. It appears that they are more restrictive than is generally
Adaptive estimation of binomial probabilities under misclassification
Albers, Willem/Wim; Veldman, H.J.
1984-01-01
If misclassification occurs the standard binomial estimator is usually seriously biased. It is known that an improvement can be achieved by using more than one observer in classifying the sample elements. Here it will be investigated which number of observers is optimal given the total number of
Binomial vs poisson statistics in radiation studies
International Nuclear Information System (INIS)
Foster, J.; Kouris, K.; Spyrou, N.M.; Matthews, I.P.; Welsh National School of Medicine, Cardiff
1983-01-01
The processes of radioactive decay, decay and growth of radioactive species in a radioactive chain, prompt emission(s) from nuclear reactions, conventional activation and cyclic activation are discussed with respect to their underlying statistical density function. By considering the transformation(s) that each nucleus may undergo it is shown that all these processes are fundamentally binomial. Formally, when the number of experiments N is large and the probability of success p is close to zero, the binomial is closely approximated by the Poisson density function. In radiation and nuclear physics, N is always large: each experiment can be conceived of as the observation of the fate of each of the N nuclei initially present. Whether p, the probability that a given nucleus undergoes a prescribed transformation, is close to zero depends on the process and nuclide(s) concerned. Hence, although a binomial description is always valid, the Poisson approximation is not always adequate. Therefore further clarification is provided as to when the binomial distribution must be used in the statistical treatment of detected events. (orig.)
The Normal Distribution From Binomial to Normal
Indian Academy of Sciences (India)
Home; Journals; Resonance – Journal of Science Education; Volume 2; Issue 6. The Normal Distribution From Binomial to Normal. S Ramasubramanian. Series Article Volume 2 Issue 6 June 1997 pp 15-24. Fulltext. Click here to view fulltext PDF. Permanent link: http://www.ias.ac.in/article/fulltext/reso/002/06/0015-0024 ...
Adaptive bayesian analysis for binomial proportions
CSIR Research Space (South Africa)
Das, Sonali
2008-10-01
Full Text Available The authors consider the problem of statistical inference of binomial proportions for non-matched, correlated samples, under the Bayesian framework. Such inference can arise when the same group is observed at a different number of times with the aim...
Dyer, Betsey D.; Kahn, Michael J.; LeBlanc, Mark D.
2008-01-01
Classification and regression tree (CART) analysis was applied to genome-wide tetranucleotide frequencies (genomic signatures) of 195 archaea and bacteria. Although genomic signatures have typically been used to classify evolutionary divergence, in this study, convergent evolution was the focus. Temperature optima for most of the organisms examined could be distinguished by CART analyses of tetranucleotide frequencies. This suggests that pervasive (nonlinear) qualities of genomes may reflect certain environmental conditions (such as temperature) in which those genomes evolved. The predominant use of GAGA and AGGA as the discriminating tetramers in CART models suggests that purine-loading and codon biases of thermophiles may explain some of the results. PMID:19054742
Expansion around half-integer values, binomial sums, and inverse binomial sums
International Nuclear Information System (INIS)
Weinzierl, Stefan
2004-01-01
I consider the expansion of transcendental functions in a small parameter around rational numbers. This includes in particular the expansion around half-integer values. I present algorithms which are suitable for an implementation within a symbolic computer algebra system. The method is an extension of the technique of nested sums. The algorithms allow in addition the evaluation of binomial sums, inverse binomial sums and generalizations thereof
Tomography of binomial states of the radiation field
Bazrafkan, MR; Man'ko, [No Value
2004-01-01
The symplectic, optical, and photon-number tomographic symbols of binomial states of the radiation field are studied. Explicit relations for all tomograms of the binomial states are obtained. Two measures for nonclassical properties of these states are discussed.
Pooling overdispersed binomial data to estimate event rate.
Young-Xu, Yinong; Chan, K Arnold
2008-08-19
The beta-binomial model is one of the methods that can be used to validly combine event rates from overdispersed binomial data. Our objective is to provide a full description of this method and to update and broaden its applications in clinical and public health research. We describe the statistical theories behind the beta-binomial model and the associated estimation methods. We supply information about statistical software that can provide beta-binomial estimations. Using a published example, we illustrate the application of the beta-binomial model when pooling overdispersed binomial data. In an example regarding the safety of oral antifungal treatments, we had 41 treatment arms with event rates varying from 0% to 13.89%. Using the beta-binomial model, we obtained a summary event rate of 3.44% with a standard error of 0.59%. The parameters of the beta-binomial model took the values of 1.24 for alpha and 34.73 for beta. The beta-binomial model can provide a robust estimate for the summary event rate by pooling overdispersed binomial data from different studies. The explanation of the method and the demonstration of its applications should help researchers incorporate the beta-binomial method as they aggregate probabilities of events from heterogeneous studies.
Pooling overdispersed binomial data to estimate event rate
Directory of Open Access Journals (Sweden)
Chan K Arnold
2008-08-01
Full Text Available Abstract Background The beta-binomial model is one of the methods that can be used to validly combine event rates from overdispersed binomial data. Our objective is to provide a full description of this method and to update and broaden its applications in clinical and public health research. Methods We describe the statistical theories behind the beta-binomial model and the associated estimation methods. We supply information about statistical software that can provide beta-binomial estimations. Using a published example, we illustrate the application of the beta-binomial model when pooling overdispersed binomial data. Results In an example regarding the safety of oral antifungal treatments, we had 41 treatment arms with event rates varying from 0% to 13.89%. Using the beta-binomial model, we obtained a summary event rate of 3.44% with a standard error of 0.59%. The parameters of the beta-binomial model took the values of 1.24 for alpha and 34.73 for beta. Conclusion The beta-binomial model can provide a robust estimate for the summary event rate by pooling overdispersed binomial data from different studies. The explanation of the method and the demonstration of its applications should help researchers incorporate the beta-binomial method as they aggregate probabilities of events from heterogeneous studies.
International Nuclear Information System (INIS)
Amitabh, J.; Vaccaro, J.A.; Hill, K.E.
1998-01-01
We study the recently defined number-phase Wigner function S NP (n,θ) for a single-mode field considered to be in binomial and negative binomial states. These states interpolate between Fock and coherent states and coherent and quasi thermal states, respectively, and thus provide a set of states with properties ranging from uncertain phase and sharp photon number to sharp phase and uncertain photon number. The distribution function S NP (n,θ) gives a graphical representation of the complimentary nature of the number and phase properties of these states. We highlight important differences between Wigner's quasi probability function, which is associated with the position and momentum observables, and S NP (n,θ), which is associated directly with the photon number and phase observables. We also discuss the number-phase entropic uncertainty relation for the binomial and negative binomial states and we show that negative binomial states give a lower phase entropy than states which minimize the phase variance
Hutton, Eileen K; Simioni, Julia C; Thabane, Lehana
2017-08-01
Among women with a fetus with a non-cephalic presentation, external cephalic version (ECV) has been shown to reduce the rate of breech presentation at birth and cesarean birth. Compared with ECV at term, beginning ECV prior to 37 weeks' gestation decreases the number of infants in a non-cephalic presentation at birth. The purpose of this secondary analysis was to investigate factors associated with a successful ECV procedure and to present this in a clinically useful format. Data were collected as part of the Early ECV Pilot and Early ECV2 Trials, which randomized 1776 women with a fetus in breech presentation to either early ECV (34-36 weeks' gestation) or delayed ECV (at or after 37 weeks). The outcome of interest was successful ECV, defined as the fetus being in a cephalic presentation immediately following the procedure, as well as at the time of birth. The importance of several factors in predicting successful ECV was investigated using two statistical methods: logistic regression and classification and regression tree (CART) analyses. Among nulliparas, non-engagement of the presenting part and an easily palpable fetal head were independently associated with success. Among multiparas, non-engagement of the presenting part, gestation less than 37 weeks and an easily palpable fetal head were found to be independent predictors of success. These findings were consistent with results of the CART analyses. Regardless of parity, descent of the presenting part was the most discriminating factor in predicting successful ECV and cephalic presentation at birth. © 2017 Nordic Federation of Societies of Obstetrics and Gynecology.
Tran, Phoebe; Waller, Lance
2015-01-01
Lyme disease has been the subject of many studies due to increasing incidence rates year after year and the severe complications that can arise in later stages of the disease. Negative binomial models have been used to model Lyme disease in the past with some success. However, there has been little focus on the reliability and consistency of these models when they are used to study Lyme disease at multiple spatial scales. This study seeks to explore how sensitive/consistent negative binomial models are when they are used to study Lyme disease at different spatial scales (at the regional and sub-regional levels). The study area includes the thirteen states in the Northeastern United States with the highest Lyme disease incidence during the 2002-2006 period. Lyme disease incidence at county level for the period of 2002-2006 was linked with several previously identified key landscape and climatic variables in a negative binomial regression model for the Northeastern region and two smaller sub-regions (the New England sub-region and the Mid-Atlantic sub-region). This study found that negative binomial models, indeed, were sensitive/inconsistent when used at different spatial scales. We discuss various plausible explanations for such behavior of negative binomial models. Further investigation of the inconsistency and sensitivity of negative binomial models when used at different spatial scales is important for not only future Lyme disease studies and Lyme disease risk assessment/management but any study that requires use of this model type in a spatial context. Copyright © 2014 Elsevier Inc. All rights reserved.
A comparison of methods for the analysis of binomial clustered outcomes in behavioral research.
Ferrari, Alberto; Comelli, Mario
2016-12-01
In behavioral research, data consisting of a per-subject proportion of "successes" and "failures" over a finite number of trials often arise. This clustered binary data are usually non-normally distributed, which can distort inference if the usual general linear model is applied and sample size is small. A number of more advanced methods is available, but they are often technically challenging and a comparative assessment of their performances in behavioral setups has not been performed. We studied the performances of some methods applicable to the analysis of proportions; namely linear regression, Poisson regression, beta-binomial regression and Generalized Linear Mixed Models (GLMMs). We report on a simulation study evaluating power and Type I error rate of these models in hypothetical scenarios met by behavioral researchers; plus, we describe results from the application of these methods on data from real experiments. Our results show that, while GLMMs are powerful instruments for the analysis of clustered binary outcomes, beta-binomial regression can outperform them in a range of scenarios. Linear regression gave results consistent with the nominal level of significance, but was overall less powerful. Poisson regression, instead, mostly led to anticonservative inference. GLMMs and beta-binomial regression are generally more powerful than linear regression; yet linear regression is robust to model misspecification in some conditions, whereas Poisson regression suffers heavily from violations of the assumptions when used to model proportion data. We conclude providing directions to behavioral scientists dealing with clustered binary data and small sample sizes. Copyright © 2016 Elsevier B.V. All rights reserved.
On Using Selection Procedures with Binomial Models.
1983-10-01
eds.), Shinko Tsusho Co. Ltd., Tokyo, Japan , pp. 501-533. Gupta, S. S. and Sobel, M. (1960). Selecting a subset containing the best of several...IA_____3_6r__I____ *TITLE food A$ieweI L TYPE of 09PORT 6 PERIOD COVERED ON USING SELECTION PROCEDURES WITH BINOMIAL MODELS Technical 6. PeSPRFeauS1 ONG. REPORT...ontoedis stoc toeSI. to Ei.,..,t&* toemR.,. 14. SUPPOLEMENTARY MOCTES 19. Rey WORDS (Coatiou. 40 ow.oa* edo if Necesary and #do""&a by block number
Bayesian analysis of a correlated binomial model
Diniz, Carlos A. R.; Tutia, Marcelo H.; Leite, Jose G.
2010-01-01
In this paper a Bayesian approach is applied to the correlated binomial model, CB(n, p, ρ), proposed by Luceño (Comput. Statist. Data Anal. 20 (1995) 511–520). The data augmentation scheme is used in order to overcome the complexity of the mixture likelihood. MCMC methods, including Gibbs sampling and Metropolis within Gibbs, are applied to estimate the posterior marginal for the probability of success p and for the correlation coefficient ρ. The sensitivity of the posterior is studied taking...
Greene, LaVana; Elzey, Brianda; Franklin, Mariah; Fakayode, Sayo O.
2017-03-01
The negative health impact of polycyclic aromatic hydrocarbons (PAHs) and differences in pharmacological activity of enantiomers of chiral molecules in humans highlights the need for analysis of PAHs and their chiral analogue molecules in humans. Herein, the first use of cyclodextrin guest-host inclusion complexation, fluorescence spectrophotometry, and chemometric approach to PAH (anthracene) and chiral-PAH analogue derivatives (1-(9-anthryl)-2,2,2-triflouroethanol (TFE)) analyses are reported. The binding constants (Kb), stoichiometry (n), and thermodynamic properties (Gibbs free energy (ΔG), enthalpy (ΔH), and entropy (ΔS)) of anthracene and enantiomers of TFE-methyl-β-cyclodextrin (Me-β-CD) guest-host complexes were also determined. Chemometric partial-least-square (PLS) regression analysis of emission spectra data of Me-β-CD-guest-host inclusion complexes was used for the determination of anthracene and TFE enantiomer concentrations in Me-β-CD-guest-host inclusion complex samples. The values of calculated Kb and negative ΔG suggest the thermodynamic favorability of anthracene-Me-β-CD and enantiomeric of TFE-Me-β-CD inclusion complexation reactions. However, anthracene-Me-β-CD and enantiomer TFE-Me-β-CD inclusion complexations showed notable differences in the binding affinity behaviors and thermodynamic properties. The PLS regression analysis resulted in square-correlation-coefficients of 0.997530 or better and a low LOD of 3.81 × 10- 7 M for anthracene and 3.48 × 10- 8 M for TFE enantiomers at physiological conditions. Most importantly, PLS regression accurately determined the anthracene and TFE enantiomer concentrations with an average low error of 2.31% for anthracene, 4.44% for R-TFE and 3.60% for S-TFE. The results of the study are highly significant because of its high sensitivity and accuracy for analysis of PAH and chiral PAH analogue derivatives without the need of an expensive chiral column, enantiomeric resolution, or use of a
Peluso, Marco E M; Munnia, Armelle; Ceppi, Marcello
2014-11-05
Exposures to bisphenol-A, a weak estrogenic chemical, largely used for the production of plastic containers, can affect the rodent behaviour. Thus, we examined the relationships between bisphenol-A and the anxiety-like behaviour, spatial skills, and aggressiveness, in 12 toxicity studies of rodent offspring from females orally exposed to bisphenol-A, while pregnant and/or lactating, by median and linear splines analyses. Subsequently, the meta-regression analysis was applied to quantify the behavioural changes. U-shaped, inverted U-shaped and J-shaped dose-response curves were found to describe the relationships between bisphenol-A with the behavioural outcomes. The occurrence of anxiogenic-like effects and spatial skill changes displayed U-shaped and inverted U-shaped curves, respectively, providing examples of effects that are observed at low-doses. Conversely, a J-dose-response relationship was observed for aggressiveness. When the proportion of rodents expressing certain traits or the time that they employed to manifest an attitude was analysed, the meta-regression indicated that a borderline significant increment of anxiogenic-like effects was present at low-doses regardless of sexes (β)=-0.8%, 95% C.I. -1.7/0.1, P=0.076, at ≤120 μg bisphenol-A. Whereas, only bisphenol-A-males exhibited a significant inhibition of spatial skills (β)=0.7%, 95% C.I. 0.2/1.2, P=0.004, at ≤100 μg/day. A significant increment of aggressiveness was observed in both the sexes (β)=67.9,C.I. 3.4, 172.5, P=0.038, at >4.0 μg. Then, bisphenol-A treatments significantly abrogated spatial learning and ability in males (P<0.001 vs. females). Overall, our study showed that developmental exposures to low-doses of bisphenol-A, e.g. ≤120 μg/day, were associated to behavioural aberrations in offspring. Copyright © 2014. Published by Elsevier Ireland Ltd.
DEFF Research Database (Denmark)
Sharma, Tarang; Gøtzsche, Peter C.; Kuss, Oliver
2017-01-01
Carlo (MCMC), and a beta-binomial regression model. Results The estimates for the four outcomes did not change substantially across the different methods; the Yusuf-Peto method underestimated the treatment harm and overestimated its precision, especially when the estimated odds ratio deviated greatly...... from 1. For example, the odds ratio for suicidality for children and adolescents was 2.39 (95% confidence interval = 1.32–4.33), using the Yusuf-Peto method but increased to 2.64 (1.33–5.26) using conditional logistic regression, to 2.69 (1.19–6.09) using beta-binomial, to 2.73 (1.37–5.42) using......, sensitivity analyses should be performed using different methods instead of the Yusuf-Peto approach, in particular the beta-binomial method, which was shown to be superior through a simulation study....
Fossati, Andrea; Widiger, Thomas A; Borroni, Serena; Maffei, Cesare; Somma, Antonella
2017-06-01
To extend the evidence on the reliability and construct validity of the Five-Factor Model Rating Form (FFMRF) in its self-report version, two independent samples of Italian participants, which were composed of 510 adolescent high school students and 457 community-dwelling adults, respectively, were administered the FFMRF in its Italian translation. Adolescent participants were also administered the Italian translation of the Borderline Personality Features Scale for Children-11 (BPFSC-11), whereas adult participants were administered the Italian translation of the Triarchic Psychopathy Measure (TriPM). Cronbach α values were consistent with previous findings; in both samples, average interitem r values indicated acceptable internal consistency for all FFMRF scales. A multidimensional graded item response theory model indicated that the majority of FFMRF items had adequate discrimination parameters; information indices supported the reliability of the FFMRF scales. Both categorical (i.e., item-level) and scale-level regression analyses suggested that the FFMRF scores may predict a nonnegligible amount of variance in the BPFSC-11 total score in adolescent participants, and in the TriPM scale scores in adult participants.
Johansen, Mette; Bahrt, Henriette; Altman, Roy D; Bartels, Else M; Juhl, Carsten B; Bliddal, Henning; Lund, Hans; Christensen, Robin
2016-08-01
The aim was to identify factors explaining inconsistent observations concerning the efficacy of intra-articular hyaluronic acid compared to intra-articular sham/control, or non-intervention control, in patients with symptomatic osteoarthritis, based on randomized clinical trials (RCTs). A systematic review and meta-regression analyses of available randomized trials were conducted. The outcome, pain, was assessed according to a pre-specified hierarchy of potentially available outcomes. Hedges׳s standardized mean difference [SMD (95% CI)] served as effect size. REstricted Maximum Likelihood (REML) mixed-effects models were used to combine study results, and heterogeneity was calculated and interpreted as Tau-squared and I-squared, respectively. Overall, 99 studies (14,804 patients) met the inclusion criteria: Of these, only 71 studies (72%), including 85 comparisons (11,216 patients), had adequate data available for inclusion in the primary meta-analysis. Overall, compared with placebo, intra-articular hyaluronic acid reduced pain with an effect size of -0.39 [-0.47 to -0.31; P hyaluronic acid. Based on available trial data, intra-articular hyaluronic acid showed a better effect than intra-articular saline on pain reduction in osteoarthritis. Publication bias and the risk of selective outcome reporting suggest only small clinical effect compared to saline. Copyright © 2016 Elsevier Inc. All rights reserved.
Zero inflated negative binomial-Sushila distribution and its application
Yamrubboon, Darika; Thongteeraparp, Ampai; Bodhisuwan, Winai; Jampachaisri, Katechan
2017-11-01
A new zero inflated distribution is proposed in this work, namely the zero inflated negative binomial-Sushila distribution. The new distribution which is a mixture of the Bernoulli and negative binomial-Sushila distributions is an alternative distribution for the excessive zero counts and over-dispersion. Some characteristics of the proposed distribution are derived including probability mass function, mean and variance. The parameter estimation of the zero inflated negative binomial-Sushila distribution is also implemented by maximum likelihood method. In application, the proposed distribution can provide a better fit than traditional distributions: zero inflated Poisson and zero inflated negative binomial distributions.
Samdal, Gro Beate; Eide, Geir Egil; Barth, Tom; Williams, Geoffrey; Meland, Eivind
2017-03-28
This systematic review aims to explain the heterogeneity in results of interventions to promote physical activity and healthy eating for overweight and obese adults, by exploring the differential effects of behaviour change techniques (BCTs) and other intervention characteristics. The inclusion criteria specified RCTs with ≥ 12 weeks' duration, from January 2007 to October 2014, for adults (mean age ≥ 40 years, mean BMI ≥ 30). Primary outcomes were measures of healthy diet or physical activity. Two reviewers rated study quality, coded the BCTs, and collected outcome results at short (≤6 months) and long term (≥12 months). Meta-analyses and meta-regressions were used to estimate effect sizes (ES), heterogeneity indices (I 2 ) and regression coefficients. We included 48 studies containing a total of 82 outcome reports. The 32 long term reports had an overall ES = 0.24 with 95% confidence interval (CI): 0.15 to 0.33 and I 2 = 59.4%. The 50 short term reports had an ES = 0.37 with 95% CI: 0.26 to 0.48, and I 2 = 71.3%. The number of BCTs unique to the intervention group, and the BCTs goal setting and self-monitoring of behaviour predicted the effect at short and long term. The total number of BCTs in both intervention arms and using the BCTs goal setting of outcome, feedback on outcome of behaviour, implementing graded tasks, and adding objects to the environment, e.g. using a step counter, significantly predicted the effect at long term. Setting a goal for change; and the presence of reporting bias independently explained 58.8% of inter-study variation at short term. Autonomy supportive and person-centred methods as in Motivational Interviewing, the BCTs goal setting of behaviour, and receiving feedback on the outcome of behaviour, explained all of the between study variations in effects at long term. There are similarities, but also differences in effective BCTs promoting change in healthy eating and physical activity and
Extending the Binomial Checkpointing Technique for Resilience
Energy Technology Data Exchange (ETDEWEB)
Walther, Andrea; Narayanan, Sri Hari Krishna
2016-10-10
In terms of computing time, adjoint methods offer a very attractive alternative to compute gradient information, re- quired, e.g., for optimization purposes. However, together with this very favorable temporal complexity result comes a memory requirement that is in essence proportional with the operation count of the underlying function, e.g., if algo- rithmic differentiation is used to provide the adjoints. For this reason, checkpointing approaches in many variants have become popular. This paper analyzes an extension of the so-called binomial approach to cover also possible failures of the computing systems. Such a measure of precaution is of special interest for massive parallel simulations and adjoint calculations where the mean time between failure of the large scale computing system is smaller than the time needed to complete the calculation of the adjoint information. We de- scribe the extensions of standard checkpointing approaches required for such resilience, provide a corresponding imple- mentation and discuss numerical results.
Forward selection two sample binomial test
Wong, Kam-Fai; Wong, Weng-Kee; Lin, Miao-Shan
2016-01-01
Fisher’s exact test (FET) is a conditional method that is frequently used to analyze data in a 2 × 2 table for small samples. This test is conservative and attempts have been made to modify the test to make it less conservative. For example, Crans and Shuster (2008) proposed adding more points in the rejection region to make the test more powerful. We provide another way to modify the test to make it less conservative by using two independent binomial distributions as the reference distribution for the test statistic. We compare our new test with several methods and show that our test has advantages over existing methods in terms of control of the type 1 and type 2 errors. We reanalyze results from an oncology trial using our proposed method and our software which is freely available to the reader. PMID:27335577
Fractional Sums and Differences with Binomial Coefficients
Directory of Open Access Journals (Sweden)
Thabet Abdeljawad
2013-01-01
Full Text Available In fractional calculus, there are two approaches to obtain fractional derivatives. The first approach is by iterating the integral and then defining a fractional order by using Cauchy formula to obtain Riemann fractional integrals and derivatives. The second approach is by iterating the derivative and then defining a fractional order by making use of the binomial theorem to obtain Grünwald-Letnikov fractional derivatives. In this paper we formulate the delta and nabla discrete versions for left and right fractional integrals and derivatives representing the second approach. Then, we use the discrete version of the Q-operator and some discrete fractional dual identities to prove that the presented fractional differences and sums coincide with the discrete Riemann ones describing the first approach.
Discovering Binomial Identities with PascGaloisJE
Evans, Tyler J.
2008-01-01
We describe exercises in which students use PascGaloisJE to formulate conjectures about certain binomial identities which hold when the binomial coefficients are interpreted as elements in the cyclic group Z[subscript p] of integers modulo a prime integer "p". In addition to having an appealing visual component, these exercises are open-ended and…
Wigner Function of Density Operator for Negative Binomial Distribution
International Nuclear Information System (INIS)
Xu Xinglei; Li Hongqi
2008-01-01
By using the technique of integration within an ordered product (IWOP) of operator we derive Wigner function of density operator for negative binomial distribution of radiation field in the mixed state case, then we derive the Wigner function of squeezed number state, which yields negative binomial distribution by virtue of the entangled state representation and the entangled Wigner operator
Penggunaan Model Binomial Pada Penentuan Harga Opsi Saham Karyawan
Directory of Open Access Journals (Sweden)
Dara Puspita Anggraeni
2015-11-01
Full Text Available Binomial Model for Valuing Employee Stock Options. Employee Stock Options (ESO differ from standard exchange-traded options. The three main differences in a valuation model for employee stock options : Vesting Period, Exit Rate and Non-Transferability. In this thesis, the model for valuing employee stock options discussed. This model are implement with a generalized binomial model.
International Nuclear Information System (INIS)
Valor, Alma; Alfonso, Lester; Caleyo, Francisco; Vidal, Julio; Perez-Baruch, Eloy; Hallen, José M.
2015-01-01
Highlights: • Observed external-corrosion defects in underground pipelines revealed a tendency to cluster. • The Poisson distribution is unable to fit extensive count data for these type of defects. • In contrast, the negative binomial distribution provides a suitable count model for them. • Two spatial stochastic processes lead to the negative binomial distribution for defect counts. • They are the Gamma-Poisson mixed process and the compound Poisson process. • A Rogeŕs process also arises as a plausible temporal stochastic process leading to corrosion defect clustering and to negative binomially distributed defect counts. - Abstract: The spatial distribution of external corrosion defects in buried pipelines is usually described as a Poisson process, which leads to corrosion defects being randomly distributed along the pipeline. However, in real operating conditions, the spatial distribution of defects considerably departs from Poisson statistics due to the aggregation of defects in groups or clusters. In this work, the statistical analysis of real corrosion data from underground pipelines operating in southern Mexico leads to conclude that the negative binomial distribution provides a better description for defect counts. The origin of this distribution from several processes is discussed. The analysed processes are: mixed Gamma-Poisson, compound Poisson and Roger’s processes. The physical reasons behind them are discussed for the specific case of soil corrosion.
Simons, Monique; de Vet, Emely; Chinapaw, Mai Jm; de Boer, Michiel; Seidell, Jacob C; Brug, Johannes
2014-04-04
Playing video games contributes substantially to sedentary behavior in youth. A new generation of video games-active games-seems to be a promising alternative to sedentary games to promote physical activity and reduce sedentary behavior. At this time, little is known about correlates of active and non-active gaming among adolescents. The objective of this study was to examine potential personal, social, and game-related correlates of both active and non-active gaming in adolescents. A survey assessing game behavior and potential personal, social, and game-related correlates was conducted among adolescents (12-16 years, N=353) recruited via schools. Multivariable, multilevel logistic regression analyses, adjusted for demographics (age, sex and educational level of adolescents), were conducted to examine personal, social, and game-related correlates of active gaming ≥1 hour per week (h/wk) and non-active gaming >7 h/wk. Active gaming ≥1 h/wk was significantly associated with a more positive attitude toward active gaming (OR 5.3, CI 2.4-11.8; Pgames (OR 0.30, CI 0.1-0.6; P=.002), a higher score on habit strength regarding gaming (OR 1.9, CI 1.2-3.2; P=.008) and having brothers/sisters (OR 6.7, CI 2.6-17.1; Pgame engagement (OR 0.95, CI 0.91-0.997; P=.04). Non-active gaming >7 h/wk was significantly associated with a more positive attitude toward non-active gaming (OR 2.6, CI 1.1-6.3; P=.035), a stronger habit regarding gaming (OR 3.0, CI 1.7-5.3; P7 h/wk. Active gaming is most strongly (negatively) associated with attitude with respect to non-active games, followed by observed active game behavior of brothers and sisters and attitude with respect to active gaming (positive associations). On the other hand, non-active gaming is most strongly associated with observed non-active game behavior of friends, habit strength regarding gaming and attitude toward non-active gaming (positive associations). Habit strength was a correlate of both active and non-active gaming
de Vet, Emely; Chinapaw, Mai JM; de Boer, Michiel; Seidell, Jacob C; Brug, Johannes
2014-01-01
Background Playing video games contributes substantially to sedentary behavior in youth. A new generation of video games—active games—seems to be a promising alternative to sedentary games to promote physical activity and reduce sedentary behavior. At this time, little is known about correlates of active and non-active gaming among adolescents. Objective The objective of this study was to examine potential personal, social, and game-related correlates of both active and non-active gaming in adolescents. Methods A survey assessing game behavior and potential personal, social, and game-related correlates was conducted among adolescents (12-16 years, N=353) recruited via schools. Multivariable, multilevel logistic regression analyses, adjusted for demographics (age, sex and educational level of adolescents), were conducted to examine personal, social, and game-related correlates of active gaming ≥1 hour per week (h/wk) and non-active gaming >7 h/wk. Results Active gaming ≥1 h/wk was significantly associated with a more positive attitude toward active gaming (OR 5.3, CI 2.4-11.8; Pgames (OR 0.30, CI 0.1-0.6; P=.002), a higher score on habit strength regarding gaming (OR 1.9, CI 1.2-3.2; P=.008) and having brothers/sisters (OR 6.7, CI 2.6-17.1; Pgame engagement (OR 0.95, CI 0.91-0.997; P=.04). Non-active gaming >7 h/wk was significantly associated with a more positive attitude toward non-active gaming (OR 2.6, CI 1.1-6.3; P=.035), a stronger habit regarding gaming (OR 3.0, CI 1.7-5.3; P7 h/wk. Active gaming is most strongly (negatively) associated with attitude with respect to non-active games, followed by observed active game behavior of brothers and sisters and attitude with respect to active gaming (positive associations). On the other hand, non-active gaming is most strongly associated with observed non-active game behavior of friends, habit strength regarding gaming and attitude toward non-active gaming (positive associations). Habit strength was a
L.T. Burgers (Laura); F.T. van de Wetering (Fleur); J.L. Severens (Hans); W.K. Redekop (Ken)
2016-01-01
textabstractBackground: Systematic reviews of cost-effectiveness analyses summarize results and describe study characteristics. Variability in the study results is often explained qualitatively or based on sensitivity analyses of individual studies. However, variability due to input parameters and
Microbial comparative pan-genomics using binomial mixture models
Directory of Open Access Journals (Sweden)
Ussery David W
2009-08-01
Full Text Available Abstract Background The size of the core- and pan-genome of bacterial species is a topic of increasing interest due to the growing number of sequenced prokaryote genomes, many from the same species. Attempts to estimate these quantities have been made, using regression methods or mixture models. We extend the latter approach by using statistical ideas developed for capture-recapture problems in ecology and epidemiology. Results We estimate core- and pan-genome sizes for 16 different bacterial species. The results reveal a complex dependency structure for most species, manifested as heterogeneous detection probabilities. Estimated pan-genome sizes range from small (around 2600 gene families in Buchnera aphidicola to large (around 43000 gene families in Escherichia coli. Results for Echerichia coli show that as more data become available, a larger diversity is estimated, indicating an extensive pool of rarely occurring genes in the population. Conclusion Analyzing pan-genomics data with binomial mixture models is a way to handle dependencies between genomes, which we find is always present. A bottleneck in the estimation procedure is the annotation of rarely occurring genes.
Validity of the negative binomial distribution in particle production
International Nuclear Information System (INIS)
Cugnon, J.; Harouna, O.
1987-01-01
Some aspects of the clan picture for particle production in nuclear and in high-energy processes are examined. In particular, it is shown that the requirement of having logarithmic distribution for the number of particles within a clan in order to generate a negative binomial should not be taken strictly. Large departures are allowed without distorting too much the negative binomial. The question of the undetected particles is also studied. It is shown that, under reasonable circumstances, the latter do not affect the negative binomial character of the multiplicity distribution
Alexeeff, Stacey E; Schwartz, Joel; Kloog, Itai; Chudnovsky, Alexandra; Koutrakis, Petros; Coull, Brent A
2015-01-01
Many epidemiological studies use predicted air pollution exposures as surrogates for true air pollution levels. These predicted exposures contain exposure measurement error, yet simulation studies have typically found negligible bias in resulting health effect estimates. However, previous studies typically assumed a statistical spatial model for air pollution exposure, which may be oversimplified. We address this shortcoming by assuming a realistic, complex exposure surface derived from fine-scale (1 km × 1 km) remote-sensing satellite data. Using simulation, we evaluate the accuracy of epidemiological health effect estimates in linear and logistic regression when using spatial air pollution predictions from kriging and land use regression models. We examined chronic (long-term) and acute (short-term) exposure to air pollution. Results varied substantially across different scenarios. Exposure models with low out-of-sample R(2) yielded severe biases in the health effect estimates of some models, ranging from 60% upward bias to 70% downward bias. One land use regression exposure model with >0.9 out-of-sample R(2) yielded upward biases up to 13% for acute health effect estimates. Almost all models drastically underestimated the SEs. Land use regression models performed better in chronic effect simulations. These results can help researchers when interpreting health effect estimates in these types of studies.
Comparison of multinomial and binomial proportion methods for analysis of multinomial count data.
Galyean, M L; Wester, D B
2010-10-01
Simulation methods were used to generate 1,000 experiments, each with 3 treatments and 10 experimental units/treatment, in completely randomized (CRD) and randomized complete block designs. Data were counts in 3 ordered or 4 nominal categories from multinomial distributions. For the 3-category analyses, category probabilities were 0.6, 0.3, and 0.1, respectively, for 2 of the treatments, and 0.5, 0.35, and 0.15 for the third treatment. In the 4-category analysis (CRD only), probabilities were 0.3, 0.3, 0.2, and 0.2 for treatments 1 and 2 vs. 0.4, 0.4, 0.1, and 0.1 for treatment 3. The 3-category data were analyzed with generalized linear mixed models as an ordered multinomial distribution with a cumulative logit link or by regrouping the data (e.g., counts in 1 category/sum of counts in all categories), followed by analysis of single categories as binomial proportions. Similarly, the 4-category data were analyzed as a nominal multinomial distribution with a glogit link or by grouping data as binomial proportions. For the 3-category CRD analyses, empirically determined type I error rates based on pair-wise comparisons (F- and Wald chi(2) tests) did not differ between multinomial and individual binomial category analyses with 10 (P = 0.38 to 0.60) or 50 (P = 0.19 to 0.67) sampling units/experimental unit. When analyzed as binomial proportions, power estimates varied among categories, with analysis of the category with the greatest counts yielding power similar to the multinomial analysis. Agreement between methods (percentage of experiments with the same results for the overall test for treatment effects) varied considerably among categories analyzed and sampling unit scenarios for the 3-category CRD analyses. Power (F-test) was 24.3, 49.1, 66.9, 83.5, 86.8, and 99.7% for 10, 20, 30, 40, 50, and 100 sampling units/experimental unit for the 3-category multinomial CRD analyses. Results with randomized complete block design simulations were similar to those with the CRD
Statistical Inference for a Class of Multivariate Negative Binomial Distributions
DEFF Research Database (Denmark)
Rubak, Ege H.; Møller, Jesper; McCullagh, Peter
This paper considers statistical inference procedures for a class of models for positively correlated count variables called -permanental random fields, and which can be viewed as a family of multivariate negative binomial distributions. Their appealing probabilistic properties have earlier been...
Application of binomial-edited CPMG to shale characterization.
Washburn, Kathryn E; Birdwell, Justin E
2014-09-01
Unconventional shale resources may contain a significant amount of hydrogen in organic solids such as kerogen, but it is not possible to directly detect these solids with many NMR systems. Binomial-edited pulse sequences capitalize on magnetization transfer between solids, semi-solids, and liquids to provide an indirect method of detecting solid organic materials in shales. When the organic solids can be directly measured, binomial-editing helps distinguish between different phases. We applied a binomial-edited CPMG pulse sequence to a range of natural and experimentally-altered shale samples. The most substantial signal loss is seen in shales rich in organic solids while fluids associated with inorganic pores seem essentially unaffected. This suggests that binomial-editing is a potential method for determining fluid locations, solid organic content, and kerogen-bitumen discrimination. Copyright © 2014 Elsevier Inc. All rights reserved.
Multifractal structure of multiplicity distributions and negative binomials
International Nuclear Information System (INIS)
Malik, S.; Delhi, Univ.
1997-01-01
The paper presents experimental results of the multifractal structure analysis in proton-emulsion interactions at 800 GeV. The multiplicity moments have a power law dependence on the mean multiplicity in varying bin sizes of pseudorapidity. The values of generalised dimensions are calculated from the slope value. The multifractal characteristics are also examined in the light of negative binomials. The observed multiplicity moments and those derived from the negative-binomial fits agree well with each other. Also the values of D q , both observed and derived from the negative-binomial fits not only decrease with q typifying multifractality but also agree well each other showing consistency with the negative-binomial form
Gaussian Process Regression Model in Spatial Logistic Regression
Sofro, A.; Oktaviarina, A.
2018-01-01
Spatial analysis has developed very quickly in the last decade. One of the favorite approaches is based on the neighbourhood of the region. Unfortunately, there are some limitations such as difficulty in prediction. Therefore, we offer Gaussian process regression (GPR) to accommodate the issue. In this paper, we will focus on spatial modeling with GPR for binomial data with logit link function. The performance of the model will be investigated. We will discuss the inference of how to estimate the parameters and hyper-parameters and to predict as well. Furthermore, simulation studies will be explained in the last section.
Entanglement of Generalized Two-Mode Binomial States and Teleportation
International Nuclear Information System (INIS)
Wang Dongmei; Yu Youhong
2009-01-01
The entanglement of the generalized two-mode binomial states in the phase damping channel is studied by making use of the relative entropy of the entanglement. It is shown that the factors of q and p play the crucial roles in control the relative entropy of the entanglement. Furthermore, we propose a scheme of teleporting an unknown state via the generalized two-mode binomial states, and calculate the mean fidelity of the scheme. (general)
Detecting non-binomial sex allocation when developmental mortality operates.
Wilkinson, Richard D; Kapranas, Apostolos; Hardy, Ian C W
2016-11-07
Optimal sex allocation theory is one of the most intricately developed areas of evolutionary ecology. Under a range of conditions, particularly under population sub-division, selection favours sex being allocated to offspring non-randomly, generating non-binomial variances of offspring group sex ratios. Detecting non-binomial sex allocation is complicated by stochastic developmental mortality, as offspring sex can often only be identified on maturity with the sex of non-maturing offspring remaining unknown. We show that current approaches for detecting non-binomiality have limited ability to detect non-binomial sex allocation when developmental mortality has occurred. We present a new procedure using an explicit model of sex allocation and mortality and develop a Bayesian model selection approach (available as an R package). We use the double and multiplicative binomial distributions to model over- and under-dispersed sex allocation and show how to calculate Bayes factors for comparing these alternative models to the null hypothesis of binomial sex allocation. The ability to detect non-binomial sex allocation is greatly increased, particularly in cases where mortality is common. The use of Bayesian methods allows for the quantification of the evidence in favour of each hypothesis, and our modelling approach provides an improved descriptive capability over existing approaches. We use a simulation study to demonstrate substantial improvements in power for detecting non-binomial sex allocation in situations where current methods fail, and we illustrate the approach in real scenarios using empirically obtained datasets on the sexual composition of groups of gregarious parasitoid wasps. Copyright © 2016 Elsevier Ltd. All rights reserved.
Negative Binomial Distribution and the multiplicity moments at the LHC
International Nuclear Information System (INIS)
Praszalowicz, Michal
2011-01-01
In this work we show that the latest LHC data on multiplicity moments C 2 -C 5 are well described by a two-step model in the form of a convolution of the Poisson distribution with energy-dependent source function. For the source function we take Γ Negative Binomial Distribution. No unexpected behavior of Negative Binomial Distribution parameter k is found. We give also predictions for the higher energies of 10 and 14 TeV.
On some binomial [Formula: see text]-difference sequence spaces.
Meng, Jian; Song, Meimei
2017-01-01
In this paper, we introduce the binomial sequence spaces [Formula: see text], [Formula: see text] and [Formula: see text] by combining the binomial transformation and difference operator. We prove the BK -property and some inclusion relations. Furthermore, we obtain Schauder bases and compute the α -, β - and γ -duals of these sequence spaces. Finally, we characterize matrix transformations on the sequence space [Formula: see text].
Negative-binomial multiplicity distributions in the interaction of light ions with 12C at 4.2 GeV/c
International Nuclear Information System (INIS)
Tucholski, A.; Bogdanowicz, J.; Moroz, Z.; Wojtkowska, J.
1989-01-01
Multiplicity distribution of single-charged particles in the interaction of p, d, α and 12 C projectiles with C target nuclei at 4.2 GeV/c were analysed in terms of the negative binomial distribution. The experimental data obtained by the Dubna Propane Bubble Chamber Group were used. It is shown that the experimental distributions are satisfactorily described by the negative-binomial distribution. Values of the parameters of these distributions are discussed. (orig.)
Confidence limits for parameters of Poisson and binomial distributions
International Nuclear Information System (INIS)
Arnett, L.M.
1976-04-01
The confidence limits for the frequency in a Poisson process and for the proportion of successes in a binomial process were calculated and tabulated for the situations in which the observed values of the frequency or proportion and an a priori distribution of these parameters are available. Methods are used that produce limits with exactly the stated confidence levels. The confidence interval [a,b] is calculated so that Pr [a less than or equal to lambda less than or equal to b c,μ], where c is the observed value of the parameter, and μ is the a priori hypothesis of the distribution of this parameter. A Bayesian type analysis is used. The intervals calculated are narrower and appreciably different from results, known to be conservative, that are often used in problems of this type. Pearson and Hartley recognized the characteristics of their methods and contemplated that exact methods could someday be used. The calculation of the exact intervals requires involved numerical analyses readily implemented only on digital computers not available to Pearson and Hartley. A Monte Carlo experiment was conducted to verify a selected interval from those calculated. This numerical experiment confirmed the results of the analytical methods and the prediction of Pearson and Hartley that their published tables give conservative results
Nazarzadeh, Milad; Bidel, Zeinab; Mosavi Jarahi, Alireza; Esmaeelpour, Keihan; Menati, Walieh; Shakeri, Ali Asghar; Menati, Rostam; Kikhavani, Sattar; Saki, Kourosh
2015-09-01
Cannabis is the most widely used substance in the world. This study aimed to estimate the prevalence of cannabis lifetime use (CLU) in high school and college students of Iran and also to determine factors related to changes in prevalence. A systematic review of literature on cannabis use in Iran was conducted according to MOOSE guideline. Domestic scientific databases, PubMed/Medline, ISI Web of Knowledge, and Google Scholar, relevant reference lists, and relevant journals were searched up to April, 2014. Prevalences were calculated using the variance stabilizing double arcsine transformation and confidence intervals (CIs) estimated using the Wilson method. Heterogeneity was assessed by Cochran's Q statistic and I(2) index and causes of heterogeneity were evaluated using meta-regression model. In electronic database search, 4,000 citations were retrieved, producing a total of 33 studies. CLU was reported with a random effects pooled prevalence of 4.0% (95% CI = 3.0% to 5.0%). In subgroups of high school and college students, prevalences were 5.0% (95% CI = 3.0% to -7.0%) and 2.0% (95% CI = 2.0% to -3.0%), respectively. Meta-regression model indicated that prevalence is higher in college students (β = 0.089, p students are lower than industrialized countries. In addition, gender, level of education, and methods of sampling are highly associated with changes in the prevalence of CLU across provinces. © The Author(s) 2014.
Extra-binomial variation approach for analysis of pooled DNA sequencing data
Wallace, Chris
2012-01-01
Motivation: The invention of next-generation sequencing technology has made it possible to study the rare variants that are more likely to pinpoint causal disease genes. To make such experiments financially viable, DNA samples from several subjects are often pooled before sequencing. This induces large between-pool variation which, together with other sources of experimental error, creates over-dispersed data. Statistical analysis of pooled sequencing data needs to appropriately model this additional variance to avoid inflating the false-positive rate. Results: We propose a new statistical method based on an extra-binomial model to address the over-dispersion and apply it to pooled case-control data. We demonstrate that our model provides a better fit to the data than either a standard binomial model or a traditional extra-binomial model proposed by Williams and can analyse both rare and common variants with lower or more variable pool depths compared to the other methods. Availability: Package ‘extraBinomial’ is on http://cran.r-project.org/ Contact: chris.wallace@cimr.cam.ac.uk Supplementary information: Supplementary data are available at Bioinformatics Online. PMID:22976083
Wang, Hui; Sui, Weiguo; Xue, Wen; Wu, Junyong; Chen, Jiejing; Dai, Yong
2014-09-01
Immunoglobulin A nephropathy (IgAN) is a complex trait regulated by the interaction among multiple physiologic regulatory systems and probably involving numerous genes, which leads to inconsistent findings in genetic studies. One possibility of failure to replicate some single-locus results is that the underlying genetics of IgAN nephropathy is based on multiple genes with minor effects. To learn the association between 23 single nucleotide polymorphisms (SNPs) in 14 genes predisposing to chronic glomerular diseases and IgAN in Han males, the 23 SNPs genotypes of 21 Han males were detected and analyzed with a BaiO gene chip, and their associations were analyzed with univariate analysis and multiple linear regression analysis. Analysis showed that CTLA4 rs231726 and CR2 rs1048971 revealed a significant association with IgAN. These findings support the multi-gene nature of the etiology of IgAN and propose a potential gene-gene interactive model for future studies.
Duda, David P.; Minnis, Patrick
2009-01-01
Previous studies have shown that probabilistic forecasting may be a useful method for predicting persistent contrail formation. A probabilistic forecast to accurately predict contrail formation over the contiguous United States (CONUS) is created by using meteorological data based on hourly meteorological analyses from the Advanced Regional Prediction System (ARPS) and from the Rapid Update Cycle (RUC) as well as GOES water vapor channel measurements, combined with surface and satellite observations of contrails. Two groups of logistic models were created. The first group of models (SURFACE models) is based on surface-based contrail observations supplemented with satellite observations of contrail occurrence. The second group of models (OUTBREAK models) is derived from a selected subgroup of satellite-based observations of widespread persistent contrails. The mean accuracies for both the SURFACE and OUTBREAK models typically exceeded 75 percent when based on the RUC or ARPS analysis data, but decreased when the logistic models were derived from ARPS forecast data.
Hadronic multiplicity distributions: the negative binomial and its alternatives
International Nuclear Information System (INIS)
Carruthers, P.
1986-01-01
We review properties of the negative binomial distribution, along with its many possible statistical or dynamical origins. Considering the relation of the multiplicity distribution to the density matrix for Boson systems, we re-introduce the partially coherent laser distribution, which allows for coherent as well as incoherent hadronic emission from the k fundamental cells, and provides equally good phenomenological fits to existing data. The broadening of non-single diffractive hadron-hadron distributions can be equally well due to the decrease of coherent with increasing energy as to the large (and rapidly decreasing) values of k deduced from negative binomial fits. Similarly the narrowness of e + -e - multiplicity distribution is due to nearly coherent (therefore nearly Poissonian) emission from a small number of jets, in contrast to the negative binomial with enormous values of k. 31 refs
Sample size calculation for comparing two negative binomial rates.
Zhu, Haiyuan; Lakkis, Hassan
2014-02-10
Negative binomial model has been increasingly used to model the count data in recent clinical trials. It is frequently chosen over Poisson model in cases of overdispersed count data that are commonly seen in clinical trials. One of the challenges of applying negative binomial model in clinical trial design is the sample size estimation. In practice, simulation methods have been frequently used for sample size estimation. In this paper, an explicit formula is developed to calculate sample size based on the negative binomial model. Depending on different approaches to estimate the variance under null hypothesis, three variations of the sample size formula are proposed and discussed. Important characteristics of the formula include its accuracy and its ability to explicitly incorporate dispersion parameter and exposure time. The performance of the formula with each variation is assessed using simulations. Copyright © 2013 John Wiley & Sons, Ltd.
Hadronic multiplicity distributions: the negative binomial and its alternatives
International Nuclear Information System (INIS)
Carruthers, P.
1986-01-01
We review properties of the negative binomial distribution, along with its many possible statistical or dynamical origins. Considering the relation of the multiplicity distribution to the density matrix for boson systems, we re-introduce the partially coherent laser distribution, which allows for coherent as well as incoherent hadronic emission from the k fundamental cells, and provides equally good phenomenological fits to existing data. The broadening of non-single diffractive hadron-hadron distributions can be equally well due to the decrease of coherence with increasing energy as to the large (and rapidly decreasing) values of k deduced from negative binomial fits. Similarly the narrowness of e + -e - multiplicity distribution is due to nearly coherent (therefore nearly Poissonian) emission from a small number of jets, in contrast to the negative binomial with enormous values of k. 31 refs
The k-Binomial Transforms and the Hankel Transform
Spivey, Michael Z.; Steil, Laura L.
2006-01-01
We give a new proof of the invariance of the Hankel transform under the binomial transform of a sequence. Our method of proof leads to three variations of the binomial transform; we call these the k-binomial transforms. We give a simple means of constructing these transforms via a triangle of numbers. We show how the exponential generating function of a sequence changes after our transforms are applied, and we use this to prove that several sequences in the On-Line Encyclopedia of Integer Sequences are related via our transforms. In the process, we prove three conjectures in the OEIS. Addressing a question of Layman, we then show that the Hankel transform of a sequence is invariant under one of our transforms, and we show how the Hankel transform changes after the other two transforms are applied. Finally, we use these results to determine the Hankel transforms of several integer sequences.
Duda, David P.; Minnis, Patrick
2009-01-01
Straightforward application of the Schmidt-Appleman contrail formation criteria to diagnose persistent contrail occurrence from numerical weather prediction data is hindered by significant bias errors in the upper tropospheric humidity. Logistic models of contrail occurrence have been proposed to overcome this problem, but basic questions remain about how random measurement error may affect their accuracy. A set of 5000 synthetic contrail observations is created to study the effects of random error in these probabilistic models. The simulated observations are based on distributions of temperature, humidity, and vertical velocity derived from Advanced Regional Prediction System (ARPS) weather analyses. The logistic models created from the simulated observations were evaluated using two common statistical measures of model accuracy, the percent correct (PC) and the Hanssen-Kuipers discriminant (HKD). To convert the probabilistic results of the logistic models into a dichotomous yes/no choice suitable for the statistical measures, two critical probability thresholds are considered. The HKD scores are higher when the climatological frequency of contrail occurrence is used as the critical threshold, while the PC scores are higher when the critical probability threshold is 0.5. For both thresholds, typical random errors in temperature, relative humidity, and vertical velocity are found to be small enough to allow for accurate logistic models of contrail occurrence. The accuracy of the models developed from synthetic data is over 85 percent for both the prediction of contrail occurrence and non-occurrence, although in practice, larger errors would be anticipated.
Differential susceptibility among reef-building coral species can lead to community shifts and loss of diversity as a result of temperature-induced mass bleaching events. However, the influence of the local environment on species-specific bleaching susceptibilities has not been ...
Estimation of Log-Linear-Binomial Distribution with Applications
Directory of Open Access Journals (Sweden)
Elsayed Ali Habib
2010-01-01
Full Text Available Log-linear-binomial distribution was introduced for describing the behavior of the sum of dependent Bernoulli random variables. The distribution is a generalization of binomial distribution that allows construction of a broad class of distributions. In this paper, we consider the problem of estimating the two parameters of log-linearbinomial distribution by moment and maximum likelihood methods. The distribution is used to fit genetic data and to obtain the sampling distribution of the sign test under dependence among trials.
Interpretations and implications of negative binomial distributions of multiparticle productions
International Nuclear Information System (INIS)
Arisawa, Tetsuo
2006-01-01
The number of particles produced in high energy experiments is approximated by a negative binomial distribution. Deriving a representation of the distribution from a stochastic equation, conditions for the process to satisfy the distribution are clarified. Based on them, it is proposed that multiparticle production consists of spontaneous and induced production. The rate of the induced production is proportional to the number of existing particles. The ratio of the two production rates remains constant during the process. The ''NBD space'' is also defined where the number of particles produced in its subspaces follows negative binomial distributions with different parameters
Chen, Wansu; Shi, Jiaxiao; Qian, Lei; Azen, Stanley P
2014-06-26
To estimate relative risks or risk ratios for common binary outcomes, the most popular model-based methods are the robust (also known as modified) Poisson and the log-binomial regression. Of the two methods, it is believed that the log-binomial regression yields more efficient estimators because it is maximum likelihood based, while the robust Poisson model may be less affected by outliers. Evidence to support the robustness of robust Poisson models in comparison with log-binomial models is very limited. In this study a simulation was conducted to evaluate the performance of the two methods in several scenarios where outliers existed. The findings indicate that for data coming from a population where the relationship between the outcome and the covariate was in a simple form (e.g. log-linear), the two models yielded comparable biases and mean square errors. However, if the true relationship contained a higher order term, the robust Poisson models consistently outperformed the log-binomial models even when the level of contamination is low. The robust Poisson models are more robust (or less sensitive) to outliers compared to the log-binomial models when estimating relative risks or risk ratios for common binary outcomes. Users should be aware of the limitations when choosing appropriate models to estimate relative risks or risk ratios.
Metaprop: a Stata command to perform meta-analysis of binomial data.
Nyaga, Victoria N; Arbyn, Marc; Aerts, Marc
2014-01-01
Meta-analyses have become an essential tool in synthesizing evidence on clinical and epidemiological questions derived from a multitude of similar studies assessing the particular issue. Appropriate and accessible statistical software is needed to produce the summary statistic of interest. Metaprop is a statistical program implemented to perform meta-analyses of proportions in Stata. It builds further on the existing Stata procedure metan which is typically used to pool effects (risk ratios, odds ratios, differences of risks or means) but which is also used to pool proportions. Metaprop implements procedures which are specific to binomial data and allows computation of exact binomial and score test-based confidence intervals. It provides appropriate methods for dealing with proportions close to or at the margins where the normal approximation procedures often break down, by use of the binomial distribution to model the within-study variability or by allowing Freeman-Tukey double arcsine transformation to stabilize the variances. Metaprop was applied on two published meta-analyses: 1) prevalence of HPV-infection in women with a Pap smear showing ASC-US; 2) cure rate after treatment for cervical precancer using cold coagulation. The first meta-analysis showed a pooled HPV-prevalence of 43% (95% CI: 38%-48%). In the second meta-analysis, the pooled percentage of cured women was 94% (95% CI: 86%-97%). By using metaprop, no studies with 0% or 100% proportions were excluded from the meta-analysis. Furthermore, study specific and pooled confidence intervals always were within admissible values, contrary to the original publication, where metan was used.
Generation of the reciprocal-binomial state for optical fields
International Nuclear Information System (INIS)
Valverde, C.; Avelar, A.T.; Baseia, B.; Malbouisson, J.M.C.
2003-01-01
We compare the efficiencies of two interesting schemes to generate truncated states of the light field in running modes, namely the 'quantum scissors' and the 'beam-splitter array' schemes. The latter is applied to create the reciprocal-binomial state as a travelling wave, required to implement recent experimental proposals of phase-distribution determination and of quantum lithography
Improved binomial charts for monitoring high-quality processes
Albers, Willem/Wim
2009-01-01
For processes concerning attribute data with (very) small failure rate p, often negative binomial control charts are used. The decision whether to stop or continue is made each time r failures have occurred, for some r≥1. Finding the optimal r for detecting a given increase of p first requires
Statistical inference for a class of multivariate negative binomial distributions
DEFF Research Database (Denmark)
Rubak, Ege Holger; Møller, Jesper; McCullagh, Peter
This paper considers statistical inference procedures for a class of models for positively correlated count variables called α-permanental random fields, and which can be viewed as a family of multivariate negative binomial distributions. Their appealing probabilistic properties have earlier been...
Improved binomial charts for high-quality processes
Albers, Willem/Wim
For processes concerning attribute data with (very) small failure rate p, often negative binomial control charts are used. The decision whether to stop or continue is made each time r failures have occurred, for some r≥1. Finding the optimal r for detecting a given increase of p first requires
Estimating negative binomial parameters from occurrence data with detection times.
Hwang, Wen-Han; Huggins, Richard; Stoklosa, Jakub
2016-11-01
The negative binomial distribution is a common model for the analysis of count data in biology and ecology. In many applications, we may not observe the complete frequency count in a quadrat but only that a species occurred in the quadrat. If only occurrence data are available then the two parameters of the negative binomial distribution, the aggregation index and the mean, are not identifiable. This can be overcome by data augmentation or through modeling the dependence between quadrat occupancies. Here, we propose to record the (first) detection time while collecting occurrence data in a quadrat. We show that under what we call proportionate sampling, where the time to survey a region is proportional to the area of the region, that both negative binomial parameters are estimable. When the mean parameter is larger than two, our proposed approach is more efficient than the data augmentation method developed by Solow and Smith (, Am. Nat. 176, 96-98), and in general is cheaper to conduct. We also investigate the effect of misidentification when collecting negative binomially distributed data, and conclude that, in general, the effect can be simply adjusted for provided that the mean and variance of misidentification probabilities are known. The results are demonstrated in a simulation study and illustrated in several real examples. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Calculation of generalized secant integral using binomial coefficients
International Nuclear Information System (INIS)
Guseinov, I.I.; Mamedov, B.A.
2004-01-01
A single series expansion relation is derived for the generalized secant (GS) integral in terms of binomial coefficients, exponential integrals and incomplete gamma functions. The convergence of the series is tested by the concrete cases of parameters. The formulas given in this study for the evaluation of GS integral show good rate of convergence and numerical stability
Selecting Tools to Model Integer and Binomial Multiplication
Pratt, Sarah Smitherman; Eddy, Colleen M.
2017-01-01
Mathematics teachers frequently provide concrete manipulatives to students during instruction; however, the rationale for using certain manipulatives in conjunction with concepts may not be explored. This article focuses on area models that are currently used in classrooms to provide concrete examples of integer and binomial multiplication. The…
Using the β-binomial distribution to characterize forest health
S.J. Zarnoch; R.L. Anderson; R.M. Sheffield
1995-01-01
The β-binomial distribution is suggested as a model for describing and analyzing the dichotomous data obtained from programs monitoring the health of forests in the United States. Maximum likelihood estimation of the parameters is given as well as asymptotic likelihood ratio tests. The procedure is illustrated with data on dogwood anthracnose infection (caused...
Currency lookback options and observation frequency: A binomial approach
T.H.F. Cheuk; A.C.F. Vorst (Ton)
1997-01-01
textabstractIn the last decade, interest in exotic options has been growing, especially in the over-the-counter currency market. In this paper we consider Iookback currency options, which are path-dependent. We show that a one-state variable binomial model for currency Iookback options can
Stochastic analysis of complex reaction networks using binomial moment equations.
Barzel, Baruch; Biham, Ofer
2012-09-01
The stochastic analysis of complex reaction networks is a difficult problem because the number of microscopic states in such systems increases exponentially with the number of reactive species. Direct integration of the master equation is thus infeasible and is most often replaced by Monte Carlo simulations. While Monte Carlo simulations are a highly effective tool, equation-based formulations are more amenable to analytical treatment and may provide deeper insight into the dynamics of the network. Here, we present a highly efficient equation-based method for the analysis of stochastic reaction networks. The method is based on the recently introduced binomial moment equations [Barzel and Biham, Phys. Rev. Lett. 106, 150602 (2011)]. The binomial moments are linear combinations of the ordinary moments of the probability distribution function of the population sizes of the interacting species. They capture the essential combinatorics of the reaction processes reflecting their stoichiometric structure. This leads to a simple and transparent form of the equations, and allows a highly efficient and surprisingly simple truncation scheme. Unlike ordinary moment equations, in which the inclusion of high order moments is prohibitively complicated, the binomial moment equations can be easily constructed up to any desired order. The result is a set of equations that enables the stochastic analysis of complex reaction networks under a broad range of conditions. The number of equations is dramatically reduced from the exponential proliferation of the master equation to a polynomial (and often quadratic) dependence on the number of reactive species in the binomial moment equations. The aim of this paper is twofold: to present a complete derivation of the binomial moment equations; to demonstrate the applicability of the moment equations for a representative set of example networks, in which stochastic effects play an important role.
Logistic regression for dichotomized counts.
Preisser, John S; Das, Kalyan; Benecha, Habtamu; Stamm, John W
2016-12-01
Sometimes there is interest in a dichotomized outcome indicating whether a count variable is positive or zero. Under this scenario, the application of ordinary logistic regression may result in efficiency loss, which is quantifiable under an assumed model for the counts. In such situations, a shared-parameter hurdle model is investigated for more efficient estimation of regression parameters relating to overall effects of covariates on the dichotomous outcome, while handling count data with many zeroes. One model part provides a logistic regression containing marginal log odds ratio effects of primary interest, while an ancillary model part describes the mean count of a Poisson or negative binomial process in terms of nuisance regression parameters. Asymptotic efficiency of the logistic model parameter estimators of the two-part models is evaluated with respect to ordinary logistic regression. Simulations are used to assess the properties of the models with respect to power and Type I error, the latter investigated under both misspecified and correctly specified models. The methods are applied to data from a randomized clinical trial of three toothpaste formulations to prevent incident dental caries in a large population of Scottish schoolchildren. © The Author(s) 2014.
Olive, David J
2017-01-01
This text covers both multiple linear regression and some experimental design models. The text uses the response plot to visualize the model and to detect outliers, does not assume that the error distribution has a known parametric distribution, develops prediction intervals that work when the error distribution is unknown, suggests bootstrap hypothesis tests that may be useful for inference after variable selection, and develops prediction regions and large sample theory for the multivariate linear regression model that has m response variables. A relationship between multivariate prediction regions and confidence regions provides a simple way to bootstrap confidence regions. These confidence regions often provide a practical method for testing hypotheses. There is also a chapter on generalized linear models and generalized additive models. There are many R functions to produce response and residual plots, to simulate prediction intervals and hypothesis tests, to detect outliers, and to choose response trans...
Directory of Open Access Journals (Sweden)
Igor K. Kochanenko
2013-01-01
Full Text Available Procedures of construction of curve regress by criterion of the least fractals, i.e. the greatest probability of the sums of degrees of the least deviations measured intensity from their modelling values are proved. The exponent is defined as fractal dimension of a time number. The difference of results of a well-founded method and a method of the least squares is quantitatively estimated.
Bhamidipati, Ravi Kanth; Syed, Muzeeb; Mullangi, Ramesh; Srinivas, Nuggehally
2018-02-01
1. Dalbavancin, a lipoglycopeptide, is approved for treating gram-positive bacterial infections. Area under plasma concentration versus time curve (AUC inf ) of dalbavancin is a key parameter and AUC inf /MIC ratio is a critical pharmacodynamic marker. 2. Using end of intravenous infusion concentration (i.e. C max ) C max versus AUC inf relationship for dalbavancin was established by regression analyses (i.e. linear, log-log, log-linear and power models) using 21 pairs of subject data. 3. The predictions of the AUC inf were performed using published C max data by application of regression equations. The quotient of observed/predicted values rendered fold difference. The mean absolute error (MAE)/root mean square error (RMSE) and correlation coefficient (r) were used in the assessment. 4. MAE and RMSE values for the various models were comparable. The C max versus AUC inf exhibited excellent correlation (r > 0.9488). The internal data evaluation showed narrow confinement (0.84-1.14-fold difference) with a RMSE regression models, a single time point strategy of using C max (i.e. end of 30-min infusion) is amenable as a prospective tool for predicting AUC inf of dalbavancin in patients.
Generalization of Binomial Coefficients to Numbers on the Nodes of Graphs
Khmelnitskaya, A.; van der Laan, G.; Talman, Dolf
2016-01-01
The triangular array of binomial coefficients, or Pascal's triangle, is formed by starting with an apex of 1. Every row of Pascal's triangle can be seen as a line-graph, to each node of which the corresponding binomial coefficient is assigned. We show that the binomial coefficient of a node is equal
Generalization of binomial coefficients to numbers on the nodes of graphs
Khmelnitskaya, Anna Borisovna; van der Laan, Gerard; Talman, Dolf
The triangular array of binomial coefficients, or Pascal's triangle, is formed by starting with an apex of 1. Every row of Pascal's triangle can be seen as a line-graph, to each node of which the corresponding binomial coefficient is assigned. We show that the binomial coefficient of a node is equal
Determination of finite-difference weights using scaled binomial windows
Chu, Chunlei
2012-05-01
The finite-difference method evaluates a derivative through a weighted summation of function values from neighboring grid nodes. Conventional finite-difference weights can be calculated either from Taylor series expansions or by Lagrange interpolation polynomials. The finite-difference method can be interpreted as a truncated convolutional counterpart of the pseudospectral method in the space domain. For this reason, we also can derive finite-difference operators by truncating the convolution series of the pseudospectral method. Various truncation windows can be employed for this purpose and they result in finite-difference operators with different dispersion properties. We found that there exists two families of scaled binomial windows that can be used to derive conventional finite-difference operators analytically. With a minor change, these scaled binomial windows can also be used to derive optimized finite-difference operators with enhanced dispersion properties. © 2012 Society of Exploration Geophysicists.
Exact Group Sequential Methods for Estimating a Binomial Proportion
Directory of Open Access Journals (Sweden)
Zhengjia Chen
2013-01-01
Full Text Available We first review existing sequential methods for estimating a binomial proportion. Afterward, we propose a new family of group sequential sampling schemes for estimating a binomial proportion with prescribed margin of error and confidence level. In particular, we establish the uniform controllability of coverage probability and the asymptotic optimality for such a family of sampling schemes. Our theoretical results establish the possibility that the parameters of this family of sampling schemes can be determined so that the prescribed level of confidence is guaranteed with little waste of samples. Analytic bounds for the cumulative distribution functions and expectations of sample numbers are derived. Moreover, we discuss the inherent connection of various sampling schemes. Numerical issues are addressed for improving the accuracy and efficiency of computation. Computational experiments are conducted for comparing sampling schemes. Illustrative examples are given for applications in clinical trials.
Fat suppression in MR imaging with binomial pulse sequences
International Nuclear Information System (INIS)
Baudovin, C.J.; Bryant, D.J.; Bydder, G.M.; Young, I.R.
1989-01-01
This paper reports on a study to develop pulse sequences allowing suppression of fat signal on MR images without eliminating signal from other tissues with short T1. They have developed such a technique involving selective excitation of protons in water, based on a binomial pulse sequence. Imaging is performed at 0.15 T. Careful shimming is performed to maximize separation of fat and water peaks. A spin-echo 1,500/80 sequence is used, employing 90 degrees pulse with transit frequency optimized for water with null excitation of 20 H offset, followed by a section-selective 180 degrees pulse. With use of the binomial sequence for imagining, reduction in fat signal is seen on images of the pelvis and legs of volunteers. Patient studies show dramatic improvement in visualization of prostatic carcinoma compared with standard sequences
On pricing futures options on random binomial tree
International Nuclear Information System (INIS)
Bayram, Kamola; Ganikhodjaev, Nasir
2013-01-01
The discrete-time approach to real option valuation has typically been implemented in the finance literature using a binomial tree framework. Instead we develop a new model by randomizing the environment and call such model a random binomial tree. Whereas the usual model has only one environment (u, d) where the price of underlying asset can move by u times up and d times down, and pair (u, d) is constant over the life of the underlying asset, in our new model the underlying security is moving in two environments namely (u 1 , d 1 ) and (u 2 , d 2 ). Thus we obtain two volatilities σ 1 and σ 2 . This new approach enables calculations reflecting the real market since it consider the two states of market normal and extra ordinal. In this paper we define and study Futures options for such models.
Zero inflated negative binomial-generalized exponential distributionand its applications
Directory of Open Access Journals (Sweden)
Sirinapa Aryuyuen
2014-08-01
Full Text Available In this paper, we propose a new zero inflated distribution, namely, the zero inflated negative binomial-generalized exponential (ZINB-GE distribution. The new distribution is used for count data with extra zeros and is an alternative for data analysis with over-dispersed count data. Some characteristics of the distribution are given, such as mean, variance, skewness, and kurtosis. Parameter estimation of the ZINB-GE distribution uses maximum likelihood estimation (MLE method. Simulated and observed data are employed to examine this distribution. The results show that the MLE method seems to have high efficiency for large sample sizes. Moreover, the mean square error of parameter estimation is increased when the zero proportion is higher. For the real data sets, this new zero inflated distribution provides a better fit than the zero inflated Poisson and zero inflated negative binomial distributions.
Data analysis using the Binomial Failure Rate common cause model
International Nuclear Information System (INIS)
Atwood, C.L.
1983-09-01
This report explains how to use the Binomial Failure Rate (BFR) method to estimate common cause failure rates. The entire method is described, beginning with the conceptual model, and covering practical issues of data preparation, treatment of variation in the failure rates, Bayesian estimation of the quantities of interest, checking the model assumptions for lack of fit to the data, and the ultimate application of the answers
e+-e- hadronic multiplicity distributions: negative binomial or Poisson
International Nuclear Information System (INIS)
Carruthers, P.; Shih, C.C.
1986-01-01
On the basis of fits to the multiplicity distributions for variable rapidity windows and the forward backward correlation for the 2 jet subset of e + e - data it is impossible to distinguish between a global negative binomial and its generalization, the partially coherent distribution. It is suggested that intensity interferometry, especially the Bose-Einstein correlation, gives information which will discriminate among dynamical models. 16 refs
Hits per trial: Basic analysis of binomial data
International Nuclear Information System (INIS)
Atwood, C.L.
1994-09-01
This report presents simple statistical methods for analyzing binomial data, such as the number of failures in some number of demands. It gives point estimates, confidence intervals, and Bayesian intervals for the failure probability. It shows how to compare subsets of the data, both graphically and by statistical tests, and how to look for trends in time. It presents a compound model when the failure probability varies randomly. Examples and SAS programs are given
Hits per trial: Basic analysis of binomial data
Energy Technology Data Exchange (ETDEWEB)
Atwood, C.L.
1994-09-01
This report presents simple statistical methods for analyzing binomial data, such as the number of failures in some number of demands. It gives point estimates, confidence intervals, and Bayesian intervals for the failure probability. It shows how to compare subsets of the data, both graphically and by statistical tests, and how to look for trends in time. It presents a compound model when the failure probability varies randomly. Examples and SAS programs are given.
Correlation Structures of Correlated Binomial Models and Implied Default Distribution
S. Mori; K. Kitsukawa; M. Hisakado
2006-01-01
We show how to analyze and interpret the correlation structures, the conditional expectation values and correlation coefficients of exchangeable Bernoulli random variables. We study implied default distributions for the iTraxx-CJ tranches and some popular probabilistic models, including the Gaussian copula model, Beta binomial distribution model and long-range Ising model. We interpret the differences in their profiles in terms of the correlation structures. The implied default distribution h...
Abstract knowledge versus direct experience in processing of binomial expressions.
Morgan, Emily; Levy, Roger
2016-12-01
We ask whether word order preferences for binomial expressions of the form A and B (e.g. bread and butter) are driven by abstract linguistic knowledge of ordering constraints referencing the semantic, phonological, and lexical properties of the constituent words, or by prior direct experience with the specific items in questions. Using forced-choice and self-paced reading tasks, we demonstrate that online processing of never-before-seen binomials is influenced by abstract knowledge of ordering constraints, which we estimate with a probabilistic model. In contrast, online processing of highly frequent binomials is primarily driven by direct experience, which we estimate from corpus frequency counts. We propose a trade-off wherein processing of novel expressions relies upon abstract knowledge, while reliance upon direct experience increases with increased exposure to an expression. Our findings support theories of language processing in which both compositional generation and direct, holistic reuse of multi-word expressions play crucial roles. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.
A new approach to analyse longitudinal epidemiological data with an excess of zeros.
Spriensma, Alette S; Hajos, Tibor R S; de Boer, Michiel R; Heymans, Martijn W; Twisk, Jos W R
2013-02-20
Within longitudinal epidemiological research, 'count' outcome variables with an excess of zeros frequently occur. Although these outcomes are frequently analysed with a linear mixed model, or a Poisson mixed model, a two-part mixed model would be better in analysing outcome variables with an excess of zeros. Therefore, objective of this paper was to introduce the relatively 'new' method of two-part joint regression modelling in longitudinal data analysis for outcome variables with an excess of zeros, and to compare the performance of this method to current approaches. Within an observational longitudinal dataset, we compared three techniques; two 'standard' approaches (a linear mixed model, and a Poisson mixed model), and a two-part joint mixed model (a binomial/Poisson mixed distribution model), including random intercepts and random slopes. Model fit indicators, and differences between predicted and observed values were used for comparisons. The analyses were performed with STATA using the GLLAMM procedure. Regarding the random intercept models, the two-part joint mixed model (binomial/Poisson) performed best. Adding random slopes for time to the models changed the sign of the regression coefficient for both the Poisson mixed model and the two-part joint mixed model (binomial/Poisson) and resulted into a much better fit. This paper showed that a two-part joint mixed model is a more appropriate method to analyse longitudinal data with an excess of zeros compared to a linear mixed model and a Poisson mixed model. However, in a model with random slopes for time a Poisson mixed model also performed remarkably well.
Time series regression model for infectious disease and weather.
Imai, Chisato; Armstrong, Ben; Chalabi, Zaid; Mangtani, Punam; Hashizume, Masahiro
2015-10-01
Time series regression has been developed and long used to evaluate the short-term associations of air pollution and weather with mortality or morbidity of non-infectious diseases. The application of the regression approaches from this tradition to infectious diseases, however, is less well explored and raises some new issues. We discuss and present potential solutions for five issues often arising in such analyses: changes in immune population, strong autocorrelations, a wide range of plausible lag structures and association patterns, seasonality adjustments, and large overdispersion. The potential approaches are illustrated with datasets of cholera cases and rainfall from Bangladesh and influenza and temperature in Tokyo. Though this article focuses on the application of the traditional time series regression to infectious diseases and weather factors, we also briefly introduce alternative approaches, including mathematical modeling, wavelet analysis, and autoregressive integrated moving average (ARIMA) models. Modifications proposed to standard time series regression practice include using sums of past cases as proxies for the immune population, and using the logarithm of lagged disease counts to control autocorrelation due to true contagion, both of which are motivated from "susceptible-infectious-recovered" (SIR) models. The complexity of lag structures and association patterns can often be informed by biological mechanisms and explored by using distributed lag non-linear models. For overdispersed models, alternative distribution models such as quasi-Poisson and negative binomial should be considered. Time series regression can be used to investigate dependence of infectious diseases on weather, but may need modifying to allow for features specific to this context. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
Directory of Open Access Journals (Sweden)
Hyungsuk Tak
2017-06-01
Full Text Available Rgbp is an R package that provides estimates and verifiable confidence intervals for random effects in two-level conjugate hierarchical models for overdispersed Gaussian, Poisson, and binomial data. Rgbp models aggregate data from k independent groups summarized by observed sufficient statistics for each random effect, such as sample means, possibly with covariates. Rgbp uses approximate Bayesian machinery with unique improper priors for the hyper-parameters, which leads to good repeated sampling coverage properties for random effects. A special feature of Rgbp is an option that generates synthetic data sets to check whether the interval estimates for random effects actually meet the nominal confidence levels. Additionally, Rgbp provides inference statistics for the hyper-parameters, e.g., regression coefficients.
Directory of Open Access Journals (Sweden)
Lioba Simon-Schuhmacher
2015-11-01
Full Text Available The essay describes the use of the Night-Death binomial and tracks its evolution from the eighteenth century to Expressionism across Great Britain, Germany, Spain, and Austria, at the hand of poems such as Edward Young’s Night Thoughts (1745, Novalis’s Hymnen an die Nacht, (1800, José Blanco White’s sonnet “Night and Death” (1828, and Georg Trakl’s “Verwandlung des Bösen” (1914. Romanticism brought along a preference for the nocturnal: night, moonlight, shades and shadows, mist and mystery, somber mood, morbidity, and death, as opposed to the Enlightenment’s predilection for day, light, clarity, and life. The essay analyses how poets from different national contexts and ages employ images and symbols of the night to create an association with death. It furthermore shows how, with varying attitudes and results, they manage to convert this binomial into a poetic ploy.
Harrison, Xavier A
2015-01-01
Overdispersion is a common feature of models of biological data, but researchers often fail to model the excess variation driving the overdispersion, resulting in biased parameter estimates and standard errors. Quantifying and modeling overdispersion when it is present is therefore critical for robust biological inference. One means to account for overdispersion is to add an observation-level random effect (OLRE) to a model, where each data point receives a unique level of a random effect that can absorb the extra-parametric variation in the data. Although some studies have investigated the utility of OLRE to model overdispersion in Poisson count data, studies doing so for Binomial proportion data are scarce. Here I use a simulation approach to investigate the ability of both OLRE models and Beta-Binomial models to recover unbiased parameter estimates in mixed effects models of Binomial data under various degrees of overdispersion. In addition, as ecologists often fit random intercept terms to models when the random effect sample size is low (Binomial mixture model, leading to biased slope and intercept estimates, but performed well for overdispersion generated by adding random noise to the linear predictor. Comparison of parameter estimates from an OLRE model with those from its corresponding Beta-Binomial model readily identified when OLRE were performing poorly due to disagreement between effect sizes, and this strategy should be employed whenever OLRE are used for Binomial data to assess their reliability. Beta-Binomial models performed well across all contexts, but showed a tendency to underestimate effect sizes when modelling non-Beta-Binomial data. Finally, both OLRE and Beta-Binomial models performed poorly when models contained Binomial data, but that they do not perform well in all circumstances and researchers should take care to verify the robustness of parameter estimates of OLRE models.
Statistical properties of nonlinear intermediate states: binomial state
Energy Technology Data Exchange (ETDEWEB)
Abdalla, M Sebawe [Mathematics Department, College of Science, King Saud University, PO Box 2455, Riyadh 11451 (Saudi Arabia); Obada, A-S F [Department Mathematics, Faculty of Science, Al-Azhar University, Nasr City 11884, Cairo (Egypt); Darwish, M [Department of Physics, Faculty of Education, Suez Canal University, Al-Arish (Egypt)
2005-12-01
In the present paper we introduce a nonlinear binomial state (the state which interpolates between the nonlinear coherent and number states). The main investigation concentrates on the statistical properties for such a state where we consider the squeezing phenomenon by examining the variation in the quadrature variances for both normal and amplitude-squared squeezing. Examinations for the quasi-probability distribution functions (W-Wigner and Q-functions) are also given for both diagonal and off diagonal terms. The quadrature distribution and the phase distribution as well as the phase variances are discussed. Moreover, we give in detail a generation scheme for such state.
Correlation Structures of Correlated Binomial Models and Implied Default Distribution
Mori, Shintaro; Kitsukawa, Kenji; Hisakado, Masato
2008-11-01
We show how to analyze and interpret the correlation structures, the conditional expectation values and correlation coefficients of exchangeable Bernoulli random variables. We study implied default distributions for the iTraxx-CJ tranches and some popular probabilistic models, including the Gaussian copula model, Beta binomial distribution model and long-range Ising model. We interpret the differences in their profiles in terms of the correlation structures. The implied default distribution has singular correlation structures, reflecting the credit market implications. We point out two possible origins of the singular behavior.
Zheng, Han; Kimber, Alan; Goodwin, Victoria A; Pickering, Ruth M
2018-01-01
A common design for a falls prevention trial is to assess falling at baseline, randomize participants into an intervention or control group, and ask them to record the number of falls they experience during a follow-up period of time. This paper addresses how best to include the baseline count in the analysis of the follow-up count of falls in negative binomial (NB) regression. We examine the performance of various approaches in simulated datasets where both counts are generated from a mixed Poisson distribution with shared random subject effect. Including the baseline count after log-transformation as a regressor in NB regression (NB-logged) or as an offset (NB-offset) resulted in greater power than including the untransformed baseline count (NB-unlogged). Cook and Wei's conditional negative binomial (CNB) model replicates the underlying process generating the data. In our motivating dataset, a statistically significant intervention effect resulted from the NB-logged, NB-offset, and CNB models, but not from NB-unlogged, and large, outlying baseline counts were overly influential in NB-unlogged but not in NB-logged. We conclude that there is little to lose by including the log-transformed baseline count in standard NB regression compared to CNB for moderate to larger sized datasets. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
DEFF Research Database (Denmark)
Azarang, Leyla; Scheike, Thomas; de Uña-Álvarez, Jacobo
2017-01-01
In this work, we present direct regression analysis for the transition probabilities in the possibly non-Markov progressive illness–death model. The method is based on binomial regression, where the response is the indicator of the occupancy for the given state along time. Randomly weighted score...
Assessing the Option to Abandon an Investment Project by the Binomial Options Pricing Model
Directory of Open Access Journals (Sweden)
Salvador Cruz Rambaud
2016-01-01
Full Text Available Usually, traditional methods for investment project appraisal such as the net present value (hereinafter NPV do not incorporate in their values the operational flexibility offered by including a real option included in the project. In this paper, real options, and more specifically the option to abandon, are analysed as a complement to cash flow sequence which quantifies the project. In this way, by considering the existing analogy with financial options, a mathematical expression is derived by using the binomial options pricing model. This methodology provides the value of the option to abandon the project within one, two, and in general n periods. Therefore, this paper aims to be a useful tool in determining the value of the option to abandon according to its residual value, thus making easier the control of the uncertainty element within the project.
Sauzet, Odile; Peacock, Janet L
2017-07-20
The analysis of perinatal outcomes often involves datasets with some multiple births. These are datasets mostly formed of independent observations and a limited number of clusters of size two (twins) and maybe of size three or more. This non-independence needs to be accounted for in the statistical analysis. Using simulated data based on a dataset of preterm infants we have previously investigated the performance of several approaches to the analysis of continuous outcomes in the presence of some clusters of size two. Mixed models have been developed for binomial outcomes but very little is known about their reliability when only a limited number of small clusters are present. Using simulated data based on a dataset of preterm infants we investigated the performance of several approaches to the analysis of binomial outcomes in the presence of some clusters of size two. Logistic models, several methods of estimation for the logistic random intercept models and generalised estimating equations were compared. The presence of even a small percentage of twins means that a logistic regression model will underestimate all parameters but a logistic random intercept model fails to estimate the correlation between siblings if the percentage of twins is too small and will provide similar estimates to logistic regression. The method which seems to provide the best balance between estimation of the standard error and the parameter for any percentage of twins is the generalised estimating equations. This study has shown that the number of covariates or the level two variance do not necessarily affect the performance of the various methods used to analyse datasets containing twins but when the percentage of small clusters is too small, mixed models cannot capture the dependence between siblings.
Directory of Open Access Journals (Sweden)
Odile Sauzet
2017-07-01
Full Text Available Abstract Background The analysis of perinatal outcomes often involves datasets with some multiple births. These are datasets mostly formed of independent observations and a limited number of clusters of size two (twins and maybe of size three or more. This non-independence needs to be accounted for in the statistical analysis. Using simulated data based on a dataset of preterm infants we have previously investigated the performance of several approaches to the analysis of continuous outcomes in the presence of some clusters of size two. Mixed models have been developed for binomial outcomes but very little is known about their reliability when only a limited number of small clusters are present. Methods Using simulated data based on a dataset of preterm infants we investigated the performance of several approaches to the analysis of binomial outcomes in the presence of some clusters of size two. Logistic models, several methods of estimation for the logistic random intercept models and generalised estimating equations were compared. Results The presence of even a small percentage of twins means that a logistic regression model will underestimate all parameters but a logistic random intercept model fails to estimate the correlation between siblings if the percentage of twins is too small and will provide similar estimates to logistic regression. The method which seems to provide the best balance between estimation of the standard error and the parameter for any percentage of twins is the generalised estimating equations. Conclusions This study has shown that the number of covariates or the level two variance do not necessarily affect the performance of the various methods used to analyse datasets containing twins but when the percentage of small clusters is too small, mixed models cannot capture the dependence between siblings.
Negative binomial models for abundance estimation of multiple closed populations
Boyce, Mark S.; MacKenzie, Darry I.; Manly, Bryan F.J.; Haroldson, Mark A.; Moody, David W.
2001-01-01
Counts of uniquely identified individuals in a population offer opportunities to estimate abundance. However, for various reasons such counts may be burdened by heterogeneity in the probability of being detected. Theoretical arguments and empirical evidence demonstrate that the negative binomial distribution (NBD) is a useful characterization for counts from biological populations with heterogeneity. We propose a method that focuses on estimating multiple populations by simultaneously using a suite of models derived from the NBD. We used this approach to estimate the number of female grizzly bears (Ursus arctos) with cubs-of-the-year in the Yellowstone ecosystem, for each year, 1986-1998. Akaike's Information Criteria (AIC) indicated that a negative binomial model with a constant level of heterogeneity across all years was best for characterizing the sighting frequencies of female grizzly bears. A lack-of-fit test indicated the model adequately described the collected data. Bootstrap techniques were used to estimate standard errors and 95% confidence intervals. We provide a Monte Carlo technique, which confirms that the Yellowstone ecosystem grizzly bear population increased during the period 1986-1998.
Low reheating temperatures in monomial and binomial inflationary models
International Nuclear Information System (INIS)
Rehagen, Thomas; Gelmini, Graciela B.
2015-01-01
We investigate the allowed range of reheating temperature values in light of the Planck 2015 results and the recent joint analysis of Cosmic Microwave Background (CMB) data from the BICEP2/Keck Array and Planck experiments, using monomial and binomial inflationary potentials. While the well studied ϕ 2 inflationary potential is no longer favored by current CMB data, as well as ϕ p with p>2, a ϕ 1 potential and canonical reheating (w re =0) provide a good fit to the CMB measurements. In this last case, we find that the Planck 2015 68% confidence limit upper bound on the spectral index, n s , implies an upper bound on the reheating temperature of T re ≲6×10 10 GeV, and excludes instantaneous reheating. The low reheating temperatures allowed by this model open the possibility that dark matter could be produced during the reheating period instead of when the Universe is radiation dominated, which could lead to very different predictions for the relic density and momentum distribution of WIMPs, sterile neutrinos, and axions. We also study binomial inflationary potentials and show the effects of a small departure from a ϕ 1 potential. We find that as a subdominant ϕ 2 term in the potential increases, first instantaneous reheating becomes allowed, and then the lowest possible reheating temperature of T re =4 MeV is excluded by the Planck 2015 68% confidence limit
Confidence Intervals for Asbestos Fiber Counts: Approximate Negative Binomial Distribution.
Bartley, David; Slaven, James; Harper, Martin
2017-03-01
The negative binomial distribution is adopted for analyzing asbestos fiber counts so as to account for both the sampling errors in capturing only a finite number of fibers and the inevitable human variation in identifying and counting sampled fibers. A simple approximation to this distribution is developed for the derivation of quantiles and approximate confidence limits. The success of the approximation depends critically on the use of Stirling's expansion to sufficient order, on exact normalization of the approximating distribution, on reasonable perturbation of quantities from the normal distribution, and on accurately approximating sums by inverse-trapezoidal integration. Accuracy of the approximation developed is checked through simulation and also by comparison to traditional approximate confidence intervals in the specific case that the negative binomial distribution approaches the Poisson distribution. The resulting statistics are shown to relate directly to early research into the accuracy of asbestos sampling and analysis. Uncertainty in estimating mean asbestos fiber concentrations given only a single count is derived. Decision limits (limits of detection) and detection limits are considered for controlling false-positive and false-negative detection assertions and are compared to traditional limits computed assuming normal distributions. Published by Oxford University Press on behalf of the British Occupational Hygiene Society 2017.
Turi, Christina E; Murch, Susan J
2013-07-09
Ethnobotanical research and the study of plants used for rituals, ceremonies and to connect with the spirit world have led to the discovery of many novel psychoactive compounds such as nicotine, caffeine, and cocaine. In North America, spiritual and ceremonial uses of plants are well documented and can be accessed online via the University of Michigan's Native American Ethnobotany Database. The objective of the study was to compare Residual, Bayesian, Binomial and Imprecise Dirichlet Model (IDM) analyses of ritual, ceremonial and spiritual plants in Moerman's ethnobotanical database and to identify genera that may be good candidates for the discovery of novel psychoactive compounds. The database was queried with the following format "Family Name AND Ceremonial OR Spiritual" for 263 North American botanical families. Spiritual and ceremonial flora consisted of 86 families with 517 species belonging to 292 genera. Spiritual taxa were then grouped further into ceremonial medicines and items categories. Residual, Bayesian, Binomial and IDM analysis were performed to identify over and under-utilized families. The 4 statistical approaches were in good agreement when identifying under-utilized families but large families (>393 species) were underemphasized by Binomial, Bayesian and IDM approaches for over-utilization. Residual, Binomial, and IDM analysis identified similar families as over-utilized in the medium (92-392 species) and small (Binomial analysis when separated into small, medium and large families. The Bayesian, Binomial and IDM approaches identified different genera as potentially important. Species belonging to the genus Artemisia and Ligusticum were most consistently identified and may be valuable in future studies of the ethnopharmacology. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Analysing recurrent events in exercise science and sports medicine
African Journals Online (AJOL)
Appropriate statistical techniques include generalised estimating equations, survival analysis (Cox proportional hazards regression with robust variance estimation) and regression for count outcomes data (Poisson or negative binomial models with robust variance estimation).13. Statistical software packages such as SAS, ...
Generazio, Edward R.
2014-01-01
Unknown risks are introduced into failure critical systems when probability of detection (POD) capabilities are accepted without a complete understanding of the statistical method applied and the interpretation of the statistical results. The presence of this risk in the nondestructive evaluation (NDE) community is revealed in common statements about POD. These statements are often interpreted in a variety of ways and therefore, the very existence of the statements identifies the need for a more comprehensive understanding of POD methodologies. Statistical methodologies have data requirements to be met, procedures to be followed, and requirements for validation or demonstration of adequacy of the POD estimates. Risks are further enhanced due to the wide range of statistical methodologies used for determining the POD capability. Receiver/Relative Operating Characteristics (ROC) Display, simple binomial, logistic regression, and Bayes' rule POD methodologies are widely used in determining POD capability. This work focuses on Hit-Miss data to reveal the framework of the interrelationships between Receiver/Relative Operating Characteristics Display, simple binomial, logistic regression, and Bayes' Rule methodologies as they are applied to POD. Knowledge of these interrelationships leads to an intuitive and global understanding of the statistical data, procedural and validation requirements for establishing credible POD estimates.
Covering Resilience: A Recent Development for Binomial Checkpointing
Energy Technology Data Exchange (ETDEWEB)
Walther, Andrea; Narayanan, Sri Hari Krishna
2016-09-12
In terms of computing time, adjoint methods offer a very attractive alternative to compute gradient information, required, e.g., for optimization purposes. However, together with this very favorable temporal complexity result comes a memory requirement that is in essence proportional with the operation count of the underlying function, e.g., if algorithmic differentiation is used to provide the adjoints. For this reason, checkpointing approaches in many variants have become popular. This paper analyzes an extension of the so-called binomial approach to cover also possible failures of the computing systems. Such a measure of precaution is of special interest for massive parallel simulations and adjoint calculations where the mean time between failure of the large scale computing system is smaller than the time needed to complete the calculation of the adjoint information. We describe the extensions of standard checkpointing approaches required for such resilience, provide a corresponding implementation and discuss first numerical results.
A Bayesian equivalency test for two independent binomial proportions.
Kawasaki, Yohei; Shimokawa, Asanao; Yamada, Hiroshi; Miyaoka, Etsuo
2016-01-01
In clinical trials, it is often necessary to perform an equivalence study. The equivalence study requires actively denoting equivalence between two different drugs or treatments. Since it is not possible to assert equivalence that is not rejected by a superiority test, statistical methods known as equivalency tests have been suggested. These methods for equivalency tests are based on the frequency framework; however, there are few such methods in the Bayesian framework. Hence, this article proposes a new index that suggests the equivalency of binomial proportions, which is constructed based on the Bayesian framework. In this study, we provide two methods for calculating the index and compare the probabilities that have been calculated by these two calculation methods. Moreover, we apply this index to the results of actual clinical trials to demonstrate the utility of the index.
Negative binomial mixed models for analyzing microbiome count data.
Zhang, Xinyan; Mallick, Himel; Tang, Zaixiang; Zhang, Lei; Cui, Xiangqin; Benson, Andrew K; Yi, Nengjun
2017-01-03
Recent advances in next-generation sequencing (NGS) technology enable researchers to collect a large volume of metagenomic sequencing data. These data provide valuable resources for investigating interactions between the microbiome and host environmental/clinical factors. In addition to the well-known properties of microbiome count measurements, for example, varied total sequence reads across samples, over-dispersion and zero-inflation, microbiome studies usually collect samples with hierarchical structures, which introduce correlation among the samples and thus further complicate the analysis and interpretation of microbiome count data. In this article, we propose negative binomial mixed models (NBMMs) for detecting the association between the microbiome and host environmental/clinical factors for correlated microbiome count data. Although having not dealt with zero-inflation, the proposed mixed-effects models account for correlation among the samples by incorporating random effects into the commonly used fixed-effects negative binomial model, and can efficiently handle over-dispersion and varying total reads. We have developed a flexible and efficient IWLS (Iterative Weighted Least Squares) algorithm to fit the proposed NBMMs by taking advantage of the standard procedure for fitting the linear mixed models. We evaluate and demonstrate the proposed method via extensive simulation studies and the application to mouse gut microbiome data. The results show that the proposed method has desirable properties and outperform the previously used methods in terms of both empirical power and Type I error. The method has been incorporated into the freely available R package BhGLM ( http://www.ssg.uab.edu/bhglm/ and http://github.com/abbyyan3/BhGLM ), providing a useful tool for analyzing microbiome data.
Lea, Amanda J.
2015-01-01
Identifying sources of variation in DNA methylation levels is important for understanding gene regulation. Recently, bisulfite sequencing has become a popular tool for investigating DNA methylation levels. However, modeling bisulfite sequencing data is complicated by dramatic variation in coverage across sites and individual samples, and because of the computational challenges of controlling for genetic covariance in count data. To address these challenges, we present a binomial mixed model and an efficient, sampling-based algorithm (MACAU: Mixed model association for count data via data augmentation) for approximate parameter estimation and p-value computation. This framework allows us to simultaneously account for both the over-dispersed, count-based nature of bisulfite sequencing data, as well as genetic relatedness among individuals. Using simulations and two real data sets (whole genome bisulfite sequencing (WGBS) data from Arabidopsis thaliana and reduced representation bisulfite sequencing (RRBS) data from baboons), we show that our method provides well-calibrated test statistics in the presence of population structure. Further, it improves power to detect differentially methylated sites: in the RRBS data set, MACAU detected 1.6-fold more age-associated CpG sites than a beta-binomial model (the next best approach). Changes in these sites are consistent with known age-related shifts in DNA methylation levels, and are enriched near genes that are differentially expressed with age in the same population. Taken together, our results indicate that MACAU is an efficient, effective tool for analyzing bisulfite sequencing data, with particular salience to analyses of structured populations. MACAU is freely available at www.xzlab.org/software.html. PMID:26599596
Irwin, Brian J.; Wagner, Tyler; Bence, James R.; Kepler, Megan V.; Liu, Weihai; Hayes, Daniel B.
2013-01-01
Partitioning total variability into its component temporal and spatial sources is a powerful way to better understand time series and elucidate trends. The data available for such analyses of fish and other populations are usually nonnegative integer counts of the number of organisms, often dominated by many low values with few observations of relatively high abundance. These characteristics are not well approximated by the Gaussian distribution. We present a detailed description of a negative binomial mixed-model framework that can be used to model count data and quantify temporal and spatial variability. We applied these models to data from four fishery-independent surveys of Walleyes Sander vitreus across the Great Lakes basin. Specifically, we fitted models to gill-net catches from Wisconsin waters of Lake Superior; Oneida Lake, New York; Saginaw Bay in Lake Huron, Michigan; and Ohio waters of Lake Erie. These long-term monitoring surveys varied in overall sampling intensity, the total catch of Walleyes, and the proportion of zero catches. Parameter estimation included the negative binomial scaling parameter, and we quantified the random effects as the variations among gill-net sampling sites, the variations among sampled years, and site × year interactions. This framework (i.e., the application of a mixed model appropriate for count data in a variance-partitioning context) represents a flexible approach that has implications for monitoring programs (e.g., trend detection) and for examining the potential of individual variance components to serve as response metrics to large-scale anthropogenic perturbations or ecological changes.
Applications of some discrete regression models for count data
Directory of Open Access Journals (Sweden)
B. M. Golam Kibria
2006-01-01
Full Text Available In this paper we have considered several regression models to fit the count data that encounter in the field of Biometrical, Environmental, Social Sciences and Transportation Engineering. We have fitted Poisson (PO, Negative Binomial (NB, Zero-Inflated Poisson (ZIP and Zero-Inflated Negative Binomial (ZINB regression models to run-off-road (ROR crash data which collected on arterial roads in south region (rural of Florida State. To compare the performance of these models, we analyzed data with moderate to high percentage of zero counts. Because the variances were almost three times greater than the means, it appeared that both NB and ZINB models performed better than PO and ZIP models for the zero inflated and over dispersed count data.
Mulroy, Sara J; Winstein, Carolee J; Kulig, Kornelia; Beneck, George J; Fowler, Eileen G; DeMuth, Sharon K; Sullivan, Katherine J; Brown, David A; Lane, Christianne J
2011-12-01
Each of the 4 randomized clinical trials (RCTs) hosted by the Physical Therapy Clinical Research Network (PTClinResNet) targeted a different disability group (low back disorder in the Muscle-Specific Strength Training Effectiveness After Lumbar Microdiskectomy [MUSSEL] trial, chronic spinal cord injury in the Strengthening and Optimal Movements for Painful Shoulders in Chronic Spinal Cord Injury [STOMPS] trial, adult stroke in the Strength Training Effectiveness Post-Stroke [STEPS] trial, and pediatric cerebral palsy in the Pediatric Endurance and Limb Strengthening [PEDALS] trial for children with spastic diplegic cerebral palsy) and tested the effectiveness of a muscle-specific or functional activity-based intervention on primary outcomes that captured pain (STOMPS, MUSSEL) or locomotor function (STEPS, PEDALS). The focus of these secondary analyses was to determine causal relationships among outcomes across levels of the International Classification of Functioning, Disability and Health (ICF) framework for the 4 RCTs. With the database from PTClinResNet, we used 2 separate secondary statistical approaches-mediation analysis for the MUSSEL and STOMPS trials and regression analysis for the STEPS and PEDALS trials-to test relationships among muscle performance, primary outcomes (pain related and locomotor related), activity and participation measures, and overall quality of life. Predictive models were stronger for the 2 studies with pain-related primary outcomes. Change in muscle performance mediated or predicted reductions in pain for the MUSSEL and STOMPS trials and, to some extent, walking speed for the STEPS trial. Changes in primary outcome variables were significantly related to changes in activity and participation variables for all 4 trials. Improvement in activity and participation outcomes mediated or predicted increases in overall quality of life for the 3 trials with adult populations. Variables included in the statistical models were limited to those
Revealing Word Order: Using Serial Position in Binomials to Predict Properties of the Speaker
Iliev, Rumen; Smirnova, Anastasia
2016-01-01
Three studies test the link between word order in binomials and psychological and demographic characteristics of a speaker. While linguists have already suggested that psychological, cultural and societal factors are important in choosing word order in binomials, the vast majority of relevant research was focused on general factors and on broadly…
Modeling and Predistortion of Envelope Tracking Power Amplifiers using a Memory Binomial Model
DEFF Research Database (Denmark)
Tafuri, Felice Francesco; Sira, Daniel; Larsen, Torben
2013-01-01
. The model definition is based on binomial series, hence the name of memory binomial model (MBM). The MBM is here applied to measured data-sets acquired from an ET measurement set-up. When used as a PA model the MBM showed an NMSE (Normalized Mean Squared Error) as low as −40dB and an ACEPR (Adjacent Channel...
Adjusted Wald Confidence Interval for a Difference of Binomial Proportions Based on Paired Data
Bonett, Douglas G.; Price, Robert M.
2012-01-01
Adjusted Wald intervals for binomial proportions in one-sample and two-sample designs have been shown to perform about as well as the best available methods. The adjusted Wald intervals are easy to compute and have been incorporated into introductory statistics courses. An adjusted Wald interval for paired binomial proportions is proposed here and…
Some normed binomial difference sequence spaces related to the [Formula: see text] spaces.
Song, Meimei; Meng, Jian
2017-01-01
The aim of this paper is to introduce the normed binomial sequence spaces [Formula: see text] by combining the binomial transformation and difference operator, where [Formula: see text]. We prove that these spaces are linearly isomorphic to the spaces [Formula: see text] and [Formula: see text], respectively. Furthermore, we compute Schauder bases and the α -, β - and γ -duals of these sequence spaces.
An efficient binomial model-based measure for sequence comparison and its application.
Liu, Xiaoqing; Dai, Qi; Li, Lihua; He, Zerong
2011-04-01
Sequence comparison is one of the major tasks in bioinformatics, which could serve as evidence of structural and functional conservation, as well as of evolutionary relations. There are several similarity/dissimilarity measures for sequence comparison, but challenges remains. This paper presented a binomial model-based measure to analyze biological sequences. With help of a random indicator, the occurrence of a word at any position of sequence can be regarded as a random Bernoulli variable, and the distribution of a sum of the word occurrence is well known to be a binomial one. By using a recursive formula, we computed the binomial probability of the word count and proposed a binomial model-based measure based on the relative entropy. The proposed measure was tested by extensive experiments including classification of HEV genotypes and phylogenetic analysis, and further compared with alignment-based and alignment-free measures. The results demonstrate that the proposed measure based on binomial model is more efficient.
Pedrini, D. T.; Pedrini, Bonnie C.
Regression, another mechanism studied by Sigmund Freud, has had much research, e.g., hypnotic regression, frustration regression, schizophrenic regression, and infra-human-animal regression (often directly related to fixation). Many investigators worked with hypnotic age regression, which has a long history, going back to Russian reflexologists.…
On the revival of the negative binomial distribution in multiparticle production
International Nuclear Information System (INIS)
Ekspong, G.
1990-01-01
This paper is based on published and some unpublished material pertaining to the revival of interest in and success of applying the negative binomial distribution to multiparticle production since 1983. After a historically oriented introduction going farther back in time, the main part of the paper is devoted to an unpublished derivation of the negative binomial distribution based on empirical observations of forward-backward multiplicity correlations. Some physical processes leading to the negative binomial distribution are mentioned and some comments made on published criticisms
Better Autologistic Regression
Directory of Open Access Journals (Sweden)
Mark A. Wolters
2017-11-01
Full Text Available Autologistic regression is an important probability model for dichotomous random variables observed along with covariate information. It has been used in various fields for analyzing binary data possessing spatial or network structure. The model can be viewed as an extension of the autologistic model (also known as the Ising model, quadratic exponential binary distribution, or Boltzmann machine to include covariates. It can also be viewed as an extension of logistic regression to handle responses that are not independent. Not all authors use exactly the same form of the autologistic regression model. Variations of the model differ in two respects. First, the variable coding—the two numbers used to represent the two possible states of the variables—might differ. Common coding choices are (zero, one and (minus one, plus one. Second, the model might appear in either of two algebraic forms: a standard form, or a recently proposed centered form. Little attention has been paid to the effect of these differences, and the literature shows ambiguity about their importance. It is shown here that changes to either coding or centering in fact produce distinct, non-nested probability models. Theoretical results, numerical studies, and analysis of an ecological data set all show that the differences among the models can be large and practically significant. Understanding the nature of the differences and making appropriate modeling choices can lead to significantly improved autologistic regression analyses. The results strongly suggest that the standard model with plus/minus coding, which we call the symmetric autologistic model, is the most natural choice among the autologistic variants.
Steinbuch, Luc; Brus, Dick J.; Heuvelink, Gerard B.M.
2018-01-01
One of the first soil forming processes in marine and fluviatile clay soils is ripening, the irreversible change of physical and chemical soil properties, especially consistency, under influence of air. We used Bayesian binomial logistic regression (BBLR) to update the map showing unripened
Measured PET Data Characterization with the Negative Binomial Distribution Model.
Santarelli, Maria Filomena; Positano, Vincenzo; Landini, Luigi
2017-01-01
Accurate statistical model of PET measurements is a prerequisite for a correct image reconstruction when using statistical image reconstruction algorithms, or when pre-filtering operations must be performed. Although radioactive decay follows a Poisson distribution, deviation from Poisson statistics occurs on projection data prior to reconstruction due to physical effects, measurement errors, correction of scatter and random coincidences. Modelling projection data can aid in understanding the statistical nature of the data in order to develop efficient processing methods and to reduce noise. This paper outlines the statistical behaviour of measured emission data evaluating the goodness of fit of the negative binomial (NB) distribution model to PET data for a wide range of emission activity values. An NB distribution model is characterized by the mean of the data and the dispersion parameter α that describes the deviation from Poisson statistics. Monte Carlo simulations were performed to evaluate: (a) the performances of the dispersion parameter α estimator, (b) the goodness of fit of the NB model for a wide range of activity values. We focused on the effect produced by correction for random and scatter events in the projection (sinogram) domain, due to their importance in quantitative analysis of PET data. The analysis developed herein allowed us to assess the accuracy of the NB distribution model to fit corrected sinogram data, and to evaluate the sensitivity of the dispersion parameter α to quantify deviation from Poisson statistics. By the sinogram ROI-based analysis, it was demonstrated that deviation on the measured data from Poisson statistics can be quantitatively characterized by the dispersion parameter α, in any noise conditions and corrections.
Hung, Tran Loc; Giang, Le Truong
2016-01-01
Using the Stein-Chen method some upper bounds in Poisson approximation for distributions of row-wise triangular arrays of independent negative-binomial distributed random variables are established in this note.
Distribution-free Inference of Zero-inated Binomial Data for Longitudinal Studies.
He, H; Wang, W J; Hu, J; Gallop, R; Crits-Christoph, P; Xia, Y L
2015-10-01
Count reponses with structural zeros are very common in medical and psychosocial research, especially in alcohol and HIV research, and the zero-inflated poisson (ZIP) and zero-inflated negative binomial (ZINB) models are widely used for modeling such outcomes. However, as alcohol drinking outcomes such as days of drinkings are counts within a given period, their distributions are bounded above by an upper limit (total days in the period) and thus inherently follow a binomial or zero-inflated binomial (ZIB) distribution, rather than a Poisson or zero-inflated Poisson (ZIP) distribution, in the presence of structural zeros. In this paper, we develop a new semiparametric approach for modeling zero-inflated binomial (ZIB)-like count responses for cross-sectional as well as longitudinal data. We illustrate this approach with both simulated and real study data.
Difference of Sums Containing Products of Binomial Coefficients and Their Logarithms
National Research Council Canada - National Science Library
Miller, Allen R; Moskowitz, Ira S
2005-01-01
Properties of the difference of two sums containing products of binomial coefficients and their logarithms which arise in the application of Shannon's information theory to a certain class of covert channels are deduced...
Difference of Sums Containing Products of Binomial Coefficients and their Logarithms
National Research Council Canada - National Science Library
Miller, Allen R; Moskowitz, Ira S
2004-01-01
Properties of the difference of two sums containing products of binomial coefficients and their logarithms which arise in the application of Shannon's information theory to a certain class of covert channels are deduced...
Zheng, Gang; Torres, Allan M.; Price, William S.
2008-09-01
Two phase-modulated binomial-like π pulses have been developed by simultaneously optimizing pulse durations and phases. In combination with excitation sculpting, both of the new binomial-like sequences outperform the well-known 3-9-19 sequence in selectivity and inversion width. The new sequences provide similar selectivity and inversion width to the W5 sequence but with significantly shorter sequence durations. When used in PGSTE-WATERGATE, they afford highly selective solvent suppression in diffusion experiments.
Perbandingan Metode Binomial dan Metode Black-Scholes Dalam Penentuan Harga Opsi
Directory of Open Access Journals (Sweden)
Surya Amami Pramuditya
2016-04-01
Full Text Available ABSTRAKOpsi adalah kontrak antara pemegang dan penulis (buyer (holder dan seller (writer di mana penulis (writer memberikan hak (bukan kewajiban kepada holder untuk membeli atau menjual aset dari writer pada harga tertentu (strike atau latihan harga dan pada waktu tertentu dalam waktu (tanggal kadaluwarsa atau jatuh tempo waktu. Ada beberapa cara untuk menentukan harga opsi, diantaranya adalah Metode Black-Scholes dan Metode Binomial. Metode binomial berasal dari model pergerakan harga saham yang membagi waktu interval [0, T] menjadi n sama panjang. Sedangkan metode Black-Scholes, dimodelkan dengan pergerakan harga saham sebagai suatu proses stokastik. Semakin besar partisi waktu n pada Metode Binomial, maka nilai opsinya akan konvergen ke nilai opsi Metode Black-Scholes.Kata kunci: opsi, Binomial, Black-Scholes.ABSTRACT Option is a contract between the holder and the writer in which the writer gives the right (not the obligation to the holder to buy or sell an asset of a writer at a specified price (the strike or exercise price and at a specified time in the future (expiry date or maturity time. There are several ways to determine the price of options, including the Black-Scholes Method and Binomial Method. Binomial method come from a model of stock price movement that divide time interval [0, T] into n equally long. While the Black Scholes method, the stock price movement is modeled as a stochastic process. More larger the partition of time n in Binomial Method, the value option will converge to the value option in Black-Scholes Method.Key words: Options, Binomial, Black-Scholes
Model-based Quantile Regression for Discrete Data
Padellini, Tullia
2018-04-10
Quantile regression is a class of methods voted to the modelling of conditional quantiles. In a Bayesian framework quantile regression has typically been carried out exploiting the Asymmetric Laplace Distribution as a working likelihood. Despite the fact that this leads to a proper posterior for the regression coefficients, the resulting posterior variance is however affected by an unidentifiable parameter, hence any inferential procedure beside point estimation is unreliable. We propose a model-based approach for quantile regression that considers quantiles of the generating distribution directly, and thus allows for a proper uncertainty quantification. We then create a link between quantile regression and generalised linear models by mapping the quantiles to the parameter of the response variable, and we exploit it to fit the model with R-INLA. We extend it also in the case of discrete responses, where there is no 1-to-1 relationship between quantiles and distribution\\'s parameter, by introducing continuous generalisations of the most common discrete variables (Poisson, Binomial and Negative Binomial) to be exploited in the fitting.
Directory of Open Access Journals (Sweden)
Quentin Noirhomme
2014-01-01
Full Text Available Multivariate classification is used in neuroimaging studies to infer brain activation or in medical applications to infer diagnosis. Their results are often assessed through either a binomial or a permutation test. Here, we simulated classification results of generated random data to assess the influence of the cross-validation scheme on the significance of results. Distributions built from classification of random data with cross-validation did not follow the binomial distribution. The binomial test is therefore not adapted. On the contrary, the permutation test was unaffected by the cross-validation scheme. The influence of the cross-validation was further illustrated on real-data from a brain–computer interface experiment in patients with disorders of consciousness and from an fMRI study on patients with Parkinson disease. Three out of 16 patients with disorders of consciousness had significant accuracy on binomial testing, but only one showed significant accuracy using permutation testing. In the fMRI experiment, the mental imagery of gait could discriminate significantly between idiopathic Parkinson's disease patients and healthy subjects according to the permutation test but not according to the binomial test. Hence, binomial testing could lead to biased estimation of significance and false positive or negative results. In our view, permutation testing is thus recommended for clinical application of classification with cross-validation.
Noirhomme, Quentin; Lesenfants, Damien; Gomez, Francisco; Soddu, Andrea; Schrouff, Jessica; Garraux, Gaëtan; Luxen, André; Phillips, Christophe; Laureys, Steven
2014-01-01
Multivariate classification is used in neuroimaging studies to infer brain activation or in medical applications to infer diagnosis. Their results are often assessed through either a binomial or a permutation test. Here, we simulated classification results of generated random data to assess the influence of the cross-validation scheme on the significance of results. Distributions built from classification of random data with cross-validation did not follow the binomial distribution. The binomial test is therefore not adapted. On the contrary, the permutation test was unaffected by the cross-validation scheme. The influence of the cross-validation was further illustrated on real-data from a brain-computer interface experiment in patients with disorders of consciousness and from an fMRI study on patients with Parkinson disease. Three out of 16 patients with disorders of consciousness had significant accuracy on binomial testing, but only one showed significant accuracy using permutation testing. In the fMRI experiment, the mental imagery of gait could discriminate significantly between idiopathic Parkinson's disease patients and healthy subjects according to the permutation test but not according to the binomial test. Hence, binomial testing could lead to biased estimation of significance and false positive or negative results. In our view, permutation testing is thus recommended for clinical application of classification with cross-validation.
Aly, Sharif S; Zhao, Jianyang; Li, Ben; Jiang, Jiming
2014-01-01
The Intraclass Correlation Coefficient (ICC) is commonly used to estimate the similarity between quantitative measures obtained from different sources. Overdispersed data is traditionally transformed so that linear mixed model (LMM) based ICC can be estimated. A common transformation used is the natural logarithm. The reliability of environmental sampling of fecal slurry on freestall pens has been estimated for Mycobacterium avium subsp. paratuberculosis using the natural logarithm transformed culture results. Recently, the negative binomial ICC was defined based on a generalized linear mixed model for negative binomial distributed data. The current study reports on the negative binomial ICC estimate which includes fixed effects using culture results of environmental samples. Simulations using a wide variety of inputs and negative binomial distribution parameters (r; p) showed better performance of the new negative binomial ICC compared to the ICC based on LMM even when negative binomial data was logarithm, and square root transformed. A second comparison that targeted a wider range of ICC values showed that the mean of estimated ICC closely approximated the true ICC.
Walton, Joseph M.; And Others
1978-01-01
Ridge regression is an approach to the problem of large standard errors of regression estimates of intercorrelated regressors. The effect of ridge regression on the estimated squared multiple correlation coefficient is discussed and illustrated. (JKS)
Hegazy, Maha A; Lotfy, Hayam M; Mowaka, Shereen; Mohamed, Ekram Hany
2016-07-05
Wavelets have been adapted for a vast number of signal-processing applications due to the amount of information that can be extracted from a signal. In this work, a comparative study on the efficiency of continuous wavelet transform (CWT) as a signal processing tool in univariate regression and a pre-processing tool in multivariate analysis using partial least square (CWT-PLS) was conducted. These were applied to complex spectral signals of ternary and quaternary mixtures. CWT-PLS method succeeded in the simultaneous determination of a quaternary mixture of drotaverine (DRO), caffeine (CAF), paracetamol (PAR) and p-aminophenol (PAP, the major impurity of paracetamol). While, the univariate CWT failed to simultaneously determine the quaternary mixture components and was able to determine only PAR and PAP, the ternary mixtures of DRO, CAF, and PAR and CAF, PAR, and PAP. During the calculations of CWT, different wavelet families were tested. The univariate CWT method was validated according to the ICH guidelines. While for the development of the CWT-PLS model a calibration set was prepared by means of an orthogonal experimental design and their absorption spectra were recorded and processed by CWT. The CWT-PLS model was constructed by regression between the wavelet coefficients and concentration matrices and validation was performed by both cross validation and external validation sets. Both methods were successfully applied for determination of the studied drugs in pharmaceutical formulations. Copyright © 2016 Elsevier B.V. All rights reserved.
Hegazy, Maha A.; Lotfy, Hayam M.; Mowaka, Shereen; Mohamed, Ekram Hany
2016-07-01
Wavelets have been adapted for a vast number of signal-processing applications due to the amount of information that can be extracted from a signal. In this work, a comparative study on the efficiency of continuous wavelet transform (CWT) as a signal processing tool in univariate regression and a pre-processing tool in multivariate analysis using partial least square (CWT-PLS) was conducted. These were applied to complex spectral signals of ternary and quaternary mixtures. CWT-PLS method succeeded in the simultaneous determination of a quaternary mixture of drotaverine (DRO), caffeine (CAF), paracetamol (PAR) and p-aminophenol (PAP, the major impurity of paracetamol). While, the univariate CWT failed to simultaneously determine the quaternary mixture components and was able to determine only PAR and PAP, the ternary mixtures of DRO, CAF, and PAR and CAF, PAR, and PAP. During the calculations of CWT, different wavelet families were tested. The univariate CWT method was validated according to the ICH guidelines. While for the development of the CWT-PLS model a calibration set was prepared by means of an orthogonal experimental design and their absorption spectra were recorded and processed by CWT. The CWT-PLS model was constructed by regression between the wavelet coefficients and concentration matrices and validation was performed by both cross validation and external validation sets. Both methods were successfully applied for determination of the studied drugs in pharmaceutical formulations.
DEFF Research Database (Denmark)
Johansen, Søren
2008-01-01
The reduced rank regression model is a multivariate regression model with a coefficient matrix with reduced rank. The reduced rank regression algorithm is an estimation procedure, which estimates the reduced rank regression model. It is related to canonical correlations and involves calculating...
Regression analysis with categorized regression calibrated exposure: some interesting findings
Directory of Open Access Journals (Sweden)
Hjartåker Anette
2006-07-01
Full Text Available Abstract Background Regression calibration as a method for handling measurement error is becoming increasingly well-known and used in epidemiologic research. However, the standard version of the method is not appropriate for exposure analyzed on a categorical (e.g. quintile scale, an approach commonly used in epidemiologic studies. A tempting solution could then be to use the predicted continuous exposure obtained through the regression calibration method and treat it as an approximation to the true exposure, that is, include the categorized calibrated exposure in the main regression analysis. Methods We use semi-analytical calculations and simulations to evaluate the performance of the proposed approach compared to the naive approach of not correcting for measurement error, in situations where analyses are performed on quintile scale and when incorporating the original scale into the categorical variables, respectively. We also present analyses of real data, containing measures of folate intake and depression, from the Norwegian Women and Cancer study (NOWAC. Results In cases where extra information is available through replicated measurements and not validation data, regression calibration does not maintain important qualities of the true exposure distribution, thus estimates of variance and percentiles can be severely biased. We show that the outlined approach maintains much, in some cases all, of the misclassification found in the observed exposure. For that reason, regression analysis with the corrected variable included on a categorical scale is still biased. In some cases the corrected estimates are analytically equal to those obtained by the naive approach. Regression calibration is however vastly superior to the naive method when applying the medians of each category in the analysis. Conclusion Regression calibration in its most well-known form is not appropriate for measurement error correction when the exposure is analyzed on a
Geedipally, Srinivas Reddy; Lord, Dominique; Dhavala, Soma Sekhar
2012-03-01
There has been a considerable amount of work devoted by transportation safety analysts to the development and application of new and innovative models for analyzing crash data. One important characteristic about crash data that has been documented in the literature is related to datasets that contained a large amount of zeros and a long or heavy tail (which creates highly dispersed data). For such datasets, the number of sites where no crash is observed is so large that traditional distributions and regression models, such as the Poisson and Poisson-gamma or negative binomial (NB) models cannot be used efficiently. To overcome this problem, the NB-Lindley (NB-L) distribution has recently been introduced for analyzing count data that are characterized by excess zeros. The objective of this paper is to document the application of a NB generalized linear model with Lindley mixed effects (NB-L GLM) for analyzing traffic crash data. The study objective was accomplished using simulated and observed datasets. The simulated dataset was used to show the general performance of the model. The model was then applied to two datasets based on observed data. One of the dataset was characterized by a large amount of zeros. The NB-L GLM was compared with the NB and zero-inflated models. Overall, the research study shows that the NB-L GLM not only offers superior performance over the NB and zero-inflated models when datasets are characterized by a large number of zeros and a long tail, but also when the crash dataset is highly dispersed. Published by Elsevier Ltd.
Sharma, Tarang; Gøtzsche, Peter C; Kuss, Oliver
2017-11-01
The aim of the study was to identify the validity of effect estimates for serious rare adverse events in clinical study reports of antidepressants trials, across different meta-analysis methods. Four serious rare adverse events (all-cause mortality, suicidality, aggressive behavior, and akathisia) were meta-analyzed using different methods. The Yusuf-Peto odds ratio ignores studies with no events and was compared with the alternative approaches of generalized linear mixed models (GLMMs), conditional logistic regression, a Bayesian approach using Markov Chain Monte Carlo (MCMC), and a beta-binomial regression model. The estimates for the four outcomes did not change substantially across the different methods; the Yusuf-Peto method underestimated the treatment harm and overestimated its precision, especially when the estimated odds ratio deviated greatly from 1. For example, the odds ratio for suicidality for children and adolescents was 2.39 (95% confidence interval = 1.32-4.33), using the Yusuf-Peto method but increased to 2.64 (1.33-5.26) using conditional logistic regression, to 2.69 (1.19-6.09) using beta-binomial, to 2.73 (1.37-5.42) using the GLMM, and finally to 2.87 (1.42-5.98) using the MCMC approach. The method used for meta-analysis of rare events data influences the estimates obtained, and the exclusion of double-zero event studies can give misleading results. To ensure reduction of bias and erroneous inferences, sensitivity analyses should be performed using different methods instead of the Yusuf-Peto approach, in particular the beta-binomial method, which was shown to be superior through a simulation study. Copyright © 2017 Elsevier Inc. All rights reserved.
Yi, Jun; Yang, Wenhong; Sun, Wen-Hua; Nomura, Kotohiro; Hada, Masahiko
2017-11-30
The NMR chemical shifts of vanadium ( 51 V) in (imido)vanadium(V) dichloride complexes with imidazolin-2-iminato and imidazolidin-2-iminato ligands were calculated by the density functional theory (DFT) method with GIAO. The calculated 51 V NMR chemical shifts were analyzed by the multiple linear regression (MLR) analysis (MLRA) method with a series of calculated molecular properties. Some of calculated NMR chemical shifts were incorrect using the optimized molecular geometries of the X-ray structures. After the global minimum geometries of all of the molecules were determined, the trend of the observed chemical shifts was well reproduced by the present DFT method. The MLRA method was performed to investigate the correlation between the 51 V NMR chemical shift and the natural charge, band energy gap, and Wiberg bond index of the V═N bond. The 51 V NMR chemical shifts obtained with the present MLR model were well reproduced with a correlation coefficient of 0.97.
Directory of Open Access Journals (Sweden)
Tudor DRUGAN
2003-08-01
Full Text Available The aim of the paper was to present the usefulness of the binomial distribution in studying of the contingency tables and the problems of approximation to normality of binomial distribution (the limits, advantages, and disadvantages. The classification of the medical keys parameters reported in medical literature and expressing them using the contingency table units based on their mathematical expressions restrict the discussion of the confidence intervals from 34 parameters to 9 mathematical expressions. The problem of obtaining different information starting with the computed confidence interval for a specified method, information like confidence intervals boundaries, percentages of the experimental errors, the standard deviation of the experimental errors and the deviation relative to significance level was solves through implementation in PHP programming language of original algorithms. The cases of expression, which contain two binomial variables, were separately treated. An original method of computing the confidence interval for the case of two-variable expression was proposed and implemented. The graphical representation of the expression of two binomial variables for which the variation domain of one of the variable depend on the other variable was a real problem because the most of the software used interpolation in graphical representation and the surface maps were quadratic instead of triangular. Based on an original algorithm, a module was implements in PHP in order to represent graphically the triangular surface plots. All the implementation described above was uses in computing the confidence intervals and estimating their performance for binomial distributions sample sizes and variable.
A fast algorithm for computing binomial coefficients modulo powers of two.
Andreica, Mugurel Ionut
2013-01-01
I present a new algorithm for computing binomial coefficients modulo 2N. The proposed method has an O(N3·Multiplication(N)+N4) preprocessing time, after which a binomial coefficient C(P, Q) with 0≤Q≤P≤2N-1 can be computed modulo 2N in O(N2·log(N)·Multiplication(N)) time. Multiplication(N) denotes the time complexity of multiplying two N-bit numbers, which can range from O(N2) to O(N·log(N)·log(log(N))) or better. Thus, the overall time complexity for evaluating M binomial coefficients C(P, Q) modulo 2N with 0≤Q≤P≤2N-1 is O((N3+M·N2·log(N))·Multiplication(N)+N4). After preprocessing, we can actually compute binomial coefficients modulo any 2R with R≤N. For larger values of P and Q, variations of Lucas' theorem must be used first in order to reduce the computation to the evaluation of multiple (O(log(P))) binomial coefficients C(P', Q') (or restricted types of factorials P'!) modulo 2N with 0≤Q'≤P'≤2N-1.
Water-selective excitation of short T2 species with binomial pulses.
Deligianni, Xeni; Bär, Peter; Scheffler, Klaus; Trattnig, Siegfried; Bieri, Oliver
2014-09-01
For imaging of fibrous musculoskeletal components, ultra-short echo time methods are often combined with fat suppression. Due to the increased chemical shift, spectral excitation of water might become a favorable option at ultra-high fields. Thus, this study aims to compare and explore short binomial excitation schemes for spectrally selective imaging of fibrous tissue components with short transverse relaxation time (T2 ). Water selective 1-1-binomial excitation is compared with nonselective imaging using a sub-millisecond spoiled gradient echo technique for in vivo imaging of fibrous tissue at 3T and 7T. Simulations indicate a maximum signal loss from binomial excitation of approximately 30% in the limit of very short T2 (0.1 ms), as compared to nonselective imaging; decreasing rapidly with increasing field strength and increasing T2 , e.g., to 19% at 3T and 10% at 7T for T2 of 1 ms. In agreement with simulations, a binomial phase close to 90° yielded minimum signal loss: approximately 6% at 3T and close to 0% at 7T for menisci, and for ligaments 9% and 13%, respectively. Overall, for imaging of short-lived T2 components, short 1-1 binomial excitation schemes prove to offer marginal signal loss especially at ultra-high fields with overall improved scanning efficiency. Copyright © 2013 Wiley Periodicals, Inc.
Directory of Open Access Journals (Sweden)
Anke Hüls
2017-05-01
Full Text Available Antimicrobial resistance in livestock is a matter of general concern. To develop hygiene measures and methods for resistance prevention and control, epidemiological studies on a population level are needed to detect factors associated with antimicrobial resistance in livestock holdings. In general, regression models are used to describe these relationships between environmental factors and resistance outcome. Besides the study design, the correlation structures of the different outcomes of antibiotic resistance and structural zero measurements on the resistance outcome as well as on the exposure side are challenges for the epidemiological model building process. The use of appropriate regression models that acknowledge these complexities is essential to assure valid epidemiological interpretations. The aims of this paper are (i to explain the model building process comparing several competing models for count data (negative binomial model, quasi-Poisson model, zero-inflated model, and hurdle model and (ii to compare these models using data from a cross-sectional study on antibiotic resistance in animal husbandry. These goals are essential to evaluate which model is most suitable to identify potential prevention measures. The dataset used as an example in our analyses was generated initially to study the prevalence and associated factors for the appearance of cefotaxime-resistant Escherichia coli in 48 German fattening pig farms. For each farm, the outcome was the count of samples with resistant bacteria. There was almost no overdispersion and only moderate evidence of excess zeros in the data. Our analyses show that it is essential to evaluate regression models in studies analyzing the relationship between environmental factors and antibiotic resistances in livestock. After model comparison based on evaluation of model predictions, Akaike information criterion, and Pearson residuals, here the hurdle model was judged to be the most appropriate
INVESTIGATION OF E-MAIL TRAFFIC BY USING ZERO-INFLATED REGRESSION MODELS
Directory of Open Access Journals (Sweden)
Yılmaz KAYA
2012-06-01
Full Text Available Based on count data obtained with a value of zero may be greater than anticipated. These types of data sets should be used to analyze by regression methods taking into account zero values. Zero- Inflated Poisson (ZIP, Zero-Inflated negative binomial (ZINB, Poisson Hurdle (PH, negative binomial Hurdle (NBH are more common approaches in modeling more zero value possessing dependent variables than expected. In the present study, the e-mail traffic of Yüzüncü Yıl University in 2009 spring semester was investigated. ZIP and ZINB, PH and NBH regression methods were applied on the data set because more zeros counting (78.9% were found in data set than expected. ZINB and NBH regression considered zero dispersion and overdispersion were found to be more accurate results due to overdispersion and zero dispersion in sending e-mail. ZINB is determined to be best model accordingto Vuong statistics and information criteria.
Regression analysis by example
Chatterjee, Samprit
2012-01-01
Praise for the Fourth Edition: ""This book is . . . an excellent source of examples for regression analysis. It has been and still is readily readable and understandable."" -Journal of the American Statistical Association Regression analysis is a conceptually simple method for investigating relationships among variables. Carrying out a successful application of regression analysis, however, requires a balance of theoretical results, empirical rules, and subjective judgment. Regression Analysis by Example, Fifth Edition has been expanded
Nonparametric modal regression
Chen, Yen-Chi; Genovese, Christopher R.; Tibshirani, Ryan J.; Wasserman, Larry
2016-01-01
Modal regression estimates the local modes of the distribution of $Y$ given $X=x$, instead of the mean, as in the usual regression sense, and can hence reveal important structure missed by usual regression methods. We study a simple nonparametric method for modal regression, based on a kernel density estimate (KDE) of the joint distribution of $Y$ and $X$. We derive asymptotic error bounds for this method, and propose techniques for constructing confidence sets and prediction sets. The latter...
International Nuclear Information System (INIS)
Joshi, A.; Lawande, S.V.
1990-01-01
A systematic study of squeezing obtained from k-photon anharmonic oscillator (with interaction hamiltonian of the form (a † ) k , k ≥ 2) interacting with light whose statistics can be varied from sub-Poissonian to poissonian via binomial state of field and super-Poissonian to poissonian via negative binomial state of field is presented. The authors predict that for all values of k there is a tendency increase in squeezing with increased sub-Poissonian character of the field while the reverse is true with super-Poissonian field. They also present non-classical behavior of the first order coherence function explicitly for k = 2 case (i.e., for two-photon anharmonic oscillator model used for a Kerr-like medium) with variation in the statistics of the input light
Analysis of generalized negative binomial distributions attached to hyperbolic Landau levels
Energy Technology Data Exchange (ETDEWEB)
Chhaiba, Hassan, E-mail: chhaiba.hassan@gmail.com [Department of Mathematics, Faculty of Sciences, Ibn Tofail University, P.O. Box 133, Kénitra (Morocco); Demni, Nizar, E-mail: nizar.demni@univ-rennes1.fr [IRMAR, Université de Rennes 1, Campus de Beaulieu, 35042 Rennes Cedex (France); Mouayn, Zouhair, E-mail: mouayn@fstbm.ac.ma [Department of Mathematics, Faculty of Sciences and Technics (M’Ghila), Sultan Moulay Slimane, P.O. Box 523, Béni Mellal (Morocco)
2016-07-15
To each hyperbolic Landau level of the Poincaré disc is attached a generalized negative binomial distribution. In this paper, we compute the moment generating function of this distribution and supply its atomic decomposition as a perturbation of the negative binomial distribution by a finitely supported measure. Using the Mandel parameter, we also discuss the nonclassical nature of the associated coherent states. Next, we derive a Lévy-Khintchine-type representation of its characteristic function when the latter does not vanish and deduce that it is quasi-infinitely divisible except for the lowest hyperbolic Landau level corresponding to the negative binomial distribution. By considering the total variation of the obtained quasi-Lévy measure, we introduce a new infinitely divisible distribution for which we derive the characteristic function.
Possibility and Challenges of Conversion of Current Virus Species Names to Linnaean Binomials
Energy Technology Data Exchange (ETDEWEB)
Postler, Thomas S.; Clawson, Anna N.; Amarasinghe, Gaya K.; Basler, Christopher F.; Bavari, Sbina; Benkő, Mária; Blasdell, Kim R.; Briese, Thomas; Buchmeier, Michael J.; Bukreyev, Alexander; Calisher, Charles H.; Chandran, Kartik; Charrel, Rémi; Clegg, Christopher S.; Collins, Peter L.; Juan Carlos, De La Torre; Derisi, Joseph L.; Dietzgen, Ralf G.; Dolnik, Olga; Dürrwald, Ralf; Dye, John M.; Easton, Andrew J.; Emonet, Sébastian; Formenty, Pierre; Fouchier, Ron A. M.; Ghedin, Elodie; Gonzalez, Jean-Paul; Harrach, Balázs; Hewson, Roger; Horie, Masayuki; Jiāng, Dàohóng; Kobinger, Gary; Kondo, Hideki; Kropinski, Andrew M.; Krupovic, Mart; Kurath, Gael; Lamb, Robert A.; Leroy, Eric M.; Lukashevich, Igor S.; Maisner, Andrea; Mushegian, Arcady R.; Netesov, Sergey V.; Nowotny, Norbert; Patterson, Jean L.; Payne, Susan L.; PaWeska, Janusz T.; Peters, Clarence J.; Radoshitzky, Sheli R.; Rima, Bertus K.; Romanowski, Victor; Rubbenstroth, Dennis; Sabanadzovic, Sead; Sanfaçon, Hélène; Salvato, Maria S.; Schwemmle, Martin; Smither, Sophie J.; Stenglein, Mark D.; Stone, David M.; Takada, Ayato; Tesh, Robert B.; Tomonaga, Keizo; Tordo, Noël; Towner, Jonathan S.; Vasilakis, Nikos; Volchkov, Viktor E.; Wahl-Jensen, Victoria; Walker, Peter J.; Wang, Lin-Fa; Varsani, Arvind; Whitfield, Anna E.; Zerbini, F. Murilo; Kuhn, Jens H.
2016-10-22
Botanical, mycological, zoological, and prokaryotic species names follow the Linnaean format, consisting of an italicized Latinized binomen with a capitalized genus name and a lower case species epithet (e.g., Homo sapiens). Virus species names, however, do not follow a uniform format, and, even when binomial, are not Linnaean in style. In this thought exercise, we attempted to convert all currently official names of species included in the virus family Arenaviridae and the virus order Mononegavirales to Linnaean binomials, and to identify and address associated challenges and concerns. Surprisingly, this endeavor was not as complicated or time-consuming as even the authors of this article expected when conceiving the experiment. [Arenaviridae; binomials; ICTV; International Committee on Taxonomy of Viruses; Mononegavirales; virus nomenclature; virus taxonomy.
Analysis of generalized negative binomial distributions attached to hyperbolic Landau levels
International Nuclear Information System (INIS)
Chhaiba, Hassan; Demni, Nizar; Mouayn, Zouhair
2016-01-01
To each hyperbolic Landau level of the Poincaré disc is attached a generalized negative binomial distribution. In this paper, we compute the moment generating function of this distribution and supply its atomic decomposition as a perturbation of the negative binomial distribution by a finitely supported measure. Using the Mandel parameter, we also discuss the nonclassical nature of the associated coherent states. Next, we derive a Lévy-Khintchine-type representation of its characteristic function when the latter does not vanish and deduce that it is quasi-infinitely divisible except for the lowest hyperbolic Landau level corresponding to the negative binomial distribution. By considering the total variation of the obtained quasi-Lévy measure, we introduce a new infinitely divisible distribution for which we derive the characteristic function.
[Application of detecting and taking overdispersion into account in Poisson regression model].
Bouche, G; Lepage, B; Migeot, V; Ingrand, P
2009-08-01
Researchers often use the Poisson regression model to analyze count data. Overdispersion can occur when a Poisson regression model is used, resulting in an underestimation of variance of the regression model parameters. Our objective was to take overdispersion into account and assess its impact with an illustration based on the data of a study investigating the relationship between use of the Internet to seek health information and number of primary care consultations. Three methods, overdispersed Poisson, a robust estimator, and negative binomial regression, were performed to take overdispersion into account in explaining variation in the number (Y) of primary care consultations. We tested overdispersion in the Poisson regression model using the ratio of the sum of Pearson residuals over the number of degrees of freedom (chi(2)/df). We then fitted the three models and compared parameter estimation to the estimations given by Poisson regression model. Variance of the number of primary care consultations (Var[Y]=21.03) was greater than the mean (E[Y]=5.93) and the chi(2)/df ratio was 3.26, which confirmed overdispersion. Standard errors of the parameters varied greatly between the Poisson regression model and the three other regression models. Interpretation of estimates from two variables (using the Internet to seek health information and single parent family) would have changed according to the model retained, with significant levels of 0.06 and 0.002 (Poisson), 0.29 and 0.09 (overdispersed Poisson), 0.29 and 0.13 (use of a robust estimator) and 0.45 and 0.13 (negative binomial) respectively. Different methods exist to solve the problem of underestimating variance in the Poisson regression model when overdispersion is present. The negative binomial regression model seems to be particularly accurate because of its theorical distribution ; in addition this regression is easy to perform with ordinary statistical software packages.
Flexible survival regression modelling
DEFF Research Database (Denmark)
Cortese, Giuliana; Scheike, Thomas H; Martinussen, Torben
2009-01-01
Regression analysis of survival data, and more generally event history data, is typically based on Cox's regression model. We here review some recent methodology, focusing on the limitations of Cox's regression model. The key limitation is that the model is not well suited to represent time-varyi...
Logistic regression for circular data
Al-Daffaie, Kadhem; Khan, Shahjahan
2017-05-01
This paper considers the relationship between a binary response and a circular predictor. It develops the logistic regression model by employing the linear-circular regression approach. The maximum likelihood method is used to estimate the parameters. The Newton-Raphson numerical method is used to find the estimated values of the parameters. A data set from weather records of Toowoomba city is analysed by the proposed methods. Moreover, a simulation study is considered. The R software is used for all computations and simulations.
International Nuclear Information System (INIS)
Liu Tang-Kun; Zhang Kang-Long; Tao Yu; Shan Chuan-Jia; Liu Ji-Bing
2016-01-01
The temporal evolution of the degree of entanglement between two atoms in a system of the binomial optical field interacting with two arbitrary entangled atoms is investigated. The influence of the strength of the dipole–dipole interaction between two atoms, probabilities of the Bernoulli trial, and particle number of the binomial optical field on the temporal evolution of the atomic entanglement are discussed. The result shows that the two atoms are always in the entanglement state. Moreover, if and only if the two atoms are initially in the maximally entangled state, the entanglement evolution is not affected by the parameters, and the degree of entanglement is always kept as 1. (paper)
Parameter estimation of the zero inflated negative binomial beta exponential distribution
Sirichantra, Chutima; Bodhisuwan, Winai
2017-11-01
The zero inflated negative binomial-beta exponential (ZINB-BE) distribution is developed, it is an alternative distribution for the excessive zero counts with overdispersion. The ZINB-BE distribution is a mixture of two distributions which are Bernoulli and negative binomial-beta exponential distributions. In this work, some characteristics of the proposed distribution are presented, such as, mean and variance. The maximum likelihood estimation is applied to parameter estimation of the proposed distribution. Finally some results of Monte Carlo simulation study, it seems to have high-efficiency when the sample size is large.
On extinction time of a generalized endemic chain-binomial model.
Aydogmus, Ozgur
2016-09-01
We considered a chain-binomial epidemic model not conferring immunity after infection. Mean field dynamics of the model has been analyzed and conditions for the existence of a stable endemic equilibrium are determined. The behavior of the chain-binomial process is probabilistically linked to the mean field equation. As a result of this link, we were able to show that the mean extinction time of the epidemic increases at least exponentially as the population size grows. We also present simulation results for the process to validate our analytical findings. Copyright © 2016 Elsevier Inc. All rights reserved.
Binomial confidence intervals for testing non-inferiority or superiority: a practitioner's dilemma.
Pradhan, Vivek; Evans, John C; Banerjee, Tathagata
2016-08-01
In testing for non-inferiority or superiority in a single arm study, the confidence interval of a single binomial proportion is frequently used. A number of such intervals are proposed in the literature and implemented in standard software packages. Unfortunately, use of different intervals leads to conflicting conclusions. Practitioners thus face a serious dilemma in deciding which one to depend on. Is there a way to resolve this dilemma? We address this question by investigating the performances of ten commonly used intervals of a single binomial proportion, in the light of two criteria, viz., coverage and expected length of the interval. © The Author(s) 2013.
DEFF Research Database (Denmark)
Fitzenberger, Bernd; Wilke, Ralf Andreas
2015-01-01
if the mean regression model does not. We provide a short informal introduction into the principle of quantile regression which includes an illustrative application from empirical labor market research. This is followed by briefly sketching the underlying statistical model for linear quantile regression based......Quantile regression is emerging as a popular statistical approach, which complements the estimation of conditional mean models. While the latter only focuses on one aspect of the conditional distribution of the dependent variable, the mean, quantile regression provides more detailed insights...... by modeling conditional quantiles. Quantile regression can therefore detect whether the partial effect of a regressor on the conditional quantiles is the same for all quantiles or differs across quantiles. Quantile regression can provide evidence for a statistical relationship between two variables even...
Introduction to regression graphics
Cook, R Dennis
2009-01-01
Covers the use of dynamic and interactive computer graphics in linear regression analysis, focusing on analytical graphics. Features new techniques like plot rotation. The authors have composed their own regression code, using Xlisp-Stat language called R-code, which is a nearly complete system for linear regression analysis and can be utilized as the main computer program in a linear regression course. The accompanying disks, for both Macintosh and Windows computers, contain the R-code and Xlisp-Stat. An Instructor's Manual presenting detailed solutions to all the problems in the book is ava
Alternative Methods of Regression
Birkes, David
2011-01-01
Of related interest. Nonlinear Regression Analysis and its Applications Douglas M. Bates and Donald G. Watts ".an extraordinary presentation of concepts and methods concerning the use and analysis of nonlinear regression models.highly recommend[ed].for anyone needing to use and/or understand issues concerning the analysis of nonlinear regression models." --Technometrics This book provides a balance between theory and practice supported by extensive displays of instructive geometrical constructs. Numerous in-depth case studies illustrate the use of nonlinear regression analysis--with all data s
A binomial random sum of present value models in investment analysis
Βουδούρη, Αγγελική; Ντζιαχρήστος, Ευάγγελος
1997-01-01
Stochastic present value models have been widely adopted in financial theory and practice and play a very important role in capital budgeting and profit planning. The purpose of this paper is to introduce a binomial random sum of stochastic present value models and offer an application in investment analysis.
Justin S. Crotteau; Martin W. Ritchie; J. Morgan. Varner
2014-01-01
Many western USA fire regimes are typified by mixed-severity fire, which compounds the variability inherent to natural regeneration densities in associated forests. Tree regeneration data are often discrete and nonnegative; accordingly, we fit a series of Poisson and negative binomial variation models to conifer seedling counts across four distinct burn severities and...
Topology of unitary groups and the prime orders of binomial coefficients
Duan, HaiBao; Lin, XianZu
2017-09-01
Let $c:SU(n)\\rightarrow PSU(n)=SU(n)/\\mathbb{Z}_{n}$ be the quotient map of the special unitary group $SU(n)$ by its center subgroup $\\mathbb{Z}_{n}$. We determine the induced homomorphism $c^{\\ast}:$ $H^{\\ast}(PSU(n))\\rightarrow H^{\\ast}(SU(n))$ on cohomologies by computing with the prime orders of binomial coefficients
A Bayesian Approach to Functional Mixed Effect Modeling for Longitudinal Data with Binomial Outcomes
Kliethermes, Stephanie; Oleson, Jacob
2014-01-01
Longitudinal growth patterns are routinely seen in medical studies where individual and population growth is followed over a period of time. Many current methods for modeling growth presuppose a parametric relationship between the outcome and time (e.g., linear, quadratic); however, these relationships may not accurately capture growth over time. Functional mixed effects (FME) models provide flexibility in handling longitudinal data with nonparametric temporal trends. Although FME methods are well-developed for continuous, normally distributed outcome measures, nonparametric methods for handling categorical outcomes are limited. We consider the situation with binomially distributed longitudinal outcomes. Although percent correct data can be modeled assuming normality, estimates outside the parameter space are possible and thus estimated curves can be unrealistic. We propose a binomial FME model using Bayesian methodology to account for growth curves with binomial (percentage) outcomes. The usefulness of our methods is demonstrated using a longitudinal study of speech perception outcomes from cochlear implant users where we successfully model both the population and individual growth trajectories. Simulation studies also advocate the usefulness of the binomial model particularly when outcomes occur near the boundary of the probability parameter space and in situations with a small number of trials. PMID:24723495
A mixed-binomial model for Likert-type personality measures.
Allik, Jüri
2014-01-01
Personality measurement is based on the idea that values on an unobservable latent variable determine the distribution of answers on a manifest response scale. Typically, it is assumed in the Item Response Theory (IRT) that latent variables are related to the observed responses through continuous normal or logistic functions, determining the probability with which one of the ordered response alternatives on a Likert-scale item is chosen. Based on an analysis of 1731 self- and other-rated responses on the 240 NEO PI-3 questionnaire items, it was proposed that a viable alternative is a finite number of latent events which are related to manifest responses through a binomial function which has only one parameter-the probability with which a given statement is approved. For the majority of items, the best fit was obtained with a mixed-binomial distribution, which assumes two different subpopulations who endorse items with two different probabilities. It was shown that the fit of the binomial IRT model can be improved by assuming that about 10% of random noise is contained in the answers and by taking into account response biases toward one of the response categories. It was concluded that the binomial response model for the measurement of personality traits may be a workable alternative to the more habitual normal and logistic IRT models.
Confidence Intervals for Weighted Composite Scores under the Compound Binomial Error Model
Kim, Kyung Yong; Lee, Won-Chan
2018-01-01
Reporting confidence intervals with test scores helps test users make important decisions about examinees by providing information about the precision of test scores. Although a variety of estimation procedures based on the binomial error model are available for computing intervals for test scores, these procedures assume that items are randomly…
Binomial Coefficients Modulo a Prime--A Visualization Approach to Undergraduate Research
Bardzell, Michael; Poimenidou, Eirini
2011-01-01
In this article we present, as a case study, results of undergraduate research involving binomial coefficients modulo a prime "p." We will discuss how undergraduates were involved in the project, even with a minimal mathematical background beforehand. There are two main avenues of exploration described to discover these binomial…
Computational results on the compound binomial risk model with nonhomogeneous claim occurrences
Tuncel, A.; Tank, F.
2013-01-01
The aim of this paper is to give a recursive formula for non-ruin (survival) probability when the claim occurrences are nonhomogeneous in the compound binomial risk model. We give recursive formulas for non-ruin (survival) probability and for distribution of the total number of claims under the
Use of the negative binomial-truncated Poisson distribution in thunderstorm prediction
Cohen, A. C.
1971-01-01
A probability model is presented for the distribution of thunderstorms over a small area given that thunderstorm events (1 or more thunderstorms) are occurring over a larger area. The model incorporates the negative binomial and truncated Poisson distributions. Probability tables for Cape Kennedy for spring, summer, and fall months and seasons are presented. The computer program used to compute these probabilities is appended.
Determining order-up-to levels under periodic review for compound binomial (intermittent) demand
Teunter, R. H.; Syntetos, A. A.; Babai, M. Z.
2010-01-01
We propose a new method for determining order-up-to levels for intermittent demand items in a periodic review system. Contrary to existing methods, we exploit the intermittent character of demand by modelling lead time demand as a compound binomial process. in an extensive numerical study using
Negative binomial distribution for multiplicity distributions in e/sup +/e/sup -/ annihilation
International Nuclear Information System (INIS)
Chew, C.K.; Lim, Y.K.
1986-01-01
The authors show that the negative binomial distribution fits excellently the available charged-particle multiplicity distributions of e/sup +/e/sup -/ annihilation into hadrons at three different energies √s = 14, 22 and 34 GeV
Kliethermes, Stephanie; Oleson, Jacob
2014-08-15
Longitudinal growth patterns are routinely seen in medical studies where individual growth and population growth are followed up over a period of time. Many current methods for modeling growth presuppose a parametric relationship between the outcome and time (e.g., linear and quadratic); however, these relationships may not accurately capture growth over time. Functional mixed-effects (FME) models provide flexibility in handling longitudinal data with nonparametric temporal trends. Although FME methods are well developed for continuous, normally distributed outcome measures, nonparametric methods for handling categorical outcomes are limited. We consider the situation with binomially distributed longitudinal outcomes. Although percent correct data can be modeled assuming normality, estimates outside the parameter space are possible, and thus, estimated curves can be unrealistic. We propose a binomial FME model using Bayesian methodology to account for growth curves with binomial (percentage) outcomes. The usefulness of our methods is demonstrated using a longitudinal study of speech perception outcomes from cochlear implant users where we successfully model both the population and individual growth trajectories. Simulation studies also advocate the usefulness of the binomial model particularly when outcomes occur near the boundary of the probability parameter space and in situations with a small number of trials. Copyright © 2014 John Wiley & Sons, Ltd.
Time evolution of negative binomial optical field in a diffusion channel
International Nuclear Information System (INIS)
Liu Tang-Kun; Wu Pan-Pan; Shan Chuan-Jia; Liu Ji-Bing; Fan Hong-Yi
2015-01-01
We find the time evolution law of a negative binomial optical field in a diffusion channel. We reveal that by adjusting the diffusion parameter, the photon number can be controlled. Therefore, the diffusion process can be considered a quantum controlling scheme through photon addition. (paper)
Joint Analysis of Binomial and Continuous Traits with a Recursive Model
DEFF Research Database (Denmark)
Varona, Louis; Sorensen, Daniel
2014-01-01
This work presents a model for the joint analysis of a binomial and a Gaussian trait using a recursive parametrization that leads to a computationally efficient implementation. The model is illustrated in an analysis of mortality and litter size in two breeds of Danish pigs, Landrace and Yorkshir...
International Nuclear Information System (INIS)
Lo Franco, R.; Compagno, G.; Messina, A.; Napoli, A.
2007-01-01
We introduce the N-photon quantum superposition of two orthogonal generalized binomial states of an electromagnetic field. We then propose, using resonant atom-cavity interactions, nonconditional schemes to generate and reveal such a quantum superposition for the two-photon case in a single-mode high-Q cavity. We finally discuss the implementation of the proposed schemes
Learning Binomial Probability Concepts with Simulation, Random Numbers and a Spreadsheet
Rochowicz, John A., Jr.
2005-01-01
This paper introduces the reader to the concepts of binomial probability and simulation. A spreadsheet is used to illustrate these concepts. Random number generators are great technological tools for demonstrating the concepts of probability. Ideas of approximation, estimation, and mathematical usefulness provide numerous ways of learning…
Raw and Central Moments of Binomial Random Variables via Stirling Numbers
Griffiths, Martin
2013-01-01
We consider here the problem of calculating the moments of binomial random variables. It is shown how formulae for both the raw and the central moments of such random variables may be obtained in a recursive manner utilizing Stirling numbers of the first kind. Suggestions are also provided as to how students might be encouraged to explore this…
NBLDA: negative binomial linear discriminant analysis for RNA-Seq data.
Dong, Kai; Zhao, Hongyu; Tong, Tiejun; Wan, Xiang
2016-09-13
RNA-sequencing (RNA-Seq) has become a powerful technology to characterize gene expression profiles because it is more accurate and comprehensive than microarrays. Although statistical methods that have been developed for microarray data can be applied to RNA-Seq data, they are not ideal due to the discrete nature of RNA-Seq data. The Poisson distribution and negative binomial distribution are commonly used to model count data. Recently, Witten (Annals Appl Stat 5:2493-2518, 2011) proposed a Poisson linear discriminant analysis for RNA-Seq data. The Poisson assumption may not be as appropriate as the negative binomial distribution when biological replicates are available and in the presence of overdispersion (i.e., when the variance is larger than or equal to the mean). However, it is more complicated to model negative binomial variables because they involve a dispersion parameter that needs to be estimated. In this paper, we propose a negative binomial linear discriminant analysis for RNA-Seq data. By Bayes' rule, we construct the classifier by fitting a negative binomial model, and propose some plug-in rules to estimate the unknown parameters in the classifier. The relationship between the negative binomial classifier and the Poisson classifier is explored, with a numerical investigation of the impact of dispersion on the discriminant score. Simulation results show the superiority of our proposed method. We also analyze two real RNA-Seq data sets to demonstrate the advantages of our method in real-world applications. We have developed a new classifier using the negative binomial model for RNA-seq data classification. Our simulation results show that our proposed classifier has a better performance than existing works. The proposed classifier can serve as an effective tool for classifying RNA-seq data. Based on the comparison results, we have provided some guidelines for scientists to decide which method should be used in the discriminant analysis of RNA-Seq data
A Mechanistic Beta-Binomial Probability Model for mRNA Sequencing Data.
Smith, Gregory R; Birtwistle, Marc R
2016-01-01
A main application for mRNA sequencing (mRNAseq) is determining lists of differentially-expressed genes (DEGs) between two or more conditions. Several software packages exist to produce DEGs from mRNAseq data, but they typically yield different DEGs, sometimes markedly so. The underlying probability model used to describe mRNAseq data is central to deriving DEGs, and not surprisingly most softwares use different models and assumptions to analyze mRNAseq data. Here, we propose a mechanistic justification to model mRNAseq as a binomial process, with data from technical replicates given by a binomial distribution, and data from biological replicates well-described by a beta-binomial distribution. We demonstrate good agreement of this model with two large datasets. We show that an emergent feature of the beta-binomial distribution, given parameter regimes typical for mRNAseq experiments, is the well-known quadratic polynomial scaling of variance with the mean. The so-called dispersion parameter controls this scaling, and our analysis suggests that the dispersion parameter is a continually decreasing function of the mean, as opposed to current approaches that impose an asymptotic value to the dispersion parameter at moderate mean read counts. We show how this leads to current approaches overestimating variance for moderately to highly expressed genes, which inflates false negative rates. Describing mRNAseq data with a beta-binomial distribution thus may be preferred since its parameters are relatable to the mechanistic underpinnings of the technique and may improve the consistency of DEG analysis across softwares, particularly for moderately to highly expressed genes.
P.M.C. de Boer (Paul); C.M. Hafner (Christian)
2005-01-01
textabstractWe argue in this paper that general ridge (GR) regression implies no major complication compared with simple ridge regression. We introduce a generalization of an explicit GR estimator derived by Hemmerle and by Teekens and de Boer and show that this estimator, which is more
Islam, Mohammad Mafijul; Alam, Morshed; Tariquzaman, Md; Kabir, Mohammad Alamgir; Pervin, Rokhsona; Begum, Munni; Khan, Md Mobarak Hossain
2013-01-08
Malnutrition is one of the principal causes of child mortality in developing countries including Bangladesh. According to our knowledge, most of the available studies, that addressed the issue of malnutrition among under-five children, considered the categorical (dichotomous/polychotomous) outcome variables and applied logistic regression (binary/multinomial) to find their predictors. In this study malnutrition variable (i.e. outcome) is defined as the number of under-five malnourished children in a family, which is a non-negative count variable. The purposes of the study are (i) to demonstrate the applicability of the generalized Poisson regression (GPR) model as an alternative of other statistical methods and (ii) to find some predictors of this outcome variable. The data is extracted from the Bangladesh Demographic and Health Survey (BDHS) 2007. Briefly, this survey employs a nationally representative sample which is based on a two-stage stratified sample of households. A total of 4,460 under-five children is analysed using various statistical techniques namely Chi-square test and GPR model. The GPR model (as compared to the standard Poisson regression and negative Binomial regression) is found to be justified to study the above-mentioned outcome variable because of its under-dispersion (variance variable namely mother's education, father's education, wealth index, sanitation status, source of drinking water, and total number of children ever born to a woman. Consistencies of our findings in light of many other studies suggest that the GPR model is an ideal alternative of other statistical models to analyse the number of under-five malnourished children in a family. Strategies based on significant predictors may improve the nutritional status of children in Bangladesh.
DEFF Research Database (Denmark)
Tan, Qihua; Bathum, L; Christiansen, L
2003-01-01
In this paper, we apply logistic regression models to measure genetic association with human survival for highly polymorphic and pleiotropic genes. By modelling genotype frequency as a function of age, we introduce a logistic regression model with polytomous responses to handle the polymorphic...... situation. Genotype and allele-based parameterization can be used to investigate the modes of gene action and to reduce the number of parameters, so that the power is increased while the amount of multiple testing minimized. A binomial logistic regression model with fractional polynomials is used to capture...
Directory of Open Access Journals (Sweden)
F.M.O. Borges
2003-12-01
que significou pouca influência da metodologia sobre essa medida. A FDN não mostrou ser melhor preditor de EM do que a FB.One experiment was run with broiler chickens, to obtain prediction equations for metabolizable energy (ME based on feedstuffs chemical analyses, and determined ME of wheat grain and its by-products, using four different methodologies. Seven wheat grain by-products were used in five treatments: wheat grain, wheat germ, white wheat flour, dark wheat flour, wheat bran for human use, wheat bran for animal use and rough wheat bran. Based on chemical analyses of crude fiber (CF, ether extract (EE, crude protein (CP, ash (AS and starch (ST of the feeds and the determined values of apparent energy (MEA, true energy (MEV, apparent corrected energy (MEAn and true energy corrected by nitrogen balance (MEVn in five treatments, prediction equations were obtained using the stepwise procedure. CF showed the best relationship with metabolizable energy values, however, this variable alone was not enough for a good estimate of the energy values (R² below 0.80. When EE and CP were included in the equations, R² increased to 0.90 or higher in most estimates. When the equations were calculated with all treatments, the equation for MEA were less precise and R² decreased. When ME data of the traditional or force-feeding methods were used separately, the precision of the equations increases (R² higher than 0.85. For MEV and MEVn values, the best multiple linear equations included CF, EE and CP (R²>0.90, independently of using all experimental data or separating by methodology. The estimates of MEVn values showed high precision and the linear coefficients (a of the equations were similar for all treatments or methodologies. Therefore, it explains the small influence of the different methodologies on this parameter. NDF was not a better predictor of ME than CF.
Obtaining adjusted prevalence ratios from logistic regression models in cross-sectional studies.
Bastos, Leonardo Soares; Oliveira, Raquel de Vasconcellos Carvalhaes de; Velasque, Luciane de Souza
2015-03-01
In the last decades, the use of the epidemiological prevalence ratio (PR) instead of the odds ratio has been debated as a measure of association in cross-sectional studies. This article addresses the main difficulties in the use of statistical models for the calculation of PR: convergence problems, availability of tools and inappropriate assumptions. We implement the direct approach to estimate the PR from binary regression models based on two methods proposed by Wilcosky & Chambless and compare with different methods. We used three examples and compared the crude and adjusted estimate of PR, with the estimates obtained by use of log-binomial, Poisson regression and the prevalence odds ratio (POR). PRs obtained from the direct approach resulted in values close enough to those obtained by log-binomial and Poisson, while the POR overestimated the PR. The model implemented here showed the following advantages: no numerical instability; assumes adequate probability distribution and, is available through the R statistical package.
Multilingual speaker age recognition: regression analyses on the Lwazi corpus
CSIR Research Space (South Africa)
Feld, M
2009-12-01
Full Text Available towards improved understanding of multilingual speech processing, the current contribution investigates how an important para-linguistic aspect of speech, namely speaker age, depends on the language spoken. In particular, the authors study how certain...
Weisberg, Sanford
2013-01-01
Praise for the Third Edition ""...this is an excellent book which could easily be used as a course text...""-International Statistical Institute The Fourth Edition of Applied Linear Regression provides a thorough update of the basic theory and methodology of linear regression modeling. Demonstrating the practical applications of linear regression analysis techniques, the Fourth Edition uses interesting, real-world exercises and examples. Stressing central concepts such as model building, understanding parameters, assessing fit and reliability, and drawing conclusions, the new edition illus
Hosmer, David W; Sturdivant, Rodney X
2013-01-01
A new edition of the definitive guide to logistic regression modeling for health science and other applications This thoroughly expanded Third Edition provides an easily accessible introduction to the logistic regression (LR) model and highlights the power of this model by examining the relationship between a dichotomous outcome and a set of covariables. Applied Logistic Regression, Third Edition emphasizes applications in the health sciences and handpicks topics that best suit the use of modern statistical software. The book provides readers with state-of-
Grégoire, G.
2014-12-01
This chapter deals with the multiple linear regression. That is we investigate the situation where the mean of a variable depends linearly on a set of covariables. The noise is supposed to be gaussian. We develop the least squared method to get the parameter estimators and estimates of their precisions. This leads to design confidence intervals, prediction intervals, global tests, individual tests and more generally tests of submodels defined by linear constraints. Methods for model's choice and variables selection, measures of the quality of the fit, residuals study, diagnostic methods are presented. Finally identification of departures from the model's assumptions and the way to deal with these problems are addressed. A real data set is used to illustrate the methodology with software R. Note that this chapter is intended to serve as a guide for other regression methods, like logistic regression or AFT models and Cox regression.
Kuss, Oliver; Hoyer, Annika; Solms, Alexander
2014-01-15
There are still challenges when meta-analyzing data from studies on diagnostic accuracy. This is mainly due to the bivariate nature of the response where information on sensitivity and specificity must be summarized while accounting for their correlation within a single trial. In this paper, we propose a new statistical model for the meta-analysis for diagnostic accuracy studies. This model uses beta-binomial distributions for the marginal numbers of true positives and true negatives and links these margins by a bivariate copula distribution. The new model comes with all the features of the current standard model, a bivariate logistic regression model with random effects, but has the additional advantages of a closed likelihood function and a larger flexibility for the correlation structure of sensitivity and specificity. In a simulation study, which compares three copula models and two implementations of the standard model, the Plackett and the Gauss copula do rarely perform worse but frequently better than the standard model. We use an example from a meta-analysis to judge the diagnostic accuracy of telomerase (a urinary tumor marker) for the diagnosis of primary bladder cancer for illustration. Copyright © 2013 John Wiley & Sons, Ltd.
Glyph: Symbolic Regression Tools
Quade, Markus; Gout, Julien; Abel, Markus
2018-01-01
We present Glyph - a Python package for genetic programming based symbolic regression. Glyph is designed for usage let by numerical simulations let by real world experiments. For experimentalists, glyph-remote provides a separation of tasks: a ZeroMQ interface splits the genetic programming optimization task from the evaluation of an experimental (or numerical) run. Glyph can be accessed at http://github.com/ambrosys/glyph . Domain experts are be able to employ symbolic regression in their ex...
Pansharpening via sparse regression
Tang, Songze; Xiao, Liang; Liu, Pengfei; Huang, Lili; Zhou, Nan; Xu, Yang
2017-09-01
Pansharpening is an effective way to enhance the spatial resolution of a multispectral (MS) image by fusing it with a provided panchromatic image. Instead of restricting the coding coefficients of low-resolution (LR) and high-resolution (HR) images to be equal, we propose a pansharpening approach via sparse regression in which the relationship between sparse coefficients of HR and LR MS images is modeled by ridge regression and elastic-net regression simultaneously learning the corresponding dictionaries. The compact dictionaries are learned based on the sampled patch pairs from the high- and low-resolution images, which can greatly characterize the structural information of the LR MS and HR MS images. Later, taking the complex relationship between the coding coefficients of LR MS and HR MS images into account, the ridge regression is used to characterize the relationship of intrapatches. The elastic-net regression is employed to describe the relationship of interpatches. Thus, the HR MS image can be almost identically reconstructed by multiplying the HR dictionary and the calculated sparse coefficient vector with the learned regression relationship. The simulated and real experimental results illustrate that the proposed method outperforms several well-known methods, both quantitatively and perceptually.
Wagner, Brandie; Riggs, Paula; Mikulich-Gilbertson, Susan
2015-01-01
It is important to correctly understand the associations among addiction to multiple drugs and between co-occurring substance use and psychiatric disorders. Substance-specific outcomes (e.g. number of days used cannabis) have distributional characteristics which range widely depending on the substance and the sample being evaluated. We recommend a four-part strategy for determining the appropriate distribution for modeling substance use data. We demonstrate this strategy by comparing the model fit and resulting inferences from applying four different distributions to model use of substances that range greatly in the prevalence and frequency of their use. Using Timeline Followback (TLFB) data from a previously-published study, we used negative binomial, beta-binomial and their zero-inflated counterparts to model proportion of days during treatment of cannabis, cigarettes, alcohol, and opioid use. The fit for each distribution was evaluated with statistical model selection criteria, visual plots and a comparison of the resulting inferences. We demonstrate the feasibility and utility of modeling each substance individually and show that no single distribution provides the best fit for all substances. Inferences regarding use of each substance and associations with important clinical variables were not consistent across models and differed by substance. Thus, the distribution chosen for modeling substance use must be carefully selected and evaluated because it may impact the resulting conclusions. Furthermore, the common procedure of aggregating use across different substances may not be ideal.
Possibility and challenges of conversion of current virus species names to Linnaean binomials
Thomas, Postler; Clawson, Anna N.; Amarasinghe, Gaya K.; Basler, Christopher F.; Bavari, Sina; Benko, Maria; Blasdell, Kim R.; Briese, Thomas; Buchmeier, Michael J.; Bukreyev, Alexander; Calisher, Charles H.; Chandran, Kartik; Charrel, Remi; Clegg, Christopher S.; Collins, Peter L.; De la Torre, Juan Carlos; DeRisi, Joseph L.; Dietzgen, Ralf G.; Dolnik, Olga; Durrwald, Ralf; Dye, John M.; Easton, Andrew J.; Emonet, Sebastian; Formenty, Pierre; Fouchier, Ron A. M.; Ghedin, Elodie; Gonzalez, Jean-Paul; Harrach, Balazs; Hewson, Roger; Horie, Masayuki; Jiang, Daohong; Kobinger, Gary P.; Kondo, Hideki; Kropinski, Andrew; Krupovic, Mart; Kurath, Gael; Lamb, Robert A.; Leroy, Eric M.; Lukashevich, Igor S.; Maisner, Andrea; Mushegian, Arcady; Netesov, Sergey V.; Nowotny, Norbert; Patterson, Jean L.; Payne, Susan L.; Paweska, Janusz T.; Peters, C.J.; Radoshitzky, Sheli; Rima, Bertus K.; Romanowski, Victor; Rubbenstroth, Dennis; Sabanadzovic, Sead; Sanfacon, Helene; Salvato , Maria; Schwemmle, Martin; Smither, Sophie J.; Stenglein, Mark; Stone, D.M.; Takada , Ayato; Tesh, Robert B.; Tomonaga, Keizo; Tordo, N.; Towner, Jonathan S.; Vasilakis, Nikos; Volchkov, Victor E.; Jensen, Victoria; Walker, Peter J.; Wang, Lin-Fa; Varsani, Arvind; Whitfield , Anna E.; Zerbini, Francisco Murilo; Kuhn, Jens H.
2017-01-01
Botanical, mycological, zoological, and prokaryotic species names follow the Linnaean format, consisting of an italicized Latinized binomen with a capitalized genus name and a lower case species epithet (e.g., Homo sapiens). Virus species names, however, do not follow a uniform format, and, even when binomial, are not Linnaean in style. In this thought exercise, we attempted to convert all currently official names of species included in the virus family Arenaviridae and the virus order Mononegavirales to Linnaean binomials, and to identify and address associated challenges and concerns. Surprisingly, this endeavor was not as complicated or time-consuming as even the authors of this article expected when conceiving the experiment.
Poisson and negative binomial item count techniques for surveys with sensitive question.
Tian, Guo-Liang; Tang, Man-Lai; Wu, Qin; Liu, Yin
2017-04-01
Although the item count technique is useful in surveys with sensitive questions, privacy of those respondents who possess the sensitive characteristic of interest may not be well protected due to a defect in its original design. In this article, we propose two new survey designs (namely the Poisson item count technique and negative binomial item count technique) which replace several independent Bernoulli random variables required by the original item count technique with a single Poisson or negative binomial random variable, respectively. The proposed models not only provide closed form variance estimate and confidence interval within [0, 1] for the sensitive proportion, but also simplify the survey design of the original item count technique. Most importantly, the new designs do not leak respondents' privacy. Empirical results show that the proposed techniques perform satisfactorily in the sense that it yields accurate parameter estimate and confidence interval.
The option to expand a project: its assessment with the binomial options pricing model
Directory of Open Access Journals (Sweden)
Salvador Cruz Rambaud
Full Text Available Traditional methods of investment appraisal, like the Net Present Value, are not able to include the value of the operational flexibility of the project. In this paper, real options, and more specifically the option to expand, are assumed to be included in the project information in addition to the expected cash flows. Thus, to calculate the total value of the project, we are going to apply the methodology of the Net Present Value to the different scenarios derived from the existence of the real option to expand. Taking into account the analogy between real and financial options, the value of including an option to expand is explored by using the binomial options pricing model. In this way, estimating the value of the option to expand is a tool which facilitates the control of the uncertainty element implicit in the project. Keywords: Real options, Option to expand, Binomial options pricing model, Investment project appraisal
A Bayesian non-inferiority test for two independent binomial proportions.
Kawasaki, Yohei; Miyaoka, Etsuo
2013-01-01
In drug development, non-inferiority tests are often employed to determine the difference between two independent binomial proportions. Many test statistics for non-inferiority are based on the frequentist framework. However, research on non-inferiority in the Bayesian framework is limited. In this paper, we suggest a new Bayesian index τ = P(π₁ > π₂-Δ₀|X₁, X₂), where X₁ and X₂ denote binomial random variables for trials n1 and n₂, and parameters π₁ and π₂ , respectively, and the non-inferiority margin is Δ₀> 0. We show two calculation methods for τ, an approximate method that uses normal approximation and an exact method that uses an exact posterior PDF. We compare the approximate probability with the exact probability for τ. Finally, we present the results of actual clinical trials to show the utility of index τ. Copyright © 2013 John Wiley & Sons, Ltd.
Application of a binomial cusum control chart to monitor one drinking water indicator
Directory of Open Access Journals (Sweden)
Elisa Henning
2014-02-01
Full Text Available The aim of this study is to analyze the use of a binomial cumulative sum chart (CUSUM to monitor the presence of total coliforms, biological indicators of quality of water supplies in water treatment processes. The sample series were monthly taken from a water treatment plant and were analyzed from 2007 to 2009. The statistical treatment of the data was performed using GNU R, and routines were created for the approximation of the upper limit of the binomial CUSUM chart. Furthermore, a comparative study was conducted to investigate whether there is a significant difference in sensitivity between the use of CUSUM and the traditional Shewhart chart, the most commonly used chart in process monitoring. The results obtained demonstrate that this study was essential for making the right choice in selecting a chart for the statistical analysis of this process.
Directory of Open Access Journals (Sweden)
Mok Tik
2014-06-01
Full Text Available This study formulates regression of vector data that will enable statistical analysis of various geodetic phenomena such as, polar motion, ocean currents, typhoon/hurricane tracking, crustal deformations, and precursory earthquake signals. The observed vector variable of an event (dependent vector variable is expressed as a function of a number of hypothesized phenomena realized also as vector variables (independent vector variables and/or scalar variables that are likely to impact the dependent vector variable. The proposed representation has the unique property of solving the coefficients of independent vector variables (explanatory variables also as vectors, hence it supersedes multivariate multiple regression models, in which the unknown coefficients are scalar quantities. For the solution, complex numbers are used to rep- resent vector information, and the method of least squares is deployed to estimate the vector model parameters after transforming the complex vector regression model into a real vector regression model through isomorphism. Various operational statistics for testing the predictive significance of the estimated vector parameter coefficients are also derived. A simple numerical example demonstrates the use of the proposed vector regression analysis in modeling typhoon paths.
Nested (inverse) binomial sums and new iterated integrals for massive Feynman diagrams
International Nuclear Information System (INIS)
Ablinger, Jakob; Schneider, Carsten; Bluemlein, Johannes; Raab, Clemens G.
2014-07-01
Nested sums containing binomial coefficients occur in the computation of massive operatormatrix elements. Their associated iterated integrals lead to alphabets including radicals, for which we determined a suitable basis. We discuss algorithms for converting between sum and integral representations, mainly relying on the Mellin transform. To aid the conversion we worked out dedicated rewrite rules, based on which also some general patterns emerging in the process can be obtained.
Study on Emission Measurement of Vehicle on Road Based on Binomial Logit Model
Aly, Sumarni Hamid; Selintung, Mary; Ramli, Muhammad Isran; Sumi, Tomonori
2011-01-01
This research attempts to evaluate emission measurement of on road vehicle. In this regard, the research develops failure probability model of vehicle emission test for passenger car which utilize binomial logit model. The model focuses on failure of CO and HC emission test for gasoline cars category and Opacity emission test for diesel-fuel cars category as dependent variables, while vehicle age, engine size, brand and type of the cars as independent variables. In order to imp...
Directory of Open Access Journals (Sweden)
Patricio Peña-Rehbein
Full Text Available This paper describes the frequency and number of Sphyrion laevigatum in the skin of Genypterus blacodes, an important economic resource in Chile. The analysis of a spatial distribution model indicated that the parasites tended to cluster. Variations in the number of parasites per host could be described by a negative binomial distribution. The maximum number of parasites observed per host was two.
Computation of Clebsch-Gordan and Gaunt coefficients using binomial coefficients
International Nuclear Information System (INIS)
Guseinov, I.I.; Oezmen, A.; Atav, Ue
1995-01-01
Using binomial coefficients the Clebsch-Gordan and Gaunt coefficients were calculated for extremely large quantum numbers. The main advantage of this approach is directly calculating these coefficients, instead of using recursion relations. Accuracy of the results is quite high for quantum numbers l 1 , and l 2 up to 100. Despite direct calculation, the CPU times are found comparable with those given in the related literature. 11 refs., 1 fig., 2 tabs
Negative binomial multiplicity distributions, a new empirical law for high energy collisions
International Nuclear Information System (INIS)
Van Hove, L.; Giovannini, A.
1987-01-01
For a variety of high energy hadron production reactions, recent experiments have confirmed the findings of the UA5 Collaboration that charged particle multiplicities in central (pseudo) rapidity intervals and in full phase space obey negative binomial (NB) distributions. The authors discuss the meaning of this new empirical law on the basis of new data and they show that they support the interpretation of the NB distributions in terms of a cascading mechanism of hardron production
Generalized harmonic, cyclotomic, and binomial sums, their polylogarithms and special numbers
Energy Technology Data Exchange (ETDEWEB)
Ablinger, J.; Schneider, C. [Johannes Kepler Univ., Linz (Austria). Research Inst. for Symbolic Computation (RISC); Bluemlein, J. [Deutsches Elektronen-Synchrotron (DESY), Zeuthen (Germany)
2013-10-15
A survey is given on mathematical structures which emerge in multi-loop Feynman diagrams. These are multiply nested sums, and, associated to them by an inverse Mellin transform, specific iterated integrals. Both classes lead to sets of special numbers. Starting with harmonic sums and polylogarithms we discuss recent extensions of these quantities as cyclotomic, generalized (cyclotomic), and binomially weighted sums, associated iterated integrals and special constants and their relations.
Estimasi Harga Multi-State European Call Option Menggunakan Model Binomial
Directory of Open Access Journals (Sweden)
Mila Kurniawaty, Endah Rokhmati
2011-05-01
Full Text Available Option merupakan kontrak yang memberikan hak kepada pemiliknya untuk membeli (call option atau menjual (put option sejumlah aset dasar tertentu (underlying asset dengan harga tertentu (strike price dalam jangka waktu tertentu (sebelum atau saat expiration date. Perkembangan option belakangan ini memunculkan banyak model pricing untuk mengestimasi harga option, salah satu model yang digunakan adalah formula Black-Scholes. Multi-state option merupakan sebuah option yang payoff-nya didasarkan pada dua atau lebih aset dasar. Ada beberapa metode yang dapat digunakan dalam mengestimasi harga call option, salah satunya masyarakat finance sering menggunakan model binomial untuk estimasi berbagai model option yang lebih luas seperti multi-state call option. Selanjutnya, dari hasil estimasi call option dengan model binomial didapatkan formula terbaik berdasarkan penghitungan eror dengan mean square error. Dari penghitungan eror didapatkan eror rata-rata dari masing-masing formula pada model binomial. Hasil eror rata-rata menunjukkan bahwa estimasi menggunakan formula 5 titik lebih baik dari pada estimasi menggunakan formula 4 titik.
Discrimination of numerical proportions: A comparison of binomial and Gaussian models.
Raidvee, Aire; Lember, Jüri; Allik, Jüri
2017-01-01
Observers discriminated the numerical proportion of two sets of elements (N = 9, 13, 33, and 65) that differed either by color or orientation. According to the standard Thurstonian approach, the accuracy of proportion discrimination is determined by irreducible noise in the nervous system that stochastically transforms the number of presented visual elements onto a continuum of psychological states representing numerosity. As an alternative to this customary approach, we propose a Thurstonian-binomial model, which assumes discrete perceptual states, each of which is associated with a certain visual element. It is shown that the probability β with which each visual element can be noticed and registered by the perceptual system can explain data of numerical proportion discrimination at least as well as the continuous Thurstonian-Gaussian model, and better, if the greater parsimony of the Thurstonian-binomial model is taken into account using AIC model selection. We conclude that Gaussian and binomial models represent two different fundamental principles-internal noise vs. using only a fraction of available information-which are both plausible descriptions of visual perception.
Genome-enabled predictions for binomial traits in sugar beet populations.
Biscarini, Filippo; Stevanato, Piergiorgio; Broccanello, Chiara; Stella, Alessandra; Saccomani, Massimo
2014-07-22
Genomic information can be used to predict not only continuous but also categorical (e.g. binomial) traits. Several traits of interest in human medicine and agriculture present a discrete distribution of phenotypes (e.g. disease status). Root vigor in sugar beet (B. vulgaris) is an example of binomial trait of agronomic importance. In this paper, a panel of 192 SNPs (single nucleotide polymorphisms) was used to genotype 124 sugar beet individual plants from 18 lines, and to classify them as showing "high" or "low" root vigor. A threshold model was used to fit the relationship between binomial root vigor and SNP genotypes, through the matrix of genomic relationships between individuals in a genomic BLUP (G-BLUP) approach. From a 5-fold cross-validation scheme, 500 testing subsets were generated. The estimated average cross-validation error rate was 0.000731 (0.073%). Only 9 out of 12326 test observations (500 replicates for an average test set size of 24.65) were misclassified. The estimated prediction accuracy was quite high. Such accurate predictions may be related to the high estimated heritability for root vigor (0.783) and to the few genes with large effect underlying the trait. Despite the sparse SNP panel, there was sufficient within-scaffold LD where SNPs with large effect on root vigor were located to allow for genome-enabled predictions to work.
Tollerup, Kris E; Marcum, Daniel; Wilson, Rob; Godfrey, Larry
2013-08-01
The two-spotted spider mite, Tetranychus urticae Koch, is an economic pest on peppermint [Mentha x piperita (L.), 'Black Mitcham'] grown in California. A sampling plan for T. urticae was developed under Pacific Northwest conditions in the early 1980s and has been used by California growers since approximately 1998. This sampling plan, however, is cumbersome and a poor predictor of T. urticae densities in California. Between June and August, the numbers of immature and adult T. urticae were counted on leaves at three commercial peppermint fields (sites) in 2010 and a single field in 2011. In each of seven locations per site, 45 leaves were sampled, that is, 9 leaves per five stems. Leaf samples were stratified by collecting three leaves from the top, middle, and bottom strata per stem. The on-plant distribution of T. urticae did not significantly differ among the stem strata through the growing season. Binomial and enumerative sampling plans were developed using generic Taylor's power law coefficient values. The best fit of our data for binomial sampling occurred using a tally threshold of T = 0. The optimum number of leaves required for T urticae at the critical density of five mites per leaf was 20 for the binomial and 23 for the enumerative sampling plans, respectively. Sampling models were validated using Resampling for Validation of Sampling Plan Software.
O'Donnell, Katherine M; Thompson, Frank R; Semlitsch, Raymond D
2015-01-01
Detectability of individual animals is highly variable and nearly always binomial mixture models to account for multiple sources of variation in detectability. The state process of the hierarchical model describes ecological mechanisms that generate spatial and temporal patterns in abundance, while the observation model accounts for the imperfect nature of counting individuals due to temporary emigration and false absences. We illustrate our model's potential advantages, including the allowance of temporary emigration between sampling periods, with a case study of southern red-backed salamanders Plethodon serratus. We fit our model and a standard binomial mixture model to counts of terrestrial salamanders surveyed at 40 sites during 3-5 surveys each spring and fall 2010-2012. Our models generated similar parameter estimates to standard binomial mixture models. Aspect was the best predictor of salamander abundance in our case study; abundance increased as aspect became more northeasterly. Increased time-since-rainfall strongly decreased salamander surface activity (i.e. availability for sampling), while higher amounts of woody cover objects and rocks increased conditional detection probability (i.e. probability of capture, given an animal is exposed to sampling). By explicitly accounting for both components of detectability, we increased congruence between our statistical modeling and our ecological understanding of the system. We stress the importance of choosing survey locations and protocols that maximize species availability and conditional detection probability to increase population parameter estimate reliability.
Analysis of railroad tank car releases using a generalized binomial model.
Liu, Xiang; Hong, Yili
2015-11-01
The United States is experiencing an unprecedented boom in shale oil production, leading to a dramatic growth in petroleum crude oil traffic by rail. In 2014, U.S. railroads carried over 500,000 tank carloads of petroleum crude oil, up from 9500 in 2008 (a 5300% increase). In light of continual growth in crude oil by rail, there is an urgent national need to manage this emerging risk. This need has been underscored in the wake of several recent crude oil release incidents. In contrast to highway transport, which usually involves a tank trailer, a crude oil train can carry a large number of tank cars, having the potential for a large, multiple-tank-car release incident. Previous studies exclusively assumed that railroad tank car releases in the same train accident are mutually independent, thereby estimating the number of tank cars releasing given the total number of tank cars derailed based on a binomial model. This paper specifically accounts for dependent tank car releases within a train accident. We estimate the number of tank cars releasing given the number of tank cars derailed based on a generalized binomial model. The generalized binomial model provides a significantly better description for the empirical tank car accident data through our numerical case study. This research aims to provide a new methodology and new insights regarding the further development of risk management strategies for improving railroad crude oil transportation safety. Copyright © 2015 Elsevier Ltd. All rights reserved.
DEFF Research Database (Denmark)
Bache, Stefan Holst
A new and alternative quantile regression estimator is developed and it is shown that the estimator is root n-consistent and asymptotically normal. The estimator is based on a minimax ‘deviance function’ and has asymptotically equivalent properties to the usual quantile regression estimator. It is......, however, a different and therefore new estimator. It allows for both linear- and nonlinear model specifications. A simple algorithm for computing the estimates is proposed. It seems to work quite well in practice but whether it has theoretical justification is still an open question....
Practical Session: Logistic Regression
Clausel, M.; Grégoire, G.
2014-12-01
An exercise is proposed to illustrate the logistic regression. One investigates the different risk factors in the apparition of coronary heart disease. It has been proposed in Chapter 5 of the book of D.G. Kleinbaum and M. Klein, "Logistic Regression", Statistics for Biology and Health, Springer Science Business Media, LLC (2010) and also by D. Chessel and A.B. Dufour in Lyon 1 (see Sect. 6 of http://pbil.univ-lyon1.fr/R/pdf/tdr341.pdf). This example is based on data given in the file evans.txt coming from http://www.sph.emory.edu/dkleinb/logreg3.htm#data.
Prediction, Regression and Critical Realism
DEFF Research Database (Denmark)
Næss, Petter
2004-01-01
This paper considers the possibility of prediction in land use planning, and the use of statistical research methods in analyses of relationships between urban form and travel behaviour. Influential writers within the tradition of critical realism reject the possibility of predicting social...... phenomena. This position is fundamentally problematic to public planning. Without at least some ability to predict the likely consequences of different proposals, the justification for public sector intervention into market mechanisms will be frail. Statistical methods like regression analyses are commonly...... seen as necessary in order to identify aggregate level effects of policy measures, but are questioned by many advocates of critical realist ontology. Using research into the relationship between urban structure and travel as an example, the paper discusses relevant research methods and the kinds...
Brenner, Tom; Chen, Johnny; Stait-Gardner, Tim; Zheng, Gang; Matsukawa, Shingo; Price, William S.
2018-03-01
A new family of binomial-like inversion sequences, named jump-and-return sandwiches (JRS), has been developed by inserting a binomial-like sequence into a standard jump-and-return sequence, discovered through use of a stochastic Genetic Algorithm optimisation. Compared to currently used binomial-like inversion sequences (e.g., 3-9-19 and W5), the new sequences afford wider inversion bands and narrower non-inversion bands with an equal number of pulses. As an example, two jump-and-return sandwich 10-pulse sequences achieved 95% inversion at offsets corresponding to 9.4% and 10.3% of the non-inversion band spacing, compared to 14.7% for the binomial-like W5 inversion sequence, i.e., they afforded non-inversion bands about two thirds the width of the W5 non-inversion band.
Software Regression Verification
2013-12-11
of recursive procedures. Acta Informatica , 45(6):403 – 439, 2008. [GS11] Benny Godlin and Ofer Strichman. Regression verifica- tion. Technical Report...functions. Therefore, we need to rede - fine m-term. – Mutual termination. If either function f or function f ′ (or both) is non- deterministic, then their
Multiple linear regression analysis
Edwards, T. R.
1980-01-01
Program rapidly selects best-suited set of coefficients. User supplies only vectors of independent and dependent data and specifies confidence level required. Program uses stepwise statistical procedure for relating minimal set of variables to set of observations; final regression contains only most statistically significant coefficients. Program is written in FORTRAN IV for batch execution and has been implemented on NOVA 1200.
Mechanisms of neuroblastoma regression
Brodeur, Garrett M.; Bagatell, Rochelle
2014-01-01
Recent genomic and biological studies of neuroblastoma have shed light on the dramatic heterogeneity in the clinical behaviour of this disease, which spans from spontaneous regression or differentiation in some patients, to relentless disease progression in others, despite intensive multimodality therapy. This evidence also suggests several possible mechanisms to explain the phenomena of spontaneous regression in neuroblastomas, including neurotrophin deprivation, humoral or cellular immunity, loss of telomerase activity and alterations in epigenetic regulation. A better understanding of the mechanisms of spontaneous regression might help to identify optimal therapeutic approaches for patients with these tumours. Currently, the most druggable mechanism is the delayed activation of developmentally programmed cell death regulated by the tropomyosin receptor kinase A pathway. Indeed, targeted therapy aimed at inhibiting neurotrophin receptors might be used in lieu of conventional chemotherapy or radiation in infants with biologically favourable tumours that require treatment. Alternative approaches consist of breaking immune tolerance to tumour antigens or activating neurotrophin receptor pathways to induce neuronal differentiation. These approaches are likely to be most effective against biologically favourable tumours, but they might also provide insights into treatment of biologically unfavourable tumours. We describe the different mechanisms of spontaneous neuroblastoma regression and the consequent therapeutic approaches. PMID:25331179
Bounded Gaussian process regression
DEFF Research Database (Denmark)
Jensen, Bjørn Sand; Nielsen, Jens Brehm; Larsen, Jan
2013-01-01
We extend the Gaussian process (GP) framework for bounded regression by introducing two bounded likelihood functions that model the noise on the dependent variable explicitly. This is fundamentally different from the implicit noise assumption in the previously suggested warped GP framework. We...
Bayesian logistic regression analysis
Van Erp, H.R.N.; Van Gelder, P.H.A.J.M.
2012-01-01
In this paper we present a Bayesian logistic regression analysis. It is found that if one wishes to derive the posterior distribution of the probability of some event, then, together with the traditional Bayes Theorem and the integrating out of nuissance parameters, the Jacobian transformation is an
Lee, JuHee; Park, Chang Gi; Choi, Moonki
2016-05-01
This study was conducted to identify risk factors that influence regular exercise among patients with Parkinson's disease in Korea. Parkinson's disease is prevalent in the elderly, and may lead to a sedentary lifestyle. Exercise can enhance physical and psychological health. However, patients with Parkinson's disease are less likely to exercise than are other populations due to physical disability. A secondary data analysis and cross-sectional descriptive study were conducted. A convenience sample of 106 patients with Parkinson's disease was recruited at an outpatient neurology clinic of a tertiary hospital in Korea. Demographic characteristics, disease-related characteristics (including disease duration and motor symptoms), self-efficacy for exercise, balance, and exercise level were investigated. Negative binomial regression and zero-inflated negative binomial regression for exercise count data were utilized to determine factors involved in exercise. The mean age of participants was 65.85 ± 8.77 years, and the mean duration of Parkinson's disease was 7.23 ± 6.02 years. Most participants indicated that they engaged in regular exercise (80.19%). Approximately half of participants exercised at least 5 days per week for 30 min, as recommended (51.9%). Motor symptoms were a significant predictor of exercise in the count model, and self-efficacy for exercise was a significant predictor of exercise in the zero model. Severity of motor symptoms was related to frequency of exercise. Self-efficacy contributed to the probability of exercise. Symptom management and improvement of self-efficacy for exercise are important to encourage regular exercise in patients with Parkinson's disease. Copyright © 2015 Elsevier Inc. All rights reserved.
Bakbergenuly, Ilyas; Morgenthaler, Stephan
2016-01-01
We study bias arising as a result of nonlinear transformations of random variables in random or mixed effects models and its effect on inference in group‐level studies or in meta‐analysis. The findings are illustrated on the example of overdispersed binomial distributions, where we demonstrate considerable biases arising from standard log‐odds and arcsine transformations of the estimated probability p^, both for single‐group studies and in combining results from several groups or studies in meta‐analysis. Our simulations confirm that these biases are linear in ρ, for small values of ρ, the intracluster correlation coefficient. These biases do not depend on the sample sizes or the number of studies K in a meta‐analysis and result in abysmal coverage of the combined effect for large K. We also propose bias‐correction for the arcsine transformation. Our simulations demonstrate that this bias‐correction works well for small values of the intraclass correlation. The methods are applied to two examples of meta‐analyses of prevalence. PMID:27192062
Bakbergenuly, Ilyas; Kulinskaya, Elena; Morgenthaler, Stephan
2016-07-01
We study bias arising as a result of nonlinear transformations of random variables in random or mixed effects models and its effect on inference in group-level studies or in meta-analysis. The findings are illustrated on the example of overdispersed binomial distributions, where we demonstrate considerable biases arising from standard log-odds and arcsine transformations of the estimated probability p̂, both for single-group studies and in combining results from several groups or studies in meta-analysis. Our simulations confirm that these biases are linear in ρ, for small values of ρ, the intracluster correlation coefficient. These biases do not depend on the sample sizes or the number of studies K in a meta-analysis and result in abysmal coverage of the combined effect for large K. We also propose bias-correction for the arcsine transformation. Our simulations demonstrate that this bias-correction works well for small values of the intraclass correlation. The methods are applied to two examples of meta-analyses of prevalence. © 2016 The Authors. Biometrical Journal Published by Wiley-VCH Verlag GmbH & Co. KGaA.
Archer, A.W.; Maples, C.G.
1989-01-01
Numerous departures from ideal relationships are revealed by Monte Carlo simulations of widely accepted binomial coefficients. For example, simulations incorporating varying levels of matrix sparseness (presence of zeros indicating lack of data) and computation of expected values reveal that not only are all common coefficients influenced by zero data, but also that some coefficients do not discriminate between sparse or dense matrices (few zero data). Such coefficients computationally merge mutually shared and mutually absent information and do not exploit all the information incorporated within the standard 2 ?? 2 contingency table; therefore, the commonly used formulae for such coefficients are more complicated than the actual range of values produced. Other coefficients do differentiate between mutual presences and absences; however, a number of these coefficients do not demonstrate a linear relationship to matrix sparseness. Finally, simulations using nonrandom matrices with known degrees of row-by-row similarities signify that several coefficients either do not display a reasonable range of values or are nonlinear with respect to known relationships within the data. Analyses with nonrandom matrices yield clues as to the utility of certain coefficients for specific applications. For example, coefficients such as Jaccard, Dice, and Baroni-Urbani and Buser are useful if correction of sparseness is desired, whereas the Russell-Rao coefficient is useful when sparseness correction is not desired. ?? 1989 International Association for Mathematical Geology.
Subset selection in regression
Miller, Alan
2002-01-01
Originally published in 1990, the first edition of Subset Selection in Regression filled a significant gap in the literature, and its critical and popular success has continued for more than a decade. Thoroughly revised to reflect progress in theory, methods, and computing power, the second edition promises to continue that tradition. The author has thoroughly updated each chapter, incorporated new material on recent developments, and included more examples and references. New in the Second Edition:A separate chapter on Bayesian methodsComplete revision of the chapter on estimationA major example from the field of near infrared spectroscopyMore emphasis on cross-validationGreater focus on bootstrappingStochastic algorithms for finding good subsets from large numbers of predictors when an exhaustive search is not feasible Software available on the Internet for implementing many of the algorithms presentedMore examplesSubset Selection in Regression, Second Edition remains dedicated to the techniques for fitting...
Ridge Regression Signal Processing
Kuhl, Mark R.
1990-01-01
The introduction of the Global Positioning System (GPS) into the National Airspace System (NAS) necessitates the development of Receiver Autonomous Integrity Monitoring (RAIM) techniques. In order to guarantee a certain level of integrity, a thorough understanding of modern estimation techniques applied to navigational problems is required. The extended Kalman filter (EKF) is derived and analyzed under poor geometry conditions. It was found that the performance of the EKF is difficult to predict, since the EKF is designed for a Gaussian environment. A novel approach is implemented which incorporates ridge regression to explain the behavior of an EKF in the presence of dynamics under poor geometry conditions. The basic principles of ridge regression theory are presented, followed by the derivation of a linearized recursive ridge estimator. Computer simulations are performed to confirm the underlying theory and to provide a comparative analysis of the EKF and the recursive ridge estimator.
Regression in organizational leadership.
Kernberg, O F
1979-02-01
The choice of good leaders is a major task for all organizations. Inforamtion regarding the prospective administrator's personality should complement questions regarding his previous experience, his general conceptual skills, his technical knowledge, and the specific skills in the area for which he is being selected. The growing psychoanalytic knowledge about the crucial importance of internal, in contrast to external, object relations, and about the mutual relationships of regression in individuals and in groups, constitutes an important practical tool for the selection of leaders.
Classification and regression trees
Breiman, Leo; Olshen, Richard A; Stone, Charles J
1984-01-01
The methodology used to construct tree structured rules is the focus of this monograph. Unlike many other statistical procedures, which moved from pencil and paper to calculators, this text's use of trees was unthinkable before computers. Both the practical and theoretical sides have been developed in the authors' study of tree methods. Classification and Regression Trees reflects these two sides, covering the use of trees as a data analysis method, and in a more mathematical framework, proving some of their fundamental properties.
DEFF Research Database (Denmark)
Hansen, Henrik; Tarp, Finn
2001-01-01
. There are, however, decreasing returns to aid, and the estimated effectiveness of aid is highly sensitive to the choice of estimator and the set of control variables. When investment and human capital are controlled for, no positive effect of aid is found. Yet, aid continues to impact on growth via...... investment. We conclude by stressing the need for more theoretical work before this kind of cross-country regressions are used for policy purposes....
Hilbe, Joseph M
2009-01-01
This book really does cover everything you ever wanted to know about logistic regression … with updates available on the author's website. Hilbe, a former national athletics champion, philosopher, and expert in astronomy, is a master at explaining statistical concepts and methods. Readers familiar with his other expository work will know what to expect-great clarity.The book provides considerable detail about all facets of logistic regression. No step of an argument is omitted so that the book will meet the needs of the reader who likes to see everything spelt out, while a person familiar with some of the topics has the option to skip "obvious" sections. The material has been thoroughly road-tested through classroom and web-based teaching. … The focus is on helping the reader to learn and understand logistic regression. The audience is not just students meeting the topic for the first time, but also experienced users. I believe the book really does meet the author's goal … .-Annette J. Dobson, Biometric...
DEFF Research Database (Denmark)
Ozenne, Brice; Sørensen, Anne Lyngholm; Scheike, Thomas
2017-01-01
In the presence of competing risks a prediction of the time-dynamic absolute risk of an event can be based on cause-specific Cox regression models for the event and the competing risks (Benichou and Gail, 1990). We present computationally fast and memory optimized C++ functions with an R interfac...... functionals. The software presented here is implemented in the riskRegression package.......In the presence of competing risks a prediction of the time-dynamic absolute risk of an event can be based on cause-specific Cox regression models for the event and the competing risks (Benichou and Gail, 1990). We present computationally fast and memory optimized C++ functions with an R interface...... for predicting the covariate specific absolute risks, their confidence intervals, and their confidence bands based on right censored time to event data. We provide explicit formulas for our implementation of the estimator of the (stratified) baseline hazard function in the presence of tied event times. As a by...
Brian S. Cade; Barry R. Noon; Rick D. Scherer; John J. Keane
2017-01-01
Counts of avian fledglings, nestlings, or clutch size that are bounded below by zero and above by some small integer form a discrete random variable distribution that is not approximated well by conventional parametric count distributions such as the Poisson or negative binomial. We developed a logistic quantile regression model to provide estimates of the empirical...
Directory of Open Access Journals (Sweden)
Kai Wu
2014-01-01
Full Text Available Background: In surgeries of closed calcaneal fractures, the lateral L-shaped incision is usually adopted. Undesirable post-operative healing of the incision is a common complication. In this retrospective study, controllable risk factors of incision complications after closed calcaneal fracture surgery through a lateral L-shaped incision are discussed and the effectiveness of clinical intervention is assessed. Materials and Methods: A review of medical records was conducted of 209 patients (239 calcaneal fractures surgically treated from June 2005 to October 2012. Univariate analyses were performed of seven controllable factors that might influence complications associated with the surgical incision. Binomial multiple logistic regression analysis was performed to determine factors of statistical significance. Results: Twenty-one fractures (8.79% involved surgical incision complications, including 8 (3.35% cases of wound dehiscence, 7 (2.93% of flap margin necrosis, 5 (2.09% of hematoma, and 1 (0.42% of osteomyelitis. Five factors were statistically significant : t0 he time from injury to surgery, operative duration, post-operative drainage, retraction of skin flap, bone grafting, and patients′ smoking habits. The results of multivariate analyses showed that surgeries performed within 7 days after fracture, operative time > 1.5 h, no drainage after surgery, static skin distraction, and patient smoking were risk factors for calcaneal incision complications. The post-operative duration of antibiotics and bone grafting made no significant difference. Conclusion: Complications after calcaneal surgeries may be reduced by postponing the surgery at least 7 days after fracture, shortening the time in surgery, implementing post-operative drainage, retracting skin flaps gently and for as short a time as possible, and prohibiting smoking.
Ridge Regression: A Regression Procedure for Analyzing correlated Independent Variables
Rakow, Ernest A.
1978-01-01
Ridge regression is a technique used to ameliorate the problem of highly correlated independent variables in multiple regression analysis. This paper explains the fundamentals of ridge regression and illustrates its use. (JKS)
[Ordinal logistic regression in epidemiological studies].
Abreu, Mery Natali Silva; Siqueira, Arminda Lucia; Caiaffa, Waleska Teixeira
2009-02-01
Ordinal logistic regression models have been developed for analysis of epidemiological studies. However, the adequacy of such models for adjustment has so far received little attention. In this article, we reviewed the most important ordinal regression models and common approaches used to verify goodness-of-fit, using R or Stata programs. We performed formal and graphical analyses to compare ordinal models using data sets on health conditions from the National Health and Nutrition Examination Survey (NHANES II).
Beta-binomial model for meta-analysis of odds ratios.
Bakbergenuly, Ilyas; Kulinskaya, Elena
2017-05-20
In meta-analysis of odds ratios (ORs), heterogeneity between the studies is usually modelled via the additive random effects model (REM). An alternative, multiplicative REM for ORs uses overdispersion. The multiplicative factor in this overdispersion model (ODM) can be interpreted as an intra-class correlation (ICC) parameter. This model naturally arises when the probabilities of an event in one or both arms of a comparative study are themselves beta-distributed, resulting in beta-binomial distributions. We propose two new estimators of the ICC for meta-analysis in this setting. One is based on the inverted Breslow-Day test, and the other on the improved gamma approximation by Kulinskaya and Dollinger (2015, p. 26) to the distribution of Cochran's Q. The performance of these and several other estimators of ICC on bias and coverage is studied by simulation. Additionally, the Mantel-Haenszel approach to estimation of ORs is extended to the beta-binomial model, and we study performance of various ICC estimators when used in the Mantel-Haenszel or the inverse-variance method to combine ORs in meta-analysis. The results of the simulations show that the improved gamma-based estimator of ICC is superior for small sample sizes, and the Breslow-Day-based estimator is the best for n⩾100. The Mantel-Haenszel-based estimator of OR is very biased and is not recommended. The inverse-variance approach is also somewhat biased for ORs≠1, but this bias is not very large in practical settings. Developed methods and R programs, provided in the Web Appendix, make the beta-binomial model a feasible alternative to the standard REM for meta-analysis of ORs. © 2017 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd. © 2017 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.
DEFF Research Database (Denmark)
Bordacconi, Mats Joe; Larsen, Martin Vinæs
2014-01-01
Humans are fundamentally primed for making causal attributions based on correlations. This implies that researchers must be careful to present their results in a manner that inhibits unwarranted causal attribution. In this paper, we present the results of an experiment that suggests regression...... more likely. Our experiment drew on a sample of 235 university students from three different social science degree programs (political science, sociology and economics), all of whom had received substantial training in statistics. The subjects were asked to compare and evaluate the validity...
Adaptive metric kernel regression
DEFF Research Database (Denmark)
Goutte, Cyril; Larsen, Jan
2000-01-01
Kernel smoothing is a widely used non-parametric pattern recognition technique. By nature, it suffers from the curse of dimensionality and is usually difficult to apply to high input dimensions. In this contribution, we propose an algorithm that adapts the input metric used in multivariate...... regression by minimising a cross-validation estimate of the generalisation error. This allows to automatically adjust the importance of different dimensions. The improvement in terms of modelling performance is illustrated on a variable selection task where the adaptive metric kernel clearly outperforms...
Adaptive Metric Kernel Regression
DEFF Research Database (Denmark)
Goutte, Cyril; Larsen, Jan
1998-01-01
Kernel smoothing is a widely used nonparametric pattern recognition technique. By nature, it suffers from the curse of dimensionality and is usually difficult to apply to high input dimensions. In this paper, we propose an algorithm that adapts the input metric used in multivariate regression...... by minimising a cross-validation estimate of the generalisation error. This allows one to automatically adjust the importance of different dimensions. The improvement in terms of modelling performance is illustrated on a variable selection task where the adaptive metric kernel clearly outperforms the standard...
Binomial tree method for pricing a regime-switching volatility stock loans
Putri, Endah R. M.; Zamani, Muhammad S.; Utomo, Daryono B.
2018-03-01
Binomial model with regime switching may represents the price of stock loan which follows the stochastic process. Stock loan is one of alternative that appeal investors to get the liquidity without selling the stock. The stock loan mechanism resembles that of American call option when someone can exercise any time during the contract period. From the resembles both of mechanism, determination price of stock loan can be interpreted from the model of American call option. The simulation result shows the behavior of the price of stock loan under a regime-switching with respect to various interest rate and maturity.
CUANDO FALLA EL SUPUESTO DE HOMOCEDASTICIDAD EN VARIABLES CON DISTRIBUCIÓN BINOMIAL
Edison Ramiro Vásquez; Alberto Caballero Núñez
2011-01-01
Se utilizó el proceso de Simulación de Monte Carlopara generar poblaciones de variables aleatorias con distribuciónBinomial con varianzas homogéneas y heterogéneas; para cinco,10 y 30 observaciones por unidad experimental (n) y probabi-lidad de éxito del evento de 0,10, 0,20, ¿,0,90(p). Se conformaronexperimentos en Diseño Bloques al Azar con tres, cinco y nuevetratamientos (t); cuatro y ocho réplicas(r); para cada combinaciónt-r-n, se generaron 100 experimentos. A modo de disponer deun refer...
Constructing Binomial Trees Via Random Maps for Analysis of Financial Assets
Directory of Open Access Journals (Sweden)
Antonio Airton Carneiro de Freitas
2010-04-01
Full Text Available Random maps can be constructed from a priori knowledge of the financial assets. It is also addressed the reverse problem, i.e. from a function of an empirical stationary probability density function we set up a random map that naturally leads to an implied binomial tree, allowing the adjustment of models, including the ability to incorporate jumps. An applica- tion related to the options market is presented. It is emphasized that the quality of the model to incorporate a priori knowledge of the financial asset may be affected, for example, by the skewed vision of the analyst.
Directory of Open Access Journals (Sweden)
M. Subbiah
2017-01-01
Full Text Available Extensive statistical practice has shown the importance and relevance of the inferential problem of estimating probability parameters in a binomial experiment; especially on the issues of competing intervals from frequentist, Bayesian, and Bootstrap approaches. The package written in the free R environment and presented in this paper tries to take care of the issues just highlighted, by pooling a number of widely available and well-performing methods and apporting on them essential variations. A wide range of functions helps users with differing skills to estimate, evaluate, summarize, numerically and graphically, various measures adopting either the frequentist or the Bayesian paradigm.
A new inverse regression model applied to radiation biodosimetry
Higueras, Manuel; Puig, Pedro; Ainsbury, Elizabeth A.; Rothkamm, Kai
2015-01-01
Biological dosimetry based on chromosome aberration scoring in peripheral blood lymphocytes enables timely assessment of the ionizing radiation dose absorbed by an individual. Here, new Bayesian-type count data inverse regression methods are introduced for situations where responses are Poisson or two-parameter compound Poisson distributed. Our Poisson models are calculated in a closed form, by means of Hermite and negative binomial (NB) distributions. For compound Poisson responses, complete and simplified models are provided. The simplified models are also expressible in a closed form and involve the use of compound Hermite and compound NB distributions. Three examples of applications are given that demonstrate the usefulness of these methodologies in cytogenetic radiation biodosimetry and in radiotherapy. We provide R and SAS codes which reproduce these examples. PMID:25663804
Lecture notes on ridge regression
van Wieringen, Wessel N.
2015-01-01
The linear regression model cannot be fitted to high-dimensional data, as the high-dimensionality brings about empirical non-identifiability. Penalized regression overcomes this non-identifiability by augmentation of the loss function by a penalty (i.e. a function of regression coefficients). The ridge penalty is the sum of squared regression coefficients, giving rise to ridge regression. Here many aspect of ridge regression are reviewed e.g. moments, mean squared error, its equivalence to co...
Kuhl, Mark R.
1990-01-01
Current navigation requirements depend on a geometric dilution of precision (GDOP) criterion. As long as the GDOP stays below a specific value, navigation requirements are met. The GDOP will exceed the specified value when the measurement geometry becomes too collinear. A new signal processing technique, called Ridge Regression Processing, can reduce the effects of nearly collinear measurement geometry; thereby reducing the inflation of the measurement errors. It is shown that the Ridge signal processor gives a consistently better mean squared error (MSE) in position than the Ordinary Least Mean Squares (OLS) estimator. The applicability of this technique is currently being investigated to improve the following areas: receiver autonomous integrity monitoring (RAIM), coverage requirements, availability requirements, and precision approaches.
Marginal likelihood estimation of negative binomial parameters with applications to RNA-seq data.
León-Novelo, Luis; Fuentes, Claudio; Emerson, Sarah
2017-10-01
RNA-Seq data characteristically exhibits large variances, which need to be appropriately accounted for in any proposed model. We first explore the effects of this variability on the maximum likelihood estimator (MLE) of the dispersion parameter of the negative binomial distribution, and propose instead to use an estimator obtained via maximization of the marginal likelihood in a conjugate Bayesian framework. We show, via simulation studies, that the marginal MLE can better control this variation and produce a more stable and reliable estimator. We then formulate a conjugate Bayesian hierarchical model, and use this new estimator to propose a Bayesian hypothesis test to detect differentially expressed genes in RNA-Seq data. We use numerical studies to show that our much simpler approach is competitive with other negative binomial based procedures, and we use a real data set to illustrate the implementation and flexibility of the procedure. © The Author 2017. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Bayesian analysis of overdispersed chromosome aberration data with the negative binomial model
International Nuclear Information System (INIS)
Brame, R.S.; Groer, P.G.
2002-01-01
The usual assumption of a Poisson model for the number of chromosome aberrations in controlled calibration experiments implies variance equal to the mean. However, it is known that chromosome aberration data from experiments involving high linear energy transfer radiations can be overdispersed, i.e. the variance is greater than the mean. Present methods for dealing with overdispersed chromosome data rely on frequentist statistical techniques. In this paper, the problem of overdispersion is considered from a Bayesian standpoint. The Bayes Factor is used to compare Poisson and negative binomial models for two previously published calibration data sets describing the induction of dicentric chromosome aberrations by high doses of neutrons. Posterior densities for the model parameters, which characterise dose response and overdispersion are calculated and graphed. Calibrative densities are derived for unknown neutron doses from hypothetical radiation accident data to determine the impact of different model assumptions on dose estimates. The main conclusion is that an initial assumption of a negative binomial model is the conservative approach to chromosome dosimetry for high LET radiations. (author)
International Nuclear Information System (INIS)
Shultis, J.K.; Buranapan, W.; Eckhoff, N.D.
1981-12-01
Of considerable importance in the safety analysis of nuclear power plants are methods to estimate the probability of failure-on-demand, p, of a plant component that normally is inactive and that may fail when activated or stressed. Properties of five methods for estimating from failure-on-demand data the parameters of the beta prior distribution in a compound beta-binomial probability model are examined. Simulated failure data generated from a known beta-binomial marginal distribution are used to estimate values of the beta parameters by (1) matching moments of the prior distribution to those of the data, (2) the maximum likelihood method based on the prior distribution, (3) a weighted marginal matching moments method, (4) an unweighted marginal matching moments method, and (5) the maximum likelihood method based on the marginal distribution. For small sample sizes (N = or < 10) with data typical of low failure probability components, it was found that the simple prior matching moments method is often superior (e.g. smallest bias and mean squared error) while for larger sample sizes the marginal maximum likelihood estimators appear to be best
Estimation of component failure probability from masked binomial system testing data
International Nuclear Information System (INIS)
Tan Zhibin
2005-01-01
The component failure probability estimates from analysis of binomial system testing data are very useful because they reflect the operational failure probability of components in the field which is similar to the test environment. In practice, this type of analysis is often confounded by the problem of data masking: the status of tested components is unknown. Methods in considering this type of uncertainty are usually computationally intensive and not practical to solve the problem for complex systems. In this paper, we consider masked binomial system testing data and develop a probabilistic model to efficiently estimate component failure probabilities. In the model, all system tests are classified into test categories based on component coverage. Component coverage of test categories is modeled by a bipartite graph. Test category failure probabilities conditional on the status of covered components are defined. An EM algorithm to estimate component failure probabilities is developed based on a simple but powerful concept: equivalent failures and tests. By simulation we not only demonstrate the convergence and accuracy of the algorithm but also show that the probabilistic model is capable of analyzing systems in series, parallel and any other user defined structures. A case study illustrates an application in test case prioritization
International Nuclear Information System (INIS)
Zhang Yu; Wang Guangyi; Lu Xinmiao; Hu Yongcai; Xu Jiangtao
2016-01-01
The random telegraph signal noise in the pixel source follower MOSFET is the principle component of the noise in the CMOS image sensor under low light. In this paper, the physical and statistical model of the random telegraph signal noise in the pixel source follower based on the binomial distribution is set up. The number of electrons captured or released by the oxide traps in the unit time is described as the random variables which obey the binomial distribution. As a result, the output states and the corresponding probabilities of the first and the second samples of the correlated double sampling circuit are acquired. The standard deviation of the output states after the correlated double sampling circuit can be obtained accordingly. In the simulation section, one hundred thousand samples of the source follower MOSFET have been simulated, and the simulation results show that the proposed model has the similar statistical characteristics with the existing models under the effect of the channel length and the density of the oxide trap. Moreover, the noise histogram of the proposed model has been evaluated at different environmental temperatures. (paper)
Chang, Yu-Wei; Tsong, Yi; Zhao, Zhigen
2017-01-01
Assessing equivalence or similarity has drawn much attention recently as many drug products have lost or will lose their patents in the next few years, especially certain best-selling biologics. To claim equivalence between the test treatment and the reference treatment when assay sensitivity is well established from historical data, one has to demonstrate both superiority of the test treatment over placebo and equivalence between the test treatment and the reference treatment. Thus, there is urgency for practitioners to derive a practical way to calculate sample size for a three-arm equivalence trial. The primary endpoints of a clinical trial may not always be continuous, but may be discrete. In this paper, the authors derive power function and discuss sample size requirement for a three-arm equivalence trial with Poisson and negative binomial clinical endpoints. In addition, the authors examine the effect of the dispersion parameter on the power and the sample size by varying its coefficient from small to large. In extensive numerical studies, the authors demonstrate that required sample size heavily depends on the dispersion parameter. Therefore, misusing a Poisson model for negative binomial data may easily lose power up to 20%, depending on the value of the dispersion parameter.
Ma, Zhuanglin; Zhang, Honglu; Chien, Steven I-Jy; Wang, Jin; Dong, Chunjiao
2017-01-01
To investigate the relationship between crash frequency and potential influence factors, the accident data for events occurring on a 50km long expressway in China, including 567 crash records (2006-2008), were collected and analyzed. Both the fixed-length and the homogeneous longitudinal grade methods were applied to divide the study expressway section into segments. A negative binomial (NB) model and a random effect negative binomial (RENB) model were developed to predict crash frequency. The parameters of both models were determined using the maximum likelihood (ML) method, and the mixed stepwise procedure was applied to examine the significance of explanatory variables. Three explanatory variables, including longitudinal grade, road width, and ratio of longitudinal grade and curve radius (RGR), were found as significantly affecting crash frequency. The marginal effects of significant explanatory variables to the crash frequency were analyzed. The model performance was determined by the relative prediction error and the cumulative standardized residual. The results show that the RENB model outperforms the NB model. It was also found that the model performance with the fixed-length segment method is superior to that with the homogeneous longitudinal grade segment method. Copyright © 2016. Published by Elsevier Ltd.
Directory of Open Access Journals (Sweden)
David Shilane
2013-01-01
Full Text Available The negative binomial distribution becomes highly skewed under extreme dispersion. Even at moderately large sample sizes, the sample mean exhibits a heavy right tail. The standard normal approximation often does not provide adequate inferences about the data's expected value in this setting. In previous work, we have examined alternative methods of generating confidence intervals for the expected value. These methods were based upon Gamma and Chi Square approximations or tail probability bounds such as Bernstein's inequality. We now propose growth estimators of the negative binomial mean. Under high dispersion, zero values are likely to be overrepresented in the data. A growth estimator constructs a normal-style confidence interval by effectively removing a small, predetermined number of zeros from the data. We propose growth estimators based upon multiplicative adjustments of the sample mean and direct removal of zeros from the sample. These methods do not require estimating the nuisance dispersion parameter. We will demonstrate that the growth estimators' confidence intervals provide improved coverage over a wide range of parameter values and asymptotically converge to the sample mean. Interestingly, the proposed methods succeed despite adding both bias and variance to the normal approximation.
Liu, Lian; Zhang, Shao-Wu; Huang, Yufei; Meng, Jia
2017-08-31
As a newly emerged research area, RNA epigenetics has drawn increasing attention recently for the participation of RNA methylation and other modifications in a number of crucial biological processes. Thanks to high throughput sequencing techniques, such as, MeRIP-Seq, transcriptome-wide RNA methylation profile is now available in the form of count-based data, with which it is often of interests to study the dynamics at epitranscriptomic layer. However, the sample size of RNA methylation experiment is usually very small due to its costs; and additionally, there usually exist a large number of genes whose methylation level cannot be accurately estimated due to their low expression level, making differential RNA methylation analysis a difficult task. We present QNB, a statistical approach for differential RNA methylation analysis with count-based small-sample sequencing data. Compared with previous approaches such as DRME model based on a statistical test covering the IP samples only with 2 negative binomial distributions, QNB is based on 4 independent negative binomial distributions with their variances and means linked by local regressions, and in the way, the input control samples are also properly taken care of. In addition, different from DRME approach, which relies only the input control sample only for estimating the background, QNB uses a more robust estimator for gene expression by combining information from both input and IP samples, which could largely improve the testing performance for very lowly expressed genes. QNB showed improved performance on both simulated and real MeRIP-Seq datasets when compared with competing algorithms. And the QNB model is also applicable to other datasets related RNA modifications, including but not limited to RNA bisulfite sequencing, m 1 A-Seq, Par-CLIP, RIP-Seq, etc.
Directory of Open Access Journals (Sweden)
Guégan Jean-François
2009-12-01
Full Text Available Abstract Background Combining multiple independent tests, when all test the same hypothesis and in the same direction, has been the subject of several approaches. Besides the inappropriate (in this case Bonferroni procedure, the Fisher's method has been widely used, in particular in population genetics. This last method has nevertheless been challenged by the SGM (symmetry around the geometric mean and Stouffer's Z-transformed methods that are less sensitive to asymmetry and deviations from uniformity of the distribution of the partial P-values. Performances of these different procedures were never compared on proportional data such as those currently used in population genetics. Results We present new software that implements a more recent method, the generalised binomial procedure, which tests for the deviation of the observed proportion of P-values lying under a chosen threshold from the expected proportion of such P-values under the null hypothesis. The respective performances of all available procedures were evaluated using simulated data under the null hypothesis with standard P-values distribution (differentiation tests. All procedures more or less behaved consistently with ~5% significant tests at α = 0.05. Then, linkage disequilibrium tests with increasing signal strength (rate of clonal reproduction, known to generate highly non-standard P-value distributions are undertaken and finally real population genetics data are analysed. In these cases, all procedures appear, more or less equally, very conservative, though SGM seems slightly more conservative. Conclusion Based on our results and those discussed in the literature we conclude that the generalised binomial and Stouffer's Z procedures should be preferred and Z when the number of tests is very small. The more conservative SGM might still be appropriate for meta-analyses when a strong publication bias in favour of significant results is expected to inflate type 2 error.
Calculating a Stepwise Ridge Regression.
Morris, John D.
1986-01-01
Although methods for using ordinary least squares regression computer programs to calculate a ridge regression are available, the calculation of a stepwise ridge regression requires a special purpose algorithm and computer program. The correct stepwise ridge regression procedure is given, and a parallel FORTRAN computer program is described.…
Modified Regression Correlation Coefficient for Poisson Regression Model
Kaengthong, Nattacha; Domthong, Uthumporn
2017-09-01
This study gives attention to indicators in predictive power of the Generalized Linear Model (GLM) which are widely used; however, often having some restrictions. We are interested in regression correlation coefficient for a Poisson regression model. This is a measure of predictive power, and defined by the relationship between the dependent variable (Y) and the expected value of the dependent variable given the independent variables [E(Y|X)] for the Poisson regression model. The dependent variable is distributed as Poisson. The purpose of this research was modifying regression correlation coefficient for Poisson regression model. We also compare the proposed modified regression correlation coefficient with the traditional regression correlation coefficient in the case of two or more independent variables, and having multicollinearity in independent variables. The result shows that the proposed regression correlation coefficient is better than the traditional regression correlation coefficient based on Bias and the Root Mean Square Error (RMSE).
Multinomial logistic regression ensembles.
Lee, Kyewon; Ahn, Hongshik; Moon, Hojin; Kodell, Ralph L; Chen, James J
2013-05-01
This article proposes a method for multiclass classification problems using ensembles of multinomial logistic regression models. A multinomial logit model is used as a base classifier in ensembles from random partitions of predictors. The multinomial logit model can be applied to each mutually exclusive subset of the feature space without variable selection. By combining multiple models the proposed method can handle a huge database without a constraint needed for analyzing high-dimensional data, and the random partition can improve the prediction accuracy by reducing the correlation among base classifiers. The proposed method is implemented using R, and the performance including overall prediction accuracy, sensitivity, and specificity for each category is evaluated on two real data sets and simulation data sets. To investigate the quality of prediction in terms of sensitivity and specificity, the area under the receiver operating characteristic (ROC) curve (AUC) is also examined. The performance of the proposed model is compared to a single multinomial logit model and it shows a substantial improvement in overall prediction accuracy. The proposed method is also compared with other classification methods such as the random forest, support vector machines, and random multinomial logit model.
Luo, Chongliang; Liu, Jin; Dey, Dipak K; Chen, Kun
2016-07-01
In many fields, multi-view datasets, measuring multiple distinct but interrelated sets of characteristics on the same set of subjects, together with data on certain outcomes or phenotypes, are routinely collected. The objective in such a problem is often two-fold: both to explore the association structures of multiple sets of measurements and to develop a parsimonious model for predicting the future outcomes. We study a unified canonical variate regression framework to tackle the two problems simultaneously. The proposed criterion integrates multiple canonical correlation analysis with predictive modeling, balancing between the association strength of the canonical variates and their joint predictive power on the outcomes. Moreover, the proposed criterion seeks multiple sets of canonical variates simultaneously to enable the examination of their joint effects on the outcomes, and is able to handle multivariate and non-Gaussian outcomes. An efficient algorithm based on variable splitting and Lagrangian multipliers is proposed. Simulation studies show the superior performance of the proposed approach. We demonstrate the effectiveness of the proposed approach in an [Formula: see text] intercross mice study and an alcohol dependence study. © The Author 2016. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Measurement Error in Education and Growth Regressions
Portela, M.; Teulings, C.N.; Alessie, R.
The perpetual inventory method used for the construction of education data per country leads to systematic measurement error. This paper analyses the effect of this measurement error on GDP regressions. There is a systematic difference in the education level between census data and observations
Measurement error in education and growth regressions
Portela, Miguel; Teulings, Coen; Alessie, R.
2004-01-01
The perpetual inventory method used for the construction of education data per country leads to systematic measurement error. This paper analyses the effect of this measurement error on GDP regressions. There is a systematic difference in the education level between census data and observations
Is ‘hit and run’ a single word? The processing of irreversible binomials in neglect dyslexia
Directory of Open Access Journals (Sweden)
Giorgio eArcara
2012-02-01
Full Text Available The present study is the first neuropsychological investigation into the problem of the mental representation and processing of irreversible binomials, i.e. word pairs linked by a conjunction (e.g. ‘hit and run’, ‘dead or alive’. In order to test their lexical status, the phenomenon of neglect dyslexia is explored.People with left-sided neglect dyslexia show a clear lexical effect: they can read irreversible binomials better (i.e., by dropping the leftmost words less frequently when their components are presented in their correct order. This may be taken as an indication that they treat these constructions as lexical, not decomposable, elements. This finding therefore constitutes strong evidence that irreversible binomials tend to be stored in the mental lexicon as a whole and that this whole form is preferably addressed in the retrieval process.
The effect of the negative binomial distribution on the line-width of the micromaser cavity field
International Nuclear Information System (INIS)
Kremid, A. M.
2009-01-01
The influence of negative binomial distribution (NBD) on the line-width of the negative binomial distribution (NBD) on the line-width of the micromaser is considered. The threshold of the micromaser is shifted towards higher values of the pumping parameter q. Moreover the line-width exhibits sharp dips 'resonances' when the cavity temperature reduces to a very low value. These dips are very clear evidence for the occurrence of the so-called trapping states regime in the micromaser. This statistics prevents the appearance of these trapping states, namely by increasing the negative binomial parameter q these dips wash out and the line-width becomes more broadening. For small values of the parameter q the line-width at large values of q randomly oscillates around its transition line. As q becomes large this oscillatory behavior occurs at rarely values of q. (author)
Directory of Open Access Journals (Sweden)
Nuray GİRGİNER
2008-01-01
Full Text Available In this study, it has been investigated traveller satisfaction about the tram which is one of the mass transportation vehicles on case of Eskisehir’s Tram System (Estram using Binomial Logistic Regression Analysis. Eskisehir’s population have become dense on students and their’s satisfactions as traveller have important. So, sample of this study has formed from 300 students of Anatolia University and Eskisehir Osmangazi University which are in Eskisehir and they have selected with Simple Random Sampling. As a consequence, utilizing some of subjective and objective variables, it is investigated whether or not Estram satisfies these students. Considering latent variable about satisfaction at the binomial level, binomial logistic regression is implemented about student satisfaction. The result of analysis showed that whole independent variables had negative effect on the satisfaction of students about Estram.
Is "hit and run" a single word? The processing of irreversible binomials in neglect dyslexia.
Arcara, Giorgio; Lacaita, Graziano; Mattaloni, Elisa; Passarini, Laura; Mondini, Sara; Benincà, Paola; Semenza, Carlo
2012-01-01
The present study is the first neuropsychological investigation into the problem of the mental representation and processing of irreversible binomials (IBs), i.e., word pairs linked by a conjunction (e.g., "hit and run," "dead or alive"). In order to test their lexical status, the phenomenon of neglect dyslexia is explored. People with left-sided neglect dyslexia show a clear lexical effect: they can read IBs better (i.e., by dropping the leftmost words less frequently) when their components are presented in their correct order. This may be taken as an indication that they treat these constructions as lexical, not decomposable, elements. This finding therefore constitutes strong evidence that IBs tend to be stored in the mental lexicon as a whole and that this whole form is preferably addressed in the retrieval process.
Binomial moments of the distance distribution and the probability of undetected error
Energy Technology Data Exchange (ETDEWEB)
Barg, A. [Lucent Technologies, Murray Hill, NJ (United States). Bell Labs.; Ashikhmin, A. [Los Alamos National Lab., NM (United States)
1998-09-01
In [1] K.A.S. Abdel-Ghaffar derives a lower bound on the probability of undetected error for unrestricted codes. The proof relies implicitly on the binomial moments of the distance distribution of the code. The authors use the fact that these moments count the size of subcodes of the code to give a very simple proof of the bound in [1] by showing that it is essentially equivalent to the Singleton bound. They discuss some combinatorial connections revealed by this proof. They also discuss some improvements of this bound. Finally, they analyze asymptotics. They show that an upper bound on the undetected error exponent that corresponds to the bound of [1] improves known bounds on this function.
The multi-class binomial failure rate model for the treatment of common-cause failures
International Nuclear Information System (INIS)
Hauptmanns, U.
1995-01-01
The impact of common cause failures (CCF) on PSA results for NPPs is in sharp contrast with the limited quality which can be achieved in their assessment. This is due to the dearth of observations and cannot be remedied in the short run. Therefore the methods employed for calculating failure rates should be devised such as to make the best use of the few available observations on CCF. The Multi-Class Binomial Failure Rate (MCBFR) Model achieves this by assigning observed failures to different classes according to their technical characteristics and applying the BFR formalism to each of these. The results are hence determined by a superposition of BFR type expressions for each class, each of them with its own coupling factor. The model thus obtained flexibly reproduces the dependence of CCF rates on failure multiplicity insinuated by the observed failure multiplicities. This is demonstrated by evaluating CCFs observed for combined impulse pilot valves in German NPPs. (orig.) [de
Xie, Wen-Jie; Han, Rui-Qi; Jiang, Zhi-Qiang; Wei, Lijian; Zhou, Wei-Xing
2017-08-01
Complex network is not only a powerful tool for the analysis of complex system, but also a promising way to analyze time series. The algorithm of horizontal visibility graph (HVG) maps time series into graphs, whose degree distributions are numerically and analytically investigated for certain time series. We derive the degree distributions of HVGs through an iterative construction process of HVGs. The degree distributions of the HVG and the directed HVG for random series are derived to be exponential, which confirms the analytical results from other methods. We also obtained the analytical expressions of degree distributions of HVGs and in-degree and out-degree distributions of directed HVGs transformed from multifractal binomial measures, which agree excellently with numerical simulations.
Directory of Open Access Journals (Sweden)
Xiong Wang
2013-01-01
Full Text Available Based on characteristics of the nonlife joint-stock insurance company, this paper presents a compound binomial risk model that randomizes the premium income on unit time and sets the threshold for paying dividends to shareholders. In this model, the insurance company obtains the insurance policy in unit time with probability and pays dividends to shareholders with probability when the surplus is no less than . We then derive the recursive formulas of the expected discounted penalty function and the asymptotic estimate for it. And we will derive the recursive formulas and asymptotic estimates for the ruin probability and the distribution function of the deficit at ruin. The numerical examples have been shown to illustrate the accuracy of the asymptotic estimations.
Connecting Binomial and Black-Scholes Option Pricing Models: A Spreadsheet-Based Illustration
Directory of Open Access Journals (Sweden)
Yi Feng
2012-07-01
Full Text Available The Black-Scholes option pricing model is part of the modern financial curriculum, even at the introductory level. However, as the derivation of the model, which requires advanced mathematical tools, is well beyond the scope of standard finance textbooks, the model has remained a great, but mysterious, recipe for many students. This paper illustrates, from a pedagogic perspective, how a simple binomial model, which converges to the Black-Scholes formula, can capture the economic insight in the original derivation. Microsoft ExcelTM plays an important pedagogic role in connecting the two models. The interactivity as provided by scroll bars, in conjunction with Excel's graphical features, will allow students to visualize the impacts of individual input parameters on option pricing.
Organ procurement and the body donor-family binomial: instruments to subsidize nursing approach
Directory of Open Access Journals (Sweden)
Gisele da Cruz Ferreira
2013-05-01
Full Text Available We aimed to describe the design of instruments to subsidize the care for the body donor-family binomial in the perspective of the process of organ procurement. The Activities of Living Model grounded the instruments for data collection. We identified 33 possible diagnoses, 14 associated to the body preservation and 19 to responses from family members facing grieving and the decision on whether to authorize the donation. We selected 31 interventions to preserve the body for organs/tissues procurement, and 25 to meet the needs for information, coping and support for the family decision. The nursing diagnoses, interventions, and outcomes were registered according to the North American Nursing Diagnosis Association, Nursing Intervention Classification, and Nursing Outcome Classification, respectively. The instruments follow the legislation of the Board of Nursing and the donor/organ procurement, needing to be validated by field experts.
A comparison of LMC and SDL complexity measures on binomial distributions
Piqueira, José Roberto C.
2016-02-01
The concept of complexity has been widely discussed in the last forty years, with a lot of thinking contributions coming from all areas of the human knowledge, including Philosophy, Linguistics, History, Biology, Physics, Chemistry and many others, with mathematicians trying to give a rigorous view of it. In this sense, thermodynamics meets information theory and, by using the entropy definition, López-Ruiz, Mancini and Calbet proposed a definition for complexity that is referred as LMC measure. Shiner, Davison and Landsberg, by slightly changing the LMC definition, proposed the SDL measure and the both, LMC and SDL, are satisfactory to measure complexity for a lot of problems. Here, SDL and LMC measures are applied to the case of a binomial probability distribution, trying to clarify how the length of the data set implies complexity and how the success probability of the repeated trials determines how complex the whole set is.
Polynomial regression analysis and significance test of the regression function
International Nuclear Information System (INIS)
Gao Zhengming; Zhao Juan; He Shengping
2012-01-01
In order to analyze the decay heating power of a certain radioactive isotope per kilogram with polynomial regression method, the paper firstly demonstrated the broad usage of polynomial function and deduced its parameters with ordinary least squares estimate. Then significance test method of polynomial regression function is derived considering the similarity between the polynomial regression model and the multivariable linear regression model. Finally, polynomial regression analysis and significance test of the polynomial function are done to the decay heating power of the iso tope per kilogram in accord with the authors' real work. (authors)
Recursive Algorithm For Linear Regression
Varanasi, S. V.
1988-01-01
Order of model determined easily. Linear-regression algorithhm includes recursive equations for coefficients of model of increased order. Algorithm eliminates duplicative calculations, facilitates search for minimum order of linear-regression model fitting set of data satisfactory.
Panel data specifications in nonparametric kernel regression
DEFF Research Database (Denmark)
Czekaj, Tomasz Gerard; Henningsen, Arne
We discuss nonparametric regression models for panel data. A fully nonparametric panel data specification that uses the time variable and the individual identifier as additional (categorical) explanatory variables is considered to be the most suitable. We use this estimator and conventional...... parametric panel data estimators to analyse the production technology of Polish crop farms. The results of our nonparametric kernel regressions generally differ from the estimates of the parametric models but they only slightly depend on the choice of the kernel functions. Based on economic reasoning, we...... found the estimates of the fully nonparametric panel data model to be more reliable....
Pradhan, Vivek; Saha, Krishna K; Banerjee, Tathagata; Evans, John C
2014-07-30
Inference on the difference between two binomial proportions in the paired binomial setting is often an important problem in many biomedical investigations. Tang et al. (2010, Statistics in Medicine) discussed six methods to construct confidence intervals (henceforth, we abbreviate it as CI) for the difference between two proportions in paired binomial setting using method of variance estimates recovery. In this article, we propose weighted profile likelihood-based CIs for the difference between proportions of a paired binomial distribution. However, instead of the usual likelihood, we use weighted likelihood that is essentially making adjustments to the cell frequencies of a 2 × 2 table in the spirit of Agresti and Min (2005, Statistics in Medicine). We then conduct numerical studies to compare the performances of the proposed CIs with that of Tang et al. and Agresti and Min in terms of coverage probabilities and expected lengths. Our numerical study clearly indicates that the weighted profile likelihood-based intervals and Jeffreys interval (cf. Tang et al.) are superior in terms of achieving the nominal level, and in terms of expected lengths, they are competitive. Finally, we illustrate the use of the proposed CIs with real-life examples. Copyright © 2014 John Wiley & Sons, Ltd.
Directory of Open Access Journals (Sweden)
Jianwei Zhou
2014-01-01
Full Text Available The explicit formulae of spectral norms for circulant-type matrices are investigated; the matrices are circulant matrix, skew-circulant matrix, and g-circulant matrix, respectively. The entries are products of binomial coefficients with harmonic numbers. Explicit identities for these spectral norms are obtained. Employing these approaches, some numerical tests are listed to verify the results.
An algorithm for sequential tail value at risk for path-independent payoffs in a binomial tree
Roorda, Berend
2010-01-01
We present an algorithm that determines Sequential Tail Value at Risk (STVaR) for path-independent payoffs in a binomial tree. STVaR is a dynamic version of Tail-Value-at-Risk (TVaR) characterized by the property that risk levels at any moment must be in the range of risk levels later on. The
Combining Alphas via Bounded Regression
Directory of Open Access Journals (Sweden)
Zura Kakushadze
2015-11-01
Full Text Available We give an explicit algorithm and source code for combining alpha streams via bounded regression. In practical applications, typically, there is insufficient history to compute a sample covariance matrix (SCM for a large number of alphas. To compute alpha allocation weights, one then resorts to (weighted regression over SCM principal components. Regression often produces alpha weights with insufficient diversification and/or skewed distribution against, e.g., turnover. This can be rectified by imposing bounds on alpha weights within the regression procedure. Bounded regression can also be applied to stock and other asset portfolio construction. We discuss illustrative examples.
Linear regression in astronomy. I
Isobe, Takashi; Feigelson, Eric D.; Akritas, Michael G.; Babu, Gutti Jogesh
1990-01-01
Five methods for obtaining linear regression fits to bivariate data with unknown or insignificant measurement errors are discussed: ordinary least-squares (OLS) regression of Y on X, OLS regression of X on Y, the bisector of the two OLS lines, orthogonal regression, and 'reduced major-axis' regression. These methods have been used by various researchers in observational astronomy, most importantly in cosmic distance scale applications. Formulas for calculating the slope and intercept coefficients and their uncertainties are given for all the methods, including a new general form of the OLS variance estimates. The accuracy of the formulas was confirmed using numerical simulations. The applicability of the procedures is discussed with respect to their mathematical properties, the nature of the astronomical data under consideration, and the scientific purpose of the regression. It is found that, for problems needing symmetrical treatment of the variables, the OLS bisector performs significantly better than orthogonal or reduced major-axis regression.
Lara, Jesus R; Hoddle, Mark S
2015-08-01
Oligonychus perseae Tuttle, Baker, & Abatiello is a foliar pest of 'Hass' avocados [Persea americana Miller (Lauraceae)]. The recommended action threshold is 50-100 motile mites per leaf, but this count range and other ecological factors associated with O. perseae infestations limit the application of enumerative sampling plans in the field. Consequently, a comprehensive modeling approach was implemented to compare the practical application of various binomial sampling models for decision-making of O. perseae in California. An initial set of sequential binomial sampling models were developed using three mean-proportion modeling techniques (i.e., Taylor's power law, maximum likelihood, and an empirical model) in combination with two-leaf infestation tally thresholds of either one or two mites. Model performance was evaluated using a robust mite count database consisting of >20,000 Hass avocado leaves infested with varying densities of O. perseae and collected from multiple locations. Operating characteristic and average sample number results for sequential binomial models were used as the basis to develop and validate a standardized fixed-size binomial sampling model with guidelines on sample tree and leaf selection within blocks of avocado trees. This final validated model requires a leaf sampling cost of 30 leaves and takes into account the spatial dynamics of O. perseae to make reliable mite density classifications for a 50-mite action threshold. Recommendations for implementing this fixed-size binomial sampling plan to assess densities of O. perseae in commercial California avocado orchards are discussed. © The Authors 2015. Published by Oxford University Press on behalf of Entomological Society of America. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Linear regression in astronomy. II
Feigelson, Eric D.; Babu, Gutti J.
1992-01-01
A wide variety of least-squares linear regression procedures used in observational astronomy, particularly investigations of the cosmic distance scale, are presented and discussed. The classes of linear models considered are (1) unweighted regression lines, with bootstrap and jackknife resampling; (2) regression solutions when measurement error, in one or both variables, dominates the scatter; (3) methods to apply a calibration line to new data; (4) truncated regression models, which apply to flux-limited data sets; and (5) censored regression models, which apply when nondetections are present. For the calibration problem we develop two new procedures: a formula for the intercept offset between two parallel data sets, which propagates slope errors from one regression to the other; and a generalization of the Working-Hotelling confidence bands to nonstandard least-squares lines. They can provide improved error analysis for Faber-Jackson, Tully-Fisher, and similar cosmic distance scale relations.
Time-adaptive quantile regression
DEFF Research Database (Denmark)
Møller, Jan Kloppenborg; Nielsen, Henrik Aalborg; Madsen, Henrik
2008-01-01
An algorithm for time-adaptive quantile regression is presented. The algorithm is based on the simplex algorithm, and the linear optimization formulation of the quantile regression problem is given. The observations have been split to allow a direct use of the simplex algorithm. The simplex method...... and an updating procedure are combined into a new algorithm for time-adaptive quantile regression, which generates new solutions on the basis of the old solution, leading to savings in computation time. The suggested algorithm is tested against a static quantile regression model on a data set with wind power...... production, where the models combine splines and quantile regression. The comparison indicates superior performance for the time-adaptive quantile regression in all the performance parameters considered....
Retro-regression--another important multivariate regression improvement.
Randić, M
2001-01-01
We review the serious problem associated with instabilities of the coefficients of regression equations, referred to as the MRA (multivariate regression analysis) "nightmare of the first kind". This is manifested when in a stepwise regression a descriptor is included or excluded from a regression. The consequence is an unpredictable change of the coefficients of the descriptors that remain in the regression equation. We follow with consideration of an even more serious problem, referred to as the MRA "nightmare of the second kind", arising when optimal descriptors are selected from a large pool of descriptors. This process typically causes at different steps of the stepwise regression a replacement of several previously used descriptors by new ones. We describe a procedure that resolves these difficulties. The approach is illustrated on boiling points of nonanes which are considered (1) by using an ordered connectivity basis; (2) by using an ordering resulting from application of greedy algorithm; and (3) by using an ordering derived from an exhaustive search for optimal descriptors. A novel variant of multiple regression analysis, called retro-regression (RR), is outlined showing how it resolves the ambiguities associated with both "nightmares" of the first and the second kind of MRA.
Bias-corrected quantile regression estimation of censored regression models
Cizek, Pavel; Sadikoglu, Serhan
2018-01-01
In this paper, an extension of the indirect inference methodology to semiparametric estimation is explored in the context of censored regression. Motivated by weak small-sample performance of the censored regression quantile estimator proposed by Powell (J Econom 32:143–155, 1986a), two- and
Quantile regression theory and applications
Davino, Cristina; Vistocco, Domenico
2013-01-01
A guide to the implementation and interpretation of Quantile Regression models This book explores the theory and numerous applications of quantile regression, offering empirical data analysis as well as the software tools to implement the methods. The main focus of this book is to provide the reader with a comprehensivedescription of the main issues concerning quantile regression; these include basic modeling, geometrical interpretation, estimation and inference for quantile regression, as well as issues on validity of the model, diagnostic tools. Each methodological aspect is explored and
Naznin, Farhana; Currie, Graham; Logan, David; Sarvi, Majid
2016-07-01
Safety is a key concern in the design, operation and development of light rail systems including trams or streetcars as they impose crash risks on road users in terms of crash frequency and severity. The aim of this study is to identify key traffic, transit and route factors that influence tram-involved crash frequencies along tram route sections in Melbourne. A random effects negative binomial (RENB) regression model was developed to analyze crash frequency data obtained from Yarra Trams, the tram operator in Melbourne. The RENB modelling approach can account for spatial and temporal variations within observation groups in panel count data structures by assuming that group specific effects are randomly distributed across locations. The results identify many significant factors effecting tram-involved crash frequency including tram service frequency (2.71), tram stop spacing (-0.42), tram route section length (0.31), tram signal priority (-0.25), general traffic volume (0.18), tram lane priority (-0.15) and ratio of platform tram stops (-0.09). Findings provide useful insights on route section level tram-involved crashes in an urban tram or streetcar operating environment. The method described represents a useful planning tool for transit agencies hoping to improve safety performance. Copyright © 2016 Elsevier Ltd. All rights reserved.
Directory of Open Access Journals (Sweden)
Andrei ACHIMAŞ CADARIU
2004-08-01
Full Text Available Assessments of a controlled clinical trial suppose to interpret some key parameters as the controlled event rate, experimental event date, relative risk, absolute risk reduction, relative risk reduction, number needed to treat when the effect of the treatment are dichotomous variables. Defined as the difference in the event rate between treatment and control groups, the absolute risk reduction is the parameter that allowed computing the number needed to treat. The absolute risk reduction is compute when the experimental treatment reduces the risk for an undesirable outcome/event. In medical literature when the absolute risk reduction is report with its confidence intervals, the method used is the asymptotic one, even if it is well know that may be inadequate. The aim of this paper is to introduce and assess nine methods of computing confidence intervals for absolute risk reduction and absolute risk reduction – like function.Computer implementations of the methods use the PHP language. Methods comparison uses the experimental errors, the standard deviations, and the deviation relative to the imposed significance level for specified sample sizes. Six methods of computing confidence intervals for absolute risk reduction and absolute risk reduction-like functions were assessed using random binomial variables and random sample sizes.The experiments shows that the ADAC, and ADAC1 methods obtains the best overall performance of computing confidence intervals for absolute risk reduction.
Generazio, Edward R.
2011-01-01
The capability of an inspection system is established by applications of various methodologies to determine the probability of detection (POD). One accepted metric of an adequate inspection system is that for a minimum flaw size and all greater flaw sizes, there is 0.90 probability of detection with 95% confidence (90/95 POD). Directed design of experiments for probability of detection (DOEPOD) has been developed to provide an efficient and accurate methodology that yields estimates of POD and confidence bounds for both Hit-Miss or signal amplitude testing, where signal amplitudes are reduced to Hit-Miss by using a signal threshold Directed DOEPOD uses a nonparametric approach for the analysis or inspection data that does require any assumptions about the particular functional form of a POD function. The DOEPOD procedure identifies, for a given sample set whether or not the minimum requirement of 0.90 probability of detection with 95% confidence is demonstrated for a minimum flaw size and for all greater flaw sizes (90/95 POD). The DOEPOD procedures are sequentially executed in order to minimize the number of samples needed to demonstrate that there is a 90/95 POD lower confidence bound at a given flaw size and that the POD is monotonic for flaw sizes exceeding that 90/95 POD flaw size. The conservativeness of the DOEPOD methodology results is discussed. Validated guidelines for binomial estimation of POD for fracture critical inspection are established.
Design of Ultra-Wideband Tapered Slot Antenna by Using Binomial Transformer with Corrugation
Chareonsiri, Yosita; Thaiwirot, Wanwisa; Akkaraekthalin, Prayoot
2017-05-01
In this paper, the tapered slot antenna (TSA) with corrugation is proposed for UWB applications. The multi-section binomial transformer is used to design taper profile of the proposed TSA that does not involve using time consuming optimization. A step-by-step procedure for synthesis of the step impedance values related with step slot widths of taper profile is presented. The smooth taper can be achieved by fitting the smoothing curve to the entire step slot. The design of TSA based on this method yields results with a quite flat gain and wide impedance bandwidth covering UWB spectrum from 3.1 GHz to 10.6 GHz. To further improve the radiation characteristics, the corrugation is added on the both edges of the proposed TSA. The effects of different corrugation shapes on the improvement of antenna gain and front-to-back ratio (F-to-B ratio) are investigated. To demonstrate the validity of the design, the prototypes of TSA without and with corrugation are fabricated and measured. The results show good agreement between simulation and measurement.
Hilpert, Markus; Rasmuson, Anna; Johnson, William
2017-04-01
Transport of colloids in saturated porous media is significantly influenced by colloidal interactions with grain surfaces. Near-surface fluid domain colloids experience relatively low fluid drag and relatively strong colloidal forces that slow their down-gradient translation relative to colloids in bulk fluid. Near surface fluid domain colloids may re-enter into the bulk fluid via diffusion (nanoparticles) or expulsion at rear flow stagnation zones, they may immobilize (attach) via strong primary minimum interactions, or they may move along a grain-to-grain contact to the near surface fluid domain of an adjacent grain. We introduce a simple model that accounts for all possible permutations of mass transfer within a dual pore and grain network. The primary phenomena thereby represented in the model are mass transfer of colloids between the bulk and near-surface fluid domains and immobilization onto grain surfaces. Colloid movement is described by a sequence of trials in a series of unit cells, and the binomial distribution is used to calculate the probabilities of each possible sequence. Pore-scale simulations provide mechanistically-determined likelihoods and timescales associated with the above pore-scale colloid mass transfer processes, whereas the network-scale model employs pore and grain topology to determine probabilities of transfer from up-gradient bulk and near-surface fluid domains to down-gradient bulk and near-surface fluid domains. Inter-grain transport of colloids in the near surface fluid domain can cause extended tailing.
The binomial work-health in the transit of Curitiba city.
Tokars, Eunice; Moro, Antonio Renato Pereira; Cruz, Roberto Moraes
2012-01-01
The working activity in traffic of the big cities complex interacts with the environment is often in unsafe and unhealthy imbalance favoring the binomial work - health. The aim of this paper was to analyze the relationship between work and health of taxi drivers in Curitiba, Brazil. This cross-sectional observational study with 206 individuals used a questionnaire on the organization's profile and perception of the environment and direct observation of work. It was found that the majority are male, aged between 26 and 49 years and has a high school degree. They are sedentary, like making a journey from 8 to 12 hours. They consider a stressful profession, related low back pain and are concerned about safety and accidents. 40% are smokers and consume alcoholic drink and 65% do not have or do not use devices of comfort. Risk factors present in the daily taxi constraints cause physical, cognitive and organizational and can affect your performance. It is concluded that the taxi drivers must change the unhealthy lifestyle, requiring a more efficient management of government authorities for this work is healthy and safe for all involved.
Logistic Regression: Concept and Application
Cokluk, Omay
2010-01-01
The main focus of logistic regression analysis is classification of individuals in different groups. The aim of the present study is to explain basic concepts and processes of binary logistic regression analysis intended to determine the combination of independent variables which best explain the membership in certain groups called dichotomous…
Testing discontinuities in nonparametric regression
Dai, Wenlin
2017-01-19
In nonparametric regression, it is often needed to detect whether there are jump discontinuities in the mean function. In this paper, we revisit the difference-based method in [13 H.-G. Müller and U. Stadtmüller, Discontinuous versus smooth regression, Ann. Stat. 27 (1999), pp. 299–337. doi: 10.1214/aos/1018031100
Illuminance Flow Estimation by Regression
Karlsson, S.M.; Pont, S.C.; Koenderink, J.J.; Zisserman, A.
2010-01-01
We investigate the estimation of illuminance flow using Histograms of Oriented Gradient features (HOGs). In a regression setting, we found for both ridge regression and support vector machines, that the optimal solution shows close resemblance to the gradient based structure tensor (also known as
Fungible weights in logistic regression.
Jones, Jeff A; Waller, Niels G
2016-06-01
In this article we develop methods for assessing parameter sensitivity in logistic regression models. To set the stage for this work, we first review Waller's (2008) equations for computing fungible weights in linear regression. Next, we describe 2 methods for computing fungible weights in logistic regression. To demonstrate the utility of these methods, we compute fungible logistic regression weights using data from the Centers for Disease Control and Prevention's (2010) Youth Risk Behavior Surveillance Survey, and we illustrate how these alternate weights can be used to evaluate parameter sensitivity. To make our work accessible to the research community, we provide R code (R Core Team, 2015) that will generate both kinds of fungible logistic regression weights. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Tumor regression patterns in retinoblastoma
International Nuclear Information System (INIS)
Zafar, S.N.; Siddique, S.N.; Zaheer, N.
2016-01-01
To observe the types of tumor regression after treatment, and identify the common pattern of regression in our patients. Study Design: Descriptive study. Place and Duration of Study: Department of Pediatric Ophthalmology and Strabismus, Al-Shifa Trust Eye Hospital, Rawalpindi, Pakistan, from October 2011 to October 2014. Methodology: Children with unilateral and bilateral retinoblastoma were included in the study. Patients were referred to Pakistan Institute of Medical Sciences, Islamabad, for chemotherapy. After every cycle of chemotherapy, dilated funds examination under anesthesia was performed to record response of the treatment. Regression patterns were recorded on RetCam II. Results: Seventy-four tumors were included in the study. Out of 74 tumors, 3 were ICRB group A tumors, 43 were ICRB group B tumors, 14 tumors belonged to ICRB group C, and remaining 14 were ICRB group D tumors. Type IV regression was seen in 39.1% (n=29) tumors, type II in 29.7% (n=22), type III in 25.6% (n=19), and type I in 5.4% (n=4). All group A tumors (100%) showed type IV regression. Seventeen (39.5%) group B tumors showed type IV regression. In group C, 5 tumors (35.7%) showed type II regression and 5 tumors (35.7%) showed type IV regression. In group D, 6 tumors (42.9%) regressed to type II non-calcified remnants. Conclusion: The response and success of the focal and systemic treatment, as judged by the appearance of different patterns of tumor regression, varies with the ICRB grouping of the tumor. (author)
Choi, Seung Hoan; Labadorf, Adam T; Myers, Richard H; Lunetta, Kathryn L; Dupuis, Josée; DeStefano, Anita L
2017-02-06
Next generation sequencing provides a count of RNA molecules in the form of short reads, yielding discrete, often highly non-normally distributed gene expression measurements. Although Negative Binomial (NB) regression has been generally accepted in the analysis of RNA sequencing (RNA-Seq) data, its appropriateness has not been exhaustively evaluated. We explore logistic regression as an alternative method for RNA-Seq studies designed to compare cases and controls, where disease status is modeled as a function of RNA-Seq reads using simulated and Huntington disease data. We evaluate the effect of adjusting for covariates that have an unknown relationship with gene expression. Finally, we incorporate the data adaptive method in order to compare false positive rates. When the sample size is small or the expression levels of a gene are highly dispersed, the NB regression shows inflated Type-I error rates but the Classical logistic and Bayes logistic (BL) regressions are conservative. Firth's logistic (FL) regression performs well or is slightly conservative. Large sample size and low dispersion generally make Type-I error rates of all methods close to nominal alpha levels of 0.05 and 0.01. However, Type-I error rates are controlled after applying the data adaptive method. The NB, BL, and FL regressions gain increased power with large sample size, large log2 fold-change, and low dispersion. The FL regression has comparable power to NB regression. We conclude that implementing the data adaptive method appropriately controls Type-I error rates in RNA-Seq analysis. Firth's logistic regression provides a concise statistical inference process and reduces spurious associations from inaccurately estimated dispersion parameters in the negative binomial framework.
Predicting Word Reading Ability: A Quantile Regression Study
McIlraith, Autumn L.
2018-01-01
Predictors of early word reading are well established. However, it is unclear if these predictors hold for readers across a range of word reading abilities. This study used quantile regression to investigate predictive relationships at different points in the distribution of word reading. Quantile regression analyses used preschool and…
Augmenting Data with Published Results in Bayesian Linear Regression
de Leeuw, Christiaan; Klugkist, Irene
2012-01-01
In most research, linear regression analyses are performed without taking into account published results (i.e., reported summary statistics) of similar previous studies. Although the prior density in Bayesian linear regression could accommodate such prior knowledge, formal models for doing so are absent from the literature. The goal of this…
Logic regression and its extensions.
Schwender, Holger; Ruczinski, Ingo
2010-01-01
Logic regression is an adaptive classification and regression procedure, initially developed to reveal interacting single nucleotide polymorphisms (SNPs) in genetic association studies. In general, this approach can be used in any setting with binary predictors, when the interaction of these covariates is of primary interest. Logic regression searches for Boolean (logic) combinations of binary variables that best explain the variability in the outcome variable, and thus, reveals variables and interactions that are associated with the response and/or have predictive capabilities. The logic expressions are embedded in a generalized linear regression framework, and thus, logic regression can handle a variety of outcome types, such as binary responses in case-control studies, numeric responses, and time-to-event data. In this chapter, we provide an introduction to the logic regression methodology, list some applications in public health and medicine, and summarize some of the direct extensions and modifications of logic regression that have been proposed in the literature. Copyright © 2010 Elsevier Inc. All rights reserved.
Galvan, T L; Burkness, E C; Hutchison, W D
2007-06-01
To develop a practical integrated pest management (IPM) system for the multicolored Asian lady beetle, Harmonia axyridis (Pallas) (Coleoptera: Coccinellidae), in wine grapes, we assessed the spatial distribution of H. axyridis and developed eight sampling plans to estimate adult density or infestation level in grape clusters. We used 49 data sets collected from commercial vineyards in 2004 and 2005, in Minnesota and Wisconsin. Enumerative plans were developed using two precision levels (0.10 and 0.25); the six binomial plans reflected six unique action thresholds (3, 7, 12, 18, 22, and 31% of cluster samples infested with at least one H. axyridis). The spatial distribution of H. axyridis in wine grapes was aggregated, independent of cultivar and year, but it was more randomly distributed as mean density declined. The average sample number (ASN) for each sampling plan was determined using resampling software. For research purposes, an enumerative plan with a precision level of 0.10 (SE/X) resulted in a mean ASN of 546 clusters. For IPM applications, the enumerative plan with a precision level of 0.25 resulted in a mean ASN of 180 clusters. In contrast, the binomial plans resulted in much lower ASNs and provided high probabilities of arriving at correct "treat or no-treat" decisions, making these plans more efficient for IPM applications. For a tally threshold of one adult per cluster, the operating characteristic curves for the six action thresholds provided binomial sequential sampling plans with mean ASNs of only 19-26 clusters, and probabilities of making correct decisions between 83 and 96%. The benefits of the binomial sampling plans are discussed within the context of improving IPM programs for wine grapes.
Directory of Open Access Journals (Sweden)
Giselle Garcia-Sepulveda
2017-06-01
Full Text Available This paper describes the frequency and number of Trifur tortuosus in the skin of Merluccius gayi, an important economic resource in Chile. Analysis of a spatial distribution model indicated that the parasites tended to cluster. Variations in the number of parasites per host can be described by a negative binomial distribution. The maximum number of parasites observed per host was one, similar patterns was described for other parasites in Chilean marine fishes.
Criterios sobre el uso de la distribución normal para aproximar la distribución binomial
Ortiz Pinilla, Jorge; Castro, Amparo; Neira, Tito; Torres, Pedro; Castañeda, Javier
2012-01-01
Las dos reglas empíricas más conocidas para aceptar la aproximación normal de la distribución binomial carecen de regularidad en el control del margen de error cometido al utilizar la aproximación normal. Se propone un criterio y algunas fórmulas para controlarlo cerca de algunos valores escogidos para el error máximo.
Chen, Connie; Gribble, Matthew O; Bartroff, Jay; Bay, Steven M; Goldstein, Larry
2017-05-01
The United States's Clean Water Act stipulates in section 303(d) that states must identify impaired water bodies for which total maximum daily loads (TMDLs) of pollution inputs into water bodies are developed. Decision-making procedures about how to list, or delist, water bodies as impaired, or not, per Clean Water Act 303(d) differ across states. In states such as California, whether or not a particular monitoring sample suggests that water quality is impaired can be regarded as a binary outcome variable, and California's current regulatory framework invokes a version of the exact binomial test to consolidate evidence across samples and assess whether the overall water body complies with the Clean Water Act. Here, we contrast the performance of California's exact binomial test with one potential alternative, the Sequential Probability Ratio Test (SPRT). The SPRT uses a sequential testing framework, testing samples as they become available and evaluating evidence as it emerges, rather than measuring all the samples and calculating a test statistic at the end of the data collection process. Through simulations and theoretical derivations, we demonstrate that the SPRT on average requires fewer samples to be measured to have comparable Type I and Type II error rates as the current fixed-sample binomial test. Policymakers might consider efficient alternatives such as SPRT to current procedure. Copyright © 2017 Elsevier Ltd. All rights reserved.
Abstract Expression Grammar Symbolic Regression
Korns, Michael F.
This chapter examines the use of Abstract Expression Grammars to perform the entire Symbolic Regression process without the use of Genetic Programming per se. The techniques explored produce a symbolic regression engine which has absolutely no bloat, which allows total user control of the search space and output formulas, which is faster, and more accurate than the engines produced in our previous papers using Genetic Programming. The genome is an all vector structure with four chromosomes plus additional epigenetic and constraint vectors, allowing total user control of the search space and the final output formulas. A combination of specialized compiler techniques, genetic algorithms, particle swarm, aged layered populations, plus discrete and continuous differential evolution are used to produce an improved symbolic regression sytem. Nine base test cases, from the literature, are used to test the improvement in speed and accuracy. The improved results indicate that these techniques move us a big step closer toward future industrial strength symbolic regression systems.
Model building in nonproportional hazard regression.
Rodríguez-Girondo, Mar; Kneib, Thomas; Cadarso-Suárez, Carmen; Abu-Assi, Emad
2013-12-30
Recent developments of statistical methods allow for a very flexible modeling of covariates affecting survival times via the hazard rate, including also the inspection of possible time-dependent associations. Despite their immediate appeal in terms of flexibility, these models typically introduce additional difficulties when a subset of covariates and the corresponding modeling alternatives have to be chosen, that is, for building the most suitable model for given data. This is particularly true when potentially time-varying associations are given. We propose to conduct a piecewise exponential representation of the original survival data to link hazard regression with estimation schemes based on of the Poisson likelihood to make recent advances for model building in exponential family regression accessible also in the nonproportional hazard regression context. A two-stage stepwise selection approach, an approach based on doubly penalized likelihood, and a componentwise functional gradient descent approach are adapted to the piecewise exponential regression problem. These three techniques were compared via an intensive simulation study. An application to prognosis after discharge for patients who suffered a myocardial infarction supplements the simulation to demonstrate the pros and cons of the approaches in real data analyses. Copyright © 2013 John Wiley & Sons, Ltd.
Forecasting with Dynamic Regression Models
Pankratz, Alan
2012-01-01
One of the most widely used tools in statistical forecasting, single equation regression models is examined here. A companion to the author's earlier work, Forecasting with Univariate Box-Jenkins Models: Concepts and Cases, the present text pulls together recent time series ideas and gives special attention to possible intertemporal patterns, distributed lag responses of output to input series and the auto correlation patterns of regression disturbance. It also includes six case studies.
From Rasch scores to regression
DEFF Research Database (Denmark)
Christensen, Karl Bang
2006-01-01
Rasch models provide a framework for measurement and modelling latent variables. Having measured a latent variable in a population a comparison of groups will often be of interest. For this purpose the use of observed raw scores will often be inadequate because these lack interval scale propertie....... This paper compares two approaches to group comparison: linear regression models using estimated person locations as outcome variables and latent regression models based on the distribution of the score....
Testing Heteroscedasticity in Robust Regression
Czech Academy of Sciences Publication Activity Database
Kalina, Jan
2011-01-01
Roč. 1, č. 4 (2011), s. 25-28 ISSN 2045-3345 Grant - others:GA ČR(CZ) GA402/09/0557 Institutional research plan: CEZ:AV0Z10300504 Keywords : robust regression * heteroscedasticity * regression quantiles * diagnostics Subject RIV: BB - Applied Statistics , Operational Research http://www.researchjournals.co.uk/documents/Vol4/06%20Kalina.pdf
Regression methods for medical research
Tai, Bee Choo
2013-01-01
Regression Methods for Medical Research provides medical researchers with the skills they need to critically read and interpret research using more advanced statistical methods. The statistical requirements of interpreting and publishing in medical journals, together with rapid changes in science and technology, increasingly demands an understanding of more complex and sophisticated analytic procedures.The text explains the application of statistical models to a wide variety of practical medical investigative studies and clinical trials. Regression methods are used to appropriately answer the
Dimension Reduction Regression in R
Directory of Open Access Journals (Sweden)
Sanford Weisberg
2002-01-01
Full Text Available Regression is the study of the dependence of a response variable y on a collection predictors p collected in x. In dimension reduction regression, we seek to find a few linear combinations β1x,...,βdx, such that all the information about the regression is contained in these linear combinations. If d is very small, perhaps one or two, then the regression problem can be summarized using simple graphics; for example, for d=1, the plot of y versus β1x contains all the regression information. When d=2, a 3D plot contains all the information. Several methods for estimating d and relevant functions of β1,..., βdhave been suggested in the literature. In this paper, we describe an R package for three important dimension reduction methods: sliced inverse regression or sir, sliced average variance estimates, or save, and principal Hessian directions, or phd. The package is very general and flexible, and can be easily extended to include other methods of dimension reduction. It includes tests and estimates of the dimension , estimates of the relevant information including β1,..., βd, and some useful graphical summaries as well.
Hilpert, Markus; Rasmuson, Anna; Johnson, William P.
2017-07-01
Colloid transport in saturated porous media is significantly influenced by colloidal interactions with grain surfaces. Near-surface fluid domain colloids experience relatively low fluid drag and relatively strong colloidal forces that slow their downgradient translation relative to colloids in bulk fluid. Near-surface fluid domain colloids may reenter into the bulk fluid via diffusion (nanoparticles) or expulsion at rear flow stagnation zones, they may immobilize (attach) via primary minimum interactions, or they may move along a grain-to-grain contact to the near-surface fluid domain of an adjacent grain. We introduce a simple model that accounts for all possible permutations of mass transfer within a dual pore and grain network. The primary phenomena thereby represented in the model are mass transfer of colloids between the bulk and near-surface fluid domains and immobilization. Colloid movement is described by a Markov chain, i.e., a sequence of trials in a 1-D network of unit cells, which contain a pore and a grain. Using combinatorial analysis, which utilizes the binomial coefficient, we derive the residence time distribution, i.e., an inventory of the discrete colloid travel times through the network and of their probabilities to occur. To parameterize the network model, we performed mechanistic pore-scale simulations in a single unit cell that determined the likelihoods and timescales associated with the above colloid mass transfer processes. We found that intergrain transport of colloids in the near-surface fluid domain can cause extended tailing, which has traditionally been attributed to hydrodynamic dispersion emanating from flow tortuosity of solute trajectories.
Directory of Open Access Journals (Sweden)
Germán Lobos
2008-12-01
Full Text Available The main objective of this research was to identify the determining factors of the use of public instruments to manage risk in the Chilean wine industry. A binomial logistic regression model was proposed. Based on a survey of 104 viticulture and winemaking companies, a database was constructed between January and October 2007. The model was fitted using maximum likelihood estimation. The variables that turned out to be statistically significant were: risk of the wine price, availability of external consultancy and number of permanent workers. From the Public Management point of view, the main conclusion suggests that the use of public instruments could be increased if viticulturists and winemakers had more external counseling.
International Nuclear Information System (INIS)
Ghosh, D.; Deb, A.; Haldar, P.K.; Sahoo, S.R.; Maity, D.
2004-01-01
This work studies the validity of the negative binomial distribution in the multiplicity distribution of charged secondaries in 16 O and 32 S interactions with AgBr at 60 GeV/c per nucleon and 200 GeV/c per nucleon, respectively. The validity of negative binomial distribution (NBD) is studied in different azimuthal phase spaces. It is observed that the data can be well parameterized in terms of the NBD law for different azimuthal phase spaces. (authors)
Nonparametric Mixture of Regression Models.
Huang, Mian; Li, Runze; Wang, Shaoli
2013-07-01
Motivated by an analysis of US house price index data, we propose nonparametric finite mixture of regression models. We study the identifiability issue of the proposed models, and develop an estimation procedure by employing kernel regression. We further systematically study the sampling properties of the proposed estimators, and establish their asymptotic normality. A modified EM algorithm is proposed to carry out the estimation procedure. We show that our algorithm preserves the ascent property of the EM algorithm in an asymptotic sense. Monte Carlo simulations are conducted to examine the finite sample performance of the proposed estimation procedure. An empirical analysis of the US house price index data is illustrated for the proposed methodology.
Regression filter for signal resolution
International Nuclear Information System (INIS)
Matthes, W.
1975-01-01
The problem considered is that of resolving a measured pulse height spectrum of a material mixture, e.g. gamma ray spectrum, Raman spectrum, into a weighed sum of the spectra of the individual constituents. The model on which the analytical formulation is based is described. The problem reduces to that of a multiple linear regression. A stepwise linear regression procedure was constructed. The efficiency of this method was then tested by transforming the procedure in a computer programme which was used to unfold test spectra obtained by mixing some spectra, from a library of arbitrary chosen spectra, and adding a noise component. (U.K.)
Bayesian variable selection in regression
Energy Technology Data Exchange (ETDEWEB)
Mitchell, T.J.; Beauchamp, J.J.
1987-01-01
This paper is concerned with the selection of subsets of ''predictor'' variables in a linear regression model for the prediction of a ''dependent'' variable. We take a Bayesian approach and assign a probability distribution to the dependent variable through a specification of prior distributions for the unknown parameters in the regression model. The appropriate posterior probabilities are derived for each submodel and methods are proposed for evaluating the family of prior distributions. Examples are given that show the application of the Bayesian methodology. 23 refs., 3 figs.
A Matlab program for stepwise regression
Directory of Open Access Journals (Sweden)
Yanhong Qi
2016-03-01
Full Text Available The stepwise linear regression is a multi-variable regression for identifying statistically significant variables in the linear regression equation. In present study, we presented the Matlab program of stepwise regression.
Survival analysis II: Cox regression
Stel, Vianda S.; Dekker, Friedo W.; Tripepi, Giovanni; Zoccali, Carmine; Jager, Kitty J.
2011-01-01
In contrast to the Kaplan-Meier method, Cox proportional hazards regression can provide an effect estimate by quantifying the difference in survival between patient groups and can adjust for confounding effects of other variables. The purpose of this article is to explain the basic concepts of the
Regression of lumbar disk herniation
Directory of Open Access Journals (Sweden)
G. Yu Evzikov
2015-01-01
Full Text Available Compression of the spinal nerve root, giving rise to pain and sensory and motor disorders in the area of its innervation is the most vivid manifestation of herniated intervertebral disk. Different treatment modalities, including neurosurgery, for evolving these conditions are discussed. There has been recent evidence that spontaneous regression of disk herniation can regress. The paper describes a female patient with large lateralized disc extrusion that has caused compression of the nerve root S1, leading to obvious myotonic and radicular syndrome. Magnetic resonance imaging has shown that the clinical manifestations of discogenic radiculopathy, as well myotonic syndrome and morphological changes completely regressed 8 months later. The likely mechanism is inflammation-induced resorption of a large herniated disk fragment, which agrees with the data available in the literature. A decision to perform neurosurgery for which the patient had indications was made during her first consultation. After regression of discogenic radiculopathy, there was only moderate pain caused by musculoskeletal diseases (facet syndrome, piriformis syndrome that were successfully eliminated by minimally invasive techniques.
Ridge Regression for Interactive Models.
Tate, Richard L.
1988-01-01
An exploratory study of the value of ridge regression for interactive models is reported. Assuming that the linear terms in a simple interactive model are centered to eliminate non-essential multicollinearity, a variety of common models, representing both ordinal and disordinal interactions, are shown to have "orientations" that are…
Regression Models for Repairable Systems
Czech Academy of Sciences Publication Activity Database
Novák, Petr
2015-01-01
Roč. 17, č. 4 (2015), s. 963-972 ISSN 1387-5841 Institutional support: RVO:67985556 Keywords : Reliability analysis * Repair models * Regression Subject RIV: BB - Applied Statistics , Operational Research Impact factor: 0.782, year: 2015 http://library.utia.cas.cz/separaty/2015/SI/novak-0450902.pdf
Cactus: An Introduction to Regression
Hyde, Hartley
2008-01-01
When the author first used "VisiCalc," the author thought it a very useful tool when he had the formulas. But how could he design a spreadsheet if there was no known formula for the quantities he was trying to predict? A few months later, the author relates he learned to use multiple linear regression software and suddenly it all clicked into…
Linear regression and the normality assumption.
Schmidt, Amand F; Finan, Chris
2017-12-16
Researchers often perform arbitrary outcome transformations to fulfill the normality assumption of a linear regression model. This commentary explains and illustrates that in large data settings, such transformations are often unnecessary, and worse may bias model estimates. Linear regression assumptions are illustrated using simulated data and an empirical example on the relation between time since type 2 diabetes diagnosis and glycated hemoglobin levels. Simulation results were evaluated on coverage; i.e., the number of times the 95% confidence interval included the true slope coefficient. Although outcome transformations bias point estimates, violations of the normality assumption in linear regression analyses do not. The normality assumption is necessary to unbiasedly estimate standard errors, and hence confidence intervals and P-values. However, in large sample sizes (e.g., where the number of observations per variable is >10) violations of this normality assumption often do not noticeably impact results. Contrary to this, assumptions on, the parametric model, absence of extreme observations, homoscedasticity, and independency of the errors, remain influential even in large sample size settings. Given that modern healthcare research typically includes thousands of subjects focusing on the normality assumption is often unnecessary, does not guarantee valid results, and worse may bias estimates due to the practice of outcome transformations. Copyright © 2017 Elsevier Inc. All rights reserved.
Directory of Open Access Journals (Sweden)
Gisela Lundberg
2008-08-01
Full Text Available Amplification of the oncogene MYCN in double minutes (DMs is a common finding in neuroblastoma (NB. Because DMs lack centromeric sequences it has been unclear how NB cells retain and amplify extrachromosomal MYCN copies during tumour development.We show that MYCN-carrying DMs in NB cells translocate from the nuclear interior to the periphery of the condensing chromatin at transition from interphase to prophase and are preferentially located adjacent to the telomere repeat sequences of the chromosomes throughout cell division. However, DM segregation was not affected by disruption of the telosome nucleoprotein complex and DMs readily migrated from human to murine chromatin in human/mouse cell hybrids, indicating that they do not bind to specific positional elements in human chromosomes. Scoring DM copy-numbers in ana/telophase cells revealed that DM segregation could be closely approximated by a binomial random distribution. Colony-forming assay demonstrated a strong growth-advantage for NB cells with high DM (MYCN copy-numbers, compared to NB cells with lower copy-numbers. In fact, the overall distribution of DMs in growing NB cell populations could be readily reproduced by a mathematical model assuming binomial segregation at cell division combined with a proliferative advantage for cells with high DM copy-numbers.Binomial segregation at cell division explains the high degree of MYCN copy-number variability in NB. Our findings also provide a proof-of-principle for oncogene amplification through creation of genetic diversity by random events followed by Darwinian selection.
International Nuclear Information System (INIS)
Ghosh, D.; Mukhopadhyay, A.; Ghosh, A.; Roy, J.
1989-01-01
This letter presents new data on the multiplicity distribution of charged secondaries in 24 Mg interactions with AgBr at 4.5 GeV/c per nucleon. The validity of the negative binomial distribution (NBD) is studied. It is observed that the data can be well parametrized in terms of the NBD law for the whole phase space and also for different pseudo-rapidity bins. A comparison of different parameters with those in the case of h-h interactions reveals some interesting results, the implications of which are discussed. (orig.)
Directory of Open Access Journals (Sweden)
Robert Kurniawan
2017-11-01
Full Text Available AbstractThe incidence rates of DHF in Jakarta in 2010 to 2014 are always higher than that of the national rates. Therefore, this study aims to find the effect of weather parameter on DHF cases. Weather is chosen because it can be observed daily and can be predicted so that it can be used as earlydetection in estimating the number of DHF cases. Data use includes DHF cases which is collected daily and weather data including lowest and highest temperatures and rainfall. Data analysis used is zero-truncated negative binomial analysis at 10% significance level. Based on the periodic dataof selected variables from January 1st 2015 until May 31st 2015, the study revealed that weather factors consisting of highest temperature, lowest temperature, and rainfall rate were significant enough to predict the number of DHF patients in DKI Jakarta. The three variables had positiveeffects in influencing the number of DHF patients in the same period. However, the weather factors cannot be controlled by humans, so that appropriate preventions are required whenever weather’s predictions indicate the increasing number of DHF cases in DKI Jakarta.Keywords: Dengue Hemorrhagic Fever, zero truncated negative binomial, early warning.AbstrakAngka kesakitan DBD pada tahun 2010 hingga 2014 selalu lebih tinggi dibandingkan dengan angka kesakitan DBD nasional. Oleh karena itu, penelitian ini bertujuan untuk mencari pengaruh faktor cuaca terhadap kasus DBD. Faktor cuaca dipilih karena dapat diamati setiap harinya dan dapat diprediksi sehingga dapat dijadikan deteksi dini dalam perkiraan jumlah penderita DBD. Data yang digunakan dalam penelitian ini adalah data jumlah penderita DBD di DKI Jakarta per hari dan data cuaca yang meliputi suhu terendah, suhu tertinggi dan curah hujan. Untuk mengetahui pengaruh faktor cuaca tersebut terhadap jumlah penderita DBD di DKI Jakarta digunakan metode analisis zero-truncated negative binomial. Berdasarkan data periode 1 Januari 2015 hingga
Non-binomial distribution of palladium Kα Ln X-ray satellites emitted after excitation by 16O ions
International Nuclear Information System (INIS)
Rymuza, P.; Sujkowski, Z.; Carlen, M.; Dousse, J.C.; Gasser, M.; Kern, J.; Perny, B.; Rheme, C.
1988-02-01
The palladium K α L n X-ray satellites spectrum emitted after excitation by 5.4 MeV/u 16 O ions has been measured. The distribution of the satellites yield is found to be significantly narrower than the binomial one. The deviations can be accounted for by assuming that the L-shell ionization is due to two uncorrelated processes: the direct ionization by impact and the electron capture to the K-shell of the projectile. 12 refs., 1 fig., 1 tab. (author)
Directory of Open Access Journals (Sweden)
Paweł Mielcarz
2007-06-01
Full Text Available The article presents a case study of valuation of real options included in a investment project. The main goal of the article is to present the calculation and methodological issues of application the methodology for real option valuation. In order to do it there are used the binomial model and Market Asset Declaimer methodology. The project presented in the article concerns the introduction of radio station to a new market. It includes two valuable real options: to abandon the project and to expand.
Quantile Regression With Measurement Error
Wei, Ying
2009-08-27
Regression quantiles can be substantially biased when the covariates are measured with error. In this paper we propose a new method that produces consistent linear quantile estimation in the presence of covariate measurement error. The method corrects the measurement error induced bias by constructing joint estimating equations that simultaneously hold for all the quantile levels. An iterative EM-type estimation algorithm to obtain the solutions to such joint estimation equations is provided. The finite sample performance of the proposed method is investigated in a simulation study, and compared to the standard regression calibration approach. Finally, we apply our methodology to part of the National Collaborative Perinatal Project growth data, a longitudinal study with an unusual measurement error structure. © 2009 American Statistical Association.
Active set support vector regression.
Musicant, David R; Feinberg, Alexander
2004-03-01
This paper presents active set support vector regression (ASVR), a new active set strategy to solve a straightforward reformulation of the standard support vector regression problem. This new algorithm is based on the successful ASVM algorithm for classification problems, and consists of solving a finite number of linear equations with a typically large dimensionality equal to the number of points to be approximated. However, by making use of the Sherman-Morrison-Woodbury formula, a much smaller matrix of the order of the original input space is inverted at each step. The algorithm requires no specialized quadratic or linear programming code, but merely a linear equation solver which is publicly available. ASVR is extremely fast, produces comparable generalization error to other popular algorithms, and is available on the web for download.
Producing The New Regressive Left
DEFF Research Database (Denmark)
Crone, Christine
This thesis is the first comprehensive research work conducted on the Beirut based TV station, an important representative of the post-2011 generation of Arab satellite news media. The launch of al-Mayadeen in June 2012 was closely linked to the political developments across the Arab world...... members, this thesis investigates a growing political trend and ideological discourse in the Arab world that I have called The New Regressive Left. On the premise that a media outlet can function as a forum for ideology production, the thesis argues that an analysis of this material can help to trace...... the contexture of The New Regressive Left. If the first part of the thesis lays out the theoretical approach and draws the contextual framework, through an exploration of the surrounding Arab media-and ideoscapes, the second part is an analytical investigation of the discourse that permeates the programmes aired...
AUTISTIC EPILEPTIFORM REGRESSION (A REVIEW
Directory of Open Access Journals (Sweden)
L. Yu. Glukhova
2012-01-01
Full Text Available The author represents the review of current scientific literature devoted to autistic epileptiform regression — the special form of autistic disorder, characterized by development of severe communicative disorders in children as a result of continuous prolonged epileptiform activity on EEG. This condition has been described by R.F. Tuchman and I. Rapin in 1997. The author describes the aspects of pathogenesis, clinical pictures and diagnostics of this disorder, including the peculiar anomalies on EEG (benign epileptiform patterns of childhood, with a high index of epileptiform activity, especially in the sleep. The especial attention is given to approaches to the treatment of autistic epileptiform regression. Efficacy of valproates, corticosteroid hormones and antiepileptic drugs of other groups is considered.
Polynomial Regressions and Nonsense Inference
Directory of Open Access Journals (Sweden)
Daniel Ventosa-Santaulària
2013-11-01
Full Text Available Polynomial specifications are widely used, not only in applied economics, but also in epidemiology, physics, political analysis and psychology, just to mention a few examples. In many cases, the data employed to estimate such specifications are time series that may exhibit stochastic nonstationary behavior. We extend Phillips’ results (Phillips, P. Understanding spurious regressions in econometrics. J. Econom. 1986, 33, 311–340. by proving that an inference drawn from polynomial specifications, under stochastic nonstationarity, is misleading unless the variables cointegrate. We use a generalized polynomial specification as a vehicle to study its asymptotic and finite-sample properties. Our results, therefore, lead to a call to be cautious whenever practitioners estimate polynomial regressions.
Spontaneous regression of colon cancer.
Kihara, Kyoichi; Fujita, Shin; Ohshiro, Taihei; Yamamoto, Seiichiro; Sekine, Shigeki
2015-01-01
A case of spontaneous regression of transverse colon cancer is reported. A 64-year-old man was diagnosed as having cancer of the transverse colon at a local hospital. Initial and second colonoscopy examinations revealed a typical cancer of the transverse colon, which was diagnosed as moderately differentiated adenocarcinoma. The patient underwent right hemicolectomy 6 weeks after the initial colonoscopy. The resected specimen showed only a scar at the tumor site, and no cancerous tissue was proven histologically. The patient is alive with no evidence of recurrence 1 year after surgery. Although an antitumor immune response is the most likely explanation, the exact nature of the phenomenon was unclear. We describe this rare case and review the literature pertaining to spontaneous regression of colorectal cancer. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Directional quantile regression in R
Czech Academy of Sciences Publication Activity Database
Boček, Pavel; Šiman, Miroslav
2017-01-01
Roč. 53, č. 3 (2017), s. 480-492 ISSN 0023-5954 R&D Projects: GA ČR GA14-07234S Institutional support: RVO:67985556 Keywords : multivariate quantile * regression quantile * halfspace depth * depth contour Subject RIV: BD - Theory of Information OBOR OECD: Applied mathematics Impact factor: 0.379, year: 2016 http://library.utia.cas.cz/separaty/2017/SI/bocek-0476587.pdf
Jakaitiene, Audrone; Avino, Mariano; Guarracino, Mario Rosario
2017-04-01
Against diminishing costs, next-generation sequencing (NGS) still remains expensive for studies with a large number of individuals. As cost saving, sequencing genome of pools containing multiple samples might be used. Currently, there are many software available for the detection of single-nucleotide polymorphisms (SNPs). Sensitivity and specificity depend on the model used and data analyzed, indicating that all software have space for improvement. We use beta-binomial model to detect rare mutations in untagged pooled NGS experiments. We propose a multireference framework for pooled data with ability being specific up to two patients affected by neuromuscular disorders (NMD). We assessed the results comparing with The Genome Analysis Toolkit (GATK), CRISP, SNVer, and FreeBayes. Our results show that the multireference approach applying beta-binomial model is accurate in predicting rare mutations at 0.01 fraction. Finally, we explored the concordance of mutations between the model and software, checking their involvement in any NMD-related gene. We detected seven novel SNPs, for which the functional analysis produced enriched terms related to locomotion and musculature.
On Weighted Support Vector Regression
DEFF Research Database (Denmark)
Han, Xixuan; Clemmensen, Line Katrine Harder
2014-01-01
We propose a new type of weighted support vector regression (SVR), motivated by modeling local dependencies in time and space in prediction of house prices. The classic weights of the weighted SVR are added to the slack variables in the objective function (OF‐weights). This procedure directly...... the differences and similarities of the two types of weights by demonstrating the connection between the Least Absolute Shrinkage and Selection Operator (LASSO) and the SVR. We show that an SVR problem can be transformed to a LASSO problem plus a linear constraint and a box constraint. We demonstrate...
Directory of Open Access Journals (Sweden)
Goovaerts Pierre
2011-12-01
Full Text Available Abstract Background Although prostate cancer-related incidence and mortality have declined recently, striking racial/ethnic differences persist in the United States. Visualizing and modelling temporal trends of prostate cancer late-stage incidence, and how they vary according to geographic locations and race, should help explaining such disparities. Joinpoint regression is increasingly used to identify the timing and extent of changes in time series of health outcomes. Yet, most analyses of temporal trends are aspatial and conducted at the national level or for a single cancer registry. Methods Time series (1981-2007 of annual proportions of prostate cancer late-stage cases were analyzed for non-Hispanic Whites and non-Hispanic Blacks in each county of Florida. Noise in the data was first filtered by binomial kriging and results were modelled using joinpoint regression. A similar analysis was also conducted at the state level and for groups of metropolitan and non-metropolitan counties. Significant racial differences were detected using tests of parallelism and coincidence of time trends. A new disparity statistic was introduced to measure spatial and temporal changes in the frequency of racial disparities. Results State-level percentage of late-stage diagnosis decreased 50% since 1981; a decline that accelerated in the 90's when Prostate Specific Antigen (PSA screening was introduced. Analysis at the metropolitan and non-metropolitan levels revealed that the frequency of late-stage diagnosis increased recently in urban areas, and this trend was significant for white males. The annual rate of decrease in late-stage diagnosis and the onset years for significant declines varied greatly among counties and racial groups. Most counties with non-significant average annual percent change (AAPC were located in the Florida Panhandle for white males, whereas they clustered in South-eastern Florida for black males. The new disparity statistic indicated
Goovaerts, Pierre; Xiao, Hong
2011-12-05
Although prostate cancer-related incidence and mortality have declined recently, striking racial/ethnic differences persist in the United States. Visualizing and modelling temporal trends of prostate cancer late-stage incidence, and how they vary according to geographic locations and race, should help explaining such disparities. Joinpoint regression is increasingly used to identify the timing and extent of changes in time series of health outcomes. Yet, most analyses of temporal trends are aspatial and conducted at the national level or for a single cancer registry. Time series (1981-2007) of annual proportions of prostate cancer late-stage cases were analyzed for non-Hispanic Whites and non-Hispanic Blacks in each county of Florida. Noise in the data was first filtered by binomial kriging and results were modelled using joinpoint regression. A similar analysis was also conducted at the state level and for groups of metropolitan and non-metropolitan counties. Significant racial differences were detected using tests of parallelism and coincidence of time trends. A new disparity statistic was introduced to measure spatial and temporal changes in the frequency of racial disparities. State-level percentage of late-stage diagnosis decreased 50% since 1981; a decline that accelerated in the 90's when Prostate Specific Antigen (PSA) screening was introduced. Analysis at the metropolitan and non-metropolitan levels revealed that the frequency of late-stage diagnosis increased recently in urban areas, and this trend was significant for white males. The annual rate of decrease in late-stage diagnosis and the onset years for significant declines varied greatly among counties and racial groups. Most counties with non-significant average annual percent change (AAPC) were located in the Florida Panhandle for white males, whereas they clustered in South-eastern Florida for black males. The new disparity statistic indicated that the spatial extent of racial disparities reached a
Variable Selection in ROC Regression
Directory of Open Access Journals (Sweden)
Binhuan Wang
2013-01-01
Full Text Available Regression models are introduced into the receiver operating characteristic (ROC analysis to accommodate effects of covariates, such as genes. If many covariates are available, the variable selection issue arises. The traditional induced methodology separately models outcomes of diseased and nondiseased groups; thus, separate application of variable selections to two models will bring barriers in interpretation, due to differences in selected models. Furthermore, in the ROC regression, the accuracy of area under the curve (AUC should be the focus instead of aiming at the consistency of model selection or the good prediction performance. In this paper, we obtain one single objective function with the group SCAD to select grouped variables, which adapts to popular criteria of model selection, and propose a two-stage framework to apply the focused information criterion (FIC. Some asymptotic properties of the proposed methods are derived. Simulation studies show that the grouped variable selection is superior to separate model selections. Furthermore, the FIC improves the accuracy of the estimated AUC compared with other criteria.
Directory of Open Access Journals (Sweden)
Taliha KELEŞ
2016-12-01
Full Text Available Regression analysis is a statistical technique for investigating and modeling the relationship between variables. The purpose of this study was the trivial presentation of the equation for orthogonal regression (OR and the comparison of classical linear regression (CLR and OR techniques with respect to the sum of squared perpendicular distances. For that purpose, the analyses were shown by an example. It was found that the sum of squared perpendicular distances of OR is smaller. Thus, it was seen that OR line has appeared to present a much better fit for the data than CLR line. Depending on those results, the OR is thought to be a regression technique to obtain more accurate results than CLR at simple linear regression studies.
Isotonic Regression under Lipschitz Constraint
Wilbur, W.J.
2018-01-01
The pool adjacent violators (PAV) algorithm is an efficient technique for the class of isotonic regression problems with complete ordering. The algorithm yields a stepwise isotonic estimate which approximates the function and assigns maximum likelihood to the data. However, if one has reasons to believe that the data were generated by a continuous function, a smoother estimate may provide a better approximation to that function. In this paper, we consider the formulation which assumes that the data were generated by a continuous monotonic function obeying the Lipschitz condition. We propose a new algorithm, the Lipschitz pool adjacent violators (LPAV) algorithm, which approximates that function; we prove the convergence of the algorithm and examine its complexity. PMID:29456266
Lumbar herniated disc: spontaneous regression.
Altun, Idiris; Yüksel, Kasım Zafer
2017-01-01
Low back pain is a frequent condition that results in substantial disability and causes admission of patients to neurosurgery clinics. To evaluate and present the therapeutic outcomes in lumbar disc hernia (LDH) patients treated by means of a conservative approach, consisting of bed rest and medical therapy. This retrospective cohort was carried out in the neurosurgery departments of hospitals in Kahramanmaraş city and 23 patients diagnosed with LDH at the levels of L3-L4, L4-L5 or L5-S1 were enrolled. The average age was 38.4 ± 8.0 and the chief complaint was low back pain and sciatica radiating to one or both lower extremities. Conservative treatment was administered. Neurological examination findings, durations of treatment and intervals until symptomatic recovery were recorded. Laségue tests and neurosensory examination revealed that mild neurological deficits existed in 16 of our patients. Previously, 5 patients had received physiotherapy and 7 patients had been on medical treatment. The number of patients with LDH at the level of L3-L4, L4-L5, and L5-S1 were 1, 13, and 9, respectively. All patients reported that they had benefit from medical treatment and bed rest, and radiologic improvement was observed simultaneously on MRI scans. The average duration until symptomatic recovery and/or regression of LDH symptoms was 13.6 ± 5.4 months (range: 5-22). It should be kept in mind that lumbar disc hernias could regress with medical treatment and rest without surgery, and there should be an awareness that these patients could recover radiologically. This condition must be taken into account during decision making for surgical intervention in LDH patients devoid of indications for emergent surgery.
Directory of Open Access Journals (Sweden)
Patricio Peña-Rehbein
2012-03-01
Full Text Available Nematodes of the genus Anisakis have marine fishes as intermediate hosts. One of these hosts is Thyrsites atun, an important fishery resource in Chile between 38 and 41° S. This paper describes the frequency and number of Anisakis nematodes in the internal organs of Thyrsites atun. An analysis based on spatial distribution models showed that the parasites tend to be clustered. The variation in the number of parasites per host could be described by the negative binomial distribution. The maximum observed number of parasites was nine parasites per host. The environmental and zoonotic aspects of the study are also discussed.Nematóides do gênero Anisakis têm nos peixes marinhos seus hospedeiros intermediários. Um desses hospedeiros é Thyrsites atun, um importante recurso pesqueiro no Chile entre 38 e 41° S. Este artigo descreve a freqüência e o número de nematóides Anisakis nos órgãos internos de Thyrsites atun. Uma análise baseada em modelos de distribuição espacial demonstrou que os parasitos tendem a ficar agrupados. A variação numérica de parasitas por hospedeiro pôde ser descrita por distribuição binomial negativa. O número máximo observado de parasitas por hospedeiro foi nove. Os aspectos ambientais e zoonóticos desse estudo também serão discutidos.
Shirazi, Mohammadali; Dhavala, Soma Sekhar; Lord, Dominique; Geedipally, Srinivas Reddy
2017-10-01
Safety analysts usually use post-modeling methods, such as the Goodness-of-Fit statistics or the Likelihood Ratio Test, to decide between two or more competitive distributions or models. Such metrics require all competitive distributions to be fitted to the data before any comparisons can be accomplished. Given the continuous growth in introducing new statistical distributions, choosing the best one using such post-modeling methods is not a trivial task, in addition to all theoretical or numerical issues the analyst may face during the analysis. Furthermore, and most importantly, these measures or tests do not provide any intuitions into why a specific distribution (or model) is preferred over another (Goodness-of-Logic). This paper ponders into these issues by proposing a methodology to design heuristics for Model Selection based on the characteristics of data, in terms of descriptive summary statistics, before fitting the models. The proposed methodology employs two analytic tools: (1) Monte-Carlo Simulations and (2) Machine Learning Classifiers, to design easy heuristics to predict the label of the 'most-likely-true' distribution for analyzing data. The proposed methodology was applied to investigate when the recently introduced Negative Binomial Lindley (NB-L) distribution is preferred over the Negative Binomial (NB) distribution. Heuristics were designed to select the 'most-likely-true' distribution between these two distributions, given a set of prescribed summary statistics of data. The proposed heuristics were successfully compared against classical tests for several real or observed datasets. Not only they are easy to use and do not need any post-modeling inputs, but also, using these heuristics, the analyst can attain useful information about why the NB-L is preferred over the NB - or vice versa- when modeling data. Copyright © 2017 Elsevier Ltd. All rights reserved.
Examples of mixed-effects modeling with crossed random effects and with binomial data
Quené, H.; van den Bergh, H.
2008-01-01
Psycholinguistic data are often analyzed with repeated-measures analyses of variance (ANOVA), but this paper argues that mixed-effects (multilevel) models provide a better alternative method. First, models are discussed in which the two random factors of participants and items are crossed, and not
Tang, Yongqiang
2017-05-25
We derive the sample size formulae for comparing two negative binomial rates based on both the relative and absolute rate difference metrics in noninferiority and equivalence trials with unequal follow-up times, and establish an approximate relationship between the sample sizes required for the treatment comparison based on the two treatment effect metrics. The proposed method allows the dispersion parameter to vary by treatment groups. The accuracy of these methods is assessed by simulations. It is demonstrated that ignoring the between-subject variation in the follow-up time by setting the follow-up time for all individuals to be the mean follow-up time may greatly underestimate the required size, resulting in underpowered studies. Methods are provided for back-calculating the dispersion parameter based on the published summary results.
Is “Hit and Run” a Single Word? The Processing of Irreversible Binomials in Neglect Dyslexia
Arcara, Giorgio; Lacaita, Graziano; Mattaloni, Elisa; Passarini, Laura; Mondini, Sara; Benincà, Paola; Semenza, Carlo
2012-01-01
The present study is the first neuropsychological investigation into the problem of the mental representation and processing of irreversible binomials (IBs), i.e., word pairs linked by a conjunction (e.g., “hit and run,” “dead or alive”). In order to test their lexical status, the phenomenon of neglect dyslexia is explored. People with left-sided neglect dyslexia show a clear lexical effect: they can read IBs better (i.e., by dropping the leftmost words less frequently) when their components are presented in their correct order. This may be taken as an indication that they treat these constructions as lexical, not decomposable, elements. This finding therefore constitutes strong evidence that IBs tend to be stored in the mental lexicon as a whole and that this whole form is preferably addressed in the retrieval process. PMID:22347199
Silva, Ivair R
2018-01-15
Type I error probability spending functions are commonly used for designing sequential analysis of binomial data in clinical trials, but it is also quickly emerging for near-continuous sequential analysis of post-market drug and vaccine safety surveillance. It is well known that, for clinical trials, when the null hypothesis is not rejected, it is still important to minimize the sample size. Unlike in post-market drug and vaccine safety surveillance, that is not important. In post-market safety surveillance, specially when the surveillance involves identification of potential signals, the meaningful statistical performance measure to be minimized is the expected sample size when the null hypothesis is rejected. The present paper shows that, instead of the convex Type I error spending shape conventionally used in clinical trials, a concave shape is more indicated for post-market drug and vaccine safety surveillance. This is shown for both, continuous and group sequential analysis. Copyright © 2017 John Wiley & Sons, Ltd.
Using Ridge Regression with Non-Cognitive Variables by Race in Admissions.
Tracey, Terrence J.; Sedlacek, William E.
1984-01-01
A study of the effectiveness of ridge regression over ordinary least squares regression as applied to both cognitive and noncognitive admissions data is reported. Separate race equations and a general equation were used. The analysis used did not improve on existing regression analyses. (MSE)
What do you do when the binomial cannot value real options? The LSM model
Directory of Open Access Journals (Sweden)
S. Alonso
2014-12-01
Full Text Available The Least-Squares Monte Carlo model (LSM model has emerged as the derivative valuation technique with the greatest impact in current practice. As with other options valuation models, the LSM algorithm was initially posited in the field of financial derivatives and its extension to the realm of real options requires considering certain questions which might hinder understanding of the algorithm and which the present paper seeks to address. The implementation of the LSM model combines Monte Carlo simulation, dynamic programming and statistical regression in a flexible procedure suitable for application to valuing nearly all types of corporate investments. The goal of this paper is to show how the LSM algorithm is applied in the context of a corporate investment, thus contributing to the understanding of the principles of its operation.
Insulin resistance: regression and clustering.
Yoon, Sangho; Assimes, Themistocles L; Quertermous, Thomas; Hsiao, Chin-Fu; Chuang, Lee-Ming; Hwu, Chii-Min; Rajaratnam, Bala; Olshen, Richard A
2014-01-01
In this paper we try to define insulin resistance (IR) precisely for a group of Chinese women. Our definition deliberately does not depend upon body mass index (BMI) or age, although in other studies, with particular random effects models quite different from models used here, BMI accounts for a large part of the variability in IR. We accomplish our goal through application of Gauss mixture vector quantization (GMVQ), a technique for clustering that was developed for application to lossy data compression. Defining data come from measurements that play major roles in medical practice. A precise statement of what the data are is in Section 1. Their family structures are described in detail. They concern levels of lipids and the results of an oral glucose tolerance test (OGTT). We apply GMVQ to residuals obtained from regressions of outcomes of an OGTT and lipids on functions of age and BMI that are inferred from the data. A bootstrap procedure developed for our family data supplemented by insights from other approaches leads us to believe that two clusters are appropriate for defining IR precisely. One cluster consists of women who are IR, and the other of women who seem not to be. Genes and other features are used to predict cluster membership. We argue that prediction with "main effects" is not satisfactory, but prediction that includes interactions may be.
Insulin resistance: regression and clustering.
Directory of Open Access Journals (Sweden)
Sangho Yoon
Full Text Available In this paper we try to define insulin resistance (IR precisely for a group of Chinese women. Our definition deliberately does not depend upon body mass index (BMI or age, although in other studies, with particular random effects models quite different from models used here, BMI accounts for a large part of the variability in IR. We accomplish our goal through application of Gauss mixture vector quantization (GMVQ, a technique for clustering that was developed for application to lossy data compression. Defining data come from measurements that play major roles in medical practice. A precise statement of what the data are is in Section 1. Their family structures are described in detail. They concern levels of lipids and the results of an oral glucose tolerance test (OGTT. We apply GMVQ to residuals obtained from regressions of outcomes of an OGTT and lipids on functions of age and BMI that are inferred from the data. A bootstrap procedure developed for our family data supplemented by insights from other approaches leads us to believe that two clusters are appropriate for defining IR precisely. One cluster consists of women who are IR, and the other of women who seem not to be. Genes and other features are used to predict cluster membership. We argue that prediction with "main effects" is not satisfactory, but prediction that includes interactions may be.
Regularized Label Relaxation Linear Regression.
Fang, Xiaozhao; Xu, Yong; Li, Xuelong; Lai, Zhihui; Wong, Wai Keung; Fang, Bingwu
2018-04-01
Linear regression (LR) and some of its variants have been widely used for classification problems. Most of these methods assume that during the learning phase, the training samples can be exactly transformed into a strict binary label matrix, which has too little freedom to fit the labels adequately. To address this problem, in this paper, we propose a novel regularized label relaxation LR method, which has the following notable characteristics. First, the proposed method relaxes the strict binary label matrix into a slack variable matrix by introducing a nonnegative label relaxation matrix into LR, which provides more freedom to fit the labels and simultaneously enlarges the margins between different classes as much as possible. Second, the proposed method constructs the class compactness graph based on manifold learning and uses it as the regularization item to avoid the problem of overfitting. The class compactness graph is used to ensure that the samples sharing the same labels can be kept close after they are transformed. Two different algorithms, which are, respectively, based on -norm and -norm loss functions are devised. These two algorithms have compact closed-form solutions in each iteration so that they are easily implemented. Extensive experiments show that these two algorithms outperform the state-of-the-art algorithms in terms of the classification accuracy and running time.
Interpreting Multiple Linear Regression: A Guidebook of Variable Importance
Nathans, Laura L.; Oswald, Frederick L.; Nimon, Kim
2012-01-01
Multiple regression (MR) analyses are commonly employed in social science fields. It is also common for interpretation of results to typically reflect overreliance on beta weights, often resulting in very limited interpretations of variable importance. It appears that few researchers employ other methods to obtain a fuller understanding of what…
Hierarchical Logistic Regression: Accounting for Multilevel Data in DIF Detection
French, Brian F.; Finch, W. Holmes
2010-01-01
The purpose of this study was to examine the performance of differential item functioning (DIF) assessment in the presence of a multilevel structure that often underlies data from large-scale testing programs. Analyses were conducted using logistic regression (LR), a popular, flexible, and effective tool for DIF detection. Data were simulated…
Henrard, S; Speybroeck, N; Hermans, C
2015-11-01
Haemophilia is a rare genetic haemorrhagic disease characterized by partial or complete deficiency of coagulation factor VIII, for haemophilia A, or IX, for haemophilia B. As in any other medical research domain, the field of haemophilia research is increasingly concerned with finding factors associated with binary or continuous outcomes through multivariable models. Traditional models include multiple logistic regressions, for binary outcomes, and multiple linear regressions for continuous outcomes. Yet these regression models are at times difficult to implement, especially for non-statisticians, and can be difficult to interpret. The present paper sought to didactically explain how, why, and when to use classification and regression tree (CART) analysis for haemophilia research. The CART method is non-parametric and non-linear, based on the repeated partitioning of a sample into subgroups based on a certain criterion. Breiman developed this method in 1984. Classification trees (CTs) are used to analyse categorical outcomes and regression trees (RTs) to analyse continuous ones. The CART methodology has become increasingly popular in the medical field, yet only a few examples of studies using this methodology specifically in haemophilia have to date been published. Two examples using CART analysis and previously published in this field are didactically explained in details. There is increasing interest in using CART analysis in the health domain, primarily due to its ease of implementation, use, and interpretation, thus facilitating medical decision-making. This method should be promoted for analysing continuous or categorical outcomes in haemophilia, when applicable. © 2015 John Wiley & Sons Ltd.
Estimating the exceedance probability of rain rate by logistic regression
Chiu, Long S.; Kedem, Benjamin
1990-01-01
Recent studies have shown that the fraction of an area with rain intensity above a fixed threshold is highly correlated with the area-averaged rain rate. To estimate the fractional rainy area, a logistic regression model, which estimates the conditional probability that rain rate over an area exceeds a fixed threshold given the values of related covariates, is developed. The problem of dependency in the data in the estimation procedure is bypassed by the method of partial likelihood. Analyses of simulated scanning multichannel microwave radiometer and observed electrically scanning microwave radiometer data during the Global Atlantic Tropical Experiment period show that the use of logistic regression in pixel classification is superior to multiple regression in predicting whether rain rate at each pixel exceeds a given threshold, even in the presence of noisy data. The potential of the logistic regression technique in satellite rain rate estimation is discussed.
Use of probabilistic weights to enhance linear regression myoelectric control.
Smith, Lauren H; Kuiken, Todd A; Hargrove, Levi J
2015-12-01
Clinically available prostheses for transradial amputees do not allow simultaneous myoelectric control of degrees of freedom (DOFs). Linear regression methods can provide simultaneous myoelectric control, but frequently also result in difficulty with isolating individual DOFs when desired. This study evaluated the potential of using probabilistic estimates of categories of gross prosthesis movement, which are commonly used in classification-based myoelectric control, to enhance linear regression myoelectric control. Gaussian models were fit to electromyogram (EMG) feature distributions for three movement classes at each DOF (no movement, or movement in either direction) and used to weight the output of linear regression models by the probability that the user intended the movement. Eight able-bodied and two transradial amputee subjects worked in a virtual Fitts' law task to evaluate differences in controllability between linear regression and probability-weighted regression for an intramuscular EMG-based three-DOF wrist and hand system. Real-time and offline analyses in able-bodied subjects demonstrated that probability weighting improved performance during single-DOF tasks (p linear regression control. Use of probability weights can improve the ability to isolate individual during linear regression myoelectric control, while maintaining the ability to simultaneously control multiple DOFs.
Hadayeghi, Alireza; Shalaby, Amer S; Persaud, Bhagwant N
2010-03-01
A common technique used for the calibration of collision prediction models is the Generalized Linear Modeling (GLM) procedure with the assumption of Negative Binomial or Poisson error distribution. In this technique, fixed coefficients that represent the average relationship between the dependent variable and each explanatory variable are estimated. However, the stationary relationship assumed may hide some important spatial factors of the number of collisions at a particular traffic analysis zone. Consequently, the accuracy of such models for explaining the relationship between the dependent variable and the explanatory variables may be suspected since collision frequency is likely influenced by many spatially defined factors such as land use, demographic characteristics, and traffic volume patterns. The primary objective of this study is to investigate the spatial variations in the relationship between the number of zonal collisions and potential transportation planning predictors, using the Geographically Weighted Poisson Regression modeling technique. The secondary objective is to build on knowledge comparing the accuracy of Geographically Weighted Poisson Regression models to that of Generalized Linear Models. The results show that the Geographically Weighted Poisson Regression models are useful for capturing spatially dependent relationships and generally perform better than the conventional Generalized Linear Models. Copyright 2009 Elsevier Ltd. All rights reserved.
Accounting for Zero Inflation of Mussel Parasite Counts Using Discrete Regression Models
Directory of Open Access Journals (Sweden)
Emel Çankaya
2017-06-01
Full Text Available In many ecological applications, the absences of species are inevitable due to either detection faults in samples or uninhabitable conditions for their existence, resulting in high number of zero counts or abundance. Usual practice for modelling such data is regression modelling of log(abundance+1 and it is well know that resulting model is inadequate for prediction purposes. New discrete models accounting for zero abundances, namely zero-inflated regression (ZIP and ZINB, Hurdle-Poisson (HP and Hurdle-Negative Binomial (HNB amongst others are widely preferred to the classical regression models. Due to the fact that mussels are one of the economically most important aquatic products of Turkey, the purpose of this study is therefore to examine the performances of these four models in determination of the significant biotic and abiotic factors on the occurrences of Nematopsis legeri parasite harming the existence of Mediterranean mussels (Mytilus galloprovincialis L.. The data collected from the three coastal regions of Sinop city in Turkey showed more than 50% of parasite counts on the average are zero-valued and model comparisons were based on information criterion. The results showed that the probability of the occurrence of this parasite is here best formulated by ZINB or HNB models and influential factors of models were found to be correspondent with ecological differences of the regions.
Principal component regression analysis with SPSS.
Liu, R X; Kuang, J; Gong, Q; Hou, X L
2003-06-01
The paper introduces all indices of multicollinearity diagnoses, the basic principle of principal component regression and determination of 'best' equation method. The paper uses an example to describe how to do principal component regression analysis with SPSS 10.0: including all calculating processes of the principal component regression and all operations of linear regression, factor analysis, descriptives, compute variable and bivariate correlations procedures in SPSS 10.0. The principal component regression analysis can be used to overcome disturbance of the multicollinearity. The simplified, speeded up and accurate statistical effect is reached through the principal component regression analysis with SPSS.
Gaussian process regression analysis for functional data
Shi, Jian Qing
2011-01-01
Gaussian Process Regression Analysis for Functional Data presents nonparametric statistical methods for functional regression analysis, specifically the methods based on a Gaussian process prior in a functional space. The authors focus on problems involving functional response variables and mixed covariates of functional and scalar variables.Covering the basics of Gaussian process regression, the first several chapters discuss functional data analysis, theoretical aspects based on the asymptotic properties of Gaussian process regression models, and new methodological developments for high dime
Unbalanced Regressions and the Predictive Equation
DEFF Research Database (Denmark)
Osterrieder, Daniela; Ventosa-Santaulària, Daniel; Vera-Valdés, J. Eduardo
Predictive return regressions with persistent regressors are typically plagued by (asymptotically) biased/inconsistent estimates of the slope, non-standard or potentially even spurious statistical inference, and regression unbalancedness. We alleviate the problem of unbalancedness in the theoreti......Predictive return regressions with persistent regressors are typically plagued by (asymptotically) biased/inconsistent estimates of the slope, non-standard or potentially even spurious statistical inference, and regression unbalancedness. We alleviate the problem of unbalancedness...
Semiparametric regression during 2003–2007
Ruppert, David
2009-01-01
Semiparametric regression is a fusion between parametric regression and nonparametric regression that integrates low-rank penalized splines, mixed model and hierarchical Bayesian methodology – thus allowing more streamlined handling of longitudinal and spatial correlation. We review progress in the field over the five-year period between 2003 and 2007. We find semiparametric regression to be a vibrant field with substantial involvement and activity, continual enhancement and widespread application.
New ridge parameters for ridge regression
Dorugade, A.V.
2014-01-01
Hoerl and Kennard (1970a) introduced the ridge regression estimator as an alternative to the ordinary least squares (OLS) estimator in the presence of multicollinearity. In ridge regression, ridge parameter plays an important role in parameter estimation. In this article, a new method for estimating ridge parameters in both situations of ordinary ridge regression (ORR) and generalized ridge regression (GRR) is proposed. The simulation study evaluates the performance of the proposed estimator ...
A Seemingly Unrelated Poisson Regression Model
King, Gary
1989-01-01
This article introduces a new estimator for the analysis of two contemporaneously correlated endogenous event count variables. This seemingly unrelated Poisson regression model (SUPREME) estimator combines the efficiencies created by single equation Poisson regression model estimators and insights from "seemingly unrelated" linear regression models.
Standards for Standardized Logistic Regression Coefficients
Menard, Scott
2011-01-01
Standardized coefficients in logistic regression analysis have the same utility as standardized coefficients in linear regression analysis. Although there has been no consensus on the best way to construct standardized logistic regression coefficients, there is now sufficient evidence to suggest a single best approach to the construction of a…
An Interactive Approach to Ridge Regression
Marquette, J. F.; Dufala, M. M.
1978-01-01
Ridge regression is an approach to ameliorating the problem of large standard errors of regression estimates when predictor variables are highly intercorrelated. An interactive computer program is presented which allows for investigation of the effects of using various ridge regression adjustment values. (JKS)
Mohebbi, Mohammadreza; Wolfe, Rory; Forbes, Andrew
2014-01-01
This paper applies the generalised linear model for modelling geographical variation to esophageal cancer incidence data in the Caspian region of Iran. The data have a complex and hierarchical structure that makes them suitable for hierarchical analysis using Bayesian techniques, but with care required to deal with problems arising from counts of events observed in small geographical areas when overdispersion and residual spatial autocorrelation are present. These considerations lead to nine regression models derived from using three probability distributions for count data: Poisson, generalised Poisson and negative binomial, and three different autocorrelation structures. We employ the framework of Bayesian variable selection and a Gibbs sampling based technique to identify significant cancer risk factors. The framework deals with situations where the number of possible models based on different combinations of candidate explanatory variables is large enough such that calculation of posterior probabilities for all models is difficult or infeasible. The evidence from applying the modelling methodology suggests that modelling strategies based on the use of generalised Poisson and negative binomial with spatial autocorrelation work well and provide a robust basis for inference. PMID:24413702
Energy Technology Data Exchange (ETDEWEB)
Hernandez I, S.; Ortiz C, E.; Chavez M, C., E-mail: lunitza@gmail.com [UNAM, Facultad de Ingenieria, Circuito Interior, Ciudad Universitaria, 04510 Mexico D. F. (Mexico)
2011-11-15
At the present time, is an unquestionable fact that the nuclear electrical energy is a topic of vital importance, no more because eliminates the dependence of the hydrocarbons and is friendly with the environment, but because is also a sure and reliable energy source, and represents a viable alternative before the claims in the growing demand of electricity in Mexico. Before this panorama, was intended several scenarios to elevate the capacity of electric generation of nuclear origin with a variable participation. One of the contemplated scenarios is represented by the expansion project of the nuclear power plant Laguna Verde through the addition of a third reactor that serves as detonator of an integral program that proposes the installation of more nuclear reactors in the country. Before this possible scenario, the Federal Commission of Electricity like responsible organism of supplying energy to the population should have tools that offer it the flexibility to be adapted to the possible changes that will be presented along the project and also gives a value to the risk to future. The methodology denominated Real Options, Binomial model was proposed as an evaluation tool that allows to quantify the value of the expansion proposal, demonstrating the feasibility of the project through a periodic visualization of their evolution, all with the objective of supplying a financial analysis that serves as base and justification before the evident apogee of the nuclear energy that will be presented in future years. (Author)
Comparing parametric and nonparametric regression methods for panel data
DEFF Research Database (Denmark)
Czekaj, Tomasz Gerard; Henningsen, Arne
We investigate and compare the suitability of parametric and non-parametric stochastic regression methods for analysing production technologies and the optimal firm size. Our theoretical analysis shows that the most commonly used functional forms in empirical production analysis, Cobb-Douglas and......We investigate and compare the suitability of parametric and non-parametric stochastic regression methods for analysing production technologies and the optimal firm size. Our theoretical analysis shows that the most commonly used functional forms in empirical production analysis, Cobb......-Douglas and Translog, are unsuitable for analysing the optimal firm size. We show that the Translog functional form implies an implausible linear relationship between the (logarithmic) firm size and the elasticity of scale, where the slope is artificially related to the substitutability between the inputs....... The practical applicability of the parametric and non-parametric regression methods is scrutinised and compared by an empirical example: we analyse the production technology and investigate the optimal size of Polish crop farms based on a firm-level balanced panel data set. A nonparametric specification test...
The purpose of this report is to provide a reference manual that could be used by investigators for making informed use of logistic regression using two methods (standard logistic regression and MARS). The details for analyses of relationships between a dependent binary response ...
Regression with Sparse Approximations of Data
DEFF Research Database (Denmark)
Noorzad, Pardis; Sturm, Bob L.
2012-01-01
We propose sparse approximation weighted regression (SPARROW), a method for local estimation of the regression function that uses sparse approximation with a dictionary of measurements. SPARROW estimates the regression function at a point with a linear combination of a few regressands selected...... by a sparse approximation of the point in terms of the regressors. We show SPARROW can be considered a variant of \\(k\\)-nearest neighbors regression (\\(k\\)-NNR), and more generally, local polynomial kernel regression. Unlike \\(k\\)-NNR, however, SPARROW can adapt the number of regressors to use based...
Assumptions of Multiple Regression: Correcting Two Misconceptions
Directory of Open Access Journals (Sweden)
Matt N. Williams
2013-09-01
Full Text Available In 2002, an article entitled - Four assumptions of multiple regression that researchers should always test- by.Osborne and Waters was published in PARE. This article has gone on to be viewed more than 275,000 times.(as of August 2013, and it is one of the first results displayed in a Google search for - regression.assumptions- . While Osborne and Waters' efforts in raising awareness of the need to check assumptions.when using regression are laudable, we note that the original article contained at least two fairly important.misconceptions about the assumptions of multiple regression: Firstly, that multiple regression requires the.assumption of normally distributed variables; and secondly, that measurement errors necessarily cause.underestimation of simple regression coefficients. In this article, we clarify that multiple regression models.estimated using ordinary least squares require the assumption of normally distributed errors in order for.trustworthy inferences, at least in small samples, but not the assumption of normally distributed response or.predictor variables. Secondly, we point out that regression coefficients in simple regression models will be.biased (toward zero estimates of the relationships between variables of interest when measurement error is.uncorrelated across those variables, but that when correlated measurement error is present, regression.coefficients may be either upwardly or downwardly biased. We conclude with a brief corrected summary of.the assumptions of multiple regression when using ordinary least squares.
Xiao, Yongling; Abrahamowicz, Michal
2010-03-30
We propose two bootstrap-based methods to correct the standard errors (SEs) from Cox's model for within-cluster correlation of right-censored event times. The cluster-bootstrap method resamples, with replacement, only the clusters, whereas the two-step bootstrap method resamples (i) the clusters, and (ii) individuals within each selected cluster, with replacement. In simulations, we evaluate both methods and compare them with the existing robust variance estimator and the shared gamma frailty model, which are available in statistical software packages. We simulate clustered event time data, with latent cluster-level random effects, which are ignored in the conventional Cox's model. For cluster-level covariates, both proposed bootstrap methods yield accurate SEs, and type I error rates, and acceptable coverage rates, regardless of the true random effects distribution, and avoid serious variance under-estimation by conventional Cox-based standard errors. However, the two-step bootstrap method over-estimates the variance for individual-level covariates. We also apply the proposed bootstrap methods to obtain confidence bands around flexible estimates of time-dependent effects in a real-life analysis of cluster event times.
Gilstrap, Donald L.
2013-01-01
In addition to qualitative methods presented in chaos and complexity theories in educational research, this article addresses quantitative methods that may show potential for future research studies. Although much in the social and behavioral sciences literature has focused on computer simulations, this article explores current chaos and…
Scott, Neil W.; Fayers, Peter M.; Aaronson, Neil K.; Bottomley, Andrew; de Graeff, Alexander; Groenvold, Mogens; Gundy, Chad; Koller, Michael; Petersen, Morten A.; Sprangers, Mirjam A. G.
2010-01-01
Differential item functioning (DIF) methods can be used to determine whether different subgroups respond differently to particular items within a health-related quality of life (HRQoL) subscale, after allowing for overall subgroup differences in that scale. This article reviews issues that arise
Scott, N.W.; Fayers, P.M.; Aaronson, N.K.; Bottomley, A.; de Graeff, A.; Groenvold, M.; Gundy, C.; Koller, M.; Petersen, M.A.; Sprangers, M.A.G.
2010-01-01
ABSTRACT: BACKGROUND: Differential item functioning (DIF) methods can be used to determine whether different subgroups respond differently to particular items within a health-related quality of life (HRQoL) subscale, after allowing for overall subgroup differences in that scale. This article reviews
Analyses of developmental rate isomorphy in ectotherms: Introducing the dirichlet regression
Czech Academy of Sciences Publication Activity Database
Boukal S., David; Ditrich, Tomáš; Kutcherov, D.; Sroka, Pavel; Dudová, Pavla; Papáček, M.
2015-01-01
Roč. 10, č. 6 (2015), e0129341 E-ISSN 1932-6203 R&D Projects: GA ČR GAP505/10/0096 Grant - others:European Fund(CZ) PERG04-GA-2008-239543; GA JU(CZ) 145/2013/P Institutional support: RVO:60077344 Keywords : ectotherms Subject RIV: ED - Physiology Impact factor: 3.057, year: 2015 http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0129341