Microbial comparative pan-genomics using binomial mixture models
Directory of Open Access Journals (Sweden)
Ussery David W
2009-08-01
Full Text Available Abstract Background The size of the core- and pan-genome of bacterial species is a topic of increasing interest due to the growing number of sequenced prokaryote genomes, many from the same species. Attempts to estimate these quantities have been made, using regression methods or mixture models. We extend the latter approach by using statistical ideas developed for capture-recapture problems in ecology and epidemiology. Results We estimate core- and pan-genome sizes for 16 different bacterial species. The results reveal a complex dependency structure for most species, manifested as heterogeneous detection probabilities. Estimated pan-genome sizes range from small (around 2600 gene families in Buchnera aphidicola to large (around 43000 gene families in Escherichia coli. Results for Echerichia coli show that as more data become available, a larger diversity is estimated, indicating an extensive pool of rarely occurring genes in the population. Conclusion Analyzing pan-genomics data with binomial mixture models is a way to handle dependencies between genomes, which we find is always present. A bottleneck in the estimation procedure is the annotation of rarely occurring genes.
Microbial comparative pan-genomics using binomial mixture models
DEFF Research Database (Denmark)
Ussery, David; Snipen, L; Almøy, T
2009-01-01
The size of the core- and pan-genome of bacterial species is a topic of increasing interest due to the growing number of sequenced prokaryote genomes, many from the same species. Attempts to estimate these quantities have been made, using regression methods or mixture models. We extend the latter...... approach by using statistical ideas developed for capture-recapture problems in ecology and epidemiology. RESULTS: We estimate core- and pan-genome sizes for 16 different bacterial species. The results reveal a complex dependency structure for most species, manifested as heterogeneous detection...... probabilities. Estimated pan-genome sizes range from small (around 2600 gene families) in Buchnera aphidicola to large (around 43000 gene families) in Escherichia coli. Results for Echerichia coli show that as more data become available, a larger diversity is estimated, indicating an extensive pool of rarely...
O'Donnell, Katherine M; Thompson, Frank R; Semlitsch, Raymond D
2015-01-01
Detectability of individual animals is highly variable and nearly always binomial mixture models to account for multiple sources of variation in detectability. The state process of the hierarchical model describes ecological mechanisms that generate spatial and temporal patterns in abundance, while the observation model accounts for the imperfect nature of counting individuals due to temporary emigration and false absences. We illustrate our model's potential advantages, including the allowance of temporary emigration between sampling periods, with a case study of southern red-backed salamanders Plethodon serratus. We fit our model and a standard binomial mixture model to counts of terrestrial salamanders surveyed at 40 sites during 3-5 surveys each spring and fall 2010-2012. Our models generated similar parameter estimates to standard binomial mixture models. Aspect was the best predictor of salamander abundance in our case study; abundance increased as aspect became more northeasterly. Increased time-since-rainfall strongly decreased salamander surface activity (i.e. availability for sampling), while higher amounts of woody cover objects and rocks increased conditional detection probability (i.e. probability of capture, given an animal is exposed to sampling). By explicitly accounting for both components of detectability, we increased congruence between our statistical modeling and our ecological understanding of the system. We stress the importance of choosing survey locations and protocols that maximize species availability and conditional detection probability to increase population parameter estimate reliability.
Directory of Open Access Journals (Sweden)
Katherine M O'Donnell
Full Text Available Detectability of individual animals is highly variable and nearly always < 1; imperfect detection must be accounted for to reliably estimate population sizes and trends. Hierarchical models can simultaneously estimate abundance and effective detection probability, but there are several different mechanisms that cause variation in detectability. Neglecting temporary emigration can lead to biased population estimates because availability and conditional detection probability are confounded. In this study, we extend previous hierarchical binomial mixture models to account for multiple sources of variation in detectability. The state process of the hierarchical model describes ecological mechanisms that generate spatial and temporal patterns in abundance, while the observation model accounts for the imperfect nature of counting individuals due to temporary emigration and false absences. We illustrate our model's potential advantages, including the allowance of temporary emigration between sampling periods, with a case study of southern red-backed salamanders Plethodon serratus. We fit our model and a standard binomial mixture model to counts of terrestrial salamanders surveyed at 40 sites during 3-5 surveys each spring and fall 2010-2012. Our models generated similar parameter estimates to standard binomial mixture models. Aspect was the best predictor of salamander abundance in our case study; abundance increased as aspect became more northeasterly. Increased time-since-rainfall strongly decreased salamander surface activity (i.e. availability for sampling, while higher amounts of woody cover objects and rocks increased conditional detection probability (i.e. probability of capture, given an animal is exposed to sampling. By explicitly accounting for both components of detectability, we increased congruence between our statistical modeling and our ecological understanding of the system. We stress the importance of choosing survey locations and
Chain binomial models and binomial autoregressive processes.
Weiss, Christian H; Pollett, Philip K
2012-09-01
We establish a connection between a class of chain-binomial models of use in ecology and epidemiology and binomial autoregressive (AR) processes. New results are obtained for the latter, including expressions for the lag-conditional distribution and related quantities. We focus on two types of chain-binomial model, extinction-colonization and colonization-extinction models, and present two approaches to parameter estimation. The asymptotic distributions of the resulting estimators are studied, as well as their finite-sample performance, and we give an application to real data. A connection is made with standard AR models, which also has implications for parameter estimation. © 2011, The International Biometric Society.
Comparison: Binomial model and Black Scholes model
Directory of Open Access Journals (Sweden)
Amir Ahmad Dar
2018-03-01
Full Text Available The Binomial Model and the Black Scholes Model are the popular methods that are used to solve the option pricing problems. Binomial Model is a simple statistical method and Black Scholes model requires a solution of a stochastic differential equation. Pricing of European call and a put option is a very difficult method used by actuaries. The main goal of this study is to differentiate the Binominal model and the Black Scholes model by using two statistical model - t-test and Tukey model at one period. Finally, the result showed that there is no significant difference between the means of the European options by using the above two models.
Buckland, Steeves; Cole, Nik C; Aguirre-Gutiérrez, Jesús; Gallagher, Laura E; Henshaw, Sion M; Besnard, Aurélien; Tucker, Rachel M; Bachraz, Vishnu; Ruhomaun, Kevin; Harris, Stephen
2014-01-01
The invasion of the giant Madagascar day gecko Phelsuma grandis has increased the threats to the four endemic Mauritian day geckos (Phelsuma spp.) that have survived on mainland Mauritius. We had two main aims: (i) to predict the spatial distribution and overlap of P. grandis and the endemic geckos at a landscape level; and (ii) to investigate the effects of P. grandis on the abundance and risks of extinction of the endemic geckos at a local scale. An ensemble forecasting approach was used to predict the spatial distribution and overlap of P. grandis and the endemic geckos. We used hierarchical binomial mixture models and repeated visual estimate surveys to calculate the abundance of the endemic geckos in sites with and without P. grandis. The predicted range of each species varied from 85 km2 to 376 km2. Sixty percent of the predicted range of P. grandis overlapped with the combined predicted ranges of the four endemic geckos; 15% of the combined predicted ranges of the four endemic geckos overlapped with P. grandis. Levin's niche breadth varied from 0.140 to 0.652 between P. grandis and the four endemic geckos. The abundance of endemic geckos was 89% lower in sites with P. grandis compared to sites without P. grandis, and the endemic geckos had been extirpated at four of ten sites we surveyed with P. grandis. Species Distribution Modelling, together with the breadth metrics, predicted that P. grandis can partly share the equivalent niche with endemic species and survive in a range of environmental conditions. We provide strong evidence that smaller endemic geckos are unlikely to survive in sympatry with P. grandis. This is a cause of concern in both Mauritius and other countries with endemic species of Phelsuma.
Directory of Open Access Journals (Sweden)
Steeves Buckland
Full Text Available The invasion of the giant Madagascar day gecko Phelsuma grandis has increased the threats to the four endemic Mauritian day geckos (Phelsuma spp. that have survived on mainland Mauritius. We had two main aims: (i to predict the spatial distribution and overlap of P. grandis and the endemic geckos at a landscape level; and (ii to investigate the effects of P. grandis on the abundance and risks of extinction of the endemic geckos at a local scale. An ensemble forecasting approach was used to predict the spatial distribution and overlap of P. grandis and the endemic geckos. We used hierarchical binomial mixture models and repeated visual estimate surveys to calculate the abundance of the endemic geckos in sites with and without P. grandis. The predicted range of each species varied from 85 km2 to 376 km2. Sixty percent of the predicted range of P. grandis overlapped with the combined predicted ranges of the four endemic geckos; 15% of the combined predicted ranges of the four endemic geckos overlapped with P. grandis. Levin's niche breadth varied from 0.140 to 0.652 between P. grandis and the four endemic geckos. The abundance of endemic geckos was 89% lower in sites with P. grandis compared to sites without P. grandis, and the endemic geckos had been extirpated at four of ten sites we surveyed with P. grandis. Species Distribution Modelling, together with the breadth metrics, predicted that P. grandis can partly share the equivalent niche with endemic species and survive in a range of environmental conditions. We provide strong evidence that smaller endemic geckos are unlikely to survive in sympatry with P. grandis. This is a cause of concern in both Mauritius and other countries with endemic species of Phelsuma.
Log-binomial models: exploring failed convergence.
Williamson, Tyler; Eliasziw, Misha; Fick, Gordon Hilton
2013-12-13
Relative risk is a summary metric that is commonly used in epidemiological investigations. Increasingly, epidemiologists are using log-binomial models to study the impact of a set of predictor variables on a single binary outcome, as they naturally offer relative risks. However, standard statistical software may report failed convergence when attempting to fit log-binomial models in certain settings. The methods that have been proposed in the literature for dealing with failed convergence use approximate solutions to avoid the issue. This research looks directly at the log-likelihood function for the simplest log-binomial model where failed convergence has been observed, a model with a single linear predictor with three levels. The possible causes of failed convergence are explored and potential solutions are presented for some cases. Among the principal causes is a failure of the fitting algorithm to converge despite the log-likelihood function having a single finite maximum. Despite these limitations, log-binomial models are a viable option for epidemiologists wishing to describe the relationship between a set of predictors and a binary outcome where relative risk is the desired summary measure. Epidemiologists are encouraged to continue to use log-binomial models and advocate for improvements to the fitting algorithms to promote the widespread use of log-binomial models.
Binomial test models and item difficulty
van der Linden, Willem J.
1979-01-01
In choosing a binomial test model, it is important to know exactly what conditions are imposed on item difficulty. In this paper these conditions are examined for both a deterministic and a stochastic conception of item responses. It appears that they are more restrictive than is generally
The Validation of a Beta-Binomial Model for Overdispersed Binomial Data.
Kim, Jongphil; Lee, Ji-Hyun
2017-01-01
The beta-binomial model has been widely used as an analytically tractable alternative that captures the overdispersion of an intra-correlated, binomial random variable, X . However, the model validation for X has been rarely investigated. As a beta-binomial mass function takes on a few different shapes, the model validation is examined for each of the classified shapes in this paper. Further, the mean square error (MSE) is illustrated for each shape by the maximum likelihood estimator (MLE) based on a beta-binomial model approach and the method of moments estimator (MME) in order to gauge when and how much the MLE is biased.
Longitudinal beta-binomial modeling using GEE for overdispersed binomial data.
Wu, Hongqian; Zhang, Ying; Long, Jeffrey D
2017-03-15
Longitudinal binomial data are frequently generated from multiple questionnaires and assessments in various scientific settings for which the binomial data are often overdispersed. The standard generalized linear mixed effects model may result in severe underestimation of standard errors of estimated regression parameters in such cases and hence potentially bias the statistical inference. In this paper, we propose a longitudinal beta-binomial model for overdispersed binomial data and estimate the regression parameters under a probit model using the generalized estimating equation method. A hybrid algorithm of the Fisher scoring and the method of moments is implemented for computing the method. Extensive simulation studies are conducted to justify the validity of the proposed method. Finally, the proposed method is applied to analyze functional impairment in subjects who are at risk of Huntington disease from a multisite observational study of prodromal Huntington disease. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Penggunaan Model Binomial Pada Penentuan Harga Opsi Saham Karyawan
Directory of Open Access Journals (Sweden)
Dara Puspita Anggraeni
2015-11-01
Full Text Available Binomial Model for Valuing Employee Stock Options. Employee Stock Options (ESO differ from standard exchange-traded options. The three main differences in a valuation model for employee stock options : Vesting Period, Exit Rate and Non-Transferability. In this thesis, the model for valuing employee stock options discussed. This model are implement with a generalized binomial model.
Correlated binomial models and correlation structures
International Nuclear Information System (INIS)
Hisakado, Masato; Kitsukawa, Kenji; Mori, Shintaro
2006-01-01
We discuss a general method to construct correlated binomial distributions by imposing several consistent relations on the joint probability function. We obtain self-consistency relations for the conditional correlations and conditional probabilities. The beta-binomial distribution is derived by a strong symmetric assumption on the conditional correlations. Our derivation clarifies the 'correlation' structure of the beta-binomial distribution. It is also possible to study the correlation structures of other probability distributions of exchangeable (homogeneous) correlated Bernoulli random variables. We study some distribution functions and discuss their behaviours in terms of their correlation structures
Directory of Open Access Journals (Sweden)
Xavier A. Harrison
2015-07-01
Full Text Available Overdispersion is a common feature of models of biological data, but researchers often fail to model the excess variation driving the overdispersion, resulting in biased parameter estimates and standard errors. Quantifying and modeling overdispersion when it is present is therefore critical for robust biological inference. One means to account for overdispersion is to add an observation-level random effect (OLRE to a model, where each data point receives a unique level of a random effect that can absorb the extra-parametric variation in the data. Although some studies have investigated the utility of OLRE to model overdispersion in Poisson count data, studies doing so for Binomial proportion data are scarce. Here I use a simulation approach to investigate the ability of both OLRE models and Beta-Binomial models to recover unbiased parameter estimates in mixed effects models of Binomial data under various degrees of overdispersion. In addition, as ecologists often fit random intercept terms to models when the random effect sample size is low (<5 levels, I investigate the performance of both model types under a range of random effect sample sizes when overdispersion is present. Simulation results revealed that the efficacy of OLRE depends on the process that generated the overdispersion; OLRE failed to cope with overdispersion generated from a Beta-Binomial mixture model, leading to biased slope and intercept estimates, but performed well for overdispersion generated by adding random noise to the linear predictor. Comparison of parameter estimates from an OLRE model with those from its corresponding Beta-Binomial model readily identified when OLRE were performing poorly due to disagreement between effect sizes, and this strategy should be employed whenever OLRE are used for Binomial data to assess their reliability. Beta-Binomial models performed well across all contexts, but showed a tendency to underestimate effect sizes when modelling non-Beta-Binomial
Harrison, Xavier A
2015-01-01
Overdispersion is a common feature of models of biological data, but researchers often fail to model the excess variation driving the overdispersion, resulting in biased parameter estimates and standard errors. Quantifying and modeling overdispersion when it is present is therefore critical for robust biological inference. One means to account for overdispersion is to add an observation-level random effect (OLRE) to a model, where each data point receives a unique level of a random effect that can absorb the extra-parametric variation in the data. Although some studies have investigated the utility of OLRE to model overdispersion in Poisson count data, studies doing so for Binomial proportion data are scarce. Here I use a simulation approach to investigate the ability of both OLRE models and Beta-Binomial models to recover unbiased parameter estimates in mixed effects models of Binomial data under various degrees of overdispersion. In addition, as ecologists often fit random intercept terms to models when the random effect sample size is low (model types under a range of random effect sample sizes when overdispersion is present. Simulation results revealed that the efficacy of OLRE depends on the process that generated the overdispersion; OLRE failed to cope with overdispersion generated from a Beta-Binomial mixture model, leading to biased slope and intercept estimates, but performed well for overdispersion generated by adding random noise to the linear predictor. Comparison of parameter estimates from an OLRE model with those from its corresponding Beta-Binomial model readily identified when OLRE were performing poorly due to disagreement between effect sizes, and this strategy should be employed whenever OLRE are used for Binomial data to assess their reliability. Beta-Binomial models performed well across all contexts, but showed a tendency to underestimate effect sizes when modelling non-Beta-Binomial data. Finally, both OLRE and Beta-Binomial models performed
Amaliana, Luthfatul; Sa'adah, Umu; Wayan Surya Wardhani, Ni
2017-12-01
Tetanus Neonatorum is an infectious disease that can be prevented by immunization. The number of Tetanus Neonatorum cases in East Java Province is the highest in Indonesia until 2015. Tetanus Neonatorum data contain over dispersion and big enough proportion of zero-inflation. Negative Binomial (NB) regression is an alternative method when over dispersion happens in Poisson regression. However, the data containing over dispersion and zero-inflation are more appropriately analyzed by using Zero-Inflated Negative Binomial (ZINB) regression. The purpose of this study are: (1) to model Tetanus Neonatorum cases in East Java Province with 71.05 percent proportion of zero-inflation by using NB and ZINB regression, (2) to obtain the best model. The result of this study indicates that ZINB is better than NB regression with smaller AIC.
[Using log-binomial model for estimating the prevalence ratio].
Ye, Rong; Gao, Yan-hui; Yang, Yi; Chen, Yue
2010-05-01
To estimate the prevalence ratios, using a log-binomial model with or without continuous covariates. Prevalence ratios for individuals' attitude towards smoking-ban legislation associated with smoking status, estimated by using a log-binomial model were compared with odds ratios estimated by logistic regression model. In the log-binomial modeling, maximum likelihood method was used when there were no continuous covariates and COPY approach was used if the model did not converge, for example due to the existence of continuous covariates. We examined the association between individuals' attitude towards smoking-ban legislation and smoking status in men and women. Prevalence ratio and odds ratio estimation provided similar results for the association in women since smoking was not common. In men however, the odds ratio estimates were markedly larger than the prevalence ratios due to a higher prevalence of outcome. The log-binomial model did not converge when age was included as a continuous covariate and COPY method was used to deal with the situation. All analysis was performed by SAS. Prevalence ratio seemed to better measure the association than odds ratio when prevalence is high. SAS programs were provided to calculate the prevalence ratios with or without continuous covariates in the log-binomial regression analysis.
Analysis of hypoglycemic events using negative binomial models.
Luo, Junxiang; Qu, Yongming
2013-01-01
Negative binomial regression is a standard model to analyze hypoglycemic events in diabetes clinical trials. Adjusting for baseline covariates could potentially increase the estimation efficiency of negative binomial regression. However, adjusting for covariates raises concerns about model misspecification, in which the negative binomial regression is not robust because of its requirement for strong model assumptions. In some literature, it was suggested to correct the standard error of the maximum likelihood estimator through introducing overdispersion, which can be estimated by the Deviance or Pearson Chi-square. We proposed to conduct the negative binomial regression using Sandwich estimation to calculate the covariance matrix of the parameter estimates together with Pearson overdispersion correction (denoted by NBSP). In this research, we compared several commonly used negative binomial model options with our proposed NBSP. Simulations and real data analyses showed that NBSP is the most robust to model misspecification, and the estimation efficiency will be improved by adjusting for baseline hypoglycemia. Copyright © 2013 John Wiley & Sons, Ltd.
Speech-discrimination scores modeled as a binomial variable.
Thornton, A R; Raffin, M J
1978-09-01
Many studies have reported variability data for tests of speech discrimination, and the disparate results of these studies have not been given a simple explanation. Arguments over the relative merits of 25- vs 50-word tests have ignored the basic mathematical properties inherent in the use of percentage scores. The present study models performance on clinical tests of speech discrimination as a binomial variable. A binomial model was developed, and some of its characteristics were tested against data from 4120 scores obtained on the CID Auditory Test W-22. A table for determining significant deviations between scores was generated and compared to observed differences in half-list scores for the W-22 tests. Good agreement was found between predicted and observed values. Implications of the binomial characteristics of speech-discrimination scores are discussed.
Quantum Theory for the Binomial Model in Finance Thoery
Chen, Zeqian
2001-01-01
In this paper, a quantum model for the binomial market in finance is proposed. We show that its risk-neutral world exhibits an intriguing structure as a disk in the unit ball of ${\\bf R}^3,$ whose radius is a function of the risk-free interest rate with two thresholds which prevent arbitrage opportunities from this quantum market. Furthermore, from the quantum mechanical point of view we re-deduce the Cox-Ross-Rubinstein binomial option pricing formula by considering Maxwell-Boltzmann statist...
Selecting Tools to Model Integer and Binomial Multiplication
Pratt, Sarah Smitherman; Eddy, Colleen M.
2017-01-01
Mathematics teachers frequently provide concrete manipulatives to students during instruction; however, the rationale for using certain manipulatives in conjunction with concepts may not be explored. This article focuses on area models that are currently used in classrooms to provide concrete examples of integer and binomial multiplication. The…
Correlation Structures of Correlated Binomial Models and Implied Default Distribution
S. Mori; K. Kitsukawa; M. Hisakado
2006-01-01
We show how to analyze and interpret the correlation structures, the conditional expectation values and correlation coefficients of exchangeable Bernoulli random variables. We study implied default distributions for the iTraxx-CJ tranches and some popular probabilistic models, including the Gaussian copula model, Beta binomial distribution model and long-range Ising model. We interpret the differences in their profiles in terms of the correlation structures. The implied default distribution h...
Data analysis using the Binomial Failure Rate common cause model
International Nuclear Information System (INIS)
Atwood, C.L.
1983-09-01
This report explains how to use the Binomial Failure Rate (BFR) method to estimate common cause failure rates. The entire method is described, beginning with the conceptual model, and covering practical issues of data preparation, treatment of variation in the failure rates, Bayesian estimation of the quantities of interest, checking the model assumptions for lack of fit to the data, and the ultimate application of the answers
Standardized binomial models for risk or prevalence ratios and differences.
Richardson, David B; Kinlaw, Alan C; MacLehose, Richard F; Cole, Stephen R
2015-10-01
Epidemiologists often analyse binary outcomes in cohort and cross-sectional studies using multivariable logistic regression models, yielding estimates of adjusted odds ratios. It is widely known that the odds ratio closely approximates the risk or prevalence ratio when the outcome is rare, and it does not do so when the outcome is common. Consequently, investigators may decide to directly estimate the risk or prevalence ratio using a log binomial regression model. We describe the use of a marginal structural binomial regression model to estimate standardized risk or prevalence ratios and differences. We illustrate the proposed approach using data from a cohort study of coronary heart disease status in Evans County, Georgia, USA. The approach reduces problems with model convergence typical of log binomial regression by shifting all explanatory variables except the exposures of primary interest from the linear predictor of the outcome regression model to a model for the standardization weights. The approach also facilitates evaluation of departures from additivity in the joint effects of two exposures. Epidemiologists should consider reporting standardized risk or prevalence ratios and differences in cohort and cross-sectional studies. These are readily-obtained using the SAS, Stata and R statistical software packages. The proposed approach estimates the exposure effect in the total population. © The Author 2015; all rights reserved. Published by Oxford University Press on behalf of the International Epidemiological Association.
Correlation Structures of Correlated Binomial Models and Implied Default Distribution
Mori, Shintaro; Kitsukawa, Kenji; Hisakado, Masato
2008-11-01
We show how to analyze and interpret the correlation structures, the conditional expectation values and correlation coefficients of exchangeable Bernoulli random variables. We study implied default distributions for the iTraxx-CJ tranches and some popular probabilistic models, including the Gaussian copula model, Beta binomial distribution model and long-range Ising model. We interpret the differences in their profiles in terms of the correlation structures. The implied default distribution has singular correlation structures, reflecting the credit market implications. We point out two possible origins of the singular behavior.
Negative binomial models for abundance estimation of multiple closed populations
Boyce, Mark S.; MacKenzie, Darry I.; Manly, Bryan F.J.; Haroldson, Mark A.; Moody, David W.
2001-01-01
Counts of uniquely identified individuals in a population offer opportunities to estimate abundance. However, for various reasons such counts may be burdened by heterogeneity in the probability of being detected. Theoretical arguments and empirical evidence demonstrate that the negative binomial distribution (NBD) is a useful characterization for counts from biological populations with heterogeneity. We propose a method that focuses on estimating multiple populations by simultaneously using a suite of models derived from the NBD. We used this approach to estimate the number of female grizzly bears (Ursus arctos) with cubs-of-the-year in the Yellowstone ecosystem, for each year, 1986-1998. Akaike's Information Criteria (AIC) indicated that a negative binomial model with a constant level of heterogeneity across all years was best for characterizing the sighting frequencies of female grizzly bears. A lack-of-fit test indicated the model adequately described the collected data. Bootstrap techniques were used to estimate standard errors and 95% confidence intervals. We provide a Monte Carlo technique, which confirms that the Yellowstone ecosystem grizzly bear population increased during the period 1986-1998.
Estimation Parameters And Modelling Zero Inflated Negative Binomial
Directory of Open Access Journals (Sweden)
Cindy Cahyaning Astuti
2016-11-01
Full Text Available Regression analysis is used to determine relationship between one or several response variable (Y with one or several predictor variables (X. Regression model between predictor variables and the Poisson distributed response variable is called Poisson Regression Model. Since, Poisson Regression requires an equality between mean and variance, it is not appropriate to apply this model on overdispersion (variance is higher than mean. Poisson regression model is commonly used to analyze the count data. On the count data type, it is often to encounteredd some observations that have zero value with large proportion of zero value on the response variable (zero Inflation. Poisson regression can be used to analyze count data but it has not been able to solve problem of excess zero value on the response variable. An alternative model which is more suitable for overdispersion data and can solve the problem of excess zero value on the response variable is Zero Inflated Negative Binomial (ZINB. In this research, ZINB is applied on the case of Tetanus Neonatorum in East Java. The aim of this research is to examine the likelihood function and to form an algorithm to estimate the parameter of ZINB and also applying ZINB model in the case of Tetanus Neonatorum in East Java. Maximum Likelihood Estimation (MLE method is used to estimate the parameter on ZINB and the likelihood function is maximized using Expectation Maximization (EM algorithm. Test results of ZINB regression model showed that the predictor variable have a partial significant effect at negative binomial model is the percentage of pregnant women visits and the percentage of maternal health personnel assisted, while the predictor variables that have a partial significant effect at zero inflation model is the percentage of neonatus visits.
Low reheating temperatures in monomial and binomial inflationary models
International Nuclear Information System (INIS)
Rehagen, Thomas; Gelmini, Graciela B.
2015-01-01
We investigate the allowed range of reheating temperature values in light of the Planck 2015 results and the recent joint analysis of Cosmic Microwave Background (CMB) data from the BICEP2/Keck Array and Planck experiments, using monomial and binomial inflationary potentials. While the well studied ϕ 2 inflationary potential is no longer favored by current CMB data, as well as ϕ p with p>2, a ϕ 1 potential and canonical reheating (w re =0) provide a good fit to the CMB measurements. In this last case, we find that the Planck 2015 68% confidence limit upper bound on the spectral index, n s , implies an upper bound on the reheating temperature of T re ≲6×10 10 GeV, and excludes instantaneous reheating. The low reheating temperatures allowed by this model open the possibility that dark matter could be produced during the reheating period instead of when the Universe is radiation dominated, which could lead to very different predictions for the relic density and momentum distribution of WIMPs, sterile neutrinos, and axions. We also study binomial inflationary potentials and show the effects of a small departure from a ϕ 1 potential. We find that as a subdominant ϕ 2 term in the potential increases, first instantaneous reheating becomes allowed, and then the lowest possible reheating temperature of T re =4 MeV is excluded by the Planck 2015 68% confidence limit
Negative binomial mixed models for analyzing microbiome count data.
Zhang, Xinyan; Mallick, Himel; Tang, Zaixiang; Zhang, Lei; Cui, Xiangqin; Benson, Andrew K; Yi, Nengjun
2017-01-03
Recent advances in next-generation sequencing (NGS) technology enable researchers to collect a large volume of metagenomic sequencing data. These data provide valuable resources for investigating interactions between the microbiome and host environmental/clinical factors. In addition to the well-known properties of microbiome count measurements, for example, varied total sequence reads across samples, over-dispersion and zero-inflation, microbiome studies usually collect samples with hierarchical structures, which introduce correlation among the samples and thus further complicate the analysis and interpretation of microbiome count data. In this article, we propose negative binomial mixed models (NBMMs) for detecting the association between the microbiome and host environmental/clinical factors for correlated microbiome count data. Although having not dealt with zero-inflation, the proposed mixed-effects models account for correlation among the samples by incorporating random effects into the commonly used fixed-effects negative binomial model, and can efficiently handle over-dispersion and varying total reads. We have developed a flexible and efficient IWLS (Iterative Weighted Least Squares) algorithm to fit the proposed NBMMs by taking advantage of the standard procedure for fitting the linear mixed models. We evaluate and demonstrate the proposed method via extensive simulation studies and the application to mouse gut microbiome data. The results show that the proposed method has desirable properties and outperform the previously used methods in terms of both empirical power and Type I error. The method has been incorporated into the freely available R package BhGLM ( http://www.ssg.uab.edu/bhglm/ and http://github.com/abbyyan3/BhGLM ), providing a useful tool for analyzing microbiome data.
Measured PET Data Characterization with the Negative Binomial Distribution Model.
Santarelli, Maria Filomena; Positano, Vincenzo; Landini, Luigi
2017-01-01
Accurate statistical model of PET measurements is a prerequisite for a correct image reconstruction when using statistical image reconstruction algorithms, or when pre-filtering operations must be performed. Although radioactive decay follows a Poisson distribution, deviation from Poisson statistics occurs on projection data prior to reconstruction due to physical effects, measurement errors, correction of scatter and random coincidences. Modelling projection data can aid in understanding the statistical nature of the data in order to develop efficient processing methods and to reduce noise. This paper outlines the statistical behaviour of measured emission data evaluating the goodness of fit of the negative binomial (NB) distribution model to PET data for a wide range of emission activity values. An NB distribution model is characterized by the mean of the data and the dispersion parameter α that describes the deviation from Poisson statistics. Monte Carlo simulations were performed to evaluate: (a) the performances of the dispersion parameter α estimator, (b) the goodness of fit of the NB model for a wide range of activity values. We focused on the effect produced by correction for random and scatter events in the projection (sinogram) domain, due to their importance in quantitative analysis of PET data. The analysis developed herein allowed us to assess the accuracy of the NB distribution model to fit corrected sinogram data, and to evaluate the sensitivity of the dispersion parameter α to quantify deviation from Poisson statistics. By the sinogram ROI-based analysis, it was demonstrated that deviation on the measured data from Poisson statistics can be quantitatively characterized by the dispersion parameter α, in any noise conditions and corrections.
Meta-analysis of studies with bivariate binary outcomes: a marginal beta-binomial model approach.
Chen, Yong; Hong, Chuan; Ning, Yang; Su, Xiao
2016-01-15
When conducting a meta-analysis of studies with bivariate binary outcomes, challenges arise when the within-study correlation and between-study heterogeneity should be taken into account. In this paper, we propose a marginal beta-binomial model for the meta-analysis of studies with binary outcomes. This model is based on the composite likelihood approach and has several attractive features compared with the existing models such as bivariate generalized linear mixed model (Chu and Cole, 2006) and Sarmanov beta-binomial model (Chen et al., 2012). The advantages of the proposed marginal model include modeling the probabilities in the original scale, not requiring any transformation of probabilities or any link function, having closed-form expression of likelihood function, and no constraints on the correlation parameter. More importantly, because the marginal beta-binomial model is only based on the marginal distributions, it does not suffer from potential misspecification of the joint distribution of bivariate study-specific probabilities. Such misspecification is difficult to detect and can lead to biased inference using currents methods. We compare the performance of the marginal beta-binomial model with the bivariate generalized linear mixed model and the Sarmanov beta-binomial model by simulation studies. Interestingly, the results show that the marginal beta-binomial model performs better than the Sarmanov beta-binomial model, whether or not the true model is Sarmanov beta-binomial, and the marginal beta-binomial model is more robust than the bivariate generalized linear mixed model under model misspecifications. Two meta-analyses of diagnostic accuracy studies and a meta-analysis of case-control studies are conducted for illustration. Copyright © 2015 John Wiley & Sons, Ltd.
An efficient binomial model-based measure for sequence comparison and its application.
Liu, Xiaoqing; Dai, Qi; Li, Lihua; He, Zerong
2011-04-01
Sequence comparison is one of the major tasks in bioinformatics, which could serve as evidence of structural and functional conservation, as well as of evolutionary relations. There are several similarity/dissimilarity measures for sequence comparison, but challenges remains. This paper presented a binomial model-based measure to analyze biological sequences. With help of a random indicator, the occurrence of a word at any position of sequence can be regarded as a random Bernoulli variable, and the distribution of a sum of the word occurrence is well known to be a binomial one. By using a recursive formula, we computed the binomial probability of the word count and proposed a binomial model-based measure based on the relative entropy. The proposed measure was tested by extensive experiments including classification of HEV genotypes and phylogenetic analysis, and further compared with alignment-based and alignment-free measures. The results demonstrate that the proposed measure based on binomial model is more efficient.
Simulation on Poisson and negative binomial models of count road accident modeling
Sapuan, M. S.; Razali, A. M.; Zamzuri, Z. H.; Ibrahim, K.
2016-11-01
Accident count data have often been shown to have overdispersion. On the other hand, the data might contain zero count (excess zeros). The simulation study was conducted to create a scenarios which an accident happen in T-junction with the assumption the dependent variables of generated data follows certain distribution namely Poisson and negative binomial distribution with different sample size of n=30 to n=500. The study objective was accomplished by fitting Poisson regression, negative binomial regression and Hurdle negative binomial model to the simulated data. The model validation was compared and the simulation result shows for each different sample size, not all model fit the data nicely even though the data generated from its own distribution especially when the sample size is larger. Furthermore, the larger sample size indicates that more zeros accident count in the dataset.
[Evaluation of estimation of prevalence ratio using bayesian log-binomial regression model].
Gao, W L; Lin, H; Liu, X N; Ren, X W; Li, J S; Shen, X P; Zhu, S L
2017-03-10
To evaluate the estimation of prevalence ratio ( PR ) by using bayesian log-binomial regression model and its application, we estimated the PR of medical care-seeking prevalence to caregivers' recognition of risk signs of diarrhea in their infants by using bayesian log-binomial regression model in Openbugs software. The results showed that caregivers' recognition of infant' s risk signs of diarrhea was associated significantly with a 13% increase of medical care-seeking. Meanwhile, we compared the differences in PR 's point estimation and its interval estimation of medical care-seeking prevalence to caregivers' recognition of risk signs of diarrhea and convergence of three models (model 1: not adjusting for the covariates; model 2: adjusting for duration of caregivers' education, model 3: adjusting for distance between village and township and child month-age based on model 2) between bayesian log-binomial regression model and conventional log-binomial regression model. The results showed that all three bayesian log-binomial regression models were convergence and the estimated PRs were 1.130(95 %CI : 1.005-1.265), 1.128(95 %CI : 1.001-1.264) and 1.132(95 %CI : 1.004-1.267), respectively. Conventional log-binomial regression model 1 and model 2 were convergence and their PRs were 1.130(95 % CI : 1.055-1.206) and 1.126(95 % CI : 1.051-1.203), respectively, but the model 3 was misconvergence, so COPY method was used to estimate PR , which was 1.125 (95 %CI : 1.051-1.200). In addition, the point estimation and interval estimation of PRs from three bayesian log-binomial regression models differed slightly from those of PRs from conventional log-binomial regression model, but they had a good consistency in estimating PR . Therefore, bayesian log-binomial regression model can effectively estimate PR with less misconvergence and have more advantages in application compared with conventional log-binomial regression model.
Identifiability in N-mixture models: a large-scale screening test with bird data.
Kéry, Marc
2018-02-01
Binomial N-mixture models have proven very useful in ecology, conservation, and monitoring: they allow estimation and modeling of abundance separately from detection probability using simple counts. Recently, doubts about parameter identifiability have been voiced. I conducted a large-scale screening test with 137 bird data sets from 2,037 sites. I found virtually no identifiability problems for Poisson and zero-inflated Poisson (ZIP) binomial N-mixture models, but negative-binomial (NB) models had problems in 25% of all data sets. The corresponding multinomial N-mixture models had no problems. Parameter estimates under Poisson and ZIP binomial and multinomial N-mixture models were extremely similar. Identifiability problems became a little more frequent with smaller sample sizes (267 and 50 sites), but were unaffected by whether the models did or did not include covariates. Hence, binomial N-mixture model parameters with Poisson and ZIP mixtures typically appeared identifiable. In contrast, NB mixtures were often unidentifiable, which is worrying since these were often selected by Akaike's information criterion. Identifiability of binomial N-mixture models should always be checked. If problems are found, simpler models, integrated models that combine different observation models or the use of external information via informative priors or penalized likelihoods, may help. © 2017 by the Ecological Society of America.
Poissonian and binomial models in radionuclide metrology by liquid scintillation counting
International Nuclear Information System (INIS)
Grau Malonda, A.
1990-01-01
Binomial and Poissonian models developed for calculating the counting efficiency from a free parameter is analysed in this paper. This model have been applied to liquid scintillator counting systems with two or three photomultipliers. It is mathematically demostrated that both models are equivalent and that the counting efficiencies calculated either from one or the other model are identical. (Author)
Modeling and Predistortion of Envelope Tracking Power Amplifiers using a Memory Binomial Model
DEFF Research Database (Denmark)
Tafuri, Felice Francesco; Sira, Daniel; Larsen, Torben
2013-01-01
. The model definition is based on binomial series, hence the name of memory binomial model (MBM). The MBM is here applied to measured data-sets acquired from an ET measurement set-up. When used as a PA model the MBM showed an NMSE (Normalized Mean Squared Error) as low as −40dB and an ACEPR (Adjacent Channel...... Error Power Ratio) below −51 dB. The simulated predistortion results showed that the MBM can improve the compensation of distortion in the adjacent channel of 5.8 dB and 5.7 dB compared to a memory polynomial predistorter (MPPD). The predistortion performance in the time domain showed an NMSE...
Tran, Phoebe; Waller, Lance
2015-01-01
Lyme disease has been the subject of many studies due to increasing incidence rates year after year and the severe complications that can arise in later stages of the disease. Negative binomial models have been used to model Lyme disease in the past with some success. However, there has been little focus on the reliability and consistency of these models when they are used to study Lyme disease at multiple spatial scales. This study seeks to explore how sensitive/consistent negative binomial models are when they are used to study Lyme disease at different spatial scales (at the regional and sub-regional levels). The study area includes the thirteen states in the Northeastern United States with the highest Lyme disease incidence during the 2002-2006 period. Lyme disease incidence at county level for the period of 2002-2006 was linked with several previously identified key landscape and climatic variables in a negative binomial regression model for the Northeastern region and two smaller sub-regions (the New England sub-region and the Mid-Atlantic sub-region). This study found that negative binomial models, indeed, were sensitive/inconsistent when used at different spatial scales. We discuss various plausible explanations for such behavior of negative binomial models. Further investigation of the inconsistency and sensitivity of negative binomial models when used at different spatial scales is important for not only future Lyme disease studies and Lyme disease risk assessment/management but any study that requires use of this model type in a spatial context. Copyright © 2014 Elsevier Inc. All rights reserved.
On extinction time of a generalized endemic chain-binomial model.
Aydogmus, Ozgur
2016-09-01
We considered a chain-binomial epidemic model not conferring immunity after infection. Mean field dynamics of the model has been analyzed and conditions for the existence of a stable endemic equilibrium are determined. The behavior of the chain-binomial process is probabilistically linked to the mean field equation. As a result of this link, we were able to show that the mean extinction time of the epidemic increases at least exponentially as the population size grows. We also present simulation results for the process to validate our analytical findings. Copyright © 2016 Elsevier Inc. All rights reserved.
Investigating Individual Differences in Toddler Search with Mixture Models
Berthier, Neil E.; Boucher, Kelsea; Weisner, Nina
2015-01-01
Children's performance on cognitive tasks is often described in categorical terms in that a child is described as either passing or failing a test, or knowing or not knowing some concept. We used binomial mixture models to determine whether individual children could be classified as passing or failing two search tasks, the DeLoache model room…
Joint Analysis of Binomial and Continuous Traits with a Recursive Model
DEFF Research Database (Denmark)
Varona, Louis; Sorensen, Daniel
2014-01-01
This work presents a model for the joint analysis of a binomial and a Gaussian trait using a recursive parametrization that leads to a computationally efficient implementation. The model is illustrated in an analysis of mortality and litter size in two breeds of Danish pigs, Landrace and Yorkshir...
A binomial random sum of present value models in investment analysis
Βουδούρη, Αγγελική; Ντζιαχρήστος, Ευάγγελος
1997-01-01
Stochastic present value models have been widely adopted in financial theory and practice and play a very important role in capital budgeting and profit planning. The purpose of this paper is to introduce a binomial random sum of stochastic present value models and offer an application in investment analysis.
A Bayesian Approach to Functional Mixed Effect Modeling for Longitudinal Data with Binomial Outcomes
Kliethermes, Stephanie; Oleson, Jacob
2014-01-01
Longitudinal growth patterns are routinely seen in medical studies where individual and population growth is followed over a period of time. Many current methods for modeling growth presuppose a parametric relationship between the outcome and time (e.g., linear, quadratic); however, these relationships may not accurately capture growth over time. Functional mixed effects (FME) models provide flexibility in handling longitudinal data with nonparametric temporal trends. Although FME methods are well-developed for continuous, normally distributed outcome measures, nonparametric methods for handling categorical outcomes are limited. We consider the situation with binomially distributed longitudinal outcomes. Although percent correct data can be modeled assuming normality, estimates outside the parameter space are possible and thus estimated curves can be unrealistic. We propose a binomial FME model using Bayesian methodology to account for growth curves with binomial (percentage) outcomes. The usefulness of our methods is demonstrated using a longitudinal study of speech perception outcomes from cochlear implant users where we successfully model both the population and individual growth trajectories. Simulation studies also advocate the usefulness of the binomial model particularly when outcomes occur near the boundary of the probability parameter space and in situations with a small number of trials. PMID:24723495
Kliethermes, Stephanie; Oleson, Jacob
2014-08-15
Longitudinal growth patterns are routinely seen in medical studies where individual growth and population growth are followed up over a period of time. Many current methods for modeling growth presuppose a parametric relationship between the outcome and time (e.g., linear and quadratic); however, these relationships may not accurately capture growth over time. Functional mixed-effects (FME) models provide flexibility in handling longitudinal data with nonparametric temporal trends. Although FME methods are well developed for continuous, normally distributed outcome measures, nonparametric methods for handling categorical outcomes are limited. We consider the situation with binomially distributed longitudinal outcomes. Although percent correct data can be modeled assuming normality, estimates outside the parameter space are possible, and thus, estimated curves can be unrealistic. We propose a binomial FME model using Bayesian methodology to account for growth curves with binomial (percentage) outcomes. The usefulness of our methods is demonstrated using a longitudinal study of speech perception outcomes from cochlear implant users where we successfully model both the population and individual growth trajectories. Simulation studies also advocate the usefulness of the binomial model particularly when outcomes occur near the boundary of the probability parameter space and in situations with a small number of trials. Copyright © 2014 John Wiley & Sons, Ltd.
Bianca N.I. Eskelson; Hailemariam Temesgen; Tara M. Barrett
2009-01-01
Cavity tree and snag abundance data are highly variable and contain many zero observations. We predict cavity tree and snag abundance from variables that are readily available from forest cover maps or remotely sensed data using negative binomial (NB), zero-inflated NB, and zero-altered NB (ZANB) regression models as well as nearest neighbor (NN) imputation methods....
Justin S. Crotteau; Martin W. Ritchie; J. Morgan. Varner
2014-01-01
Many western USA fire regimes are typified by mixed-severity fire, which compounds the variability inherent to natural regeneration densities in associated forests. Tree regeneration data are often discrete and nonnegative; accordingly, we fit a series of Poisson and negative binomial variation models to conifer seedling counts across four distinct burn severities and...
Confidence Intervals for Weighted Composite Scores under the Compound Binomial Error Model
Kim, Kyung Yong; Lee, Won-Chan
2018-01-01
Reporting confidence intervals with test scores helps test users make important decisions about examinees by providing information about the precision of test scores. Although a variety of estimation procedures based on the binomial error model are available for computing intervals for test scores, these procedures assume that items are randomly…
Computational results on the compound binomial risk model with nonhomogeneous claim occurrences
Tuncel, A.; Tank, F.
2013-01-01
The aim of this paper is to give a recursive formula for non-ruin (survival) probability when the claim occurrences are nonhomogeneous in the compound binomial risk model. We give recursive formulas for non-ruin (survival) probability and for distribution of the total number of claims under the
A mixed-binomial model for Likert-type personality measures.
Allik, Jüri
2014-01-01
Personality measurement is based on the idea that values on an unobservable latent variable determine the distribution of answers on a manifest response scale. Typically, it is assumed in the Item Response Theory (IRT) that latent variables are related to the observed responses through continuous normal or logistic functions, determining the probability with which one of the ordered response alternatives on a Likert-scale item is chosen. Based on an analysis of 1731 self- and other-rated responses on the 240 NEO PI-3 questionnaire items, it was proposed that a viable alternative is a finite number of latent events which are related to manifest responses through a binomial function which has only one parameter-the probability with which a given statement is approved. For the majority of items, the best fit was obtained with a mixed-binomial distribution, which assumes two different subpopulations who endorse items with two different probabilities. It was shown that the fit of the binomial IRT model can be improved by assuming that about 10% of random noise is contained in the answers and by taking into account response biases toward one of the response categories. It was concluded that the binomial response model for the measurement of personality traits may be a workable alternative to the more habitual normal and logistic IRT models.
A Mechanistic Beta-Binomial Probability Model for mRNA Sequencing Data.
Smith, Gregory R; Birtwistle, Marc R
2016-01-01
A main application for mRNA sequencing (mRNAseq) is determining lists of differentially-expressed genes (DEGs) between two or more conditions. Several software packages exist to produce DEGs from mRNAseq data, but they typically yield different DEGs, sometimes markedly so. The underlying probability model used to describe mRNAseq data is central to deriving DEGs, and not surprisingly most softwares use different models and assumptions to analyze mRNAseq data. Here, we propose a mechanistic justification to model mRNAseq as a binomial process, with data from technical replicates given by a binomial distribution, and data from biological replicates well-described by a beta-binomial distribution. We demonstrate good agreement of this model with two large datasets. We show that an emergent feature of the beta-binomial distribution, given parameter regimes typical for mRNAseq experiments, is the well-known quadratic polynomial scaling of variance with the mean. The so-called dispersion parameter controls this scaling, and our analysis suggests that the dispersion parameter is a continually decreasing function of the mean, as opposed to current approaches that impose an asymptotic value to the dispersion parameter at moderate mean read counts. We show how this leads to current approaches overestimating variance for moderately to highly expressed genes, which inflates false negative rates. Describing mRNAseq data with a beta-binomial distribution thus may be preferred since its parameters are relatable to the mechanistic underpinnings of the technique and may improve the consistency of DEG analysis across softwares, particularly for moderately to highly expressed genes.
International Nuclear Information System (INIS)
Valor, Alma; Alfonso, Lester; Caleyo, Francisco; Vidal, Julio; Perez-Baruch, Eloy; Hallen, José M.
2015-01-01
Highlights: • Observed external-corrosion defects in underground pipelines revealed a tendency to cluster. • The Poisson distribution is unable to fit extensive count data for these type of defects. • In contrast, the negative binomial distribution provides a suitable count model for them. • Two spatial stochastic processes lead to the negative binomial distribution for defect counts. • They are the Gamma-Poisson mixed process and the compound Poisson process. • A Rogeŕs process also arises as a plausible temporal stochastic process leading to corrosion defect clustering and to negative binomially distributed defect counts. - Abstract: The spatial distribution of external corrosion defects in buried pipelines is usually described as a Poisson process, which leads to corrosion defects being randomly distributed along the pipeline. However, in real operating conditions, the spatial distribution of defects considerably departs from Poisson statistics due to the aggregation of defects in groups or clusters. In this work, the statistical analysis of real corrosion data from underground pipelines operating in southern Mexico leads to conclude that the negative binomial distribution provides a better description for defect counts. The origin of this distribution from several processes is discussed. The analysed processes are: mixed Gamma-Poisson, compound Poisson and Roger’s processes. The physical reasons behind them are discussed for the specific case of soil corrosion.
Study on Emission Measurement of Vehicle on Road Based on Binomial Logit Model
Aly, Sumarni Hamid; Selintung, Mary; Ramli, Muhammad Isran; Sumi, Tomonori
2011-01-01
This research attempts to evaluate emission measurement of on road vehicle. In this regard, the research develops failure probability model of vehicle emission test for passenger car which utilize binomial logit model. The model focuses on failure of CO and HC emission test for gasoline cars category and Opacity emission test for diesel-fuel cars category as dependent variables, while vehicle age, engine size, brand and type of the cars as independent variables. In order to imp...
Discrimination of numerical proportions: A comparison of binomial and Gaussian models.
Raidvee, Aire; Lember, Jüri; Allik, Jüri
2017-01-01
Observers discriminated the numerical proportion of two sets of elements (N = 9, 13, 33, and 65) that differed either by color or orientation. According to the standard Thurstonian approach, the accuracy of proportion discrimination is determined by irreducible noise in the nervous system that stochastically transforms the number of presented visual elements onto a continuum of psychological states representing numerosity. As an alternative to this customary approach, we propose a Thurstonian-binomial model, which assumes discrete perceptual states, each of which is associated with a certain visual element. It is shown that the probability β with which each visual element can be noticed and registered by the perceptual system can explain data of numerical proportion discrimination at least as well as the continuous Thurstonian-Gaussian model, and better, if the greater parsimony of the Thurstonian-binomial model is taken into account using AIC model selection. We conclude that Gaussian and binomial models represent two different fundamental principles-internal noise vs. using only a fraction of available information-which are both plausible descriptions of visual perception.
The option to expand a project: its assessment with the binomial options pricing model
Directory of Open Access Journals (Sweden)
Salvador Cruz Rambaud
Full Text Available Traditional methods of investment appraisal, like the Net Present Value, are not able to include the value of the operational flexibility of the project. In this paper, real options, and more specifically the option to expand, are assumed to be included in the project information in addition to the expected cash flows. Thus, to calculate the total value of the project, we are going to apply the methodology of the Net Present Value to the different scenarios derived from the existence of the real option to expand. Taking into account the analogy between real and financial options, the value of including an option to expand is explored by using the binomial options pricing model. In this way, estimating the value of the option to expand is a tool which facilitates the control of the uncertainty element implicit in the project. Keywords: Real options, Option to expand, Binomial options pricing model, Investment project appraisal
International Nuclear Information System (INIS)
Wang Huan; Guo Xiuhua; Jia Zhongwei; Li Hongkai; Liang Zhigang; Li Kuncheng; He Qian
2010-01-01
Purpose: To introduce multilevel binomial logistic prediction model-based computer-aided diagnostic (CAD) method of small solitary pulmonary nodules (SPNs) diagnosis by combining patient and image characteristics by textural features of CT image. Materials and methods: Describe fourteen gray level co-occurrence matrix textural features obtained from 2171 benign and malignant small solitary pulmonary nodules, which belongs to 185 patients. Multilevel binomial logistic model is applied to gain these initial insights. Results: Five texture features, including Inertia, Entropy, Correlation, Difference-mean, Sum-Entropy, and age of patients own aggregating character on patient-level, which are statistically different (P < 0.05) between benign and malignant small solitary pulmonary nodules. Conclusion: Some gray level co-occurrence matrix textural features are efficiently descriptive features of CT image of small solitary pulmonary nodules, which can profit diagnosis of earlier period lung cancer if combined patient-level characteristics to some extent.
Use of the Beta-Binomial Model for Central Statistical Monitoring of Multicenter Clinical Trials
Desmet, Lieven; Venet, David; Doffagne, Erik; Timmermans, Catherine; Legrand, Catherine; Burzykowski, Tomasz; Buyse, Marc
2017-01-01
As part of central statistical monitoring of multicenter clinical trial data, we propose a procedure based on the beta-binomial distribution for the detection of centers with atypical values for the probability of some event. The procedure makes no assumptions about the typical event proportion and uses the event counts from all centers to derive a reference model. The procedure is shown through simulations to have high sensitivity and high specificity if the contamination rate is small and t...
Zero inflated Poisson and negative binomial regression models: application in education.
Salehi, Masoud; Roudbari, Masoud
2015-01-01
The number of failed courses and semesters in students are indicators of their performance. These amounts have zero inflated (ZI) distributions. Using ZI Poisson and negative binomial distributions we can model these count data to find the associated factors and estimate the parameters. This study aims at to investigate the important factors related to the educational performance of students. This cross-sectional study performed in 2008-2009 at Iran University of Medical Sciences (IUMS) with a population of almost 6000 students, 670 students selected using stratified random sampling. The educational and demographical data were collected using the University records. The study design was approved at IUMS and the students' data kept confidential. The descriptive statistics and ZI Poisson and negative binomial regressions were used to analyze the data. The data were analyzed using STATA. In the number of failed semesters, Poisson and negative binomial distributions with ZI, students' total average and quota system had the most roles. For the number of failed courses, total average, and being in undergraduate or master levels had the most effect in both models. In all models the total average have the most effect on the number of failed courses or semesters. The next important factor is quota system in failed semester and undergraduate and master levels in failed courses. Therefore, average has an important inverse effect on the numbers of failed courses and semester.
Analysis of railroad tank car releases using a generalized binomial model.
Liu, Xiang; Hong, Yili
2015-11-01
The United States is experiencing an unprecedented boom in shale oil production, leading to a dramatic growth in petroleum crude oil traffic by rail. In 2014, U.S. railroads carried over 500,000 tank carloads of petroleum crude oil, up from 9500 in 2008 (a 5300% increase). In light of continual growth in crude oil by rail, there is an urgent national need to manage this emerging risk. This need has been underscored in the wake of several recent crude oil release incidents. In contrast to highway transport, which usually involves a tank trailer, a crude oil train can carry a large number of tank cars, having the potential for a large, multiple-tank-car release incident. Previous studies exclusively assumed that railroad tank car releases in the same train accident are mutually independent, thereby estimating the number of tank cars releasing given the total number of tank cars derailed based on a binomial model. This paper specifically accounts for dependent tank car releases within a train accident. We estimate the number of tank cars releasing given the number of tank cars derailed based on a generalized binomial model. The generalized binomial model provides a significantly better description for the empirical tank car accident data through our numerical case study. This research aims to provide a new methodology and new insights regarding the further development of risk management strategies for improving railroad crude oil transportation safety. Copyright © 2015 Elsevier Ltd. All rights reserved.
Wagner, Brandie; Riggs, Paula; Mikulich-Gilbertson, Susan
2015-01-01
It is important to correctly understand the associations among addiction to multiple drugs and between co-occurring substance use and psychiatric disorders. Substance-specific outcomes (e.g. number of days used cannabis) have distributional characteristics which range widely depending on the substance and the sample being evaluated. We recommend a four-part strategy for determining the appropriate distribution for modeling substance use data. We demonstrate this strategy by comparing the model fit and resulting inferences from applying four different distributions to model use of substances that range greatly in the prevalence and frequency of their use. Using Timeline Followback (TLFB) data from a previously-published study, we used negative binomial, beta-binomial and their zero-inflated counterparts to model proportion of days during treatment of cannabis, cigarettes, alcohol, and opioid use. The fit for each distribution was evaluated with statistical model selection criteria, visual plots and a comparison of the resulting inferences. We demonstrate the feasibility and utility of modeling each substance individually and show that no single distribution provides the best fit for all substances. Inferences regarding use of each substance and associations with important clinical variables were not consistent across models and differed by substance. Thus, the distribution chosen for modeling substance use must be carefully selected and evaluated because it may impact the resulting conclusions. Furthermore, the common procedure of aggregating use across different substances may not be ideal.
Negative binomial multiplicity distribution from binomial cluster production
International Nuclear Information System (INIS)
Iso, C.; Mori, K.
1990-01-01
Two-step interpretation of negative binomial multiplicity distribution as a compound of binomial cluster production and negative binomial like cluster decay distribution is proposed. In this model we can expect the average multiplicity for the cluster production increases with increasing energy, different from a compound Poisson-Logarithmic distribution. (orig.)
Beta-binomial model for meta-analysis of odds ratios.
Bakbergenuly, Ilyas; Kulinskaya, Elena
2017-05-20
In meta-analysis of odds ratios (ORs), heterogeneity between the studies is usually modelled via the additive random effects model (REM). An alternative, multiplicative REM for ORs uses overdispersion. The multiplicative factor in this overdispersion model (ODM) can be interpreted as an intra-class correlation (ICC) parameter. This model naturally arises when the probabilities of an event in one or both arms of a comparative study are themselves beta-distributed, resulting in beta-binomial distributions. We propose two new estimators of the ICC for meta-analysis in this setting. One is based on the inverted Breslow-Day test, and the other on the improved gamma approximation by Kulinskaya and Dollinger (2015, p. 26) to the distribution of Cochran's Q. The performance of these and several other estimators of ICC on bias and coverage is studied by simulation. Additionally, the Mantel-Haenszel approach to estimation of ORs is extended to the beta-binomial model, and we study performance of various ICC estimators when used in the Mantel-Haenszel or the inverse-variance method to combine ORs in meta-analysis. The results of the simulations show that the improved gamma-based estimator of ICC is superior for small sample sizes, and the Breslow-Day-based estimator is the best for n⩾100. The Mantel-Haenszel-based estimator of OR is very biased and is not recommended. The inverse-variance approach is also somewhat biased for ORs≠1, but this bias is not very large in practical settings. Developed methods and R programs, provided in the Web Appendix, make the beta-binomial model a feasible alternative to the standard REM for meta-analysis of ORs. © 2017 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd. © 2017 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.
Prevalence Incidence Mixture Models
The R package and webtool fits Prevalence Incidence Mixture models to left-censored and irregularly interval-censored time to event data that is commonly found in screening cohorts assembled from electronic health records. Absolute and relative risk can be estimated for simple random sampling, and stratified sampling (the two approaches of superpopulation and a finite population are supported for target populations). Non-parametric (absolute risks only), semi-parametric, weakly-parametric (using B-splines), and some fully parametric (such as the logistic-Weibull) models are supported.
Ma, Zhuanglin; Zhang, Honglu; Chien, Steven I-Jy; Wang, Jin; Dong, Chunjiao
2017-01-01
To investigate the relationship between crash frequency and potential influence factors, the accident data for events occurring on a 50km long expressway in China, including 567 crash records (2006-2008), were collected and analyzed. Both the fixed-length and the homogeneous longitudinal grade methods were applied to divide the study expressway section into segments. A negative binomial (NB) model and a random effect negative binomial (RENB) model were developed to predict crash frequency. The parameters of both models were determined using the maximum likelihood (ML) method, and the mixed stepwise procedure was applied to examine the significance of explanatory variables. Three explanatory variables, including longitudinal grade, road width, and ratio of longitudinal grade and curve radius (RGR), were found as significantly affecting crash frequency. The marginal effects of significant explanatory variables to the crash frequency were analyzed. The model performance was determined by the relative prediction error and the cumulative standardized residual. The results show that the RENB model outperforms the NB model. It was also found that the model performance with the fixed-length segment method is superior to that with the homogeneous longitudinal grade segment method. Copyright © 2016. Published by Elsevier Ltd.
Bayesian analysis of overdispersed chromosome aberration data with the negative binomial model
International Nuclear Information System (INIS)
Brame, R.S.; Groer, P.G.
2002-01-01
The usual assumption of a Poisson model for the number of chromosome aberrations in controlled calibration experiments implies variance equal to the mean. However, it is known that chromosome aberration data from experiments involving high linear energy transfer radiations can be overdispersed, i.e. the variance is greater than the mean. Present methods for dealing with overdispersed chromosome data rely on frequentist statistical techniques. In this paper, the problem of overdispersion is considered from a Bayesian standpoint. The Bayes Factor is used to compare Poisson and negative binomial models for two previously published calibration data sets describing the induction of dicentric chromosome aberrations by high doses of neutrons. Posterior densities for the model parameters, which characterise dose response and overdispersion are calculated and graphed. Calibrative densities are derived for unknown neutron doses from hypothetical radiation accident data to determine the impact of different model assumptions on dose estimates. The main conclusion is that an initial assumption of a negative binomial model is the conservative approach to chromosome dosimetry for high LET radiations. (author)
International Nuclear Information System (INIS)
Zhang Yu; Wang Guangyi; Lu Xinmiao; Hu Yongcai; Xu Jiangtao
2016-01-01
The random telegraph signal noise in the pixel source follower MOSFET is the principle component of the noise in the CMOS image sensor under low light. In this paper, the physical and statistical model of the random telegraph signal noise in the pixel source follower based on the binomial distribution is set up. The number of electrons captured or released by the oxide traps in the unit time is described as the random variables which obey the binomial distribution. As a result, the output states and the corresponding probabilities of the first and the second samples of the correlated double sampling circuit are acquired. The standard deviation of the output states after the correlated double sampling circuit can be obtained accordingly. In the simulation section, one hundred thousand samples of the source follower MOSFET have been simulated, and the simulation results show that the proposed model has the similar statistical characteristics with the existing models under the effect of the channel length and the density of the oxide trap. Moreover, the noise histogram of the proposed model has been evaluated at different environmental temperatures. (paper)
International Nuclear Information System (INIS)
Shultis, J.K.; Buranapan, W.; Eckhoff, N.D.
1981-12-01
Of considerable importance in the safety analysis of nuclear power plants are methods to estimate the probability of failure-on-demand, p, of a plant component that normally is inactive and that may fail when activated or stressed. Properties of five methods for estimating from failure-on-demand data the parameters of the beta prior distribution in a compound beta-binomial probability model are examined. Simulated failure data generated from a known beta-binomial marginal distribution are used to estimate values of the beta parameters by (1) matching moments of the prior distribution to those of the data, (2) the maximum likelihood method based on the prior distribution, (3) a weighted marginal matching moments method, (4) an unweighted marginal matching moments method, and (5) the maximum likelihood method based on the marginal distribution. For small sample sizes (N = or < 10) with data typical of low failure probability components, it was found that the simple prior matching moments method is often superior (e.g. smallest bias and mean squared error) while for larger sample sizes the marginal maximum likelihood estimators appear to be best
Gomes, Marcos José Timbó Lima; Cunto, Flávio; da Silva, Alan Ricardo
2017-09-01
Generalized Linear Models (GLM) with negative binomial distribution for errors, have been widely used to estimate safety at the level of transportation planning. The limited ability of this technique to take spatial effects into account can be overcome through the use of local models from spatial regression techniques, such as Geographically Weighted Poisson Regression (GWPR). Although GWPR is a system that deals with spatial dependency and heterogeneity and has already been used in some road safety studies at the planning level, it fails to account for the possible overdispersion that can be found in the observations on road-traffic crashes. Two approaches were adopted for the Geographically Weighted Negative Binomial Regression (GWNBR) model to allow discrete data to be modeled in a non-stationary form and to take note of the overdispersion of the data: the first examines the constant overdispersion for all the traffic zones and the second includes the variable for each spatial unit. This research conducts a comparative analysis between non-spatial global crash prediction models and spatial local GWPR and GWNBR at the level of traffic zones in Fortaleza/Brazil. A geographic database of 126 traffic zones was compiled from the available data on exposure, network characteristics, socioeconomic factors and land use. The models were calibrated by using the frequency of injury crashes as a dependent variable and the results showed that GWPR and GWNBR achieved a better performance than GLM for the average residuals and likelihood as well as reducing the spatial autocorrelation of the residuals, and the GWNBR model was more able to capture the spatial heterogeneity of the crash frequency. Copyright © 2017 Elsevier Ltd. All rights reserved.
Irwin, Brian J.; Wagner, Tyler; Bence, James R.; Kepler, Megan V.; Liu, Weihai; Hayes, Daniel B.
2013-01-01
Partitioning total variability into its component temporal and spatial sources is a powerful way to better understand time series and elucidate trends. The data available for such analyses of fish and other populations are usually nonnegative integer counts of the number of organisms, often dominated by many low values with few observations of relatively high abundance. These characteristics are not well approximated by the Gaussian distribution. We present a detailed description of a negative binomial mixed-model framework that can be used to model count data and quantify temporal and spatial variability. We applied these models to data from four fishery-independent surveys of Walleyes Sander vitreus across the Great Lakes basin. Specifically, we fitted models to gill-net catches from Wisconsin waters of Lake Superior; Oneida Lake, New York; Saginaw Bay in Lake Huron, Michigan; and Ohio waters of Lake Erie. These long-term monitoring surveys varied in overall sampling intensity, the total catch of Walleyes, and the proportion of zero catches. Parameter estimation included the negative binomial scaling parameter, and we quantified the random effects as the variations among gill-net sampling sites, the variations among sampled years, and site × year interactions. This framework (i.e., the application of a mixed model appropriate for count data in a variance-partitioning context) represents a flexible approach that has implications for monitoring programs (e.g., trend detection) and for examining the potential of individual variance components to serve as response metrics to large-scale anthropogenic perturbations or ecological changes.
The multi-class binomial failure rate model for the treatment of common-cause failures
International Nuclear Information System (INIS)
Hauptmanns, U.
1995-01-01
The impact of common cause failures (CCF) on PSA results for NPPs is in sharp contrast with the limited quality which can be achieved in their assessment. This is due to the dearth of observations and cannot be remedied in the short run. Therefore the methods employed for calculating failure rates should be devised such as to make the best use of the few available observations on CCF. The Multi-Class Binomial Failure Rate (MCBFR) Model achieves this by assigning observed failures to different classes according to their technical characteristics and applying the BFR formalism to each of these. The results are hence determined by a superposition of BFR type expressions for each class, each of them with its own coupling factor. The model thus obtained flexibly reproduces the dependence of CCF rates on failure multiplicity insinuated by the observed failure multiplicities. This is demonstrated by evaluating CCFs observed for combined impulse pilot valves in German NPPs. (orig.) [de
Directory of Open Access Journals (Sweden)
Xiong Wang
2013-01-01
Full Text Available Based on characteristics of the nonlife joint-stock insurance company, this paper presents a compound binomial risk model that randomizes the premium income on unit time and sets the threshold for paying dividends to shareholders. In this model, the insurance company obtains the insurance policy in unit time with probability and pays dividends to shareholders with probability when the surplus is no less than . We then derive the recursive formulas of the expected discounted penalty function and the asymptotic estimate for it. And we will derive the recursive formulas and asymptotic estimates for the ruin probability and the distribution function of the deficit at ruin. The numerical examples have been shown to illustrate the accuracy of the asymptotic estimations.
Lea, Amanda J.
2015-01-01
Identifying sources of variation in DNA methylation levels is important for understanding gene regulation. Recently, bisulfite sequencing has become a popular tool for investigating DNA methylation levels. However, modeling bisulfite sequencing data is complicated by dramatic variation in coverage across sites and individual samples, and because of the computational challenges of controlling for genetic covariance in count data. To address these challenges, we present a binomial mixed model and an efficient, sampling-based algorithm (MACAU: Mixed model association for count data via data augmentation) for approximate parameter estimation and p-value computation. This framework allows us to simultaneously account for both the over-dispersed, count-based nature of bisulfite sequencing data, as well as genetic relatedness among individuals. Using simulations and two real data sets (whole genome bisulfite sequencing (WGBS) data from Arabidopsis thaliana and reduced representation bisulfite sequencing (RRBS) data from baboons), we show that our method provides well-calibrated test statistics in the presence of population structure. Further, it improves power to detect differentially methylated sites: in the RRBS data set, MACAU detected 1.6-fold more age-associated CpG sites than a beta-binomial model (the next best approach). Changes in these sites are consistent with known age-related shifts in DNA methylation levels, and are enriched near genes that are differentially expressed with age in the same population. Taken together, our results indicate that MACAU is an efficient, effective tool for analyzing bisulfite sequencing data, with particular salience to analyses of structured populations. MACAU is freely available at www.xzlab.org/software.html. PMID:26599596
Xiao, Chuan-Le; Chen, Xiao-Zhou; Du, Yang-Li; Sun, Xuesong; Zhang, Gong; He, Qing-Yu
2013-01-04
Mass spectrometry has become one of the most important technologies in proteomic analysis. Tandem mass spectrometry (LC-MS/MS) is a major tool for the analysis of peptide mixtures from protein samples. The key step of MS data processing is the identification of peptides from experimental spectra by searching public sequence databases. Although a number of algorithms to identify peptides from MS/MS data have been already proposed, e.g. Sequest, OMSSA, X!Tandem, Mascot, etc., they are mainly based on statistical models considering only peak-matches between experimental and theoretical spectra, but not peak intensity information. Moreover, different algorithms gave different results from the same MS data, implying their probable incompleteness and questionable reproducibility. We developed a novel peptide identification algorithm, ProVerB, based on a binomial probability distribution model of protein tandem mass spectrometry combined with a new scoring function, making full use of peak intensity information and, thus, enhancing the ability of identification. Compared with Mascot, Sequest, and SQID, ProVerB identified significantly more peptides from LC-MS/MS data sets than the current algorithms at 1% False Discovery Rate (FDR) and provided more confident peptide identifications. ProVerB is also compatible with various platforms and experimental data sets, showing its robustness and versatility. The open-source program ProVerB is available at http://bioinformatics.jnu.edu.cn/software/proverb/ .
Forecasting asthma-related hospital admissions in London using negative binomial models.
Soyiri, Ireneous N; Reidpath, Daniel D; Sarran, Christophe
2013-05-01
Health forecasting can improve health service provision and individual patient outcomes. Environmental factors are known to impact chronic respiratory conditions such as asthma, but little is known about the extent to which these factors can be used for forecasting. Using weather, air quality and hospital asthma admissions, in London (2005-2006), two related negative binomial models were developed and compared with a naive seasonal model. In the first approach, predictive forecasting models were fitted with 7-day averages of each potential predictor, and then a subsequent multivariable model is constructed. In the second strategy, an exhaustive search of the best fitting models between possible combinations of lags (0-14 days) of all the environmental effects on asthma admission was conducted. Three models were considered: a base model (seasonal effects), contrasted with a 7-day average model and a selected lags model (weather and air quality effects). Season is the best predictor of asthma admissions. The 7-day average and seasonal models were trivial to implement. The selected lags model was computationally intensive, but of no real value over much more easily implemented models. Seasonal factors can predict daily hospital asthma admissions in London, and there is a little evidence that additional weather and air quality information would add to forecast accuracy.
Assessing the Option to Abandon an Investment Project by the Binomial Options Pricing Model
Directory of Open Access Journals (Sweden)
Salvador Cruz Rambaud
2016-01-01
Full Text Available Usually, traditional methods for investment project appraisal such as the net present value (hereinafter NPV do not incorporate in their values the operational flexibility offered by including a real option included in the project. In this paper, real options, and more specifically the option to abandon, are analysed as a complement to cash flow sequence which quantifies the project. In this way, by considering the existing analogy with financial options, a mathematical expression is derived by using the binomial options pricing model. This methodology provides the value of the option to abandon the project within one, two, and in general n periods. Therefore, this paper aims to be a useful tool in determining the value of the option to abandon according to its residual value, thus making easier the control of the uncertainty element within the project.
Directory of Open Access Journals (Sweden)
Branka Remenarić
2018-01-01
Full Text Available In July 2014, the International Accounting Standards Board (IASB published International Financial Reporting Standard 9 Financial Instruments (IFRS 9. This standard introduces an expected credit loss (ECL impairment model that applies to financial instruments, including trade and lease receivables. IFRS 9 applies to annual periods beginning on or after 1 January 2018 in the European Union member states. While the main reason for amending the current model was to require major banks to recognize losses in advance of a credit event occurring, this new model also applies to all receivables, including trade receivables, lease receivables, related party loan receivables in non-financial sector entities. The new impairment model is intended to result in earlier recognition of credit losses. The previous model described in International Accounting Standard 39 Financial instruments (IAS 39 was based on incurred losses. One of the major questions now is what models to use to predict expected credit losses in non-financial sector entities. The purpose of this paper is to research the application of the current impairment model, the extent to which the current impairment model can be modified to satisfy new impairment model requirements and the applicability of the binomial model for measuring expected credit losses from accounts receivable.
Goodness-of-fit tests and model diagnostics for negative binomial regression of RNA sequencing data.
Mi, Gu; Di, Yanming; Schafer, Daniel W
2015-01-01
This work is about assessing model adequacy for negative binomial (NB) regression, particularly (1) assessing the adequacy of the NB assumption, and (2) assessing the appropriateness of models for NB dispersion parameters. Tools for the first are appropriate for NB regression generally; those for the second are primarily intended for RNA sequencing (RNA-Seq) data analysis. The typically small number of biological samples and large number of genes in RNA-Seq analysis motivate us to address the trade-offs between robustness and statistical power using NB regression models. One widely-used power-saving strategy, for example, is to assume some commonalities of NB dispersion parameters across genes via simple models relating them to mean expression rates, and many such models have been proposed. As RNA-Seq analysis is becoming ever more popular, it is appropriate to make more thorough investigations into power and robustness of the resulting methods, and into practical tools for model assessment. In this article, we propose simulation-based statistical tests and diagnostic graphics to address model adequacy. We provide simulated and real data examples to illustrate that our proposed methods are effective for detecting the misspecification of the NB mean-variance relationship as well as judging the adequacy of fit of several NB dispersion models.
Hilpert, Markus; Rasmuson, Anna; Johnson, William
2017-04-01
Transport of colloids in saturated porous media is significantly influenced by colloidal interactions with grain surfaces. Near-surface fluid domain colloids experience relatively low fluid drag and relatively strong colloidal forces that slow their down-gradient translation relative to colloids in bulk fluid. Near surface fluid domain colloids may re-enter into the bulk fluid via diffusion (nanoparticles) or expulsion at rear flow stagnation zones, they may immobilize (attach) via strong primary minimum interactions, or they may move along a grain-to-grain contact to the near surface fluid domain of an adjacent grain. We introduce a simple model that accounts for all possible permutations of mass transfer within a dual pore and grain network. The primary phenomena thereby represented in the model are mass transfer of colloids between the bulk and near-surface fluid domains and immobilization onto grain surfaces. Colloid movement is described by a sequence of trials in a series of unit cells, and the binomial distribution is used to calculate the probabilities of each possible sequence. Pore-scale simulations provide mechanistically-determined likelihoods and timescales associated with the above pore-scale colloid mass transfer processes, whereas the network-scale model employs pore and grain topology to determine probabilities of transfer from up-gradient bulk and near-surface fluid domains to down-gradient bulk and near-surface fluid domains. Inter-grain transport of colloids in the near surface fluid domain can cause extended tailing.
Robust inference in the negative binomial regression model with an application to falls data.
Aeberhard, William H; Cantoni, Eva; Heritier, Stephane
2014-12-01
A popular way to model overdispersed count data, such as the number of falls reported during intervention studies, is by means of the negative binomial (NB) distribution. Classical estimating methods are well-known to be sensitive to model misspecifications, taking the form of patients falling much more than expected in such intervention studies where the NB regression model is used. We extend in this article two approaches for building robust M-estimators of the regression parameters in the class of generalized linear models to the NB distribution. The first approach achieves robustness in the response by applying a bounded function on the Pearson residuals arising in the maximum likelihood estimating equations, while the second approach achieves robustness by bounding the unscaled deviance components. For both approaches, we explore different choices for the bounding functions. Through a unified notation, we show how close these approaches may actually be as long as the bounding functions are chosen and tuned appropriately, and provide the asymptotic distributions of the resulting estimators. Moreover, we introduce a robust weighted maximum likelihood estimator for the overdispersion parameter, specific to the NB distribution. Simulations under various settings show that redescending bounding functions yield estimates with smaller biases under contamination while keeping high efficiency at the assumed model, and this for both approaches. We present an application to a recent randomized controlled trial measuring the effectiveness of an exercise program at reducing the number of falls among people suffering from Parkinsons disease to illustrate the diagnostic use of such robust procedures and their need for reliable inference. © 2014, The International Biometric Society.
On a Fractional Binomial Process
Cahoy, Dexter O.; Polito, Federico
2012-02-01
The classical binomial process has been studied by Jakeman (J. Phys. A 23:2815-2825, 1990) (and the references therein) and has been used to characterize a series of radiation states in quantum optics. In particular, he studied a classical birth-death process where the chance of birth is proportional to the difference between a larger fixed number and the number of individuals present. It is shown that at large times, an equilibrium is reached which follows a binomial process. In this paper, the classical binomial process is generalized using the techniques of fractional calculus and is called the fractional binomial process. The fractional binomial process is shown to preserve the binomial limit at large times while expanding the class of models that include non-binomial fluctuations (non-Markovian) at regular and small times. As a direct consequence, the generality of the fractional binomial model makes the proposed model more desirable than its classical counterpart in describing real physical processes. More statistical properties are also derived.
Predicting Cumulative Incidence Probability by Direct Binomial Regression
DEFF Research Database (Denmark)
Scheike, Thomas H.; Zhang, Mei-Jie
Binomial modelling; cumulative incidence probability; cause-specific hazards; subdistribution hazard......Binomial modelling; cumulative incidence probability; cause-specific hazards; subdistribution hazard...
Hilpert, Markus; Rasmuson, Anna; Johnson, William P.
2017-07-01
Colloid transport in saturated porous media is significantly influenced by colloidal interactions with grain surfaces. Near-surface fluid domain colloids experience relatively low fluid drag and relatively strong colloidal forces that slow their downgradient translation relative to colloids in bulk fluid. Near-surface fluid domain colloids may reenter into the bulk fluid via diffusion (nanoparticles) or expulsion at rear flow stagnation zones, they may immobilize (attach) via primary minimum interactions, or they may move along a grain-to-grain contact to the near-surface fluid domain of an adjacent grain. We introduce a simple model that accounts for all possible permutations of mass transfer within a dual pore and grain network. The primary phenomena thereby represented in the model are mass transfer of colloids between the bulk and near-surface fluid domains and immobilization. Colloid movement is described by a Markov chain, i.e., a sequence of trials in a 1-D network of unit cells, which contain a pore and a grain. Using combinatorial analysis, which utilizes the binomial coefficient, we derive the residence time distribution, i.e., an inventory of the discrete colloid travel times through the network and of their probabilities to occur. To parameterize the network model, we performed mechanistic pore-scale simulations in a single unit cell that determined the likelihoods and timescales associated with the above colloid mass transfer processes. We found that intergrain transport of colloids in the near-surface fluid domain can cause extended tailing, which has traditionally been attributed to hydrodynamic dispersion emanating from flow tortuosity of solute trajectories.
Moghimbeigi, Abbas
2015-05-07
Poisson regression models provide a standard framework for quantitative trait locus (QTL) mapping of count traits. In practice, however, count traits are often over-dispersed relative to the Poisson distribution. In these situations, the zero-inflated Poisson (ZIP), zero-inflated generalized Poisson (ZIGP) and zero-inflated negative binomial (ZINB) regression may be useful for QTL mapping of count traits. Added genetic variables to the negative binomial part equation, may also affect extra zero data. In this study, to overcome these challenges, I apply two-part ZINB model. The EM algorithm with Newton-Raphson method in the M-step uses for estimating parameters. An application of the two-part ZINB model for QTL mapping is considered to detect associations between the formation of gallstone and the genotype of markers. Copyright © 2015 Elsevier Ltd. All rights reserved.
Salmerón, Diego; Cano, Juan A; Chirlaque, María D
2015-08-30
In cohort studies, binary outcomes are very often analyzed by logistic regression. However, it is well known that when the goal is to estimate a risk ratio, the logistic regression is inappropriate if the outcome is common. In these cases, a log-binomial regression model is preferable. On the other hand, the estimation of the regression coefficients of the log-binomial model is difficult owing to the constraints that must be imposed on these coefficients. Bayesian methods allow a straightforward approach for log-binomial regression models and produce smaller mean squared errors in the estimation of risk ratios than the frequentist methods, and the posterior inferences can be obtained using the software WinBUGS. However, Markov chain Monte Carlo methods implemented in WinBUGS can lead to large Monte Carlo errors in the approximations to the posterior inferences because they produce correlated simulations, and the accuracy of the approximations are inversely related to this correlation. To reduce correlation and to improve accuracy, we propose a reparameterization based on a Poisson model and a sampling algorithm coded in R. Copyright © 2015 John Wiley & Sons, Ltd.
Chen, Wansu; Shi, Jiaxiao; Qian, Lei; Azen, Stanley P
2014-06-26
To estimate relative risks or risk ratios for common binary outcomes, the most popular model-based methods are the robust (also known as modified) Poisson and the log-binomial regression. Of the two methods, it is believed that the log-binomial regression yields more efficient estimators because it is maximum likelihood based, while the robust Poisson model may be less affected by outliers. Evidence to support the robustness of robust Poisson models in comparison with log-binomial models is very limited. In this study a simulation was conducted to evaluate the performance of the two methods in several scenarios where outliers existed. The findings indicate that for data coming from a population where the relationship between the outcome and the covariate was in a simple form (e.g. log-linear), the two models yielded comparable biases and mean square errors. However, if the true relationship contained a higher order term, the robust Poisson models consistently outperformed the log-binomial models even when the level of contamination is low. The robust Poisson models are more robust (or less sensitive) to outliers compared to the log-binomial models when estimating relative risks or risk ratios for common binary outcomes. Users should be aware of the limitations when choosing appropriate models to estimate relative risks or risk ratios.
2014-01-01
Background Large-scale public health interventions with rapid scale-up are increasingly being implemented worldwide. Such implementation allows for a large target population to be reached in a short period of time. But when the time comes to investigate the effectiveness of these interventions, the rapid scale-up creates several methodological challenges, such as the lack of baseline data and the absence of control groups. One example of such an intervention is Avahan, the India HIV/AIDS initiative of the Bill & Melinda Gates Foundation. One question of interest is the effect of Avahan on condom use by female sex workers with their clients. By retrospectively reconstructing condom use and sex work history from survey data, it is possible to estimate how condom use rates evolve over time. However formal inference about how this rate changes at a given point in calendar time remains challenging. Methods We propose a new statistical procedure based on a mixture of binomial regression and Cox regression. We compare this new method to an existing approach based on generalized estimating equations through simulations and application to Indian data. Results Both methods are unbiased, but the proposed method is more powerful than the existing method, especially when initial condom use is high. When applied to the Indian data, the new method mostly agrees with the existing method, but seems to have corrected some implausible results of the latter in a few districts. We also show how the new method can be used to analyze the data of all districts combined. Conclusions The use of both methods can be recommended for exploratory data analysis. However for formal statistical inference, the new method has better power. PMID:24397563
Modeling abundance using N-mixture models: the importance of considering ecological mechanisms.
Joseph, Liana N; Elkin, Ché; Martin, Tara G; Possinghami, Hugh P
2009-04-01
Predicting abundance across a species' distribution is useful for studies of ecology and biodiversity management. Modeling of survey data in relation to environmental variables can be a powerful method for extrapolating abundances across a species' distribution and, consequently, calculating total abundances and ultimately trends. Research in this area has demonstrated that models of abundance are often unstable and produce spurious estimates, and until recently our ability to remove detection error limited the development of accurate models. The N-mixture model accounts for detection and abundance simultaneously and has been a significant advance in abundance modeling. Case studies that have tested these new models have demonstrated success for some species, but doubt remains over the appropriateness of standard N-mixture models for many species. Here we develop the N-mixture model to accommodate zero-inflated data, a common occurrence in ecology, by employing zero-inflated count models. To our knowledge, this is the first application of this method to modeling count data. We use four variants of the N-mixture model (Poisson, zero-inflated Poisson, negative binomial, and zero-inflated negative binomial) to model abundance, occupancy (zero-inflated models only) and detection probability of six birds in South Australia. We assess models by their statistical fit and the ecological realism of the parameter estimates. Specifically, we assess the statistical fit with AIC and assess the ecological realism by comparing the parameter estimates with expected values derived from literature, ecological theory, and expert opinion. We demonstrate that, despite being frequently ranked the "best model" according to AIC, the negative binomial variants of the N-mixture often produce ecologically unrealistic parameter estimates. The zero-inflated Poisson variant is preferable to the negative binomial variants of the N-mixture, as it models an ecological mechanism rather than a
Najera-Zuloaga, Josu; Lee, Dae-Jin; Arostegui, Inmaculada
2017-01-01
Health-related quality of life has become an increasingly important indicator of health status in clinical trials and epidemiological research. Moreover, the study of the relationship of health-related quality of life with patients and disease characteristics has become one of the primary aims of many health-related quality of life studies. Health-related quality of life scores are usually assumed to be distributed as binomial random variables and often highly skewed. The use of the beta-binomial distribution in the regression context has been proposed to model such data; however, the beta-binomial regression has been performed by means of two different approaches in the literature: (i) beta-binomial distribution with a logistic link; and (ii) hierarchical generalized linear models. None of the existing literature in the analysis of health-related quality of life survey data has performed a comparison of both approaches in terms of adequacy and regression parameter interpretation context. This paper is motivated by the analysis of a real data application of health-related quality of life outcomes in patients with Chronic Obstructive Pulmonary Disease, where the use of both approaches yields to contradictory results in terms of covariate effects significance and consequently the interpretation of the most relevant factors in health-related quality of life. We present an explanation of the results in both methodologies through a simulation study and address the need to apply the proper approach in the analysis of health-related quality of life survey data for practitioners, providing an R package.
DEFF Research Database (Denmark)
Vilsen, Søren B.; Tvedebrink, Torben; Mogensen, Helle Smidt
2015-01-01
We present a model fitting the distribution of non-systematic errors in STR second generation sequencing, SGS, analysis. The model fits the distribution of non-systematic errors, i.e. the noise, using a one-inflated, zero-truncated, negative binomial model. The model is a two component model...
Shirazi, Mohammadali; Lord, Dominique; Dhavala, Soma Sekhar; Geedipally, Srinivas Reddy
2016-06-01
Crash data can often be characterized by over-dispersion, heavy (long) tail and many observations with the value zero. Over the last few years, a small number of researchers have started developing and applying novel and innovative multi-parameter models to analyze such data. These multi-parameter models have been proposed for overcoming the limitations of the traditional negative binomial (NB) model, which cannot handle this kind of data efficiently. The research documented in this paper continues the work related to multi-parameter models. The objective of this paper is to document the development and application of a flexible NB generalized linear model with randomly distributed mixed effects characterized by the Dirichlet process (NB-DP) to model crash data. The objective of the study was accomplished using two datasets. The new model was compared to the NB and the recently introduced model based on the mixture of the NB and Lindley (NB-L) distributions. Overall, the research study shows that the NB-DP model offers a better performance than the NB model once data are over-dispersed and have a heavy tail. The NB-DP performed better than the NB-L when the dataset has a heavy tail, but a smaller percentage of zeros. However, both models performed similarly when the dataset contained a large amount of zeros. In addition to a greater flexibility, the NB-DP provides a clustering by-product that allows the safety analyst to better understand the characteristics of the data, such as the identification of outliers and sources of dispersion. Copyright © 2016 Elsevier Ltd. All rights reserved.
Modelling of an homogeneous equilibrium mixture model
International Nuclear Information System (INIS)
Bernard-Champmartin, A.; Poujade, O.; Mathiaud, J.; Mathiaud, J.; Ghidaglia, J.M.
2014-01-01
We present here a model for two phase flows which is simpler than the 6-equations models (with two densities, two velocities, two temperatures) but more accurate than the standard mixture models with 4 equations (with two densities, one velocity and one temperature). We are interested in the case when the two-phases have been interacting long enough for the drag force to be small but still not negligible. The so-called Homogeneous Equilibrium Mixture Model (HEM) that we present is dealing with both mixture and relative quantities, allowing in particular to follow both a mixture velocity and a relative velocity. This relative velocity is not tracked by a conservation law but by a closure law (drift relation), whose expression is related to the drag force terms of the two-phase flow. After the derivation of the model, a stability analysis and numerical experiments are presented. (authors)
Directory of Open Access Journals (Sweden)
Paweł Mielcarz
2007-06-01
Full Text Available The article presents a case study of valuation of real options included in a investment project. The main goal of the article is to present the calculation and methodological issues of application the methodology for real option valuation. In order to do it there are used the binomial model and Market Asset Declaimer methodology. The project presented in the article concerns the introduction of radio station to a new market. It includes two valuable real options: to abandon the project and to expand.
DEFF Research Database (Denmark)
Elmasry, Amr; Jensen, Claus; Katajainen, Jyrki
2017-01-01
the (total) number of elements stored in the data structure(s) prior to the operation. As the resulting data structure consists of two components that are different variants of binomial heaps, we call it a bipartite binomial heap. Compared to its counterpart, a multipartite binomial heap, the new structure...
Jakaitiene, Audrone; Avino, Mariano; Guarracino, Mario Rosario
2017-04-01
Against diminishing costs, next-generation sequencing (NGS) still remains expensive for studies with a large number of individuals. As cost saving, sequencing genome of pools containing multiple samples might be used. Currently, there are many software available for the detection of single-nucleotide polymorphisms (SNPs). Sensitivity and specificity depend on the model used and data analyzed, indicating that all software have space for improvement. We use beta-binomial model to detect rare mutations in untagged pooled NGS experiments. We propose a multireference framework for pooled data with ability being specific up to two patients affected by neuromuscular disorders (NMD). We assessed the results comparing with The Genome Analysis Toolkit (GATK), CRISP, SNVer, and FreeBayes. Our results show that the multireference approach applying beta-binomial model is accurate in predicting rare mutations at 0.01 fraction. Finally, we explored the concordance of mutations between the model and software, checking their involvement in any NMD-related gene. We detected seven novel SNPs, for which the functional analysis produced enriched terms related to locomotion and musculature.
Nonparametric Mixture of Regression Models.
Huang, Mian; Li, Runze; Wang, Shaoli
2013-07-01
Motivated by an analysis of US house price index data, we propose nonparametric finite mixture of regression models. We study the identifiability issue of the proposed models, and develop an estimation procedure by employing kernel regression. We further systematically study the sampling properties of the proposed estimators, and establish their asymptotic normality. A modified EM algorithm is proposed to carry out the estimation procedure. We show that our algorithm preserves the ascent property of the EM algorithm in an asymptotic sense. Monte Carlo simulations are conducted to examine the finite sample performance of the proposed estimation procedure. An empirical analysis of the US house price index data is illustrated for the proposed methodology.
Batra, Manu; Shah, Aasim Farooq; Rajput, Prashant; Shah, Ishrat Aasim
2016-01-01
Dental caries among children has been described as a pandemic disease with a multifactorial nature. Various sociodemographic factors and oral hygiene practices are commonly tested for their influence on dental caries. In recent years, a recent statistical model that allows for covariate adjustment has been developed and is commonly referred zero-inflated negative binomial (ZINB) models. To compare the fit of the two models, the conventional linear regression (LR) model and ZINB model to assess the risk factors associated with dental caries. A cross-sectional survey was conducted on 1138 12-year-old school children in Moradabad Town, Uttar Pradesh during months of February-August 2014. Selected participants were interviewed using a questionnaire. Dental caries was assessed by recording decayed, missing, or filled teeth (DMFT) index. To assess the risk factor associated with dental caries in children, two approaches have been applied - LR model and ZINB model. The prevalence of caries-free subjects was 24.1%, and mean DMFT was 3.4 ± 1.8. In LR model, all the variables were statistically significant. Whereas in ZINB model, negative binomial part showed place of residence, father's education level, tooth brushing frequency, and dental visit statistically significant implying that the degree of being caries-free (DMFT = 0) increases for group of children who are living in urban, whose father is university pass out, who brushes twice a day and if have ever visited a dentist. The current study report that the LR model is a poorly fitted model and may lead to spurious conclusions whereas ZINB model has shown better goodness of fit (Akaike information criterion values - LR: 3.94; ZINB: 2.39) and can be preferred if high variance and number of an excess of zeroes are present.
Zheng, Han; Kimber, Alan; Goodwin, Victoria A; Pickering, Ruth M
2018-01-01
A common design for a falls prevention trial is to assess falling at baseline, randomize participants into an intervention or control group, and ask them to record the number of falls they experience during a follow-up period of time. This paper addresses how best to include the baseline count in the analysis of the follow-up count of falls in negative binomial (NB) regression. We examine the performance of various approaches in simulated datasets where both counts are generated from a mixed Poisson distribution with shared random subject effect. Including the baseline count after log-transformation as a regressor in NB regression (NB-logged) or as an offset (NB-offset) resulted in greater power than including the untransformed baseline count (NB-unlogged). Cook and Wei's conditional negative binomial (CNB) model replicates the underlying process generating the data. In our motivating dataset, a statistically significant intervention effect resulted from the NB-logged, NB-offset, and CNB models, but not from NB-unlogged, and large, outlying baseline counts were overly influential in NB-unlogged but not in NB-logged. We conclude that there is little to lose by including the log-transformed baseline count in standard NB regression compared to CNB for moderate to larger sized datasets. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Consistency of the MLE under mixture models
Chen, Jiahua
2016-01-01
The large-sample properties of likelihood-based statistical inference under mixture models have received much attention from statisticians. Although the consistency of the nonparametric MLE is regarded as a standard conclusion, many researchers ignore the precise conditions required on the mixture model. An incorrect claim of consistency can lead to false conclusions even if the mixture model under investigation seems well behaved. Under a finite normal mixture model, for instance, the consis...
Beta-binomial regression and bimodal utilization.
Liu, Chuan-Fen; Burgess, James F; Manning, Willard G; Maciejewski, Matthew L
2013-10-01
To illustrate how the analysis of bimodal U-shaped distributed utilization can be modeled with beta-binomial regression, which is rarely used in health services research. Veterans Affairs (VA) administrative data and Medicare claims in 2001-2004 for 11,123 Medicare-eligible VA primary care users in 2000. We compared means and distributions of VA reliance (the proportion of all VA/Medicare primary care visits occurring in VA) predicted from beta-binomial, binomial, and ordinary least-squares (OLS) models. Beta-binomial model fits the bimodal distribution of VA reliance better than binomial and OLS models due to the nondependence on normality and the greater flexibility in shape parameters. Increased awareness of beta-binomial regression may help analysts apply appropriate methods to outcomes with bimodal or U-shaped distributions. © Health Research and Educational Trust.
Ali, Asad; Zaidi, Farrah; Fatima, Syeda Hira; Adnan, Muhammad; Ullah, Saleem
2018-03-24
In this study, we propose to develop a geostatistical computational framework to model the distribution of rat bite infestation of epidemic proportion in Peshawar valley, Pakistan. Two species Rattus norvegicus and Rattus rattus are suspected to spread the infestation. The framework combines strengths of maximum entropy algorithm and binomial kriging with logistic regression to spatially model the distribution of infestation and to determine the individual role of environmental predictors in modeling the distribution trends. Our results demonstrate the significance of a number of social and environmental factors in rat infestations such as (I) high human population density; (II) greater dispersal ability of rodents due to the availability of better connectivity routes such as roads, and (III) temperature and precipitation influencing rodent fecundity and life cycle.
Kondo, Yumi; Zhao, Yinshan; Petkau, John
2015-06-15
We develop a new modeling approach to enhance a recently proposed method to detect increases of contrast-enhancing lesions (CELs) on repeated magnetic resonance imaging, which have been used as an indicator for potential adverse events in multiple sclerosis clinical trials. The method signals patients with unusual increases in CEL activity by estimating the probability of observing CEL counts as large as those observed on a patient's recent scans conditional on the patient's CEL counts on previous scans. This conditional probability index (CPI), computed based on a mixed-effect negative binomial regression model, can vary substantially depending on the choice of distribution for the patient-specific random effects. Therefore, we relax this parametric assumption to model the random effects with an infinite mixture of beta distributions, using the Dirichlet process, which effectively allows any form of distribution. To our knowledge, no previous literature considers a mixed-effect regression for longitudinal count variables where the random effect is modeled with a Dirichlet process mixture. As our inference is in the Bayesian framework, we adopt a meta-analytic approach to develop an informative prior based on previous clinical trials. This is particularly helpful at the early stages of trials when less data are available. Our enhanced method is illustrated with CEL data from 10 previous multiple sclerosis clinical trials. Our simulation study shows that our procedure estimates the CPI more accurately than parametric alternatives when the patient-specific random effect distribution is misspecified and that an informative prior improves the accuracy of the CPI estimates. Copyright © 2015 John Wiley & Sons, Ltd.
Directory of Open Access Journals (Sweden)
Federica Giardina
Full Text Available The Research Center for Human Development in Dakar (CRDH with the technical assistance of ICF Macro and the National Malaria Control Programme (NMCP conducted in 2008/2009 the Senegal Malaria Indicator Survey (SMIS, the first nationally representative household survey collecting parasitological data and malaria-related indicators. In this paper, we present spatially explicit parasitaemia risk estimates and number of infected children below 5 years. Geostatistical Zero-Inflated Binomial models (ZIB were developed to take into account the large number of zero-prevalence survey locations (70% in the data. Bayesian variable selection methods were incorporated within a geostatistical framework in order to choose the best set of environmental and climatic covariates associated with the parasitaemia risk. Model validation confirmed that the ZIB model had a better predictive ability than the standard Binomial analogue. Markov chain Monte Carlo (MCMC methods were used for inference. Several insecticide treated nets (ITN coverage indicators were calculated to assess the effectiveness of interventions. After adjusting for climatic and socio-economic factors, the presence of at least one ITN per every two household members and living in urban areas reduced the odds of parasitaemia by 86% and 81% respectively. Posterior estimates of the ORs related to the wealth index show a decreasing trend with the quintiles. Infection odds appear to be increasing with age. The population-adjusted prevalence ranges from 0.12% in Thillé-Boubacar to 13.1% in Dabo. Tambacounda has the highest population-adjusted predicted prevalence (8.08% whereas the region with the highest estimated number of infected children under the age of 5 years is Kolda (13940. The contemporary map and estimates of malaria burden identify the priority areas for future control interventions and provide baseline information for monitoring and evaluation. Zero-Inflated formulations are more appropriate
Jumps in binomial AR(1) processes
Weiß , Christian H.
2009-01-01
Abstract We consider the binomial AR(1) model for serially dependent processes of binomial counts. After a review of its definition and known properties, we investigate marginal and serial properties of jumps in such processes. Based on these results, we propose the jumps control chart for monitoring a binomial AR(1) process. We show how to evaluate the performance of this control chart and give design recommendations. correspondance: Tel.: +49 931 31 84968; ...
Kuss, Oliver; Hoyer, Annika; Solms, Alexander
2014-01-15
There are still challenges when meta-analyzing data from studies on diagnostic accuracy. This is mainly due to the bivariate nature of the response where information on sensitivity and specificity must be summarized while accounting for their correlation within a single trial. In this paper, we propose a new statistical model for the meta-analysis for diagnostic accuracy studies. This model uses beta-binomial distributions for the marginal numbers of true positives and true negatives and links these margins by a bivariate copula distribution. The new model comes with all the features of the current standard model, a bivariate logistic regression model with random effects, but has the additional advantages of a closed likelihood function and a larger flexibility for the correlation structure of sensitivity and specificity. In a simulation study, which compares three copula models and two implementations of the standard model, the Plackett and the Gauss copula do rarely perform worse but frequently better than the standard model. We use an example from a meta-analysis to judge the diagnostic accuracy of telomerase (a urinary tumor marker) for the diagnosis of primary bladder cancer for illustration. Copyright © 2013 John Wiley & Sons, Ltd.
Shirazi, Mohammadali; Dhavala, Soma Sekhar; Lord, Dominique; Geedipally, Srinivas Reddy
2017-10-01
Safety analysts usually use post-modeling methods, such as the Goodness-of-Fit statistics or the Likelihood Ratio Test, to decide between two or more competitive distributions or models. Such metrics require all competitive distributions to be fitted to the data before any comparisons can be accomplished. Given the continuous growth in introducing new statistical distributions, choosing the best one using such post-modeling methods is not a trivial task, in addition to all theoretical or numerical issues the analyst may face during the analysis. Furthermore, and most importantly, these measures or tests do not provide any intuitions into why a specific distribution (or model) is preferred over another (Goodness-of-Logic). This paper ponders into these issues by proposing a methodology to design heuristics for Model Selection based on the characteristics of data, in terms of descriptive summary statistics, before fitting the models. The proposed methodology employs two analytic tools: (1) Monte-Carlo Simulations and (2) Machine Learning Classifiers, to design easy heuristics to predict the label of the 'most-likely-true' distribution for analyzing data. The proposed methodology was applied to investigate when the recently introduced Negative Binomial Lindley (NB-L) distribution is preferred over the Negative Binomial (NB) distribution. Heuristics were designed to select the 'most-likely-true' distribution between these two distributions, given a set of prescribed summary statistics of data. The proposed heuristics were successfully compared against classical tests for several real or observed datasets. Not only they are easy to use and do not need any post-modeling inputs, but also, using these heuristics, the analyst can attain useful information about why the NB-L is preferred over the NB - or vice versa- when modeling data. Copyright © 2017 Elsevier Ltd. All rights reserved.
Mixture Modeling: Applications in Educational Psychology
Harring, Jeffrey R.; Hodis, Flaviu A.
2016-01-01
Model-based clustering methods, commonly referred to as finite mixture modeling, have been applied to a wide variety of cross-sectional and longitudinal data to account for heterogeneity in population characteristics. In this article, we elucidate 2 such approaches: growth mixture modeling and latent profile analysis. Both techniques are…
Stochastic background of negative binomial distribution
International Nuclear Information System (INIS)
Suzuki, N.; Biyajima, M.; Wilk, G.
1991-01-01
A branching equations of the birth process with immigration is taken as a model for the particle production process. Using it we investigate cases in which its solution becomes the negative binomial distribution. Furthermore, we compare our approach with the modified negative binomial distribution proposed recently by Chliapnikov and Tchikilev and use it to analyse the observed multiplicity distributions. (orig.)
Examples of mixed-effects modeling with crossed random effects and with binomial data
Quené, H.; van den Bergh, H.
2008-01-01
Psycholinguistic data are often analyzed with repeated-measures analyses of variance (ANOVA), but this paper argues that mixed-effects (multilevel) models provide a better alternative method. First, models are discussed in which the two random factors of participants and items are crossed, and not
What do you do when the binomial cannot value real options? The LSM model
Directory of Open Access Journals (Sweden)
S. Alonso
2014-12-01
Full Text Available The Least-Squares Monte Carlo model (LSM model has emerged as the derivative valuation technique with the greatest impact in current practice. As with other options valuation models, the LSM algorithm was initially posited in the field of financial derivatives and its extension to the realm of real options requires considering certain questions which might hinder understanding of the algorithm and which the present paper seeks to address. The implementation of the LSM model combines Monte Carlo simulation, dynamic programming and statistical regression in a flexible procedure suitable for application to valuing nearly all types of corporate investments. The goal of this paper is to show how the LSM algorithm is applied in the context of a corporate investment, thus contributing to the understanding of the principles of its operation.
Rusli, Rusdi; Haque, Md Mazharul; King, Mark; Voon, Wong Shaw
2017-05-01
Mountainous highways generally associate with complex driving environment because of constrained road geometries, limited cross-section elements, inappropriate roadside features, and adverse weather conditions. As a result, single-vehicle (SV) crashes are overrepresented along mountainous roads, particularly in developing countries, but little attention is known about the roadway geometric, traffic and weather factors contributing to these SV crashes. As such, the main objective of the present study is to investigate SV crashes using detailed data obtained from a rigorous site survey and existing databases. The final dataset included a total of 56 variables representing road geometries including horizontal and vertical alignment, traffic characteristics, real-time weather condition, cross-sectional elements, roadside features, and spatial characteristics. To account for structured heterogeneities resulting from multiple observations within a site and other unobserved heterogeneities, the study applied a random parameters negative binomial model. Results suggest that rainfall during the crash is positively associated with SV crashes, but real-time visibility is negatively associated. The presence of a road shoulder, particularly a bitumen shoulder or wider shoulders, along mountainous highways is associated with less SV crashes. While speeding along downgrade slopes increases the likelihood of SV crashes, proper delineation decreases the likelihood. Findings of this study have significant implications for designing safer highways in mountainous areas, particularly in the context of a developing country. Copyright © 2017 Elsevier Ltd. All rights reserved.
Assefa, Enyew; Tadesse, Mekonnen
2017-08-01
The major causes for poor health in developing countries are inadequate access and under-use of modern health care services. The objective of this study was to identify and examine factors related to the use of antenatal care services using the 2011 Ethiopia Demographic and Health Survey data. The number of antenatal care visits during the last pregnancy by mothers aged 15 to 49 years (n = 7,737) was analyzed. More than 55% of the mothers did not use antenatal care (ANC) services, while more than 22% of the women used antenatal care services less than four times. More than half of the women (52%) who had access to health services had at least four antenatal care visits. The zero-inflated negative binomial model was found to be more appropriate for analyzing the data. Place of residence, age of mothers, woman's educational level, employment status, mass media exposure, religion, and access to health services were significantly associated with the use of antenatal care services. Accordingly, there should be progress toward a health-education program that enables more women to utilize ANC services, with the program targeting women in rural areas, uneducated women, and mothers with higher birth orders through appropriate media.
Directory of Open Access Journals (Sweden)
Xuedong Yan
2012-01-01
Full Text Available In this study, the traffic crash rate, total crash frequency, and injury and fatal crash frequency were taken into consideration for distinguishing between rural and urban road segment safety. The GIS-based crash data during four and half years in Pikes Peak Area, US were applied for the analyses. The comparative statistical results show that the crash rates in rural segments are consistently lower than urban segments. Further, the regression results based on Zero-Inflated Negative Binomial (ZINB regression models indicate that the urban areas have a higher crash risk in terms of both total crash frequency and injury and fatal crash frequency, compared to rural areas. Additionally, it is found that crash frequencies increase as traffic volume and segment length increase, though the higher traffic volume lower the likelihood of severe crash occurrence; compared to 2-lane roads, the 4-lane roads have lower crash frequencies but have a higher probability of severe crash occurrence; and better road facilities with higher free flow speed can benefit from high standard design feature thus resulting in a lower total crash frequency, but they cannot mitigate the severe crash risk.
International Nuclear Information System (INIS)
Hernandez I, S.; Ortiz C, E.; Chavez M, C.
2011-11-01
At the present time, is an unquestionable fact that the nuclear electrical energy is a topic of vital importance, no more because eliminates the dependence of the hydrocarbons and is friendly with the environment, but because is also a sure and reliable energy source, and represents a viable alternative before the claims in the growing demand of electricity in Mexico. Before this panorama, was intended several scenarios to elevate the capacity of electric generation of nuclear origin with a variable participation. One of the contemplated scenarios is represented by the expansion project of the nuclear power plant Laguna Verde through the addition of a third reactor that serves as detonator of an integral program that proposes the installation of more nuclear reactors in the country. Before this possible scenario, the Federal Commission of Electricity like responsible organism of supplying energy to the population should have tools that offer it the flexibility to be adapted to the possible changes that will be presented along the project and also gives a value to the risk to future. The methodology denominated Real Options, Binomial model was proposed as an evaluation tool that allows to quantify the value of the expansion proposal, demonstrating the feasibility of the project through a periodic visualization of their evolution, all with the objective of supplying a financial analysis that serves as base and justification before the evident apogee of the nuclear energy that will be presented in future years. (Author)
Mathematical finance theory review and exercises from binomial model to risk measures
Gianin, Emanuela Rosazza
2013-01-01
The book collects over 120 exercises on different subjects of Mathematical Finance, including Option Pricing, Risk Theory, and Interest Rate Models. Many of the exercises are solved, while others are only proposed. Every chapter contains an introductory section illustrating the main theoretical results necessary to solve the exercises. The book is intended as an exercise textbook to accompany graduate courses in mathematical finance offered at many universities as part of degree programs in Applied and Industrial Mathematics, Mathematical Engineering, and Quantitative Finance.
Hilpert, Markus; Johnson, William P.
2018-01-01
We used a recently developed simple mathematical network model to upscale pore-scale colloid transport information determined under unfavorable attachment conditions. Classical log-linear and nonmonotonic retention profiles, both well-reported under favorable and unfavorable attachment conditions, respectively, emerged from our upscaling. The primary attribute of the network is colloid transfer between bulk pore fluid, the near-surface fluid domain (NSFD), and attachment (treated as irreversible). The network model accounts for colloid transfer to the NSFD of downgradient grains and for reentrainment to bulk pore fluid via diffusion or via expulsion at rear flow stagnation zones (RFSZs). The model describes colloid transport by a sequence of random trials in a one-dimensional (1-D) network of Happel cells, which contain a grain and a pore. Using combinatorial analysis that capitalizes on the binomial coefficient, we derived from the pore-scale information the theoretical residence time distribution of colloids in the network. The transition from log-linear to nonmonotonic retention profiles occurs when the conditions underlying classical filtration theory are not fulfilled, i.e., when an NSFD colloid population is maintained. Then, nonmonotonic retention profiles result potentially both for attached and NSFD colloids. The concentration maxima shift downgradient depending on specific parameter choice. The concentration maxima were also shown to shift downgradient temporally (with continued elution) under conditions where attachment is negligible, explaining experimentally observed downgradient transport of retained concentration maxima of adhesion-deficient bacteria. For the case of zero reentrainment, we develop closed-form, analytical expressions for the shape, and the maximum of the colloid retention profile.
Directory of Open Access Journals (Sweden)
Alfonso Palmer
2010-07-01
Full Text Available Alcohol is currently the most consumed substance among the Spanish adolescent population. Some of the variables that bear an influence on this consumption include ease of access, use of alcohol by friends and some personality factors. The aim of this study was to analyze and quantify the predictive value of these variables specifically on alcohol consumption in the adolescent population. The useful sample was made up of 6,145 adolescents (49.8% boys and 50.2% girls with a mean age of 15.4 years (SE= 1.2. The data were analyzed using the statistical model for a count variable and Data Mining techniques. The results show the influence of ease of access, alcohol consumption by the group of friends, and certain personality factors on alcohol intake, allowing us to quantify the intensity of this influence according to age and gender. Knowing these factors is the starting point in elaborating specific preventive actions against alcohol consumption.
On modeling of structured multiphase mixtures
International Nuclear Information System (INIS)
Dobran, F.
1987-01-01
The usual modeling of multiphase mixtures involves a set of conservation and balance equations of mass, momentum, energy and entropy (the basic set) constructed by an averaging procedure or postulated. The averaged models are constructed by averaging, over space or time segments, the local macroscopic field equations of each phase, whereas the postulated models are usually motivated by the single phase multicomponent mixture models. In both situations, the resulting equations yield superimposed continua models and are closed by the constitutive equations which place restrictions on the possible material response during the motion and phase change. In modeling the structured multiphase mixtures, the modeling of intrinsic motion of grains or particles is accomplished by adjoining to the basic set of field equations the additional balance equations, thereby placing restrictions on the motion of phases only within the imposed extrinsic and intrinsic sources. The use of the additional balance equations has been primarily advocated in the postulatory theories of multiphase mixtures and are usually derived through very special assumptions of the material deformation. Nevertheless, the resulting mixture models can predict a wide variety of complex phenomena such as the Mohr-Coulomb yield criterion in granular media, Rayleigh bubble equation, wave dispersion and dilatancy. Fundamental to the construction of structured models of multiphase mixtures are the problems pertaining to the existence and number of additional balance equations to model the structural characteristics of a mixture. Utilizing a volume averaging procedure it is possible not only to derive the basic set of field equation discussed above, but also a very general set of additional balance equations for modeling of structural properties of the mixture
Model structure selection in convolutive mixtures
DEFF Research Database (Denmark)
Dyrholm, Mads; Makeig, S.; Hansen, Lars Kai
2006-01-01
The CICAAR algorithm (convolutive independent component analysis with an auto-regressive inverse model) allows separation of white (i.i.d) source signals from convolutive mixtures. We introduce a source color model as a simple extension to the CICAAR which allows for a more parsimonious represent......The CICAAR algorithm (convolutive independent component analysis with an auto-regressive inverse model) allows separation of white (i.i.d) source signals from convolutive mixtures. We introduce a source color model as a simple extension to the CICAAR which allows for a more parsimonious...... representation in many practical mixtures. The new filter-CICAAR allows Bayesian model selection and can help answer questions like: ’Are we actually dealing with a convolutive mixture?’. We try to answer this question for EEG data....
Directory of Open Access Journals (Sweden)
Silva-Aguilar Martín
2011-01-01
Full Text Available Metals are ubiquitous pollutants present as mixtures. In particular, mixture of arsenic-cadmium-lead is among the leading toxic agents detected in the environment. These metals have carcinogenic and cell-transforming potential. In this study, we used a two step cell transformation model, to determine the role of oxidative stress in transformation induced by a mixture of arsenic-cadmium-lead. Oxidative damage and antioxidant response were determined. Metal mixture treatment induces the increase of damage markers and the antioxidant response. Loss of cell viability and increased transforming potential were observed during the promotion phase. This finding correlated significantly with generation of reactive oxygen species. Cotreatment with N-acetyl-cysteine induces effect on the transforming capacity; while a diminution was found in initiation, in promotion phase a total block of the transforming capacity was observed. Our results suggest that oxidative stress generated by metal mixture plays an important role only in promotion phase promoting transforming capacity.
Probabilistic mixture-based image modelling
Czech Academy of Sciences Publication Activity Database
Haindl, Michal; Havlíček, Vojtěch; Grim, Jiří
2011-01-01
Roč. 47, č. 3 (2011), s. 482-500 ISSN 0023-5954 R&D Projects: GA MŠk 1M0572; GA ČR GA102/08/0593 Grant - others:CESNET(CZ) 387/2010; GA MŠk(CZ) 2C06019; GA ČR(CZ) GA103/11/0335 Institutional research plan: CEZ:AV0Z10750506 Keywords : BTF texture modelling * discrete distribution mixtures * Bernoulli mixture * Gaussian mixture * multi-spectral texture modelling Subject RIV: BD - Theory of Information Impact factor: 0.454, year: 2011 http://library.utia.cas.cz/separaty/2011/RO/haindl-0360244.pdf
Zero inflated negative binomial-Sushila distribution and its application
Yamrubboon, Darika; Thongteeraparp, Ampai; Bodhisuwan, Winai; Jampachaisri, Katechan
2017-11-01
A new zero inflated distribution is proposed in this work, namely the zero inflated negative binomial-Sushila distribution. The new distribution which is a mixture of the Bernoulli and negative binomial-Sushila distributions is an alternative distribution for the excessive zero counts and over-dispersion. Some characteristics of the proposed distribution are derived including probability mass function, mean and variance. The parameter estimation of the zero inflated negative binomial-Sushila distribution is also implemented by maximum likelihood method. In application, the proposed distribution can provide a better fit than traditional distributions: zero inflated Poisson and zero inflated negative binomial distributions.
Najaf, Pooya; Duddu, Venkata R; Pulugurtha, Srinivas S
2018-03-01
Machine learning (ML) techniques have higher prediction accuracy compared to conventional statistical methods for crash frequency modelling. However, their black-box nature limits the interpretability. The objective of this research is to combine both ML and statistical methods to develop hybrid link-level crash frequency models with high predictability and interpretability. For this purpose, M5' model trees method (M5') is introduced and applied to classify the crash data and then calibrate a model for each homogenous class. The data for 1134 and 345 randomly selected links on urban arterials in the city of Charlotte, North Carolina was used to develop and validate models, respectively. The outputs from the hybrid approach are compared with the outputs from cluster-based negative binomial regression (NBR) and general NBR models. Findings indicate that M5' has high predictability and is very reliable to interpret the role of different attributes on crash frequency compared to other developed models.
Modeling text with generalizable Gaussian mixtures
DEFF Research Database (Denmark)
Hansen, Lars Kai; Sigurdsson, Sigurdur; Kolenda, Thomas
2000-01-01
We apply and discuss generalizable Gaussian mixture (GGM) models for text mining. The model automatically adapts model complexity for a given text representation. We show that the generalizability of these models depends on the dimensionality of the representation and the sample size. We discuss...
Pooling overdispersed binomial data to estimate event rate.
Young-Xu, Yinong; Chan, K Arnold
2008-08-19
The beta-binomial model is one of the methods that can be used to validly combine event rates from overdispersed binomial data. Our objective is to provide a full description of this method and to update and broaden its applications in clinical and public health research. We describe the statistical theories behind the beta-binomial model and the associated estimation methods. We supply information about statistical software that can provide beta-binomial estimations. Using a published example, we illustrate the application of the beta-binomial model when pooling overdispersed binomial data. In an example regarding the safety of oral antifungal treatments, we had 41 treatment arms with event rates varying from 0% to 13.89%. Using the beta-binomial model, we obtained a summary event rate of 3.44% with a standard error of 0.59%. The parameters of the beta-binomial model took the values of 1.24 for alpha and 34.73 for beta. The beta-binomial model can provide a robust estimate for the summary event rate by pooling overdispersed binomial data from different studies. The explanation of the method and the demonstration of its applications should help researchers incorporate the beta-binomial method as they aggregate probabilities of events from heterogeneous studies.
Pooling overdispersed binomial data to estimate event rate
Directory of Open Access Journals (Sweden)
Chan K Arnold
2008-08-01
Full Text Available Abstract Background The beta-binomial model is one of the methods that can be used to validly combine event rates from overdispersed binomial data. Our objective is to provide a full description of this method and to update and broaden its applications in clinical and public health research. Methods We describe the statistical theories behind the beta-binomial model and the associated estimation methods. We supply information about statistical software that can provide beta-binomial estimations. Using a published example, we illustrate the application of the beta-binomial model when pooling overdispersed binomial data. Results In an example regarding the safety of oral antifungal treatments, we had 41 treatment arms with event rates varying from 0% to 13.89%. Using the beta-binomial model, we obtained a summary event rate of 3.44% with a standard error of 0.59%. The parameters of the beta-binomial model took the values of 1.24 for alpha and 34.73 for beta. Conclusion The beta-binomial model can provide a robust estimate for the summary event rate by pooling overdispersed binomial data from different studies. The explanation of the method and the demonstration of its applications should help researchers incorporate the beta-binomial method as they aggregate probabilities of events from heterogeneous studies.
Turi, Christina E; Murch, Susan J
2013-07-09
Ethnobotanical research and the study of plants used for rituals, ceremonies and to connect with the spirit world have led to the discovery of many novel psychoactive compounds such as nicotine, caffeine, and cocaine. In North America, spiritual and ceremonial uses of plants are well documented and can be accessed online via the University of Michigan's Native American Ethnobotany Database. The objective of the study was to compare Residual, Bayesian, Binomial and Imprecise Dirichlet Model (IDM) analyses of ritual, ceremonial and spiritual plants in Moerman's ethnobotanical database and to identify genera that may be good candidates for the discovery of novel psychoactive compounds. The database was queried with the following format "Family Name AND Ceremonial OR Spiritual" for 263 North American botanical families. Spiritual and ceremonial flora consisted of 86 families with 517 species belonging to 292 genera. Spiritual taxa were then grouped further into ceremonial medicines and items categories. Residual, Bayesian, Binomial and IDM analysis were performed to identify over and under-utilized families. The 4 statistical approaches were in good agreement when identifying under-utilized families but large families (>393 species) were underemphasized by Binomial, Bayesian and IDM approaches for over-utilization. Residual, Binomial, and IDM analysis identified similar families as over-utilized in the medium (92-392 species) and small (<92 species) classes. The families Apiaceae, Asteraceae, Ericacea, Pinaceae and Salicaceae were identified as significantly over-utilized as ceremonial medicines in medium and large sized families. Analysis of genera within the Apiaceae and Asteraceae suggest that the genus Ligusticum and Artemisia are good candidates for facilitating the discovery of novel psychoactive compounds. The 4 statistical approaches were not consistent in the selection of over-utilization of flora. Residual analysis revealed overall trends that were supported
Energy Technology Data Exchange (ETDEWEB)
Hernandez I, S.; Ortiz C, E.; Chavez M, C., E-mail: lunitza@gmail.com [UNAM, Facultad de Ingenieria, Circuito Interior, Ciudad Universitaria, 04510 Mexico D. F. (Mexico)
2011-11-15
At the present time, is an unquestionable fact that the nuclear electrical energy is a topic of vital importance, no more because eliminates the dependence of the hydrocarbons and is friendly with the environment, but because is also a sure and reliable energy source, and represents a viable alternative before the claims in the growing demand of electricity in Mexico. Before this panorama, was intended several scenarios to elevate the capacity of electric generation of nuclear origin with a variable participation. One of the contemplated scenarios is represented by the expansion project of the nuclear power plant Laguna Verde through the addition of a third reactor that serves as detonator of an integral program that proposes the installation of more nuclear reactors in the country. Before this possible scenario, the Federal Commission of Electricity like responsible organism of supplying energy to the population should have tools that offer it the flexibility to be adapted to the possible changes that will be presented along the project and also gives a value to the risk to future. The methodology denominated Real Options, Binomial model was proposed as an evaluation tool that allows to quantify the value of the expansion proposal, demonstrating the feasibility of the project through a periodic visualization of their evolution, all with the objective of supplying a financial analysis that serves as base and justification before the evident apogee of the nuclear energy that will be presented in future years. (Author)
Alekseeva, N P; Alekseev, A O; Vakhtin, Iu B; Kravtsov, V Iu; Kuzovatov, S N; Skorikova, T I
2008-01-01
Distributions of nuclear morphology anomalies in transplantable rabdomiosarcoma RA-23 cell populations were investigated under effect of ionizing radiation from 0 to 45 Gy. Internuclear bridges, nuclear protrusions and dumbbell-shaped nuclei were accepted for morphological anomalies. Empirical distributions of the number of anomalies per 100 nuclei were used. The adequate model of reentrant binomial distribution has been found. The sum of binomial random variables with binomial number of summands has such distribution. Averages of these random variables were named, accordingly, internal and external average reentrant components. Their maximum likelihood estimations were received. Statistical properties of these estimations were investigated by means of statistical modeling. It has been received that at equally significant correlation between the radiation dose and the average of nuclear anomalies in cell populations after two-three cellular cycles from the moment of irradiation in vivo the irradiation doze significantly correlates with internal average reentrant component, and in remote descendants of cell transplants irradiated in vitro - with external one.
Energy Technology Data Exchange (ETDEWEB)
Sigeti, David E. [Los Alamos National Laboratory; Pelak, Robert A. [Los Alamos National Laboratory
2012-09-11
We present a Bayesian statistical methodology for identifying improvement in predictive simulations, including an analysis of the number of (presumably expensive) simulations that will need to be made in order to establish with a given level of confidence that an improvement has been observed. Our analysis assumes the ability to predict (or postdict) the same experiments with legacy and new simulation codes and uses a simple binomial model for the probability, {theta}, that, in an experiment chosen at random, the new code will provide a better prediction than the old. This model makes it possible to do statistical analysis with an absolute minimum of assumptions about the statistics of the quantities involved, at the price of discarding some potentially important information in the data. In particular, the analysis depends only on whether or not the new code predicts better than the old in any given experiment, and not on the magnitude of the improvement. We show how the posterior distribution for {theta} may be used, in a kind of Bayesian hypothesis testing, both to decide if an improvement has been observed and to quantify our confidence in that decision. We quantify the predictive probability that should be assigned, prior to taking any data, to the possibility of achieving a given level of confidence, as a function of sample size. We show how this predictive probability depends on the true value of {theta} and, in particular, how there will always be a region around {theta} = 1/2 where it is highly improbable that we will be able to identify an improvement in predictive capability, although the width of this region will shrink to zero as the sample size goes to infinity. We show how the posterior standard deviation may be used, as a kind of 'plan B metric' in the case that the analysis shows that {theta} is close to 1/2 and argue that such a plan B should generally be part of hypothesis testing. All the analysis presented in the paper is done with a
Binomial collisions and near collisions
Blokhuis, Aart; Brouwer, Andries; de Weger, Benne
2017-01-01
We describe efficient algorithms to search for cases in which binomial coefficients are equal or almost equal, give a conjecturally complete list of all cases where two binomial coefficients differ by 1, and give some identities for binomial coefficients that seem to be new.
Statistical inference involving binomial and negative binomial parameters.
García-Pérez, Miguel A; Núñez-Antón, Vicente
2009-05-01
Statistical inference about two binomial parameters implies that they are both estimated by binomial sampling. There are occasions in which one aims at testing the equality of two binomial parameters before and after the occurrence of the first success along a sequence of Bernoulli trials. In these cases, the binomial parameter before the first success is estimated by negative binomial sampling whereas that after the first success is estimated by binomial sampling, and both estimates are related. This paper derives statistical tools to test two hypotheses, namely, that both binomial parameters equal some specified value and that both parameters are equal though unknown. Simulation studies are used to show that in small samples both tests are accurate in keeping the nominal Type-I error rates, and also to determine sample size requirements to detect large, medium, and small effects with adequate power. Additional simulations also show that the tests are sufficiently robust to certain violations of their assumptions.
Silver, Toni O.
2013-01-01
2013 dissertation for MSc in Finance and Risk Management. Selected by academic staff as a good example of a masters level dissertation. \\ud \\ud This study investigated the two major methods of modelling the frequency of\\ud operational losses under the BCBS Accord of 1998 known as Basel II Capital\\ud Accord. It compared the Poisson method of modelling the frequency of\\ud losses to that of the Negative Binomial. The frequency of operational losses\\ud was investigated using a cross section of se...
An, Qingyu; Wu, Jun; Fan, Xuesong; Pan, Liyang; Sun, Wei
2016-01-01
The hand foot and mouth disease (HFMD) is a human syndrome caused by intestinal viruses like that coxsackie A virus 16, enterovirus 71 and easily developed into outbreak in kindergarten and school. Scientifically and accurately early detection of the start time of HFMD epidemic is a key principle in planning of control measures and minimizing the impact of HFMD. The objective of this study was to establish a reliable early detection model for start timing of hand foot mouth disease epidemic in Dalian and to evaluate the performance of model by analyzing the sensitivity in detectability. The negative binomial regression model was used to estimate the weekly baseline case number of HFMD and identified the optimal alerting threshold between tested difference threshold values during the epidemic and non-epidemic year. Circular distribution method was used to calculate the gold standard of start timing of HFMD epidemic. From 2009 to 2014, a total of 62022 HFMD cases were reported (36879 males and 25143 females) in Dalian, Liaoning Province, China, including 15 fatal cases. The median age of the patients was 3 years. The incidence rate of epidemic year ranged from 137.54 per 100,000 population to 231.44 per 100,000population, the incidence rate of non-epidemic year was lower than 112 per 100,000 population. The negative binomial regression model with AIC value 147.28 was finally selected to construct the baseline level. The threshold value was 100 for the epidemic year and 50 for the non- epidemic year had the highest sensitivity(100%) both in retrospective and prospective early warning and the detection time-consuming was 2 weeks before the actual starting of HFMD epidemic. The negative binomial regression model could early warning the start of a HFMD epidemic with good sensitivity and appropriate detection time in Dalian.
Optimal designs for linear mixture models
Mendieta, E.J.; Linssen, H.N.; Doornbos, R.
1975-01-01
In a recent paper Snee and Marquardt [8] considered designs for linear mixture models, where the components are subject to individual lower and/or upper bounds. When the number of components is large their algorithm XVERT yields designs far too extensive for practical purposes. The purpose of this
Optimal designs for linear mixture models
Mendieta, E.J.; Linssen, H.N.; Doornbos, R.
1975-01-01
In a recent paper Snee and Marquardt (1974) considered designs for linear mixture models, where the components are subject to individual lower and/or upper bounds. When the number of components is large their algorithm XVERT yields designs far too extensive for practical purposes. The purpose of
Directory of Open Access Journals (Sweden)
James A Wiley
Full Text Available We put forward a new item response model which is an extension of the binomial error model first introduced by Keats and Lord. Like the binomial error model, the basic latent variable can be interpreted as a probability of responding in a certain way to an arbitrarily specified item. For a set of dichotomous items, this model gives predictions that are similar to other single parameter IRT models (such as the Rasch model but has certain advantages in more complex cases. The first is that in specifying a flexible two-parameter Beta distribution for the latent variable, it is easy to formulate models for randomized experiments in which there is no reason to believe that either the latent variable or its distribution vary over randomly composed experimental groups. Second, the elementary response function is such that extensions to more complex cases (e.g., polychotomous responses, unfolding scales are straightforward. Third, the probability metric of the latent trait allows tractable extensions to cover a wide variety of stochastic response processes.
CUMBIN - CUMULATIVE BINOMIAL PROGRAMS
Bowerman, P. N.
1994-01-01
The cumulative binomial program, CUMBIN, is one of a set of three programs which calculate cumulative binomial probability distributions for arbitrary inputs. The three programs, CUMBIN, NEWTONP (NPO-17556), and CROSSER (NPO-17557), can be used independently of one another. CUMBIN can be used by statisticians and users of statistical procedures, test planners, designers, and numerical analysts. The program has been used for reliability/availability calculations. CUMBIN calculates the probability that a system of n components has at least k operating if the probability that any one operating is p and the components are independent. Equivalently, this is the reliability of a k-out-of-n system having independent components with common reliability p. CUMBIN can evaluate the incomplete beta distribution for two positive integer arguments. CUMBIN can also evaluate the cumulative F distribution and the negative binomial distribution, and can determine the sample size in a test design. CUMBIN is designed to work well with all integer values 0 < k <= n. To run the program, the user simply runs the executable version and inputs the information requested by the program. The program is not designed to weed out incorrect inputs, so the user must take care to make sure the inputs are correct. Once all input has been entered, the program calculates and lists the result. The CUMBIN program is written in C. It was developed on an IBM AT with a numeric co-processor using Microsoft C 5.0. Because the source code is written using standard C structures and functions, it should compile correctly with most C compilers. The program format is interactive. It has been implemented under DOS 3.2 and has a memory requirement of 26K. CUMBIN was developed in 1988.
CROSSER - CUMULATIVE BINOMIAL PROGRAMS
Bowerman, P. N.
1994-01-01
The cumulative binomial program, CROSSER, is one of a set of three programs which calculate cumulative binomial probability distributions for arbitrary inputs. The three programs, CROSSER, CUMBIN (NPO-17555), and NEWTONP (NPO-17556), can be used independently of one another. CROSSER can be used by statisticians and users of statistical procedures, test planners, designers, and numerical analysts. The program has been used for reliability/availability calculations. CROSSER calculates the point at which the reliability of a k-out-of-n system equals the common reliability of the n components. It is designed to work well with all integer values 0 < k <= n. To run the program, the user simply runs the executable version and inputs the information requested by the program. The program is not designed to weed out incorrect inputs, so the user must take care to make sure the inputs are correct. Once all input has been entered, the program calculates and lists the result. It also lists the number of iterations of Newton's method required to calculate the answer within the given error. The CROSSER program is written in C. It was developed on an IBM AT with a numeric co-processor using Microsoft C 5.0. Because the source code is written using standard C structures and functions, it should compile correctly with most C compilers. The program format is interactive. It has been implemented under DOS 3.2 and has a memory requirement of 26K. CROSSER was developed in 1988.
Direct Importance Estimation with Gaussian Mixture Models
Yamada, Makoto; Sugiyama, Masashi
The ratio of two probability densities is called the importance and its estimation has gathered a great deal of attention these days since the importance can be used for various data processing purposes. In this paper, we propose a new importance estimation method using Gaussian mixture models (GMMs). Our method is an extention of the Kullback-Leibler importance estimation procedure (KLIEP), an importance estimation method using linear or kernel models. An advantage of GMMs is that covariance matrices can also be learned through an expectation-maximization procedure, so the proposed method — which we call the Gaussian mixture KLIEP (GM-KLIEP) — is expected to work well when the true importance function has high correlation. Through experiments, we show the validity of the proposed approach.
Text document classification based on mixture models
Czech Academy of Sciences Publication Activity Database
Novovičová, Jana; Malík, Antonín
2004-01-01
Roč. 40, č. 3 (2004), s. 293-304 ISSN 0023-5954 R&D Projects: GA AV ČR IAA2075302; GA ČR GA102/03/0049; GA AV ČR KSK1019101 Institutional research plan: CEZ:AV0Z1075907 Keywords : text classification * text categorization * multinomial mixture model Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.224, year: 2004
A turbulence model in mixtures. First part: Statistical description of mixture
International Nuclear Information System (INIS)
Besnard, D.
1987-03-01
Classical theory of mixtures gives a model for molecular mixtures. This kind of model is based on a small gradient approximation for concentration, temperature, and pression. We present here a mixture model, allowing for large gradients in the flow. We also show that, with a local balance assumption between material diffusion and flow gradients evolution, we obtain a model similar to those mentioned above [fr
NEWTONP - CUMULATIVE BINOMIAL PROGRAMS
Bowerman, P. N.
1994-01-01
The cumulative binomial program, NEWTONP, is one of a set of three programs which calculate cumulative binomial probability distributions for arbitrary inputs. The three programs, NEWTONP, CUMBIN (NPO-17555), and CROSSER (NPO-17557), can be used independently of one another. NEWTONP can be used by statisticians and users of statistical procedures, test planners, designers, and numerical analysts. The program has been used for reliability/availability calculations. NEWTONP calculates the probably p required to yield a given system reliability V for a k-out-of-n system. It can also be used to determine the Clopper-Pearson confidence limits (either one-sided or two-sided) for the parameter p of a Bernoulli distribution. NEWTONP can determine Bayesian probability limits for a proportion (if the beta prior has positive integer parameters). It can determine the percentiles of incomplete beta distributions with positive integer parameters. It can also determine the percentiles of F distributions and the midian plotting positions in probability plotting. NEWTONP is designed to work well with all integer values 0 < k <= n. To run the program, the user simply runs the executable version and inputs the information requested by the program. NEWTONP is not designed to weed out incorrect inputs, so the user must take care to make sure the inputs are correct. Once all input has been entered, the program calculates and lists the result. It also lists the number of iterations of Newton's method required to calculate the answer within the given error. The NEWTONP program is written in C. It was developed on an IBM AT with a numeric co-processor using Microsoft C 5.0. Because the source code is written using standard C structures and functions, it should compile correctly with most C compilers. The program format is interactive. It has been implemented under DOS 3.2 and has a memory requirement of 26K. NEWTONP was developed in 1988.
Distinguishing between Binomial, Hypergeometric and Negative Binomial Distributions
Wroughton, Jacqueline; Cole, Tarah
2013-01-01
Recognizing the differences between three discrete distributions (Binomial, Hypergeometric and Negative Binomial) can be challenging for students. We present an activity designed to help students differentiate among these distributions. In addition, we present assessment results in the form of pre- and post-tests that were designed to assess the…
Gaussian Mixture Model of Heart Rate Variability
Costa, Tommaso; Boccignone, Giuseppe; Ferraro, Mario
2012-01-01
Heart rate variability (HRV) is an important measure of sympathetic and parasympathetic functions of the autonomic nervous system and a key indicator of cardiovascular condition. This paper proposes a novel method to investigate HRV, namely by modelling it as a linear combination of Gaussians. Results show that three Gaussians are enough to describe the stationary statistics of heart variability and to provide a straightforward interpretation of the HRV power spectrum. Comparisons have been made also with synthetic data generated from different physiologically based models showing the plausibility of the Gaussian mixture parameters. PMID:22666386
Marginalized zero-inflated negative binomial regression with application to dental caries.
Preisser, John S; Das, Kalyan; Long, D Leann; Divaris, Kimon
2016-05-10
The zero-inflated negative binomial regression model (ZINB) is often employed in diverse fields such as dentistry, health care utilization, highway safety, and medicine to examine relationships between exposures of interest and overdispersed count outcomes exhibiting many zeros. The regression coefficients of ZINB have latent class interpretations for a susceptible subpopulation at risk for the disease/condition under study with counts generated from a negative binomial distribution and for a non-susceptible subpopulation that provides only zero counts. The ZINB parameters, however, are not well-suited for estimating overall exposure effects, specifically, in quantifying the effect of an explanatory variable in the overall mixture population. In this paper, a marginalized zero-inflated negative binomial regression (MZINB) model for independent responses is proposed to model the population marginal mean count directly, providing straightforward inference for overall exposure effects based on maximum likelihood estimation. Through simulation studies, the finite sample performance of MZINB is compared with marginalized zero-inflated Poisson, Poisson, and negative binomial regression. The MZINB model is applied in the evaluation of a school-based fluoride mouthrinse program on dental caries in 677 children. Copyright © 2015 John Wiley & Sons, Ltd.
Poisson Mixture Regression Models for Heart Disease Prediction.
Mufudza, Chipo; Erol, Hamza
2016-01-01
Early heart disease control can be achieved by high disease prediction and diagnosis efficiency. This paper focuses on the use of model based clustering techniques to predict and diagnose heart disease via Poisson mixture regression models. Analysis and application of Poisson mixture regression models is here addressed under two different classes: standard and concomitant variable mixture regression models. Results show that a two-component concomitant variable Poisson mixture regression model predicts heart disease better than both the standard Poisson mixture regression model and the ordinary general linear Poisson regression model due to its low Bayesian Information Criteria value. Furthermore, a Zero Inflated Poisson Mixture Regression model turned out to be the best model for heart prediction over all models as it both clusters individuals into high or low risk category and predicts rate to heart disease componentwise given clusters available. It is deduced that heart disease prediction can be effectively done by identifying the major risks componentwise using Poisson mixture regression model.
Binomial Rings: Axiomatisation, Transfer and Classification
Xantcha, Qimh Richey
2011-01-01
Hall's binomial rings, rings with binomial coefficients, are given an axiomatisation and proved identical to the numerical rings studied by Ekedahl. The Binomial Transfer Principle is established, enabling combinatorial proofs of algebraical identities. The finitely generated binomial rings are completely classified. An application to modules over binomial rings is given.
Liu, Lian; Zhang, Shao-Wu; Huang, Yufei; Meng, Jia
2017-08-31
As a newly emerged research area, RNA epigenetics has drawn increasing attention recently for the participation of RNA methylation and other modifications in a number of crucial biological processes. Thanks to high throughput sequencing techniques, such as, MeRIP-Seq, transcriptome-wide RNA methylation profile is now available in the form of count-based data, with which it is often of interests to study the dynamics at epitranscriptomic layer. However, the sample size of RNA methylation experiment is usually very small due to its costs; and additionally, there usually exist a large number of genes whose methylation level cannot be accurately estimated due to their low expression level, making differential RNA methylation analysis a difficult task. We present QNB, a statistical approach for differential RNA methylation analysis with count-based small-sample sequencing data. Compared with previous approaches such as DRME model based on a statistical test covering the IP samples only with 2 negative binomial distributions, QNB is based on 4 independent negative binomial distributions with their variances and means linked by local regressions, and in the way, the input control samples are also properly taken care of. In addition, different from DRME approach, which relies only the input control sample only for estimating the background, QNB uses a more robust estimator for gene expression by combining information from both input and IP samples, which could largely improve the testing performance for very lowly expressed genes. QNB showed improved performance on both simulated and real MeRIP-Seq datasets when compared with competing algorithms. And the QNB model is also applicable to other datasets related RNA modifications, including but not limited to RNA bisulfite sequencing, m 1 A-Seq, Par-CLIP, RIP-Seq, etc.
Lee, JuHee; Park, Chang Gi; Choi, Moonki
2016-05-01
This study was conducted to identify risk factors that influence regular exercise among patients with Parkinson's disease in Korea. Parkinson's disease is prevalent in the elderly, and may lead to a sedentary lifestyle. Exercise can enhance physical and psychological health. However, patients with Parkinson's disease are less likely to exercise than are other populations due to physical disability. A secondary data analysis and cross-sectional descriptive study were conducted. A convenience sample of 106 patients with Parkinson's disease was recruited at an outpatient neurology clinic of a tertiary hospital in Korea. Demographic characteristics, disease-related characteristics (including disease duration and motor symptoms), self-efficacy for exercise, balance, and exercise level were investigated. Negative binomial regression and zero-inflated negative binomial regression for exercise count data were utilized to determine factors involved in exercise. The mean age of participants was 65.85 ± 8.77 years, and the mean duration of Parkinson's disease was 7.23 ± 6.02 years. Most participants indicated that they engaged in regular exercise (80.19%). Approximately half of participants exercised at least 5 days per week for 30 min, as recommended (51.9%). Motor symptoms were a significant predictor of exercise in the count model, and self-efficacy for exercise was a significant predictor of exercise in the zero model. Severity of motor symptoms was related to frequency of exercise. Self-efficacy contributed to the probability of exercise. Symptom management and improvement of self-efficacy for exercise are important to encourage regular exercise in patients with Parkinson's disease. Copyright © 2015 Elsevier Inc. All rights reserved.
Barle, Stanko
In this dissertation, two dynamical systems with many degrees of freedom are analyzed. One is the system of highly correlated electrons in the two-impurity Kondo problem. The other deals with building a realistic model of diffusion underlying financial markets. The simplest mean-field theory capable of mimicking the non-Fermi liquid behavior of the critical point in the two-impurity Kondo problem is presented. In this approach Landau's adiabaticity assumption--of a one-to-one correspondence between the low-energy excitations of the interacting and noninteracting systems--is violated through the presence of decoupled local degrees of freedom. These do not couple directly to external fields but appear indirectly in the physical properties leading, for example, to the log(T, omega) behavior of the staggered magnetic susceptibility. Also, as observed previously, the correlation function = -1/4 is a consequence of the equal weights of the singlet and triplet impurity configurations at the critical point. In the second problem, a numerical model is developed to describe the diffusion of prices in the market. Implied binomial (or multinomial) trees are constructed to enable practical pricing of derivative securities in consistency with the existing market. The method developed here is capable of accounting for both the strike price and term structure of the implied volatility. It includes the correct treatment of interest rate and dividends which proves robust even if these quantities are unusually large. The method is explained both as a set of individual innovations and, from a different prospective, as a consequence of a single plausible transformation from the tree of spot prices to the tree of futures prices.
Mixture of Regression Models with Single-Index
Xiang, Sijia; Yao, Weixin
2016-01-01
In this article, we propose a class of semiparametric mixture regression models with single-index. We argue that many recently proposed semiparametric/nonparametric mixture regression models can be considered special cases of the proposed model. However, unlike existing semiparametric mixture regression models, the new pro- posed model can easily incorporate multivariate predictors into the nonparametric components. Backfitting estimates and the corresponding algorithms have been proposed for...
Maximum likelihood estimation of finite mixture model for economic data
Phoong, Seuk-Yen; Ismail, Mohd Tahir
2014-06-01
Finite mixture model is a mixture model with finite-dimension. This models are provides a natural representation of heterogeneity in a finite number of latent classes. In addition, finite mixture models also known as latent class models or unsupervised learning models. Recently, maximum likelihood estimation fitted finite mixture models has greatly drawn statistician's attention. The main reason is because maximum likelihood estimation is a powerful statistical method which provides consistent findings as the sample sizes increases to infinity. Thus, the application of maximum likelihood estimation is used to fit finite mixture model in the present paper in order to explore the relationship between nonlinear economic data. In this paper, a two-component normal mixture model is fitted by maximum likelihood estimation in order to investigate the relationship among stock market price and rubber price for sampled countries. Results described that there is a negative effect among rubber price and stock market price for Malaysia, Thailand, Philippines and Indonesia.
The Binomial Distribution in Shooting
Chalikias, Miltiadis S.
2009-01-01
The binomial distribution is used to predict the winner of the 49th International Shooting Sport Federation World Championship in double trap shooting held in 2006 in Zagreb, Croatia. The outcome of the competition was definitely unexpected.
Nonparametric Mixture Models for Supervised Image Parcellation.
Sabuncu, Mert R; Yeo, B T Thomas; Van Leemput, Koen; Fischl, Bruce; Golland, Polina
2009-09-01
We present a nonparametric, probabilistic mixture model for the supervised parcellation of images. The proposed model yields segmentation algorithms conceptually similar to the recently developed label fusion methods, which register a new image with each training image separately. Segmentation is achieved via the fusion of transferred manual labels. We show that in our framework various settings of a model parameter yield algorithms that use image intensity information differently in determining the weight of a training subject during fusion. One particular setting computes a single, global weight per training subject, whereas another setting uses locally varying weights when fusing the training data. The proposed nonparametric parcellation approach capitalizes on recently developed fast and robust pairwise image alignment tools. The use of multiple registrations allows the algorithm to be robust to occasional registration failures. We report experiments on 39 volumetric brain MRI scans with expert manual labels for the white matter, cerebral cortex, ventricles and subcortical structures. The results demonstrate that the proposed nonparametric segmentation framework yields significantly better segmentation than state-of-the-art algorithms.
Stochastic radiative transfer model for mixture of discontinuous vegetation canopies
International Nuclear Information System (INIS)
Shabanov, Nikolay V.; Huang, D.; Knjazikhin, Y.; Dickinson, R.E.; Myneni, Ranga B.
2007-01-01
Modeling of the radiation regime of a mixture of vegetation species is a fundamental problem of the Earth's land remote sensing and climate applications. The major existing approaches, including the linear mixture model and the turbid medium (TM) mixture radiative transfer model, provide only an approximate solution to this problem. In this study, we developed the stochastic mixture radiative transfer (SMRT) model, a mathematically exact tool to evaluate radiation regime in a natural canopy with spatially varying optical properties, that is, canopy, which exhibits a structured mixture of vegetation species and gaps. The model solves for the radiation quantities, direct input to the remote sensing/climate applications: mean radiation fluxes over whole mixture and over individual species. The canopy structure is parameterized in the SMRT model in terms of two stochastic moments: the probability of finding species and the conditional pair-correlation of species. The second moment is responsible for the 3D radiation effects, namely, radiation streaming through gaps without interaction with vegetation and variation of the radiation fluxes between different species. We performed analytical and numerical analysis of the radiation effects, simulated with the SMRT model for the three cases of canopy structure: (a) non-ordered mixture of species and gaps (TM); (b) ordered mixture of species without gaps; and (c) ordered mixture of species with gaps. The analysis indicates that the variation of radiation fluxes between different species is proportional to the variation of species optical properties (leaf albedo, density of foliage, etc.) Gaps introduce significant disturbance to the radiation regime in the canopy as their optical properties constitute major contrast to those of any vegetation species. The SMRT model resolves deficiencies of the major existing mixture models: ignorance of species radiation coupling via multiple scattering of photons (the linear mixture model
mixtools: An R Package for Analyzing Mixture Models
Directory of Open Access Journals (Sweden)
Tatiana Benaglia
2009-10-01
Full Text Available The mixtools package for R provides a set of functions for analyzing a variety of finite mixture models. These functions include both traditional methods, such as EM algorithms for univariate and multivariate normal mixtures, and newer methods that reflect some recent research in finite mixture models. In the latter category, mixtools provides algorithms for estimating parameters in a wide range of different mixture-of-regression contexts, in multinomial mixtures such as those arising from discretizing continuous multivariate data, in nonparametric situations where the multivariate component densities are completely unspecified, and in semiparametric situations such as a univariate location mixture of symmetric but otherwise unspecified densities. Many of the algorithms of the mixtools package are EM algorithms or are based on EM-like ideas, so this article includes an overview of EM algorithms for finite mixture models.
Jafarzadeh, S Reza; Norris, Michelle; Thurmond, Mark C
2014-08-01
To identify events that could predict province-level frequency of foot-and-mouth disease (FMD) outbreaks in Iran, 5707 outbreaks reported from April 1995 to March 2002 were studied. A zero-inflated negative binomial model was used to estimate the probability of a 'no-outbreak' status and the number of outbreaks in a province, using the number of previous occurrences of FMD for the same or adjacent provinces and season as covariates. For each province, the probability of observing no outbreak was negatively associated with the number of outbreaks in the same province in the previous month (odds ratio [OR]=0.06, 95% confidence interval [CI]: 0.01, 0.30) and in 'the second previous month' (OR=0.10, 95% CI: 0.02, 0.51), the total number of outbreaks in the second previous month in adjacent provinces (OR=0.57, 95% CI: 0.36, 0.91) and the season (winter [OR=0.18, 95% CI: 0.06, 0.55] and spring [OR=0.27, 95% CI: 0.09, 0.81], compared with summer). The expected number of outbreaks in a province was positively associated with number of outbreaks in the same province in previous month (coefficient [coef]=0.74, 95% CI: 0.66, 0.82) and in the second previous month (coef=0.23, 95% CI: 0.16, 0.31), total number of outbreaks in adjacent provinces in the previous month (coef=0.32, 95% CI: 0.22, 0.41) and season (fall [coef=0.20, 95% CI: 0.07, 0.33] and spring [coef=0.18, 95% CI: 0.05, 0.31], compared to summer); however, number of outbreaks was negatively associated with the total number of outbreaks in adjacent provinces in the second previous month (coef=-0.19, 95% CI: -0.28, -0.09). The findings indicate that the probability of an outbreak (and the expected number of outbreaks if any) may be predicted based on previous province information, which could help decision-makers allocate resources more efficiently for province-level disease control measures. Further, the study illustrates use of zero inflated negative binomial model to study diseases occurrence where disease is
Evaluating Mixture Modeling for Clustering: Recommendations and Cautions
Steinley, Douglas; Brusco, Michael J.
2011-01-01
This article provides a large-scale investigation into several of the properties of mixture-model clustering techniques (also referred to as latent class cluster analysis, latent profile analysis, model-based clustering, probabilistic clustering, Bayesian classification, unsupervised learning, and finite mixture models; see Vermunt & Magdison,…
Naznin, Farhana; Currie, Graham; Logan, David; Sarvi, Majid
2016-07-01
Safety is a key concern in the design, operation and development of light rail systems including trams or streetcars as they impose crash risks on road users in terms of crash frequency and severity. The aim of this study is to identify key traffic, transit and route factors that influence tram-involved crash frequencies along tram route sections in Melbourne. A random effects negative binomial (RENB) regression model was developed to analyze crash frequency data obtained from Yarra Trams, the tram operator in Melbourne. The RENB modelling approach can account for spatial and temporal variations within observation groups in panel count data structures by assuming that group specific effects are randomly distributed across locations. The results identify many significant factors effecting tram-involved crash frequency including tram service frequency (2.71), tram stop spacing (-0.42), tram route section length (0.31), tram signal priority (-0.25), general traffic volume (0.18), tram lane priority (-0.15) and ratio of platform tram stops (-0.09). Findings provide useful insights on route section level tram-involved crashes in an urban tram or streetcar operating environment. The method described represents a useful planning tool for transit agencies hoping to improve safety performance. Copyright © 2016 Elsevier Ltd. All rights reserved.
Kim, Dae-Hwan; Ramjan, Lucie M; Mak, Kwok-Kei
2016-01-01
Traffic safety is a significant public health challenge, and vehicle crashes account for the majority of injuries. This study aims to identify whether drivers' characteristics and past traffic violations may predict vehicle crashes in Korea. A total of 500,000 drivers were randomly selected from the 11.6 million driver records of the Ministry of Land, Transport and Maritime Affairs in Korea. Records of traffic crashes were obtained from the archives of the Korea Insurance Development Institute. After matching the past violation history for the period 2004-2005 with the number of crashes in year 2006, a total of 488,139 observations were used for the analysis. Zero-inflated negative binomial model was used to determine the incident risk ratio (IRR) of vehicle crashes by past violations of individual drivers. The included covariates were driver's age, gender, district of residence, vehicle choice, and driving experience. Drivers violating (1) a hit-and-run or drunk driving regulation at least once and (2) a signal, central line, or speed regulation more than once had a higher risk of a vehicle crash with respective IRRs of 1.06 and 1.15. Furthermore, female gender, a younger age, fewer years of driving experience, and middle-sized vehicles were all significantly associated with a higher likelihood of vehicle crashes. Drivers' demographic characteristics and past traffic violations could predict vehicle crashes in Korea. Greater resources should be assigned to the provision of traffic safety education programs for the high-risk driver groups.
Modeling the effects of binary mixtures on survival in time.
Baas, J.; van Houte, B.P.P.; van Gestel, C.A.M.; Kooijman, S.A.L.M.
2007-01-01
In general, effects of mixtures are difficult to describe, and most of the models in use are descriptive in nature and lack a strong mechanistic basis. The aim of this experiment was to develop a process-based model for the interpretation of mixture toxicity measurements, with effects of binary
Poisson Mixture Regression Models for Heart Disease Prediction
Erol, Hamza
2016-01-01
Early heart disease control can be achieved by high disease prediction and diagnosis efficiency. This paper focuses on the use of model based clustering techniques to predict and diagnose heart disease via Poisson mixture regression models. Analysis and application of Poisson mixture regression models is here addressed under two different classes: standard and concomitant variable mixture regression models. Results show that a two-component concomitant variable Poisson mixture regression model predicts heart disease better than both the standard Poisson mixture regression model and the ordinary general linear Poisson regression model due to its low Bayesian Information Criteria value. Furthermore, a Zero Inflated Poisson Mixture Regression model turned out to be the best model for heart prediction over all models as it both clusters individuals into high or low risk category and predicts rate to heart disease componentwise given clusters available. It is deduced that heart disease prediction can be effectively done by identifying the major risks componentwise using Poisson mixture regression model. PMID:27999611
Modeling of Multicomponent Mixture Separation Processes Using Hollow fiber Membrane
Energy Technology Data Exchange (ETDEWEB)
Kim, Sin-Ah; Kim, Jin-Kuk; Lee, Young Moo; Yeo, Yeong-Koo [Hanyang University, Seoul (Korea, Republic of)
2015-02-15
So far, most of research activities on modeling of membrane separation processes have been focused on binary feed mixture. But, in actual separation operations, binary feed is hard to find and most separation processes involve multicomponent feed mixture. In this work models for membrane separation processes treating multicomponent feed mixture are developed. Various model types are investigated and validity of proposed models are analysed based on experimental data obtained using hollowfiber membranes. The proposed separation models show quick convergence and exhibit good tracking performance.
Integer Solutions of Binomial Coefficients
Gilbertson, Nicholas J.
2016-01-01
A good formula is like a good story, rich in description, powerful in communication, and eye-opening to readers. The formula presented in this article for determining the coefficients of the binomial expansion of (x + y)n is one such "good read." The beauty of this formula is in its simplicity--both describing a quantitative situation…
Wei, Feng; Lovegrove, Gordon
2013-12-01
Today, North American governments are more willing to consider compact neighborhoods with increased use of sustainable transportation modes. Bicycling, one of the most effective modes for short trips with distances less than 5km is being encouraged. However, as vulnerable road users (VRUs), cyclists are more likely to be injured when involved in collisions. In order to create a safe road environment for them, evaluating cyclists' road safety at a macro level in a proactive way is necessary. In this paper, different generalized linear regression methods for collision prediction model (CPM) development are reviewed and previous studies on micro-level and macro-level bicycle-related CPMs are summarized. On the basis of insights gained in the exploration stage, this paper also reports on efforts to develop negative binomial models for bicycle-auto collisions at a community-based, macro-level. Data came from the Central Okanagan Regional District (CORD), of British Columbia, Canada. The model results revealed two types of statistical associations between collisions and each explanatory variable: (1) An increase in bicycle-auto collisions is associated with an increase in total lane kilometers (TLKM), bicycle lane kilometers (BLKM), bus stops (BS), traffic signals (SIG), intersection density (INTD), and arterial-local intersection percentage (IALP). (2) A decrease in bicycle collisions was found to be associated with an increase in the number of drive commuters (DRIVE), and in the percentage of drive commuters (DRP). These results support our hypothesis that in North America, with its current low levels of bicycle use (macro-level CPMs. Copyright © 2012. Published by Elsevier Ltd.
Modelling interactions in grass-clover mixtures
Nassiri Mahallati, M.
1998-01-01
The study described in this thesis focuses on a quantitative understanding of the complex interactions in binary mixtures of perennial ryegrass (Lolium perenne L.) and white clover (Trifolium repens L.) under cutting. The first part of the study describes the dynamics of growth, production
Thermodynamic modeling of CO2 mixtures
DEFF Research Database (Denmark)
Bjørner, Martin Gamel
Knowledge of the thermodynamic properties and phase equilibria of mixtures containing carbon dioxide (CO2) is important in several industrial processes such as enhanced oil recovery, carbon capture and storage, and supercritical extractions, where CO2 is used as a solvent. Despite this importance...
Detecting non-binomial sex allocation when developmental mortality operates.
Wilkinson, Richard D; Kapranas, Apostolos; Hardy, Ian C W
2016-11-07
Optimal sex allocation theory is one of the most intricately developed areas of evolutionary ecology. Under a range of conditions, particularly under population sub-division, selection favours sex being allocated to offspring non-randomly, generating non-binomial variances of offspring group sex ratios. Detecting non-binomial sex allocation is complicated by stochastic developmental mortality, as offspring sex can often only be identified on maturity with the sex of non-maturing offspring remaining unknown. We show that current approaches for detecting non-binomiality have limited ability to detect non-binomial sex allocation when developmental mortality has occurred. We present a new procedure using an explicit model of sex allocation and mortality and develop a Bayesian model selection approach (available as an R package). We use the double and multiplicative binomial distributions to model over- and under-dispersed sex allocation and show how to calculate Bayes factors for comparing these alternative models to the null hypothesis of binomial sex allocation. The ability to detect non-binomial sex allocation is greatly increased, particularly in cases where mortality is common. The use of Bayesian methods allows for the quantification of the evidence in favour of each hypothesis, and our modelling approach provides an improved descriptive capability over existing approaches. We use a simulation study to demonstrate substantial improvements in power for detecting non-binomial sex allocation in situations where current methods fail, and we illustrate the approach in real scenarios using empirically obtained datasets on the sexual composition of groups of gregarious parasitoid wasps. Copyright © 2016 Elsevier Ltd. All rights reserved.
Negative Binomial Distribution and the multiplicity moments at the LHC
International Nuclear Information System (INIS)
Praszalowicz, Michal
2011-01-01
In this work we show that the latest LHC data on multiplicity moments C 2 -C 5 are well described by a two-step model in the form of a convolution of the Poisson distribution with energy-dependent source function. For the source function we take Γ Negative Binomial Distribution. No unexpected behavior of Negative Binomial Distribution parameter k is found. We give also predictions for the higher energies of 10 and 14 TeV.
Communication: Modeling electrolyte mixtures with concentration dependent dielectric permittivity
Chen, Hsieh; Panagiotopoulos, Athanassios Z.
2018-01-01
We report a new implicit-solvent simulation model for electrolyte mixtures based on the concept of concentration dependent dielectric permittivity. A combining rule is found to predict the dielectric permittivity of electrolyte mixtures based on the experimentally measured dielectric permittivity for pure electrolytes as well as the mole fractions of the electrolytes in mixtures. Using grand canonical Monte Carlo simulations, we demonstrate that this approach allows us to accurately reproduce the mean ionic activity coefficients of NaCl in NaCl-CaCl2 mixtures at ionic strengths up to I = 3M. These results are important for thermodynamic studies of geologically relevant brines and physiological fluids.
A Dirichlet process mixture model for brain MRI tissue classification.
Ferreira da Silva, Adelino R
2007-04-01
Accurate classification of magnetic resonance images according to tissue type or region of interest has become a critical requirement in diagnosis, treatment planning, and cognitive neuroscience. Several authors have shown that finite mixture models give excellent results in the automated segmentation of MR images of the human normal brain. However, performance and robustness of finite mixture models deteriorate when the models have to deal with a variety of anatomical structures. In this paper, we propose a nonparametric Bayesian model for tissue classification of MR images of the brain. The model, known as Dirichlet process mixture model, uses Dirichlet process priors to overcome the limitations of current parametric finite mixture models. To validate the accuracy and robustness of our method we present the results of experiments carried out on simulated MR brain scans, as well as on real MR image data. The results are compared with similar results from other well-known MRI segmentation methods.
Smoothness in Binomial Edge Ideals
Directory of Open Access Journals (Sweden)
Hamid Damadi
2016-06-01
Full Text Available In this paper we study some geometric properties of the algebraic set associated to the binomial edge ideal of a graph. We study the singularity and smoothness of the algebraic set associated to the binomial edge ideal of a graph. Some of these algebraic sets are irreducible and some of them are reducible. If every irreducible component of the algebraic set is smooth we call the graph an edge smooth graph, otherwise it is called an edge singular graph. We show that complete graphs are edge smooth and introduce two conditions such that the graph G is edge singular if and only if it satisfies these conditions. Then, it is shown that cycles and most of trees are edge singular. In addition, it is proved that complete bipartite graphs are edge smooth.
Li, Liang; Mao, Huzhang; Ishwaran, Hemant; Rajeswaran, Jeevanantham; Ehrlinger, John; Blackstone, Eugene H.
2016-01-01
Atrial fibrillation (AF) is an abnormal heart rhythm characterized by rapid and irregular heart beat, with or without perceivable symptoms. In clinical practice, the electrocardiogram (ECG) is often used for diagnosis of AF. Since the AF often arrives as recurrent episodes of varying frequency and duration and only the episodes that occur at the time of ECG can be detected, the AF is often underdiagnosed when a limited number of repeated ECGs are used. In studies evaluating the efficacy of AF ablation surgery, each patient undergo multiple ECGs and the AF status at the time of ECG is recorded. The objective of this paper is to estimate the marginal proportions of patients with or without AF in a population, which are important measures of the efficacy of the treatment. The underdiagnosis problem is addressed by a three-class mixture regression model in which a patient’s probability of having no AF, paroxysmal AF, and permanent AF is modeled by auxiliary baseline covariates in a nested logistic regression. A binomial regression model is specified conditional on a subject being in the paroxysmal AF group. The model parameters are estimated by the EM algorithm. These parameters are themselves nuisance parameters for the purpose of this research, but the estimators of the marginal proportions of interest can be expressed as functions of the data and these nuisance parameters and their variances can be estimated by the sandwich method. We examine the performance of the proposed methodology in simulations and two real data applications. PMID:27983754
Schifferstein, H.N.J.
1996-01-01
The Equiratio Mixture Model predicts the psychophysical function for an equiratio mixture type on the basis of the psychophysical functions for the unmixed components. The model reliably estimates the sweetness of mixtures of sugars and sugar-alchohols, but is unable to predict intensity for
Beta Regression Finite Mixture Models of Polarization and Priming
Smithson, Michael; Merkle, Edgar C.; Verkuilen, Jay
2011-01-01
This paper describes the application of finite-mixture general linear models based on the beta distribution to modeling response styles, polarization, anchoring, and priming effects in probability judgments. These models, in turn, enhance our capacity for explicitly testing models and theories regarding the aforementioned phenomena. The mixture…
New models for predicting thermophysical properties of ionic liquid mixtures.
Huang, Ying; Zhang, Xiangping; Zhao, Yongsheng; Zeng, Shaojuan; Dong, Haifeng; Zhang, Suojiang
2015-10-28
Potential applications of ILs require the knowledge of the physicochemical properties of ionic liquid (IL) mixtures. In this work, a series of semi-empirical models were developed to predict the density, surface tension, heat capacity and thermal conductivity of IL mixtures. Each semi-empirical model only contains one new characteristic parameter, which can be determined using one experimental data point. In addition, as another effective tool, artificial neural network (ANN) models were also established. The two kinds of models were verified by a total of 2304 experimental data points for binary mixtures of ILs and molecular compounds. The overall average absolute deviations (AARDs) of both the semi-empirical and ANN models are less than 2%. Compared to previously reported models, these new semi-empirical models require fewer adjustable parameters and can be applied in a wider range of applications.
DEFF Research Database (Denmark)
Tsivintzelis, Ioannis; Kontogeorgis, Georgios; Michelsen, Michael Locht
2011-01-01
In Part I of this series of articles, the study of H2S mixtures has been presented with CPA. In this study the phase behavior of CO2 containing mixtures is modeled. Binary mixtures with water, alcohols, glycols and hydrocarbons are investigated. Both phase equilibria (vapor–liquid and liquid–liqu...
Sample size calculation for comparing two negative binomial rates.
Zhu, Haiyuan; Lakkis, Hassan
2014-02-10
Negative binomial model has been increasingly used to model the count data in recent clinical trials. It is frequently chosen over Poisson model in cases of overdispersed count data that are commonly seen in clinical trials. One of the challenges of applying negative binomial model in clinical trial design is the sample size estimation. In practice, simulation methods have been frequently used for sample size estimation. In this paper, an explicit formula is developed to calculate sample size based on the negative binomial model. Depending on different approaches to estimate the variance under null hypothesis, three variations of the sample size formula are proposed and discussed. Important characteristics of the formula include its accuracy and its ability to explicitly incorporate dispersion parameter and exposure time. The performance of the formula with each variation is assessed using simulations. Copyright © 2013 John Wiley & Sons, Ltd.
Detecting Housing Submarkets using Unsupervised Learning of Finite Mixture Models
DEFF Research Database (Denmark)
Ntantamis, Christos
association between prices that can be attributed, among others, to unobserved neighborhood effects. In this paper, a model of spatial association for housing markets is introduced. Spatial association is treated in the context of spatial heterogeneity, which is explicitly modeled in both a global and a local....... The identified mixtures are considered as the different spatial housing submarkets. The main advantage of the approach is that submarkets are recovered by the housing prices data compared to submarkets imposed by administrative or geographical criteria. The Finite Mixture Model is estimated using the Figueiredo...
Chromosome aberration analysis based on a beta-binomial distribution
International Nuclear Information System (INIS)
Otake, Masanori; Prentice, R.L.
1983-10-01
Analyses carried out here generalized on earlier studies of chromosomal aberrations in the populations of Hiroshima and Nagasaki, by allowing extra-binomial variation in aberrant cell counts corresponding to within-subject correlations in cell aberrations. Strong within-subject correlations were detected with corresponding standard errors for the average number of aberrant cells that were often substantially larger than was previously assumed. The extra-binomial variation is accomodated in the analysis in the present report, as described in the section on dose-response models, by using a beta-binomial (B-B) variance structure. It is emphasized that we have generally satisfactory agreement between the observed and the B-B fitted frequencies by city-dose category. The chromosomal aberration data considered here are not extensive enough to allow a precise discrimination between competing dose-response models. A quadratic gamma ray and linear neutron model, however, most closely fits the chromosome data. (author)
Bayesian Plackett-Luce Mixture Models for Partially Ranked Data.
Mollica, Cristina; Tardella, Luca
2017-06-01
The elicitation of an ordinal judgment on multiple alternatives is often required in many psychological and behavioral experiments to investigate preference/choice orientation of a specific population. The Plackett-Luce model is one of the most popular and frequently applied parametric distributions to analyze rankings of a finite set of items. The present work introduces a Bayesian finite mixture of Plackett-Luce models to account for unobserved sample heterogeneity of partially ranked data. We describe an efficient way to incorporate the latent group structure in the data augmentation approach and the derivation of existing maximum likelihood procedures as special instances of the proposed Bayesian method. Inference can be conducted with the combination of the Expectation-Maximization algorithm for maximum a posteriori estimation and the Gibbs sampling iterative procedure. We additionally investigate several Bayesian criteria for selecting the optimal mixture configuration and describe diagnostic tools for assessing the fitness of ranking distributions conditionally and unconditionally on the number of ranked items. The utility of the novel Bayesian parametric Plackett-Luce mixture for characterizing sample heterogeneity is illustrated with several applications to simulated and real preference ranked data. We compare our method with the frequentist approach and a Bayesian nonparametric mixture model both assuming the Plackett-Luce model as a mixture component. Our analysis on real datasets reveals the importance of an accurate diagnostic check for an appropriate in-depth understanding of the heterogenous nature of the partial ranking data.
Supervised Gaussian mixture model based remote sensing image ...
African Journals Online (AJOL)
Using the supervised classification technique, both simulated and empirical satellite remote sensing data are used to train and test the Gaussian mixture model algorithm. For the purpose of validating the experiment, the resulting classified satellite image is compared with the ground truth data. For the simulated modelling, ...
Zero-truncated negative binomial - Erlang distribution
Bodhisuwan, Winai; Pudprommarat, Chookait; Bodhisuwan, Rujira; Saothayanun, Luckhana
2017-11-01
The zero-truncated negative binomial-Erlang distribution is introduced. It is developed from negative binomial-Erlang distribution. In this work, the probability mass function is derived and some properties are included. The parameters of the zero-truncated negative binomial-Erlang distribution are estimated by using the maximum likelihood estimation. Finally, the proposed distribution is applied to real data, the number of methamphetamine in the Bangkok, Thailand. Based on the results, it shows that the zero-truncated negative binomial-Erlang distribution provided a better fit than the zero-truncated Poisson, zero-truncated negative binomial, zero-truncated generalized negative-binomial and zero-truncated Poisson-Lindley distributions for this data.
Identifying Clusters with Mixture Models that Include Radial Velocity Observations
Czarnatowicz, Alexis; Ybarra, Jason E.
2018-01-01
The study of stellar clusters plays an integral role in the study of star formation. We present a cluster mixture model that considers radial velocity data in addition to spatial data. Maximum likelihood estimation through the Expectation-Maximization (EM) algorithm is used for parameter estimation. Our mixture model analysis can be used to distinguish adjacent or overlapping clusters, and estimate properties for each cluster.Work supported by awards from the Virginia Foundation for Independent Colleges (VFIC) Undergraduate Science Research Fellowship and The Research Experience @Bridgewater (TREB).
Application of Negative Binomial Regression for Assessing Public ...
African Journals Online (AJOL)
Because the variance was nearly two times greater than the mean, the negative binomial regression model provided an improved fit to the data and accounted better for overdispersion than the Poisson regression model, which assumed that the mean and variance are the same. The level of education and race were found
Models for the computation of opacity of mixtures
International Nuclear Information System (INIS)
Klapisch, Marcel; Busquet, Michel
2013-01-01
We compare four models for the partial densities of the components of mixtures. These models yield different opacities as shown on polystyrene, acrylic and polyimide in local thermodynamical equilibrium (LTE). Two of these models, the ‘whole volume partial pressure’ model (M1) and its modification (M2) are not thermodynamically consistent (TC). The other two models are TC and minimize free energy. M3, the ‘partial volume equal pressure’ model, uses equality of chemical potential. M4 uses commonality of free electron density. The latter two give essentially identical results in LTE, but M4’s convergence is slower. M4 is easily generalized to non-LTE conditions. Non-LTE effects are shown by the variation of the Planck mean opacity of the mixtures with temperature and density. (paper)
Copula Based Factorization in Bayesian Multivariate Infinite Mixture Models
Martin Burda; Artem Prokhorov
2012-01-01
Bayesian nonparametric models based on infinite mixtures of density kernels have been recently gaining in popularity due to their flexibility and feasibility of implementation even in complicated modeling scenarios. In economics, they have been particularly useful in estimating nonparametric distributions of latent variables. However, these models have been rarely applied in more than one dimension. Indeed, the multivariate case suffers from the curse of dimensionality, with a rapidly increas...
The R Package bgmm : Mixture Modeling with Uncertain Knowledge
Directory of Open Access Journals (Sweden)
Przemys law Biecek
2012-04-01
Full Text Available Classical supervised learning enjoys the luxury of accessing the true known labels for the observations in a modeled dataset. Real life, however, poses an abundance of problems, where the labels are only partially defined, i.e., are uncertain and given only for a subsetof observations. Such partial labels can occur regardless of the knowledge source. For example, an experimental assessment of labels may have limited capacity and is prone to measurement errors. Also expert knowledge is often restricted to a specialized area and is thus unlikely to provide trustworthy labels for all observations in the dataset. Partially supervised mixture modeling is able to process such sparse and imprecise input. Here, we present an R package calledbgmm, which implements two partially supervised mixture modeling methods: soft-label and belief-based modeling. For completeness, we equipped the package also with the functionality of unsupervised, semi- and fully supervised mixture modeling. On real data we present the usage of bgmm for basic model-fitting in all modeling variants. The package can be applied also to selection of the best-fitting from a set of models with different component numbers or constraints on their structures. This functionality is presented on an artificial dataset, which can be simulated in bgmm from a distribution defined by a given model.
Option Pricing with Asymmetric Heteroskedastic Normal Mixture Models
DEFF Research Database (Denmark)
Rombouts, Jeroen V. K; Stentoft, Lars
2015-01-01
We propose an asymmetric GARCH in mean mixture model and provide a feasible method for option pricing within this general framework by deriving the appropriate risk neutral dynamics. We forecast the out-of-sample prices of a large sample of options on the S&P 500 index from January 2006 to December...
Application of association models to mixtures containing alkanolamines
DEFF Research Database (Denmark)
Avlund, Ane Søgaard; Eriksen, Daniel Kunisch; Kontogeorgis, Georgios
2011-01-01
Two association models,the CPA and sPC-SAFT equations of state, are applied to binarymixtures containing alkanolamines and hydrocarbons or water. CPA is applied to mixtures of MEA and DEA, while sPC-SAFT is applied to MEA–n-heptane liquid–liquid equilibria and MEA–water vapor–liquid equilibria. T...
The Semiparametric Normal Variance-Mean Mixture Model
DEFF Research Database (Denmark)
Korsholm, Lars
1997-01-01
We discuss the normal vairance-mean mixture model from a semi-parametric point of view, i.e. we let the mixing distribution belong to a non parametric family. The main results are consistency of the non parametric maximum likelihood estimat or in this case, and construction of an asymptotically...... normal and efficient estimator....
Evaluation of Distance Measures Between Gaussian Mixture Models of MFCCs
DEFF Research Database (Denmark)
Jensen, Jesper Højvang; Ellis, Dan P. W.; Christensen, Mads Græsbøll
2007-01-01
In music similarity and in the related task of genre classification, a distance measure between Gaussian mixture models is frequently needed. We present a comparison of the Kullback-Leibler distance, the earth movers distance and the normalized L2 distance for this application. Although...
Parameter Estimation and Model Selection for Mixtures of Truncated Exponentials
DEFF Research Database (Denmark)
Langseth, Helge; Nielsen, Thomas Dyhre; Rumí, Rafael
2010-01-01
Bayesian networks with mixtures of truncated exponentials (MTEs) support efficient inference algorithms and provide a flexible way of modeling hybrid domains (domains containing both discrete and continuous variables). On the other hand, estimating an MTE from data has turned out to be a difficul...
Detecting Math Anxiety with a Mixture Partial Credit Model
Ölmez, Ibrahim Burak; Cohen, Allan S.
2017-01-01
The purpose of this study was to investigate a new methodology for detection of differences in middle grades students' math anxiety. A mixture partial credit model analysis revealed two distinct latent classes based on homogeneities in response patterns within each latent class. Students in Class 1 had less anxiety about apprehension of math…
Using the {Beta}-binomial distribution to characterize forest health
Energy Technology Data Exchange (ETDEWEB)
Zarnoch, S. J. [USDA Forest Service, Southern Research Station, Athens, GA (United States); Anderson, R.L.; Sheffield, R. M. [USDA Forest Service, Southern Research Station, Asheville, NC (United States)
1995-03-01
Forest health monitoring programs often use base variables which are dichotomous (i e. alive/dead, damaged/undamaged) to describe the health of trees. Typical sampling designs usually consist of randomly or systematically chosen clusters of trees for observation.It was claimed that contagiousness of diseases for example may result in non-uniformity of affected trees, so that distribution of the proportions, rather than simply the mean proportion, becomes important. The use of the {Beta}-binomial model was suggested for such cases. Use of the {Beta}-binomial distribution model applied in forest health analyses, was described.. Data on dogwood anthracnose (caused by Discula destructiva), a disease of flowering dogwood (Cornus florida L.), was used to illustrate the utility of the model. The {Beta}-binomial model allowed the detection of different distributional patterns of dogwood anthracnose over time and space. Results led to further speculation regarding the cause of the patterns. Traditional proportion analyses like ANOVA would not have detected the trends found using the {Beta}-binomial model, until more distinct patterns had evolved at a later date. The model was said to be flexible and require no special weighting or transformations of data.Another advantage claimed was its ability to handle unequal sample sizes.
Mixture models with entropy regularization for community detection in networks
Chang, Zhenhai; Yin, Xianjun; Jia, Caiyan; Wang, Xiaoyang
2018-04-01
Community detection is a key exploratory tool in network analysis and has received much attention in recent years. NMM (Newman's mixture model) is one of the best models for exploring a range of network structures including community structure, bipartite and core-periphery structures, etc. However, NMM needs to know the number of communities in advance. Therefore, in this study, we have proposed an entropy regularized mixture model (called EMM), which is capable of inferring the number of communities and identifying network structure contained in a network, simultaneously. In the model, by minimizing the entropy of mixing coefficients of NMM using EM (expectation-maximization) solution, the small clusters contained little information can be discarded step by step. The empirical study on both synthetic networks and real networks has shown that the proposed model EMM is superior to the state-of-the-art methods.
Gilthorpe, M S; Dahly, D L; Tu, Y K; Kubzansky, L D; Goodman, E
2014-06-01
Lifecourse trajectories of clinical or anthropological attributes are useful for identifying how our early-life experiences influence later-life morbidity and mortality. Researchers often use growth mixture models (GMMs) to estimate such phenomena. It is common to place constrains on the random part of the GMM to improve parsimony or to aid convergence, but this can lead to an autoregressive structure that distorts the nature of the mixtures and subsequent model interpretation. This is especially true if changes in the outcome within individuals are gradual compared with the magnitude of differences between individuals. This is not widely appreciated, nor is its impact well understood. Using repeat measures of body mass index (BMI) for 1528 US adolescents, we estimated GMMs that required variance-covariance constraints to attain convergence. We contrasted constrained models with and without an autocorrelation structure to assess the impact this had on the ideal number of latent classes, their size and composition. We also contrasted model options using simulations. When the GMM variance-covariance structure was constrained, a within-class autocorrelation structure emerged. When not modelled explicitly, this led to poorer model fit and models that differed substantially in the ideal number of latent classes, as well as class size and composition. Failure to carefully consider the random structure of data within a GMM framework may lead to erroneous model inferences, especially for outcomes with greater within-person than between-person homogeneity, such as BMI. It is crucial to reflect on the underlying data generation processes when building such models.
Problems on Divisibility of Binomial Coefficients
Osler, Thomas J.; Smoak, James
2004-01-01
Twelve unusual problems involving divisibility of the binomial coefficients are represented in this article. The problems are listed in "The Problems" section. All twelve problems have short solutions which are listed in "The Solutions" section. These problems could be assigned to students in any course in which the binomial theorem and Pascal's…
System-Reliability Cumulative-Binomial Program
Scheuer, Ernest M.; Bowerman, Paul N.
1989-01-01
Cumulative-binomial computer program, NEWTONP, one of set of three programs, calculates cumulative binomial probability distributions for arbitrary inputs. NEWTONP, CUMBIN (NPO-17555), and CROSSER (NPO-17557), used independently of one another. Program finds probability required to yield given system reliability. Used by statisticians and users of statistical procedures, test planners, designers, and numerical analysts. Program written in C.
A class of orthogonal nonrecursive binomial filters.
Haddad, R. A.
1971-01-01
The time- and frequency-domain properties of the orthogonal binomial sequences are presented. It is shown that these sequences, or digital filters based on them, can be generated using adders and delay elements only. The frequency-domain behavior of these nonrecursive binomial filters suggests a number of applications as low-pass Gaussian filters or as inexpensive bandpass filters.
Common-Reliability Cumulative-Binomial Program
Scheuer, Ernest, M.; Bowerman, Paul N.
1989-01-01
Cumulative-binomial computer program, CROSSER, one of set of three programs, calculates cumulative binomial probability distributions for arbitrary inputs. CROSSER, CUMBIN (NPO-17555), and NEWTONP (NPO-17556), used independently of one another. Point of equality between reliability of system and common reliability of components found. Used by statisticians and users of statistical procedures, test planners, designers, and numerical analysts. Program written in C.
Khan, Iftekhar; Morris, Stephen
2014-11-12
The performance of the Beta Binomial (BB) model is compared with several existing models for mapping the EORTC QLQ-C30 (QLQ-C30) on to the EQ-5D-3L using data from lung cancer trials. Data from 2 separate non small cell lung cancer clinical trials (TOPICAL and SOCCAR) are used to develop and validate the BB model. Comparisons with Linear, TOBIT, Quantile, Quadratic and CLAD models are carried out. The mean prediction error, R(2), proportion predicted outside the valid range, clinical interpretation of coefficients, model fit and estimation of Quality Adjusted Life Years (QALY) are reported and compared. Monte-Carlo simulation is also used. The Beta-Binomial regression model performed 'best' among all models. For TOPICAL and SOCCAR trials, respectively, residual mean square error (RMSE) was 0.09 and 0.11; R(2) was 0.75 and 0.71; observed vs. predicted means were 0.612 vs. 0.608 and 0.750 vs. 0.749. Mean difference in QALY's (observed vs. predicted) were 0.051 vs. 0.053 and 0.164 vs. 0.162 for TOPICAL and SOCCAR respectively. Models tested on independent data show simulated 95% confidence from the BB model containing the observed mean more often (77% and 59% for TOPICAL and SOCCAR respectively) compared to the other models. All algorithms over-predict at poorer health states but the BB model was relatively better, particularly for the SOCCAR data. The BB model may offer superior predictive properties amongst mapping algorithms considered and may be more useful when predicting EQ-5D-3L at poorer health states. We recommend the algorithm derived from the TOPICAL data due to better predictive properties and less uncertainty.
Improved Denoising via Poisson Mixture Modeling of Image Sensor Noise.
Zhang, Jiachao; Hirakawa, Keigo
2017-04-01
This paper describes a study aimed at comparing the real image sensor noise distribution to the models of noise often assumed in image denoising designs. A quantile analysis in pixel, wavelet transform, and variance stabilization domains reveal that the tails of Poisson, signal-dependent Gaussian, and Poisson-Gaussian models are too short to capture real sensor noise behavior. A new Poisson mixture noise model is proposed to correct the mismatch of tail behavior. Based on the fact that noise model mismatch results in image denoising that undersmoothes real sensor data, we propose a mixture of Poisson denoising method to remove the denoising artifacts without affecting image details, such as edge and textures. Experiments with real sensor data verify that denoising for real image sensor data is indeed improved by this new technique.
Currency lookback options and observation frequency: A binomial approach
T.H.F. Cheuk; A.C.F. Vorst (Ton)
1997-01-01
textabstractIn the last decade, interest in exotic options has been growing, especially in the over-the-counter currency market. In this paper we consider Iookback currency options, which are path-dependent. We show that a one-state variable binomial model for currency Iookback options can
Using the β-binomial distribution to characterize forest health
S.J. Zarnoch; R.L. Anderson; R.M. Sheffield
1995-01-01
The β-binomial distribution is suggested as a model for describing and analyzing the dichotomous data obtained from programs monitoring the health of forests in the United States. Maximum likelihood estimation of the parameters is given as well as asymptotic likelihood ratio tests. The procedure is illustrated with data on dogwood anthracnose infection (caused...
Statistical inference for a class of multivariate negative binomial distributions
DEFF Research Database (Denmark)
Rubak, Ege Holger; Møller, Jesper; McCullagh, Peter
This paper considers statistical inference procedures for a class of models for positively correlated count variables called α-permanental random fields, and which can be viewed as a family of multivariate negative binomial distributions. Their appealing probabilistic properties have earlier been...
Phylogenetic mixtures and linear invariants for equal input models.
Casanellas, Marta; Steel, Mike
2017-04-01
The reconstruction of phylogenetic trees from molecular sequence data relies on modelling site substitutions by a Markov process, or a mixture of such processes. In general, allowing mixed processes can result in different tree topologies becoming indistinguishable from the data, even for infinitely long sequences. However, when the underlying Markov process supports linear phylogenetic invariants, then provided these are sufficiently informative, the identifiability of the tree topology can be restored. In this paper, we investigate a class of processes that support linear invariants once the stationary distribution is fixed, the 'equal input model'. This model generalizes the 'Felsenstein 1981' model (and thereby the Jukes-Cantor model) from four states to an arbitrary number of states (finite or infinite), and it can also be described by a 'random cluster' process. We describe the structure and dimension of the vector spaces of phylogenetic mixtures and of linear invariants for any fixed phylogenetic tree (and for all trees-the so called 'model invariants'), on any number n of leaves. We also provide a precise description of the space of mixtures and linear invariants for the special case of [Formula: see text] leaves. By combining techniques from discrete random processes and (multi-) linear algebra, our results build on a classic result that was first established by James Lake (Mol Biol Evol 4:167-191, 1987).
A nonparametric mixture model for cure rate estimation.
Peng, Y; Dear, K B
2000-03-01
Nonparametric methods have attracted less attention than their parametric counterparts for cure rate analysis. In this paper, we study a general nonparametric mixture model. The proportional hazards assumption is employed in modeling the effect of covariates on the failure time of patients who are not cured. The EM algorithm, the marginal likelihood approach, and multiple imputations are employed to estimate parameters of interest in the model. This model extends models and improves estimation methods proposed by other researchers. It also extends Cox's proportional hazards regression model by allowing a proportion of event-free patients and investigating covariate effects on that proportion. The model and its estimation method are investigated by simulations. An application to breast cancer data, including comparisons with previous analyses using a parametric model and an existing nonparametric model by other researchers, confirms the conclusions from the parametric model but not those from the existing nonparametric model.
Modeling adsorption of binary and ternary mixtures on microporous media
DEFF Research Database (Denmark)
Monsalvo, Matias Alfonso; Shapiro, Alexander
2007-01-01
it possible using the same equation of state to describe the thermodynamic properties of the segregated and the bulk phases. For comparison, we also used the ideal adsorbed solution theory (IAST) to describe adsorption equilibria. The main advantage of these two models is their capabilities to predict......The goal of this work is to analyze the adsorption of binary and ternary mixtures on the basis of the multicomponent potential theory of adsorption (MPTA). In the MPTA, the adsorbate is considered as a segregated mixture in the external potential field emitted by the solid adsorbent. This makes...... multicomponent adsorption equilibria on the basis of single-component adsorption data. We compare the MPTA and IAST models to a large set of experimental data, obtaining reasonable good agreement with experimental data and high degree of predictability. Some limitations of both models are also discussed....
Hydrogenic ionization model for mixtures in non-LTE plasmas
International Nuclear Information System (INIS)
Djaoui, A.
1999-01-01
The Hydrogenic Ionization Model for Mixtures (HIMM) is a non-Local Thermodynamic Equilibrium (non-LTE), time-dependent ionization model for laser-produced plasmas containing mixtures of elements (species). In this version, both collisional and radiative rates are taken into account. An ionization distribution for each species which is consistent with the ambient electron density is obtained by use of an iterative procedure in a single calculation for all species. Energy levels for each shell having a given principal quantum number and for each ion stage of each species in the mixture are calculated using screening constants. Steady-state non-LTE as well as LTE solutions are also provided. The non-LTE rate equations converge to the LTE solution at sufficiently high densities or as the radiation temperature approaches the electron temperature. The model is particularly useful at low temperatures where convergence problems are usually encountered in our previous models. We apply our model to typical situation in x-ray laser research, laser-produced plasmas and inertial confinement fusion. Our results compare well with previously published results for a selenium plasma. (author)
Color Texture Segmentation by Decomposition of Gaussian Mixture Model
Czech Academy of Sciences Publication Activity Database
Grim, Jiří; Somol, Petr; Haindl, Michal; Pudil, Pavel
2006-01-01
Roč. 19, č. 4225 (2006), s. 287-296 ISSN 0302-9743. [Iberoamerican Congress on Pattern Recognition. CIARP 2006 /11./. Cancun, 14.11.2006-17.11.2006] R&D Projects: GA AV ČR 1ET400750407; GA MŠk 1M0572; GA MŠk 2C06019 EU Projects: European Commission(XE) 507752 - MUSCLE Institutional research plan: CEZ:AV0Z10750506 Keywords : texture segmentation * gaussian mixture model * EM algorithm Subject RIV: IN - Informatics, Computer Science Impact factor: 0.402, year: 2005 http://library.utia.cas.cz/separaty/historie/grim-color texture segmentation by decomposition of gaussian mixture model.pdf
Estimating negative binomial parameters from occurrence data with detection times.
Hwang, Wen-Han; Huggins, Richard; Stoklosa, Jakub
2016-11-01
The negative binomial distribution is a common model for the analysis of count data in biology and ecology. In many applications, we may not observe the complete frequency count in a quadrat but only that a species occurred in the quadrat. If only occurrence data are available then the two parameters of the negative binomial distribution, the aggregation index and the mean, are not identifiable. This can be overcome by data augmentation or through modeling the dependence between quadrat occupancies. Here, we propose to record the (first) detection time while collecting occurrence data in a quadrat. We show that under what we call proportionate sampling, where the time to survey a region is proportional to the area of the region, that both negative binomial parameters are estimable. When the mean parameter is larger than two, our proposed approach is more efficient than the data augmentation method developed by Solow and Smith (, Am. Nat. 176, 96-98), and in general is cheaper to conduct. We also investigate the effect of misidentification when collecting negative binomially distributed data, and conclude that, in general, the effect can be simply adjusted for provided that the mean and variance of misidentification probabilities are known. The results are demonstrated in a simulation study and illustrated in several real examples. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
XDGMM: eXtreme Deconvolution Gaussian Mixture Modeling
Holoien, Thomas W.-S.; Marshall, Philip J.; Wechsler, Risa H.
2017-08-01
XDGMM uses Gaussian mixtures to do density estimation of noisy, heterogenous, and incomplete data using extreme deconvolution (XD) algorithms which is compatible with the scikit-learn machine learning methods. It implements both the astroML and Bovy et al. (2011) algorithms, and extends the BaseEstimator class from scikit-learn so that cross-validation methods work. It allows the user to produce a conditioned model if values of some parameters are known.
Option Pricing with Asymmetric Heteroskedastic Normal Mixture Models
DEFF Research Database (Denmark)
Rombouts, Jeroen V.K.; Stentoft, Lars
This paper uses asymmetric heteroskedastic normal mixture models to fit return data and to price options. The models can be estimated straightforwardly by maximum likelihood, have high statistical fit when used on S&P 500 index return data, and allow for substantial negative skewness and time...... varying higher order moments of the risk neutral distribution. When forecasting out-of-sample a large set of index options between 1996 and 2009, substantial improvements are found compared to several benchmark models in terms of dollar losses and the ability to explain the smirk in implied volatilities...
Variable selection for mixture and promotion time cure rate models.
Masud, Abdullah; Tu, Wanzhu; Yu, Zhangsheng
2016-11-16
Failure-time data with cured patients are common in clinical studies. Data from these studies are typically analyzed with cure rate models. Variable selection methods have not been well developed for cure rate models. In this research, we propose two least absolute shrinkage and selection operators based methods, for variable selection in mixture and promotion time cure models with parametric or nonparametric baseline hazards. We conduct an extensive simulation study to assess the operating characteristics of the proposed methods. We illustrate the use of the methods using data from a study of childhood wheezing. © The Author(s) 2016.
KONVERGENSI ESTIMATOR DALAM MODEL MIXTURE BERBASIS MISSING DATA
Directory of Open Access Journals (Sweden)
N Dwidayati
2014-06-01
Full Text Available Abstrak __________________________________________________________________________________________ Model mixture dapat mengestimasi proporsi pasien yang sembuh (cured dan fungsi survival pasien tak sembuh (uncured. Pada kajian ini, model mixture dikembangkan untuk analisis cure rate berbasis missing data. Ada beberapa metode yang dapat digunakan untuk analisis missing data. Salah satu metode yang dapat digunakan adalah Algoritma EM, Metode ini didasarkan pada 2 (dua langkah, yaitu: (1 Expectation Step dan (2 Maximization Step. Algoritma EM merupakan pendekatan iterasi untuk mempelajari model dari data dengan nilai hilang melalui 4 (empat langkah, yaitu(1 pilih himpunan inisial dari parameter untuk sebuah model, (2 tentukan nilai ekspektasi untuk data hilang, (3 buat induksi parameter model baru dari gabungan nilai ekspekstasi dan data asli, dan (4 jika parameter tidak converged, ulangi langkah 2 menggunakan model baru. Berdasar kajian yang dilakukan dapat ditunjukkan bahwa pada algoritma EM, log-likelihood untuk missing data mengalami kenaikan setelah dilakukan setiap iterasi dari algoritmanya. Dengan demikian berdasar algoritma EM, barisan likelihood konvergen jika likelihood terbatas ke bawah. Abstract __________________________________________________________________________________________ Model mixture can estimate proportion of recovering patient and function of patient survival do not recover. At this study, model mixture developed to analyse cure rate bases on missing data. There are some method which applicable to analyse missing data. One of method which can be applied is Algoritma EM, This method based on 2 ( two step, that is: ( 1 Expectation Step and ( 2 Maximization Step. EM Algorithm is approach of iteration to study model from data with value loses through 4 ( four step, yaitu(1 select;chooses initial gathering from parameter for a model, ( 2 determines expectation value for data to lose, ( 3 induce newfangled parameter
Bayesian mixture models for source separation in MEG
International Nuclear Information System (INIS)
Calvetti, Daniela; Homa, Laura; Somersalo, Erkki
2011-01-01
This paper discusses the problem of imaging electromagnetic brain activity from measurements of the induced magnetic field outside the head. This imaging modality, magnetoencephalography (MEG), is known to be severely ill posed, and in order to obtain useful estimates for the activity map, complementary information needs to be used to regularize the problem. In this paper, a particular emphasis is on finding non-superficial focal sources that induce a magnetic field that may be confused with noise due to external sources and with distributed brain noise. The data are assumed to come from a mixture of a focal source and a spatially distributed possibly virtual source; hence, to differentiate between those two components, the problem is solved within a Bayesian framework, with a mixture model prior encoding the information that different sources may be concurrently active. The mixture model prior combines one density that favors strongly focal sources and another that favors spatially distributed sources, interpreted as clutter in the source estimation. Furthermore, to address the challenge of localizing deep focal sources, a novel depth sounding algorithm is suggested, and it is shown with simulated data that the method is able to distinguish between a signal arising from a deep focal source and a clutter signal. (paper)
Newton Binomial Formulas in Schubert Calculus
Cordovez, Jorge; Gatto, Letterio; Santiago, Taise
2008-01-01
We prove Newton's binomial formulas for Schubert Calculus to determine numbers of base point free linear series on the projective line with prescribed ramification divisor supported at given distinct points.
On pricing futures options on random binomial tree
International Nuclear Information System (INIS)
Bayram, Kamola; Ganikhodjaev, Nasir
2013-01-01
The discrete-time approach to real option valuation has typically been implemented in the finance literature using a binomial tree framework. Instead we develop a new model by randomizing the environment and call such model a random binomial tree. Whereas the usual model has only one environment (u, d) where the price of underlying asset can move by u times up and d times down, and pair (u, d) is constant over the life of the underlying asset, in our new model the underlying security is moving in two environments namely (u 1 , d 1 ) and (u 2 , d 2 ). Thus we obtain two volatilities σ 1 and σ 2 . This new approach enables calculations reflecting the real market since it consider the two states of market normal and extra ordinal. In this paper we define and study Futures options for such models.
The Binomial Coefficient for Negative Arguments
Kronenburg, M. J.
2011-01-01
The definition of the binomial coefficient in terms of gamma functions also allows non-integer arguments. For nonnegative integer arguments the gamma functions reduce to factorials, leading to the well-known Pascal triangle. Using a symmetry formula for the gamma function, this definition is extended to negative integer arguments, making the symmetry identity for binomial coefficients valid for all integer arguments. The agreement of this definition with some other identities and with the bin...
Calculating Cumulative Binomial-Distribution Probabilities
Scheuer, Ernest M.; Bowerman, Paul N.
1989-01-01
Cumulative-binomial computer program, CUMBIN, one of set of three programs, calculates cumulative binomial probability distributions for arbitrary inputs. CUMBIN, NEWTONP (NPO-17556), and CROSSER (NPO-17557), used independently of one another. Reliabilities and availabilities of k-out-of-n systems analyzed. Used by statisticians and users of statistical procedures, test planners, designers, and numerical analysts. Used for calculations of reliability and availability. Program written in C.
Determining of migraine prognosis using latent growth mixture models.
Tasdelen, Bahar; Ozge, Aynur; Kaleagasi, Hakan; Erdogan, Semra; Mengi, Tufan
2011-04-01
This paper presents a retrospective study to classify patients into subtypes of the treatment according to baseline and longitudinally observed values considering heterogenity in migraine prognosis. In the classical prospective clinical studies, participants are classified with respect to baseline status and followed within a certain time period. However, latent growth mixture model is the most suitable method, which considers the population heterogenity and is not affected drop-outs if they are missing at random. Hence, we planned this comprehensive study to identify prognostic factors in migraine. The study data have been based on a 10-year computer-based follow-up data of Mersin University Headache Outpatient Department. The developmental trajectories within subgroups were described for the severity, frequency, and duration of headache separately and the probabilities of each subgroup were estimated by using latent growth mixture models. SAS PROC TRAJ procedures, semiparametric and group-based mixture modeling approach, were applied to define the developmental trajectories. While the three-group model for the severity (mild, moderate, severe) and frequency (low, medium, high) of headache appeared to be appropriate, the four-group model for the duration (low, medium, high, extremely high) was more suitable. The severity of headache increased in the patients with nausea, vomiting, photophobia and phonophobia. The frequency of headache was especially related with increasing age and unilateral pain. Nausea and photophobia were also related with headache duration. Nausea, vomiting and photophobia were the most significant factors to identify developmental trajectories. The remission time was not the same for the severity, frequency, and duration of headache.
Spatially adaptive mixture modeling for analysis of FMRI time series.
Vincent, Thomas; Risser, Laurent; Ciuciu, Philippe
2010-04-01
Within-subject analysis in fMRI essentially addresses two problems, the detection of brain regions eliciting evoked activity and the estimation of the underlying dynamics. In Makni et aL, 2005 and Makni et aL, 2008, a detection-estimation framework has been proposed to tackle these problems jointly, since they are connected to one another. In the Bayesian formalism, detection is achieved by modeling activating and nonactivating voxels through independent mixture models (IMM) within each region while hemodynamic response estimation is performed at a regional scale in a nonparametric way. Instead of IMMs, in this paper we take advantage of spatial mixture models (SMM) for their nonlinear spatial regularizing properties. The proposed method is unsupervised and spatially adaptive in the sense that the amount of spatial correlation is automatically tuned from the data and this setting automatically varies across brain regions. In addition, the level of regularization is specific to each experimental condition since both the signal-to-noise ratio and the activation pattern may vary across stimulus types in a given brain region. These aspects require the precise estimation of multiple partition functions of underlying Ising fields. This is addressed efficiently using first path sampling for a small subset of fields and then using a recently developed fast extrapolation technique for the large remaining set. Simulation results emphasize that detection relying on supervised SMM outperforms its IMM counterpart and that unsupervised spatial mixture models achieve similar results without any hand-tuning of the correlation parameter. On real datasets, the gain is illustrated in a localizer fMRI experiment: brain activations appear more spatially resolved using SMM in comparison with classical general linear model (GLM)-based approaches, while estimating a specific parcel-based HRF shape. Our approach therefore validates the treatment of unsmoothed fMRI data without fixed GLM
DEFF Research Database (Denmark)
Tsivintzelis, Ioannis; Kontogeorgis, Georgios; Michelsen, Michael Locht
2010-01-01
(water, methanol, and glycols) are modeled assuming presence or not of cross-association interactions. Such interactions are accounted for using either a combining rule or a cross-solvation energy obtained from spectroscopic data. Using the parameters obtained from the binary systems, one ternary......The Cubic-Plus-Association (CPA) equation of state is applied to a large variety of mixtures containing H2S, which are of interest in the oil and gas industry. Binary H2S mixtures with alkanes, CO2, water, methanol, and glycols are first considered. The interactions of H2S with polar compounds...... and three quaternary mixtures are considered. It is shown that overall excellent correlation for binary, mixtures and satisfactory prediction results for multicomponent systems are obtained. There are significant differences between the various modeling approaches and the best results are obtained when...
Effective dielectric mixture model for characterization of diesel contaminated soil
International Nuclear Information System (INIS)
Al-Mattarneh, H.M.A.
2007-01-01
Human exposure to contaminated soil by diesel isomers can have serious health consequences like neurological diseases or cancer. The potential of dielectric measuring techniques for electromagnetic characterization of contaminated soils was investigated in this paper. The purpose of the research was to develop an empirical dielectric mixture model for soil hydrocarbon contamination application. The paper described the basic theory and elaborated in dielectric mixture theory. The analytical and empirical models were explained in simple algebraic formulas. The experimental study was then described with reference to materials, properties and experimental results. The results of the analytical models were also mathematically explained. The proposed semi-empirical model was also presented. According to the result of the electromagnetic properties of dry soil contaminated with diesel, the diesel presence had no significant effect on the electromagnetic properties of dry soil. It was concluded that diesel had no contribution to the soil electrical conductivity, which confirmed the nonconductive character of diesel. The results of diesel-contaminated soil at saturation condition indicated that both dielectric constant and loss factors of soil were decreased with increasing diesel content. 15 refs., 2 tabs., 9 figs
Buckland, S.; Cole, N.C.; Aguirre-Gutiérrez, J.; Gallagher, L.E.; Henshaw, S.M.; Besnard, A.; Tucker, R.M.; Bachraz, V.; Ruhomaun, K.; Harris, S.
2014-01-01
The invasion of the giant Madagascar day gecko Phelsuma grandis has increased the threats to the four endemic Mauritian day geckos (Phelsuma spp.) that have survived on mainland Mauritius. We had two main aims: (i) to predict the spatial distribution and overlap of P. grandis and the endemic geckos
Buckland, S.; Cole, N.C.; Aguirre-Gutiérrez, J.; Gallagher, L.E.; Henshaw, S.M.; Besnard, A.; Tucker, R.M.; Bachraz, V.; Ruhomaun, K.; Harris, S.
2014-01-01
The invasion of the giant Madagascar day gecko Phelsuma grandis has increased the threats to the four endemic Mauritian day geckos (Phelsuma spp.) that have survived on mainland Mauritius. We had two main aims: (i) to predict the spatial distribution and overlap of P. grandis and the endemic geckos at a landscape level; and (ii) to investigate the effects of P. grandis on the abundance and risks of extinction of the endemic geckos at a local scale. An ensemble forecasting approach was used to...
Experiments with Mixtures Designs, Models, and the Analysis of Mixture Data
Cornell, John A
2011-01-01
The most comprehensive, single-volume guide to conducting experiments with mixtures"If one is involved, or heavily interested, in experiments on mixtures of ingredients, one must obtain this book. It is, as was the first edition, the definitive work."-Short Book Reviews (Publication of the International Statistical Institute)"The text contains many examples with worked solutions and with its extensive coverage of the subject matter will prove invaluable to those in the industrial and educational sectors whose work involves the design and analysis of mixture experiments."-Journal of the Royal S
On population size estimators in the Poisson mixture model.
Mao, Chang Xuan; Yang, Nan; Zhong, Jinhua
2013-09-01
Estimating population sizes via capture-recapture experiments has enormous applications. The Poisson mixture model can be adopted for those applications with a single list in which individuals appear one or more times. We compare several nonparametric estimators, including the Chao estimator, the Zelterman estimator, two jackknife estimators and the bootstrap estimator. The target parameter of the Chao estimator is a lower bound of the population size. Those of the other four estimators are not lower bounds, and they may produce lower confidence limits for the population size with poor coverage probabilities. A simulation study is reported and two examples are investigated. © 2013, The International Biometric Society.
A mixture model for robust registration in Kinect sensor
Peng, Li; Zhou, Huabing; Zhu, Shengguo
2018-03-01
The Microsoft Kinect sensor has been widely used in many applications, but it suffers from the drawback of low registration precision between color image and depth image. In this paper, we present a robust method to improve the registration precision by a mixture model that can handle multiply images with the nonparametric model. We impose non-parametric geometrical constraints on the correspondence, as a prior distribution, in a reproducing kernel Hilbert space (RKHS).The estimation is performed by the EM algorithm which by also estimating the variance of the prior model is able to obtain good estimates. We illustrate the proposed method on the public available dataset. The experimental results show that our approach outperforms the baseline methods.
Fast Bayesian Inference in Dirichlet Process Mixture Models.
Wang, Lianming; Dunson, David B
2011-01-01
There has been increasing interest in applying Bayesian nonparametric methods in large samples and high dimensions. As Markov chain Monte Carlo (MCMC) algorithms are often infeasible, there is a pressing need for much faster algorithms. This article proposes a fast approach for inference in Dirichlet process mixture (DPM) models. Viewing the partitioning of subjects into clusters as a model selection problem, we propose a sequential greedy search algorithm for selecting the partition. Then, when conjugate priors are chosen, the resulting posterior conditionally on the selected partition is available in closed form. This approach allows testing of parametric models versus nonparametric alternatives based on Bayes factors. We evaluate the approach using simulation studies and compare it with four other fast nonparametric methods in the literature. We apply the proposed approach to three datasets including one from a large epidemiologic study. Matlab codes for the simulation and data analyses using the proposed approach are available online in the supplemental materials.
PENERAPAN REGRESI BINOMIAL NEGATIF UNTUK MENGATASI OVERDISPERSI PADA REGRESI POISSON
Directory of Open Access Journals (Sweden)
PUTU SUSAN PRADAWATI
2013-09-01
Full Text Available Poisson regression was used to analyze the count data which Poisson distributed. Poisson regression analysis requires state equidispersion, in which the mean value of the response variable is equal to the value of the variance. However, there are deviations in which the value of the response variable variance is greater than the mean. This is called overdispersion. If overdispersion happens and Poisson Regression analysis is being used, then underestimated standard errors will be obtained. Negative Binomial Regression can handle overdispersion because it contains a dispersion parameter. From the simulation data which experienced overdispersion in the Poisson Regression model it was found that the Negative Binomial Regression was better than the Poisson Regression model.
e+-e- hadronic multiplicity distributions: negative binomial or Poisson
International Nuclear Information System (INIS)
Carruthers, P.; Shih, C.C.
1986-01-01
On the basis of fits to the multiplicity distributions for variable rapidity windows and the forward backward correlation for the 2 jet subset of e + e - data it is impossible to distinguish between a global negative binomial and its generalization, the partially coherent distribution. It is suggested that intensity interferometry, especially the Bose-Einstein correlation, gives information which will discriminate among dynamical models. 16 refs
Hits per trial: Basic analysis of binomial data
International Nuclear Information System (INIS)
Atwood, C.L.
1994-09-01
This report presents simple statistical methods for analyzing binomial data, such as the number of failures in some number of demands. It gives point estimates, confidence intervals, and Bayesian intervals for the failure probability. It shows how to compare subsets of the data, both graphically and by statistical tests, and how to look for trends in time. It presents a compound model when the failure probability varies randomly. Examples and SAS programs are given
Statistical Inference for a Class of Multivariate Negative Binomial Distributions
DEFF Research Database (Denmark)
Rubak, Ege H.; Møller, Jesper; McCullagh, Peter
This paper considers statistical inference procedures for a class of models for positively correlated count variables called -permanental random fields, and which can be viewed as a family of multivariate negative binomial distributions. Their appealing probabilistic properties have earlier been...... studied in the literature, while this is the first statistical paper on -permanental random fields. The focus is on maximum likelihood estimation, maximum quasi-likelihood estimation and on maximum composite likelihood estimation based on uni- and bivariate distributions. Furthermore, new results...
Hits per trial: Basic analysis of binomial data
Energy Technology Data Exchange (ETDEWEB)
Atwood, C.L.
1994-09-01
This report presents simple statistical methods for analyzing binomial data, such as the number of failures in some number of demands. It gives point estimates, confidence intervals, and Bayesian intervals for the failure probability. It shows how to compare subsets of the data, both graphically and by statistical tests, and how to look for trends in time. It presents a compound model when the failure probability varies randomly. Examples and SAS programs are given.
New Flexible Models and Design Construction Algorithms for Mixtures and Binary Dependent Variables
A. Ruseckaite (Aiste)
2017-01-01
markdownabstractThis thesis discusses new mixture(-amount) models, choice models and the optimal design of experiments. Two chapters of the thesis relate to the so-called mixture, which is a product or service whose ingredients’ proportions sum to one. The thesis begins by introducing mixture
Tractography segmentation using a hierarchical Dirichlet processes mixture model.
Wang, Xiaogang; Grimson, W Eric L; Westin, Carl-Fredrik
2011-01-01
In this paper, we propose a new nonparametric Bayesian framework to cluster white matter fiber tracts into bundles using a hierarchical Dirichlet processes mixture (HDPM) model. The number of clusters is automatically learned driven by data with a Dirichlet process (DP) prior instead of being manually specified. After the models of bundles have been learned from training data without supervision, they can be used as priors to cluster/classify fibers of new subjects for comparison across subjects. When clustering fibers of new subjects, new clusters can be created for structures not observed in the training data. Our approach does not require computing pairwise distances between fibers and can cluster a huge set of fibers across multiple subjects. We present results on several data sets, the largest of which has more than 120,000 fibers. Copyright © 2010 Elsevier Inc. All rights reserved.
Clustering disaggregated load profiles using a Dirichlet process mixture model
International Nuclear Information System (INIS)
Granell, Ramon; Axon, Colin J.; Wallom, David C.H.
2015-01-01
Highlights: • We show that the Dirichlet process mixture model is scaleable. • Our model does not require the number of clusters as an input. • Our model creates clusters only by the features of the demand profiles. • We have used both residential and commercial data sets. - Abstract: The increasing availability of substantial quantities of power-use data in both the residential and commercial sectors raises the possibility of mining the data to the advantage of both consumers and network operations. We present a Bayesian non-parametric model to cluster load profiles from households and business premises. Evaluators show that our model performs as well as other popular clustering methods, but unlike most other methods it does not require the number of clusters to be predetermined by the user. We used the so-called ‘Chinese restaurant process’ method to solve the model, making use of the Dirichlet-multinomial distribution. The number of clusters grew logarithmically with the quantity of data, making the technique suitable for scaling to large data sets. We were able to show that the model could distinguish features such as the nationality, household size, and type of dwelling between the cluster memberships
Bayesian nonparametric meta-analysis using Polya tree mixture models.
Branscum, Adam J; Hanson, Timothy E
2008-09-01
Summary. A common goal in meta-analysis is estimation of a single effect measure using data from several studies that are each designed to address the same scientific inquiry. Because studies are typically conducted in geographically disperse locations, recent developments in the statistical analysis of meta-analytic data involve the use of random effects models that account for study-to-study variability attributable to differences in environments, demographics, genetics, and other sources that lead to heterogeneity in populations. Stemming from asymptotic theory, study-specific summary statistics are modeled according to normal distributions with means representing latent true effect measures. A parametric approach subsequently models these latent measures using a normal distribution, which is strictly a convenient modeling assumption absent of theoretical justification. To eliminate the influence of overly restrictive parametric models on inferences, we consider a broader class of random effects distributions. We develop a novel hierarchical Bayesian nonparametric Polya tree mixture (PTM) model. We present methodology for testing the PTM versus a normal random effects model. These methods provide researchers a straightforward approach for conducting a sensitivity analysis of the normality assumption for random effects. An application involving meta-analysis of epidemiologic studies designed to characterize the association between alcohol consumption and breast cancer is presented, which together with results from simulated data highlight the performance of PTMs in the presence of nonnormality of effect measures in the source population.
Semiparametric Mixtures of Regressions with Single-index for Model Based Clustering
Xiang, Sijia; Yao, Weixin
2017-01-01
In this article, we propose two classes of semiparametric mixture regression models with single-index for model based clustering. Unlike many semiparametric/nonparametric mixture regression models that can only be applied to low dimensional predictors, the new semiparametric models can easily incorporate high dimensional predictors into the nonparametric components. The proposed models are very general, and many of the recently proposed semiparametric/nonparametric mixture regression models a...
Parameter estimation of the zero inflated negative binomial beta exponential distribution
Sirichantra, Chutima; Bodhisuwan, Winai
2017-11-01
The zero inflated negative binomial-beta exponential (ZINB-BE) distribution is developed, it is an alternative distribution for the excessive zero counts with overdispersion. The ZINB-BE distribution is a mixture of two distributions which are Bernoulli and negative binomial-beta exponential distributions. In this work, some characteristics of the proposed distribution are presented, such as, mean and variance. The maximum likelihood estimation is applied to parameter estimation of the proposed distribution. Finally some results of Monte Carlo simulation study, it seems to have high-efficiency when the sample size is large.
Energy Technology Data Exchange (ETDEWEB)
Thienpont, Benedicte; Barata, Carlos [Department of Environmental Chemistry, Institute of Environmental Assessment and Water Research (IDAEA, CSIC), Jordi Girona, 18-26, 08034 Barcelona (Spain); Raldúa, Demetrio, E-mail: drpqam@cid.csic.es [Department of Environmental Chemistry, Institute of Environmental Assessment and Water Research (IDAEA, CSIC), Jordi Girona, 18-26, 08034 Barcelona (Spain); Maladies Rares: Génétique et Métabolisme (MRGM), University of Bordeaux, EA 4576, F-33400 Talence (France)
2013-06-01
Maternal thyroxine (T4) plays an essential role in fetal brain development, and even mild and transitory deficits in free-T4 in pregnant women can produce irreversible neurological effects in their offspring. Women of childbearing age are daily exposed to mixtures of chemicals disrupting the thyroid gland function (TGFDs) through the diet, drinking water, air and pharmaceuticals, which has raised the highest concern for the potential additive or synergic effects on the development of mild hypothyroxinemia during early pregnancy. Recently we demonstrated that zebrafish eleutheroembryos provide a suitable alternative model for screening chemicals impairing the thyroid hormone synthesis. The present study used the intrafollicular T4-content (IT4C) of zebrafish eleutheroembryos as integrative endpoint for testing the hypotheses that the effect of mixtures of TGFDs with a similar mode of action [inhibition of thyroid peroxidase (TPO)] was well predicted by a concentration addition concept (CA) model, whereas the response addition concept (RA) model predicted better the effect of dissimilarly acting binary mixtures of TGFDs [TPO-inhibitors and sodium-iodide symporter (NIS)-inhibitors]. However, CA model provided better prediction of joint effects than RA in five out of the six tested mixtures. The exception being the mixture MMI (TPO-inhibitor)-KClO{sub 4} (NIS-inhibitor) dosed at a fixed ratio of EC{sub 10} that provided similar CA and RA predictions and hence it was difficult to get any conclusive result. There results support the phenomenological similarity criterion stating that the concept of concentration addition could be extended to mixture constituents having common apical endpoints or common adverse outcomes. - Highlights: • Potential synergic or additive effect of mixtures of chemicals on thyroid function. • Zebrafish as alternative model for testing the effect of mixtures of goitrogens. • Concentration addition seems to predict better the effect of
International Nuclear Information System (INIS)
Thienpont, Benedicte; Barata, Carlos; Raldúa, Demetrio
2013-01-01
Maternal thyroxine (T4) plays an essential role in fetal brain development, and even mild and transitory deficits in free-T4 in pregnant women can produce irreversible neurological effects in their offspring. Women of childbearing age are daily exposed to mixtures of chemicals disrupting the thyroid gland function (TGFDs) through the diet, drinking water, air and pharmaceuticals, which has raised the highest concern for the potential additive or synergic effects on the development of mild hypothyroxinemia during early pregnancy. Recently we demonstrated that zebrafish eleutheroembryos provide a suitable alternative model for screening chemicals impairing the thyroid hormone synthesis. The present study used the intrafollicular T4-content (IT4C) of zebrafish eleutheroembryos as integrative endpoint for testing the hypotheses that the effect of mixtures of TGFDs with a similar mode of action [inhibition of thyroid peroxidase (TPO)] was well predicted by a concentration addition concept (CA) model, whereas the response addition concept (RA) model predicted better the effect of dissimilarly acting binary mixtures of TGFDs [TPO-inhibitors and sodium-iodide symporter (NIS)-inhibitors]. However, CA model provided better prediction of joint effects than RA in five out of the six tested mixtures. The exception being the mixture MMI (TPO-inhibitor)-KClO 4 (NIS-inhibitor) dosed at a fixed ratio of EC 10 that provided similar CA and RA predictions and hence it was difficult to get any conclusive result. There results support the phenomenological similarity criterion stating that the concept of concentration addition could be extended to mixture constituents having common apical endpoints or common adverse outcomes. - Highlights: • Potential synergic or additive effect of mixtures of chemicals on thyroid function. • Zebrafish as alternative model for testing the effect of mixtures of goitrogens. • Concentration addition seems to predict better the effect of mixtures of
Modeling dynamic functional connectivity using a wishart mixture model
DEFF Research Database (Denmark)
Nielsen, Søren Føns Vind; Madsen, Kristoffer Hougaard; Schmidt, Mikkel Nørgaard
2017-01-01
framework provides model selection by quantifying models generalization to new data. We use this to quantify the number of states within a prespecified window length. We further propose a heuristic procedure for choosing the window length based on contrasting for each window length the predictive...... together whereas short windows are more unstable and influenced by noise and we find that our heuristic correctly identifies an adequate level of complexity. On single subject resting state fMRI data we find that dynamic models generally outperform static models and using the proposed heuristic points...
Multiple Response Regression for Gaussian Mixture Models with Known Labels.
Lee, Wonyul; Du, Ying; Sun, Wei; Hayes, D Neil; Liu, Yufeng
2012-12-01
Multiple response regression is a useful regression technique to model multiple response variables using the same set of predictor variables. Most existing methods for multiple response regression are designed for modeling homogeneous data. In many applications, however, one may have heterogeneous data where the samples are divided into multiple groups. Our motivating example is a cancer dataset where the samples belong to multiple cancer subtypes. In this paper, we consider modeling the data coming from a mixture of several Gaussian distributions with known group labels. A naive approach is to split the data into several groups according to the labels and model each group separately. Although it is simple, this approach ignores potential common structures across different groups. We propose new penalized methods to model all groups jointly in which the common and unique structures can be identified. The proposed methods estimate the regression coefficient matrix, as well as the conditional inverse covariance matrix of response variables. Asymptotic properties of the proposed methods are explored. Through numerical examples, we demonstrate that both estimation and prediction can be improved by modeling all groups jointly using the proposed methods. An application to a glioblastoma cancer dataset reveals some interesting common and unique gene relationships across different cancer subtypes.
Distribution-free Inference of Zero-inated Binomial Data for Longitudinal Studies.
He, H; Wang, W J; Hu, J; Gallop, R; Crits-Christoph, P; Xia, Y L
2015-10-01
Count reponses with structural zeros are very common in medical and psychosocial research, especially in alcohol and HIV research, and the zero-inflated poisson (ZIP) and zero-inflated negative binomial (ZINB) models are widely used for modeling such outcomes. However, as alcohol drinking outcomes such as days of drinkings are counts within a given period, their distributions are bounded above by an upper limit (total days in the period) and thus inherently follow a binomial or zero-inflated binomial (ZIB) distribution, rather than a Poisson or zero-inflated Poisson (ZIP) distribution, in the presence of structural zeros. In this paper, we develop a new semiparametric approach for modeling zero-inflated binomial (ZIB)-like count responses for cross-sectional as well as longitudinal data. We illustrate this approach with both simulated and real study data.
A smooth mixture of Tobits model for healthcare expenditure.
Keane, Michael; Stavrunova, Olena
2011-09-01
This paper develops a smooth mixture of Tobits (SMTobit) model for healthcare expenditure. The model is a generalization of the smoothly mixing regressions framework of Geweke and Keane (J Econometrics 2007; 138: 257-290) to the case of a Tobit-type limited dependent variable. A Markov chain Monte Carlo algorithm with data augmentation is developed to obtain the posterior distribution of model parameters. The model is applied to the US Medicare Current Beneficiary Survey data on total medical expenditure. The results suggest that the model can capture the overall shape of the expenditure distribution very well, and also provide a good fit to a number of characteristics of the conditional (on covariates) distribution of expenditure, such as the conditional mean, variance and probability of extreme outcomes, as well as the 50th, 90th, and 95th, percentiles. We find that healthier individuals face an expenditure distribution with lower mean, variance and probability of extreme outcomes, compared with their counterparts in a worse state of health. Males have an expenditure distribution with higher mean, variance and probability of an extreme outcome, compared with their female counterparts. The results also suggest that heart and cardiovascular diseases affect the expenditure of males more than that of females. Copyright © 2011 John Wiley & Sons, Ltd.
Modeling of columnar and equiaxed solidification of binary mixtures
International Nuclear Information System (INIS)
Roux, P.
2005-12-01
This work deals with the modelling of dendritic solidification in binary mixtures. Large scale phenomena are represented by volume averaging of the local conservation equations. This method allows to rigorously derive the partial differential equations of averaged fields and the closure problems associated to the deviations. Such problems can be resolved numerically on periodic cells, representative of dendritic structures, in order to give a precise evaluation of macroscopic transfer coefficients (Drag coefficients, exchange coefficients, diffusion-dispersion tensors...). The method had already been applied for a model of columnar dendritic mushy zone and it is extended to the case of equiaxed dendritic solidification, where solid grains can move. The two-phase flow is modelled with an Eulerian-Eulerian approach and the novelty is to account for the dispersion of solid velocity through the kinetic agitation of the particles. A coupling of the two models is proposed thanks to an original adaptation of the columnar model, allowing for undercooling calculation: a solid-liquid interfacial area density is introduced and calculated. At last, direct numerical simulations of crystal growth are proposed with a diffuse interface method for a representation of local phenomena. (author)
Binomial distribution for the charge asymmetry parameter
International Nuclear Information System (INIS)
Chou, T.T.; Yang, C.N.
1984-01-01
It is suggested that for high energy collisions the distribution with respect to the charge asymmetry z = nsub(F) - nsub(B) is binomial, where nsub(F) and nsub(B) are the forward and backward charge multiplicities. (orig.)
Adaptive estimation of binomial probabilities under misclassification
Albers, Willem/Wim; Veldman, H.J.
1984-01-01
If misclassification occurs the standard binomial estimator is usually seriously biased. It is known that an improvement can be achieved by using more than one observer in classifying the sample elements. Here it will be investigated which number of observers is optimal given the total number of
Binomial vs poisson statistics in radiation studies
International Nuclear Information System (INIS)
Foster, J.; Kouris, K.; Spyrou, N.M.; Matthews, I.P.; Welsh National School of Medicine, Cardiff
1983-01-01
The processes of radioactive decay, decay and growth of radioactive species in a radioactive chain, prompt emission(s) from nuclear reactions, conventional activation and cyclic activation are discussed with respect to their underlying statistical density function. By considering the transformation(s) that each nucleus may undergo it is shown that all these processes are fundamentally binomial. Formally, when the number of experiments N is large and the probability of success p is close to zero, the binomial is closely approximated by the Poisson density function. In radiation and nuclear physics, N is always large: each experiment can be conceived of as the observation of the fate of each of the N nuclei initially present. Whether p, the probability that a given nucleus undergoes a prescribed transformation, is close to zero depends on the process and nuclide(s) concerned. Hence, although a binomial description is always valid, the Poisson approximation is not always adequate. Therefore further clarification is provided as to when the binomial distribution must be used in the statistical treatment of detected events. (orig.)
Abstract knowledge versus direct experience in processing of binomial expressions.
Morgan, Emily; Levy, Roger
2016-12-01
We ask whether word order preferences for binomial expressions of the form A and B (e.g. bread and butter) are driven by abstract linguistic knowledge of ordering constraints referencing the semantic, phonological, and lexical properties of the constituent words, or by prior direct experience with the specific items in questions. Using forced-choice and self-paced reading tasks, we demonstrate that online processing of never-before-seen binomials is influenced by abstract knowledge of ordering constraints, which we estimate with a probabilistic model. In contrast, online processing of highly frequent binomials is primarily driven by direct experience, which we estimate from corpus frequency counts. We propose a trade-off wherein processing of novel expressions relies upon abstract knowledge, while reliance upon direct experience increases with increased exposure to an expression. Our findings support theories of language processing in which both compositional generation and direct, holistic reuse of multi-word expressions play crucial roles. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.
Directory of Open Access Journals (Sweden)
Oprisiu Ioana
2013-01-01
Full Text Available Abstract The Online Chemical Modeling Environment (OCHEM, http://ochem.eu is a web-based platform that provides tools for automation of typical steps necessary to create a predictive QSAR/QSPR model. The platform consists of two major subsystems: a database of experimental measurements and a modeling framework. So far, OCHEM has been limited to the processing of individual compounds. In this work, we extended OCHEM with a new ability to store and model properties of binary non-additive mixtures. The developed system is publicly accessible, meaning that any user on the Web can store new data for binary mixtures and develop models to predict their non-additive properties. The database already contains almost 10,000 data points for the density, bubble point, and azeotropic behavior of binary mixtures. For these data, we developed models for both qualitative (azeotrope/zeotrope and quantitative endpoints (density and bubble points using different learning methods and specially developed descriptors for mixtures. The prediction performance of the models was similar to or more accurate than results reported in previous studies. Thus, we have developed and made publicly available a powerful system for modeling mixtures of chemical compounds on the Web.
Flexible Mixture-Amount Models for Business and Industry Using Gaussian Processes
A. Ruseckaite (Aiste); D. Fok (Dennis); P.P. Goos (Peter)
2016-01-01
markdownabstractMany products and services can be described as mixtures of ingredients whose proportions sum to one. Specialized models have been developed for linking the mixture proportions to outcome variables, such as preference, quality and liking. In many scenarios, only the mixture
Induced polarization of clay-sand mixtures: experiments and modeling
International Nuclear Information System (INIS)
Okay, G.; Leroy, P.; Tournassat, C.; Ghorbani, A.; Jougnot, D.; Cosenza, P.; Camerlynck, C.; Cabrera, J.; Florsch, N.; Revil, A.
2012-01-01
were performed with a cylindrical four-electrode sample-holder (cylinder made of PVC with 30 cm in length and 19 cm in diameter) associated with a SIP-Fuchs II impedance meter and non-polarizing Cu/CuSO 4 electrodes. These electrodes were installed at 10 cm from the base of the sample holder and regularly spaced (each 90 degree). The results illustrate the strong impact of the Cationic Exchange Capacity (CEC) of the clay minerals upon the complex conductivity. The amplitude of the in-phase conductivity of the kaolinite-clay samples is strongly dependent to saturating fluid salinity for all volumetric clay fractions, whereas the in-phase conductivity of the smectite-clay samples is quite independent on the salinity, except at the low clay content (5% and 1% of clay in volume). This is due to the strong and constant surface conductivity of smectite associated with its very high CEC. The quadrature conductivity increases steadily with the CEC and the clay content. We observe that the dependence on frequency of the quadrature conductivity of sand-kaolinite mixtures is more important than for sand-bentonite mixtures. For both types of clay, the quadrature conductivity seems to be fairly independent on the pore fluid salinity except at very low clay contents (1% in volume of kaolinite-clay). This is due to the constant surface site density of Na counter-ions in the Stern layer of clay materials. At the lowest clay content (1%), the magnitude of the quadrature conductivity increases with the salinity, as expected for silica sands. In this case, the surface site density of Na counter-ions in the Stern layer increases with salinity. The experimental data show good agreement with predicted values given by our Spectral Induced Polarization (SIP) model. This complex conductivity model considers the electrochemical polarization of the Stern layer coating the clay particles and the Maxwell-Wagner polarization. We use the differential effective medium theory to calculate the complex
Modelling the effect of mixture components on permeation through skin.
Ghafourian, T; Samaras, E G; Brooks, J D; Riviere, J E
2010-10-15
A vehicle influences the concentration of penetrant within the membrane, affecting its diffusivity in the skin and rate of transport. Despite the huge amount of effort made for the understanding and modelling of the skin absorption of chemicals, a reliable estimation of the skin penetration potential from formulations remains a challenging objective. In this investigation, quantitative structure-activity relationship (QSAR) was employed to relate the skin permeation of compounds to the chemical properties of the mixture ingredients and the molecular structures of the penetrants. The skin permeability dataset consisted of permeability coefficients of 12 different penetrants each blended in 24 different solvent mixtures measured from finite-dose diffusion cell studies using porcine skin. Stepwise regression analysis resulted in a QSAR employing two penetrant descriptors and one solvent property. The penetrant descriptors were octanol/water partition coefficient, logP and the ninth order path molecular connectivity index, and the solvent property was the difference between boiling and melting points. The negative relationship between skin permeability coefficient and logP was attributed to the fact that most of the drugs in this particular dataset are extremely lipophilic in comparison with the compounds in the common skin permeability datasets used in QSAR. The findings show that compounds formulated in vehicles with small boiling and melting point gaps will be expected to have higher permeation through skin. The QSAR was validated internally, using a leave-many-out procedure, giving a mean absolute error of 0.396. The chemical space of the dataset was compared with that of the known skin permeability datasets and gaps were identified for future skin permeability measurements. Copyright 2010 Elsevier B.V. All rights reserved.
Toxicological risk assessment of complex mixtures through the Wtox model
Directory of Open Access Journals (Sweden)
William Gerson Matias
2015-01-01
Full Text Available Mathematical models are important tools for environmental management and risk assessment. Predictions about the toxicity of chemical mixtures must be enhanced due to the complexity of eects that can be caused to the living species. In this work, the environmental risk was accessed addressing the need to study the relationship between the organism and xenobiotics. Therefore, ve toxicological endpoints were applied through the WTox Model, and with this methodology we obtained the risk classication of potentially toxic substances. Acute and chronic toxicity, citotoxicity and genotoxicity were observed in the organisms Daphnia magna, Vibrio scheri and Oreochromis niloticus. A case study was conducted with solid wastes from textile, metal-mechanic and pulp and paper industries. The results have shown that several industrial wastes induced mortality, reproductive eects, micronucleus formation and increases in the rate of lipid peroxidation and DNA methylation of the organisms tested. These results, analyzed together through the WTox Model, allowed the classication of the environmental risk of industrial wastes. The evaluation showed that the toxicological environmental risk of the samples analyzed can be classied as signicant or critical.
Polymer mixtures in confined geometries: Model systems to explore ...
Indian Academy of Sciences (India)
to mean field behavior for very long chains, the critical behavior of mixtures confined into thin film geometry falls in the 2d Ising class irrespective of chain length. ..... AB interface does not approach the wall; (b) corresponds to a temperature .... Very recently, these theoretical studies have been extended to polymer mixtures.
Expansion around half-integer values, binomial sums, and inverse binomial sums
International Nuclear Information System (INIS)
Weinzierl, Stefan
2004-01-01
I consider the expansion of transcendental functions in a small parameter around rational numbers. This includes in particular the expansion around half-integer values. I present algorithms which are suitable for an implementation within a symbolic computer algebra system. The method is an extension of the technique of nested sums. The algorithms allow in addition the evaluation of binomial sums, inverse binomial sums and generalizations thereof
Perbandingan Metode Binomial dan Metode Black-Scholes Dalam Penentuan Harga Opsi
Directory of Open Access Journals (Sweden)
Surya Amami Pramuditya
2016-04-01
Full Text Available ABSTRAKOpsi adalah kontrak antara pemegang dan penulis (buyer (holder dan seller (writer di mana penulis (writer memberikan hak (bukan kewajiban kepada holder untuk membeli atau menjual aset dari writer pada harga tertentu (strike atau latihan harga dan pada waktu tertentu dalam waktu (tanggal kadaluwarsa atau jatuh tempo waktu. Ada beberapa cara untuk menentukan harga opsi, diantaranya adalah Metode Black-Scholes dan Metode Binomial. Metode binomial berasal dari model pergerakan harga saham yang membagi waktu interval [0, T] menjadi n sama panjang. Sedangkan metode Black-Scholes, dimodelkan dengan pergerakan harga saham sebagai suatu proses stokastik. Semakin besar partisi waktu n pada Metode Binomial, maka nilai opsinya akan konvergen ke nilai opsi Metode Black-Scholes.Kata kunci: opsi, Binomial, Black-Scholes.ABSTRACT Option is a contract between the holder and the writer in which the writer gives the right (not the obligation to the holder to buy or sell an asset of a writer at a specified price (the strike or exercise price and at a specified time in the future (expiry date or maturity time. There are several ways to determine the price of options, including the Black-Scholes Method and Binomial Method. Binomial method come from a model of stock price movement that divide time interval [0, T] into n equally long. While the Black Scholes method, the stock price movement is modeled as a stochastic process. More larger the partition of time n in Binomial Method, the value option will converge to the value option in Black-Scholes Method.Key words: Options, Binomial, Black-Scholes
A study of finite mixture model: Bayesian approach on financial time series data
Phoong, Seuk-Yen; Ismail, Mohd Tahir
2014-07-01
Recently, statistician have emphasized on the fitting finite mixture model by using Bayesian method. Finite mixture model is a mixture of distributions in modeling a statistical distribution meanwhile Bayesian method is a statistical method that use to fit the mixture model. Bayesian method is being used widely because it has asymptotic properties which provide remarkable result. In addition, Bayesian method also shows consistency characteristic which means the parameter estimates are close to the predictive distributions. In the present paper, the number of components for mixture model is studied by using Bayesian Information Criterion. Identify the number of component is important because it may lead to an invalid result. Later, the Bayesian method is utilized to fit the k-component mixture model in order to explore the relationship between rubber price and stock market price for Malaysia, Thailand, Philippines and Indonesia. Lastly, the results showed that there is a negative effect among rubber price and stock market price for all selected countries.
Nonparametric Identification and Estimation of Finite Mixture Models of Dynamic Discrete Choices
Hiroyuki Kasahara; Katsumi Shimotsu
2006-01-01
In dynamic discrete choice analysis, controlling for unobserved heterogeneity is an important issue, and finite mixture models provide flexible ways to account for unobserved heterogeneity. This paper studies nonparametric identifiability of type probabilities and type-specific component distributions in finite mixture models of dynamic discrete choices. We derive sufficient conditions for nonparametric identification for various finite mixture models of dynamic discrete choices used in appli...
Wright, Aidan G C; Hallquist, Michael N
2014-01-01
Studying personality and its pathology as it changes, develops, or remains stable over time offers exciting insight into the nature of individual differences. Researchers interested in examining personal characteristics over time have a number of time-honored analytic approaches at their disposal. In recent years there have also been considerable advances in person-oriented analytic approaches, particularly longitudinal mixture models. In this methodological primer we focus on mixture modeling approaches to the study of normative and individual change in the form of growth mixture models and ipsative change in the form of latent transition analysis. We describe the conceptual underpinnings of each of these models, outline approaches for their implementation, and provide accessible examples for researchers studying personality and its assessment.
Tomography of binomial states of the radiation field
Bazrafkan, MR; Man'ko, [No Value
2004-01-01
The symplectic, optical, and photon-number tomographic symbols of binomial states of the radiation field are studied. Explicit relations for all tomograms of the binomial states are obtained. Two measures for nonclassical properties of these states are discussed.
Aly, Sharif S; Zhao, Jianyang; Li, Ben; Jiang, Jiming
2014-01-01
The Intraclass Correlation Coefficient (ICC) is commonly used to estimate the similarity between quantitative measures obtained from different sources. Overdispersed data is traditionally transformed so that linear mixed model (LMM) based ICC can be estimated. A common transformation used is the natural logarithm. The reliability of environmental sampling of fecal slurry on freestall pens has been estimated for Mycobacterium avium subsp. paratuberculosis using the natural logarithm transformed culture results. Recently, the negative binomial ICC was defined based on a generalized linear mixed model for negative binomial distributed data. The current study reports on the negative binomial ICC estimate which includes fixed effects using culture results of environmental samples. Simulations using a wide variety of inputs and negative binomial distribution parameters (r; p) showed better performance of the new negative binomial ICC compared to the ICC based on LMM even when negative binomial data was logarithm, and square root transformed. A second comparison that targeted a wider range of ICC values showed that the mean of estimated ICC closely approximated the true ICC.
Modelling of spark to ignition transition in gas mixtures
Energy Technology Data Exchange (ETDEWEB)
Akram, M.
1996-10-01
This thesis pertains to the models for studying sparking in chemically inert gases. The processes taking place in a spark to flame transition can be segregated into physical and chemical processes, and this study is focused on physical processes. The plasma is regarded as a single-substance material. One and two-dimensional models are developed. The transfer of electrical energy into thermal energy of the gas and its redistribution in space and time along with the evolution of a plasma kernel is studied in the time domain ranging from 10 ns to 40 micros. In the case of ultra-fast sparks, the propagation of the shock and its reflection from a rigid wall is presented. The influence of electrode shape and the gap size on the flow structure development is found to be a dominating factor. It is observed that the flow structure that has developed in the early stage more or less prevails at later stages and strongly influences the shape and evolution of the hot kernel. The electrode geometry and configuration are responsible for the development of the flow structure. The strength of the vortices generated in the flow field is influenced by the power input to the gap and their location of emergence is dictated by the electrode shape and configuration. The heat transfer after 2 micros in the case of ultra-fast sparks is dominated by convection and diffusion. The strong mixing produced by hydrodynamic effects and the electrode geometry give the indication that the magnetic pinch effect might be negligible. Finally, a model for a multicomponent gas mixture is presented. The chemical kinetics mechanism for dissociation and ionization is introduced. 56 refs
Nonlinear Structured Growth Mixture Models in M"plus" and OpenMx
Grimm, Kevin J.; Ram, Nilam; Estabrook, Ryne
2010-01-01
Growth mixture models (GMMs; B. O. Muthen & Muthen, 2000; B. O. Muthen & Shedden, 1999) are a combination of latent curve models (LCMs) and finite mixture models to examine the existence of latent classes that follow distinct developmental patterns. GMMs are often fit with linear, latent basis, multiphase, or polynomial change models…
Energy Technology Data Exchange (ETDEWEB)
Kim, Beong Gwon; Roh, Myung Sub [KEPCO International Nuclear Graduate School, Ulsan (Korea, Republic of)
2014-10-15
Real options approach is suitable for evaluation of large-scale investment project with great uncertainties. Takizawa and Omori (2001) introduced a real option approach to calculate electricity price for economic feasibility. Rothwell (2006) modeled the net present value (NPV) of building an ABWR in Texas using ROA to determine the risk premium associated with net revenue uncertainty. W.C Yoon (2006) evaluated nuclear power plant construction value using DCF and ROA with sensitivity analysis. The value evaluations involved with nuclear power are very uncertain. This is because of a long period of construction as well as the cost uncertainties of decommissioning and nuclear waste management. Even more elements should be considered in new nuclear power valuation, including the uncertainty from the technology, operating costs, the potential risk of radiation, electricity mechanism and climate policy. In this respect, a traditional method such as discounted cash flow (DCF) can't fully catch the impacts of these uncertainties on nuclear power investment. So it is necessary to develop a proper method to handle such kinds of uncertainties to evaluate the new deployment of nuclear power plants. Meanwhile, overseas construction projects which are required capital investment, localization by target countries are increasing in these days. These elements may influence the uncertainty of project too.
Real time tracking by LOPF algorithm with mixture model
Meng, Bo; Zhu, Ming; Han, Guangliang; Wu, Zhiguo
2007-11-01
A new particle filter-the Local Optimum Particle Filter (LOPF) algorithm is presented for tracking object accurately and steadily in visual sequences in real time which is a challenge task in computer vision field. In order to using the particles efficiently, we first use Sobel algorithm to extract the profile of the object. Then, we employ a new Local Optimum algorithm to auto-initialize some certain number of particles from these edge points as centre of the particles. The main advantage we do this in stead of selecting particles randomly in conventional particle filter is that we can pay more attentions on these more important optimum candidates and reduce the unnecessary calculation on those negligible ones, in addition we can overcome the conventional degeneracy phenomenon in a way and decrease the computational costs. Otherwise, the threshold is a key factor that affecting the results very much. So here we adapt an adaptive threshold choosing method to get the optimal Sobel result. The dissimilarities between the target model and the target candidates are expressed by a metric derived from the Bhattacharyya coefficient. Here, we use both the counter cue to select the particles and the color cur to describe the targets as the mixture target model. The effectiveness of our scheme is demonstrated by real visual tracking experiments. Results from simulations and experiments with real video data show the improved performance of the proposed algorithm when compared with that of the standard particle filter. The superior performance is evident when the target encountering the occlusion in real video where the standard particle filter usually fails.
A Note on the Use of Mixture Models for Individual Prediction.
Cole, Veronica T; Bauer, Daniel J
Mixture models capture heterogeneity in data by decomposing the population into latent subgroups, each of which is governed by its own subgroup-specific set of parameters. Despite the flexibility and widespread use of these models, most applications have focused solely on making inferences for whole or sub-populations, rather than individual cases. The current article presents a general framework for computing marginal and conditional predicted values for individuals using mixture model results. These predicted values can be used to characterize covariate effects, examine the fit of the model for specific individuals, or forecast future observations from previous ones. Two empirical examples are provided to demonstrate the usefulness of individual predicted values in applications of mixture models. The first example examines the relative timing of initiation of substance use using a multiple event process survival mixture model whereas the second example evaluates changes in depressive symptoms over adolescence using a growth mixture model.
Background based Gaussian mixture model lesion segmentation in PET
Energy Technology Data Exchange (ETDEWEB)
Soffientini, Chiara Dolores, E-mail: chiaradolores.soffientini@polimi.it; Baselli, Giuseppe [DEIB, Department of Electronics, Information, and Bioengineering, Politecnico di Milano, Piazza Leonardo da Vinci 32, Milan 20133 (Italy); De Bernardi, Elisabetta [Department of Medicine and Surgery, Tecnomed Foundation, University of Milano—Bicocca, Monza 20900 (Italy); Zito, Felicia; Castellani, Massimo [Nuclear Medicine Department, Fondazione IRCCS Ca’ Granda Ospedale Maggiore Policlinico, via Francesco Sforza 35, Milan 20122 (Italy)
2016-05-15
Purpose: Quantitative {sup 18}F-fluorodeoxyglucose positron emission tomography is limited by the uncertainty in lesion delineation due to poor SNR, low resolution, and partial volume effects, subsequently impacting oncological assessment, treatment planning, and follow-up. The present work develops and validates a segmentation algorithm based on statistical clustering. The introduction of constraints based on background features and contiguity priors is expected to improve robustness vs clinical image characteristics such as lesion dimension, noise, and contrast level. Methods: An eight-class Gaussian mixture model (GMM) clustering algorithm was modified by constraining the mean and variance parameters of four background classes according to the previous analysis of a lesion-free background volume of interest (background modeling). Hence, expectation maximization operated only on the four classes dedicated to lesion detection. To favor the segmentation of connected objects, a further variant was introduced by inserting priors relevant to the classification of neighbors. The algorithm was applied to simulated datasets and acquired phantom data. Feasibility and robustness toward initialization were assessed on a clinical dataset manually contoured by two expert clinicians. Comparisons were performed with respect to a standard eight-class GMM algorithm and to four different state-of-the-art methods in terms of volume error (VE), Dice index, classification error (CE), and Hausdorff distance (HD). Results: The proposed GMM segmentation with background modeling outperformed standard GMM and all the other tested methods. Medians of accuracy indexes were VE <3%, Dice >0.88, CE <0.25, and HD <1.2 in simulations; VE <23%, Dice >0.74, CE <0.43, and HD <1.77 in phantom data. Robustness toward image statistic changes (±15%) was shown by the low index changes: <26% for VE, <17% for Dice, and <15% for CE. Finally, robustness toward the user-dependent volume initialization was
International Nuclear Information System (INIS)
Amitabh, J.; Vaccaro, J.A.; Hill, K.E.
1998-01-01
We study the recently defined number-phase Wigner function S NP (n,θ) for a single-mode field considered to be in binomial and negative binomial states. These states interpolate between Fock and coherent states and coherent and quasi thermal states, respectively, and thus provide a set of states with properties ranging from uncertain phase and sharp photon number to sharp phase and uncertain photon number. The distribution function S NP (n,θ) gives a graphical representation of the complimentary nature of the number and phase properties of these states. We highlight important differences between Wigner's quasi probability function, which is associated with the position and momentum observables, and S NP (n,θ), which is associated directly with the photon number and phase observables. We also discuss the number-phase entropic uncertainty relation for the binomial and negative binomial states and we show that negative binomial states give a lower phase entropy than states which minimize the phase variance
Estimation of adjusted rate differences using additive negative binomial regression.
Donoghoe, Mark W; Marschner, Ian C
2016-08-15
Rate differences are an important effect measure in biostatistics and provide an alternative perspective to rate ratios. When the data are event counts observed during an exposure period, adjusted rate differences may be estimated using an identity-link Poisson generalised linear model, also known as additive Poisson regression. A problem with this approach is that the assumption of equality of mean and variance rarely holds in real data, which often show overdispersion. An additive negative binomial model is the natural alternative to account for this; however, standard model-fitting methods are often unable to cope with the constrained parameter space arising from the non-negativity restrictions of the additive model. In this paper, we propose a novel solution to this problem using a variant of the expectation-conditional maximisation-either algorithm. Our method provides a reliable way to fit an additive negative binomial regression model and also permits flexible generalisations using semi-parametric regression functions. We illustrate the method using a placebo-controlled clinical trial of fenofibrate treatment in patients with type II diabetes, where the outcome is the number of laser therapy courses administered to treat diabetic retinopathy. An R package is available that implements the proposed method. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Infinite von Mises-Fisher Mixture Modeling of Whole Brain fMRI Data
DEFF Research Database (Denmark)
Røge, Rasmus; Madsen, Kristoffer Hougaard; Schmidt, Mikkel Nørgaard
2017-01-01
spherical manifold are rarely analyzed, in part due to the computational challenges imposed by directional statistics. In this letter, we discuss a Bayesian von Mises-Fisher (vMF) mixture model for data on the unit hypersphere and present an efficient inference procedure based on collapsed Markov chain...... Monte Carlo sampling. Comparing the vMF and gaussian mixture models on synthetic data, we demonstrate that the vMF model has a slight advantage inferring the true underlying clustering when compared to gaussian-based models on data generated from both a mixture of vMFs and a mixture of gaussians......Cluster analysis of functional magnetic resonance imaging (fMRI) data is often performed using gaussian mixture models, but when the time series are standardized such that the data reside on a hypersphere, this modeling assumption is questionable. The consequences of ignoring the underlying...
Automatic categorization of web pages and user clustering with mixtures of hidden Markov models
Ypma, A.; Heskes, T.M.; Zaiane, O.R.; Srivastav, J.
2003-01-01
We propose mixtures of hidden Markov models for modelling clickstreams of web surfers. Hence, the page categorization is learned from the data without the need for a (possibly cumbersome) manual categorization. We provide an EM algorithm for training a mixture of HMMs and show that additional static
de Jong, Martijn G.; Steenkamp, Jan-Benedict E. M.
2010-01-01
We present a class of finite mixture multilevel multidimensional ordinal IRT models for large scale cross-cultural research. Our model is proposed for confirmatory research settings. Our prior for item parameters is a mixture distribution to accommodate situations where different groups of countries have different measurement operations, while…
Genetic Analysis of Somatic Cell Score in Danish Holsteins Using a Liability-Normal Mixture Model
DEFF Research Database (Denmark)
Madsen, P; Shariati, M M; Ødegård, J
2008-01-01
Mixture models are appealing for identifying hidden structures affecting somatic cell score (SCS) data, such as unrecorded cases of subclinical mastitis. Thus, liability-normal mixture (LNM) models were used for genetic analysis of SCS data, with the aim of predicting breeding values for such cas...
Combinatorial bounds on the α-divergence of univariate mixture models
Nielsen, Frank
2017-06-20
We derive lower- and upper-bounds of α-divergence between univariate mixture models with components in the exponential family. Three pairs of bounds are presented in order with increasing quality and increasing computational cost. They are verified empirically through simulated Gaussian mixture models. The presented methodology generalizes to other divergence families relying on Hellinger-type integrals.
ODE constrained mixture modelling: a method for unraveling subpopulation structures and dynamics.
Directory of Open Access Journals (Sweden)
Jan Hasenauer
2014-07-01
Full Text Available Functional cell-to-cell variability is ubiquitous in multicellular organisms as well as bacterial populations. Even genetically identical cells of the same cell type can respond differently to identical stimuli. Methods have been developed to analyse heterogeneous populations, e.g., mixture models and stochastic population models. The available methods are, however, either incapable of simultaneously analysing different experimental conditions or are computationally demanding and difficult to apply. Furthermore, they do not account for biological information available in the literature. To overcome disadvantages of existing methods, we combine mixture models and ordinary differential equation (ODE models. The ODE models provide a mechanistic description of the underlying processes while mixture models provide an easy way to capture variability. In a simulation study, we show that the class of ODE constrained mixture models can unravel the subpopulation structure and determine the sources of cell-to-cell variability. In addition, the method provides reliable estimates for kinetic rates and subpopulation characteristics. We use ODE constrained mixture modelling to study NGF-induced Erk1/2 phosphorylation in primary sensory neurones, a process relevant in inflammatory and neuropathic pain. We propose a mechanistic pathway model for this process and reconstructed static and dynamical subpopulation characteristics across experimental conditions. We validate the model predictions experimentally, which verifies the capabilities of ODE constrained mixture models. These results illustrate that ODE constrained mixture models can reveal novel mechanistic insights and possess a high sensitivity.
ODE constrained mixture modelling: a method for unraveling subpopulation structures and dynamics.
Hasenauer, Jan; Hasenauer, Christine; Hucho, Tim; Theis, Fabian J
2014-07-01
Functional cell-to-cell variability is ubiquitous in multicellular organisms as well as bacterial populations. Even genetically identical cells of the same cell type can respond differently to identical stimuli. Methods have been developed to analyse heterogeneous populations, e.g., mixture models and stochastic population models. The available methods are, however, either incapable of simultaneously analysing different experimental conditions or are computationally demanding and difficult to apply. Furthermore, they do not account for biological information available in the literature. To overcome disadvantages of existing methods, we combine mixture models and ordinary differential equation (ODE) models. The ODE models provide a mechanistic description of the underlying processes while mixture models provide an easy way to capture variability. In a simulation study, we show that the class of ODE constrained mixture models can unravel the subpopulation structure and determine the sources of cell-to-cell variability. In addition, the method provides reliable estimates for kinetic rates and subpopulation characteristics. We use ODE constrained mixture modelling to study NGF-induced Erk1/2 phosphorylation in primary sensory neurones, a process relevant in inflammatory and neuropathic pain. We propose a mechanistic pathway model for this process and reconstructed static and dynamical subpopulation characteristics across experimental conditions. We validate the model predictions experimentally, which verifies the capabilities of ODE constrained mixture models. These results illustrate that ODE constrained mixture models can reveal novel mechanistic insights and possess a high sensitivity.
Modelling of phase equilibria of glycol ethers mixtures using an association model
DEFF Research Database (Denmark)
Garrido, Nuno M.; Folas, Georgios; Kontogeorgis, Georgios
2008-01-01
Vapor-liquid and liquid-liquid equilibria of glycol ethers (surfactant) mixtures with hydrocarbons, polar compounds and water are calculated using an association model, the Cubic-Plus-Association Equation of State. Parameters are estimated for several non-ionic surfactants of the polyoxyethylene ...
Zhu, Xiaoshu
2013-01-01
The current study introduced a general modeling framework, multilevel mixture IRT (MMIRT) which detects and describes characteristics of population heterogeneity, while accommodating the hierarchical data structure. In addition to introducing both continuous and discrete approaches to MMIRT, the main focus of the current study was to distinguish…
Depaoli, Sarah; van de Schoot, Rens; van Loey, Nancy; Sijbrandij, Marit
2015-01-01
BACKGROUND: After traumatic events, such as disaster, war trauma, and injuries including burns (which is the focus here), the risk to develop posttraumatic stress disorder (PTSD) is approximately 10% (Breslau & Davis, 1992). Latent Growth Mixture Modeling can be used to classify individuals into
Censored Hurdle Negative Binomial Regression (Case Study: Neonatorum Tetanus Case in Indonesia)
Yuli Rusdiana, Riza; Zain, Ismaini; Wulan Purnami, Santi
2017-06-01
Hurdle negative binomial model regression is a method that can be used for discreate dependent variable, excess zero and under- and overdispersion. It uses two parts approach. The first part estimates zero elements from dependent variable is zero hurdle model and the second part estimates not zero elements (non-negative integer) from dependent variable is called truncated negative binomial models. The discrete dependent variable in such cases is censored for some values. The type of censor that will be studied in this research is right censored. This study aims to obtain the parameter estimator hurdle negative binomial regression for right censored dependent variable. In the assessment of parameter estimation methods used Maximum Likelihood Estimator (MLE). Hurdle negative binomial model regression for right censored dependent variable is applied on the number of neonatorum tetanus cases in Indonesia. The type data is count data which contains zero values in some observations and other variety value. This study also aims to obtain the parameter estimator and test statistic censored hurdle negative binomial model. Based on the regression results, the factors that influence neonatorum tetanus case in Indonesia is the percentage of baby health care coverage and neonatal visits.
Gassmann Modeling of Acoustic Properties of Sand-clay Mixtures
Gurevich, B.; Carcione, J. M.
The feasibility of modeling elastic properties of a fluid-saturated sand-clay mixture rock is analyzed by assuming that the rock is composed of macroscopic regions of sand and clay. The elastic properties of such a composite rock are computed using two alternative schemes.The first scheme, which we call the composite Gassmann (CG) scheme, uses Gassmann equations to compute elastic moduli of the saturated sand and clay from their respective dry moduli. The effective elastic moduli of the fluid-saturated composite rock are then computed by applying one of the mixing laws commonly used to estimate elastic properties of composite materials.In the second scheme which we call the Berryman-Milton scheme, the elastic moduli of the dry composite rock matrix are computed from the moduli of dry sand and clay matrices using the same composite mixing law used in the first scheme. Next, the saturated composite rock moduli are computed using the equations of Brown and Korringa, which, together with the expressions for the coefficients derived by Berryman and Milton, provide an extension of Gassmann equations to rocks with a heterogeneous solid matrix.For both schemes, the moduli of the dry homogeneous sand and clay matrices are assumed to obey the Krief's velocity-porosity relationship. As a mixing law we use the self-consistent coherent potential approximation proposed by Berryman.The calculated dependence of compressional and shear velocities on porosity and clay content for a given set of parameters using the two schemes depends on the distribution of total porosity between the sand and clay regions. If the distribution of total porosity between sand and clay is relatively uniform, the predictions of the two schemes in the porosity range up to 0.3 are very similar to each other. For higher porosities and medium-to-large clay content the elastic moduli predicted by CG scheme are significantly higher than those predicted by the BM scheme.This difference is explained by the fact
Self-organising mixture autoregressive model for non-stationary time series modelling.
Ni, He; Yin, Hujun
2008-12-01
Modelling non-stationary time series has been a difficult task for both parametric and nonparametric methods. One promising solution is to combine the flexibility of nonparametric models with the simplicity of parametric models. In this paper, the self-organising mixture autoregressive (SOMAR) network is adopted as a such mixture model. It breaks time series into underlying segments and at the same time fits local linear regressive models to the clusters of segments. In such a way, a global non-stationary time series is represented by a dynamic set of local linear regressive models. Neural gas is used for a more flexible structure of the mixture model. Furthermore, a new similarity measure has been introduced in the self-organising network to better quantify the similarity of time series segments. The network can be used naturally in modelling and forecasting non-stationary time series. Experiments on artificial, benchmark time series (e.g. Mackey-Glass) and real-world data (e.g. numbers of sunspots and Forex rates) are presented and the results show that the proposed SOMAR network is effective and superior to other similar approaches.
Diffusion models for mixtures using a stiff dissipative hyperbolic formalism
Boudin , Laurent; Grec , Bérénice; Pavan , Vincent
2018-01-01
In this article, we are interested in a system of uid equations for mixtures with a sti relaxation term of Maxwell-Stefan diusion type. We use the formalism developed by Chen, Levermore, Liu in [4] to obtain a limit system of Fick type where the species velocities tend to align to a bulk velocity when the relaxation parameter remains small.
Sound speed models for a noncondensible gas-steam-water mixture
International Nuclear Information System (INIS)
Ransom, V.H.; Trapp, J.A.
1984-01-01
An analytical expression is derived for the homogeneous equilibrium speed of sound in a mixture of noncondensible gas, steam, and water. The expression is based on the Gibbs free energy interphase equilibrium condition for a Gibbs-Dalton mixture in contact with a pure liquid phase. Several simplified models are discussed including the homogeneous frozen model. These idealized models can be used as a reference for data comparison and also serve as a basis for empirically corrected nonhomogeneous and nonequilibrium models
Model-based experimental design for assessing effects of mixtures of chemicals
Energy Technology Data Exchange (ETDEWEB)
Baas, Jan, E-mail: jan.baas@falw.vu.n [Vrije Universiteit of Amsterdam, Dept of Theoretical Biology, De Boelelaan 1085, 1081 HV Amsterdam (Netherlands); Stefanowicz, Anna M., E-mail: anna.stefanowicz@uj.edu.p [Institute of Environmental Sciences, Jagiellonian University, Gronostajowa 7, 30-387 Krakow (Poland); Klimek, Beata, E-mail: beata.klimek@uj.edu.p [Institute of Environmental Sciences, Jagiellonian University, Gronostajowa 7, 30-387 Krakow (Poland); Laskowski, Ryszard, E-mail: ryszard.laskowski@uj.edu.p [Institute of Environmental Sciences, Jagiellonian University, Gronostajowa 7, 30-387 Krakow (Poland); Kooijman, Sebastiaan A.L.M., E-mail: bas@bio.vu.n [Vrije Universiteit of Amsterdam, Dept of Theoretical Biology, De Boelelaan 1085, 1081 HV Amsterdam (Netherlands)
2010-01-15
We exposed flour beetles (Tribolium castaneum) to a mixture of four poly aromatic hydrocarbons (PAHs). The experimental setup was chosen such that the emphasis was on assessing partial effects. We interpreted the effects of the mixture by a process-based model, with a threshold concentration for effects on survival. The behavior of the threshold concentration was one of the key features of this research. We showed that the threshold concentration is shared by toxicants with the same mode of action, which gives a mechanistic explanation for the observation that toxic effects in mixtures may occur in concentration ranges where the individual components do not show effects. Our approach gives reliable predictions of partial effects on survival and allows for a reduction of experimental effort in assessing effects of mixtures, extrapolations to other mixtures, other points in time, or in a wider perspective to other organisms. - We show a mechanistic approach to assess effects of mixtures in low concentrations.
Model-based experimental design for assessing effects of mixtures of chemicals
International Nuclear Information System (INIS)
Baas, Jan; Stefanowicz, Anna M.; Klimek, Beata; Laskowski, Ryszard; Kooijman, Sebastiaan A.L.M.
2010-01-01
We exposed flour beetles (Tribolium castaneum) to a mixture of four poly aromatic hydrocarbons (PAHs). The experimental setup was chosen such that the emphasis was on assessing partial effects. We interpreted the effects of the mixture by a process-based model, with a threshold concentration for effects on survival. The behavior of the threshold concentration was one of the key features of this research. We showed that the threshold concentration is shared by toxicants with the same mode of action, which gives a mechanistic explanation for the observation that toxic effects in mixtures may occur in concentration ranges where the individual components do not show effects. Our approach gives reliable predictions of partial effects on survival and allows for a reduction of experimental effort in assessing effects of mixtures, extrapolations to other mixtures, other points in time, or in a wider perspective to other organisms. - We show a mechanistic approach to assess effects of mixtures in low concentrations.
On Two Mixture-Based Clustering Approaches Used in Modeling an Insurance Portfolio
Directory of Open Access Journals (Sweden)
Tatjana Miljkovic
2018-05-01
Full Text Available We review two complementary mixture-based clustering approaches for modeling unobserved heterogeneity in an insurance portfolio: the generalized linear mixed cluster-weighted model (CWM and mixture-based clustering for an ordered stereotype model (OSM. The latter is for modeling of ordinal variables, and the former is for modeling losses as a function of mixed-type of covariates. The article extends the idea of mixture modeling to a multivariate classification for the purpose of testing unobserved heterogeneity in an insurance portfolio. The application of both methods is illustrated on a well-known French automobile portfolio, in which the model fitting is performed using the expectation-maximization (EM algorithm. Our findings show that these mixture-based clustering methods can be used to further test unobserved heterogeneity in an insurance portfolio and as such may be considered in insurance pricing, underwriting, and risk management.
Model-based experimental design for assessing effects of mixtures of chemicals
Baas, J.; Stefanowicz, A.M.; Klimek, B.; Laskowski, R.; Kooijman, S.A.L.M.
2010-01-01
We exposed flour beetles (Tribolium castaneum) to a mixture of four poly aromatic hydrocarbons (PAHs). The experimental setup was chosen such that the emphasis was on assessing partial effects. We interpreted the effects of the mixture by a process-based model, with a threshold concentration for
A general mixture model for mapping quantitative trait loci by using molecular markers
Jansen, R.C.
1992-01-01
In a segregating population a quantitative trait may be considered to follow a mixture of (normal) distributions, the mixing proportions being based on Mendelian segregation rules. A general and flexible mixture model is proposed for mapping quantitative trait loci (QTLs) by using molecular markers.
Metal Mixture Modeling Evaluation project: 2. Comparison of four modeling approaches
Farley, Kevin J.; Meyer, Joe; Balistrieri, Laurie S.; DeSchamphelaere, Karl; Iwasaki, Yuichi; Janssen, Colin; Kamo, Masashi; Lofts, Steve; Mebane, Christopher A.; Naito, Wataru; Ryan, Adam C.; Santore, Robert C.; Tipping, Edward
2015-01-01
As part of the Metal Mixture Modeling Evaluation (MMME) project, models were developed by the National Institute of Advanced Industrial Science and Technology (Japan), the U.S. Geological Survey (USA), HDR⎪HydroQual, Inc. (USA), and the Centre for Ecology and Hydrology (UK) to address the effects of metal mixtures on biological responses of aquatic organisms. A comparison of the 4 models, as they were presented at the MMME Workshop in Brussels, Belgium (May 2012), is provided herein. Overall, the models were found to be similar in structure (free ion activities computed by WHAM; specific or non-specific binding of metals/cations in or on the organism; specification of metal potency factors and/or toxicity response functions to relate metal accumulation to biological response). Major differences in modeling approaches are attributed to various modeling assumptions (e.g., single versus multiple types of binding site on the organism) and specific calibration strategies that affected the selection of model parameters. The models provided a reasonable description of additive (or nearly additive) toxicity for a number of individual toxicity test results. Less-than-additive toxicity was more difficult to describe with the available models. Because of limitations in the available datasets and the strong inter-relationships among the model parameters (log KM values, potency factors, toxicity response parameters), further evaluation of specific model assumptions and calibration strategies is needed.
Directory of Open Access Journals (Sweden)
Lusi Eka Afri
2017-03-01
Full Text Available Regresi Binomial Negatif dan regresi Conway-Maxwell-Poisson merupakan solusi untuk mengatasi overdispersi pada regresi Poisson. Kedua model tersebut merupakan perluasan dari model regresi Poisson. Menurut Hinde dan Demetrio (2007, terdapat beberapa kemungkinan terjadi overdispersi pada regresi Poisson yaitu keragaman hasil pengamatan keragaman individu sebagai komponen yang tidak dijelaskan oleh model, korelasi antar respon individu, terjadinya pengelompokan dalam populasi dan peubah teramati yang dihilangkan. Akibatnya dapat menyebabkan pendugaan galat baku yang terlalu rendah dan akan menghasilkan pendugaan parameter yang bias ke bawah (underestimate. Penelitian ini bertujuan untuk membandingan model Regresi Binomial Negatif dan model regresi Conway-Maxwell-Poisson (COM-Poisson dalam mengatasi overdispersi pada data distribusi Poisson berdasarkan statistik uji devians. Data yang digunakan dalam penelitian ini terdiri dari dua sumber data yaitu data simulasi dan data kasus terapan. Data simulasi yang digunakan diperoleh dengan membangkitkan data berdistribusi Poisson yang mengandung overdispersi dengan menggunakan bahasa pemrograman R berdasarkan karakteristik data berupa , peluang munculnya nilai nol (p serta ukuran sampel (n. Data dibangkitkan berguna untuk mendapatkan estimasi koefisien parameter pada regresi binomial negatif dan COM-Poisson. Kata Kunci: overdispersi, regresi binomial negatif, regresi Conway-Maxwell-Poisson Negative binomial regression and Conway-Maxwell-Poisson regression could be used to overcome over dispersion on Poisson regression. Both models are the extension of Poisson regression model. According to Hinde and Demetrio (2007, there will be some over dispersion on Poisson regression: observed variance in individual variance cannot be described by a model, correlation among individual response, and the population group and the observed variables are eliminated. Consequently, this can lead to low standard error
Park, Yoon Soo; Lee, Young-Sun; Xing, Kuan
2016-01-01
This study investigates the impact of item parameter drift (IPD) on parameter and ability estimation when the underlying measurement model fits a mixture distribution, thereby violating the item invariance property of unidimensional item response theory (IRT) models. An empirical study was conducted to demonstrate the occurrence of both IPD and an underlying mixture distribution using real-world data. Twenty-one trended anchor items from the 1999, 2003, and 2007 administrations of Trends in International Mathematics and Science Study (TIMSS) were analyzed using unidimensional and mixture IRT models. TIMSS treats trended anchor items as invariant over testing administrations and uses pre-calibrated item parameters based on unidimensional IRT. However, empirical results showed evidence of two latent subgroups with IPD. Results also showed changes in the distribution of examinee ability between latent classes over the three administrations. A simulation study was conducted to examine the impact of IPD on the estimation of ability and item parameters, when data have underlying mixture distributions. Simulations used data generated from a mixture IRT model and estimated using unidimensional IRT. Results showed that data reflecting IPD using mixture IRT model led to IPD in the unidimensional IRT model. Changes in the distribution of examinee ability also affected item parameters. Moreover, drift with respect to item discrimination and distribution of examinee ability affected estimates of examinee ability. These findings demonstrate the need to caution and evaluate IPD using a mixture IRT framework to understand its effects on item parameters and examinee ability.
Directory of Open Access Journals (Sweden)
Yoon Soo ePark
2016-02-01
Full Text Available This study investigates the impact of item parameter drift (IPD on parameter and ability estimation when the underlying measurement model fits a mixture distribution, thereby violating the item invariance property of unidimensional item response theory (IRT models. An empirical study was conducted to demonstrate the occurrence of both IPD and an underlying mixture distribution using real-world data. Twenty-one trended anchor items from the 1999, 2003, and 2007 administrations of Trends in International Mathematics and Science Study (TIMSS were analyzed using unidimensional and mixture IRT models. TIMSS treats trended anchor items as invariant over testing administrations and uses pre-calibrated item parameters based on unidimensional IRT. However, empirical results showed evidence of two latent subgroups with IPD. Results showed changes in the distribution of examinee ability between latent classes over the three administrations. A simulation study was conducted to examine the impact of IPD on the estimation of ability and item parameters, when data have underlying mixture distributions. Simulations used data generated from a mixture IRT model and estimated using unidimensional IRT. Results showed that data reflecting IPD using mixture IRT model led to IPD in the unidimensional IRT model. Changes in the distribution of examinee ability also affected item parameters. Moreover, drift with respect to item discrimination and distribution of examinee ability affected estimates of examinee ability. These findings demonstrate the need to caution and evaluate IPD using a mixture IRT framework to understand its effect on item parameters and examinee ability.
Directory of Open Access Journals (Sweden)
Orlov Alexey
2016-01-01
Full Text Available This article presents results of development of the mathematical model of nonstationary separation processes occurring in gas centrifuge cascades for separation of multicomponent isotope mixtures. This model was used for the calculation parameters of gas centrifuge cascade for separation of germanium isotopes. Comparison of obtained values with results of other authors revealed that developed mathematical model is adequate to describe nonstationary separation processes in gas centrifuge cascades for separation of multicomponent isotope mixtures.
Influence of high power ultrasound on rheological and foaming properties of model ice-cream mixtures
Directory of Open Access Journals (Sweden)
Verica Batur
2010-03-01
Full Text Available This paper presents research of the high power ultrasound effect on rheological and foaming properties of ice cream model mixtures. Ice cream model mixtures are prepared according to specific recipes, and afterward undergone through different homogenization techniques: mechanical mixing, ultrasound treatment and combination of mechanical and ultrasound treatment. Specific diameter (12.7 mm of ultrasound probe tip has been used for ultrasound treatment that lasted 5 minutes at 100 percent amplitude. Rheological parameters have been determined using rotational rheometer and expressed as flow index, consistency coefficient and apparent viscosity. From the results it can be concluded that all model mixtures have non-newtonian, dilatant type behavior. The highest viscosities have been observed for model mixtures that were homogenizes with mechanical mixing, and significantly lower values of viscosity have been observed for ultrasound treated ones. Foaming properties are expressed as percentage of increase in foam volume, foam stability index and minimal viscosity. It has been determined that ice cream model mixtures treated only with ultrasound had minimal increase in foam volume, while the highest increase in foam volume has been observed for ice cream mixture that has been treated in combination with mechanical and ultrasound treatment. Also, ice cream mixtures having higher amount of proteins in composition had shown higher foam stability. It has been determined that optimal treatment time is 10 minutes.
Extending the Binomial Checkpointing Technique for Resilience
Energy Technology Data Exchange (ETDEWEB)
Walther, Andrea; Narayanan, Sri Hari Krishna
2016-10-10
In terms of computing time, adjoint methods offer a very attractive alternative to compute gradient information, re- quired, e.g., for optimization purposes. However, together with this very favorable temporal complexity result comes a memory requirement that is in essence proportional with the operation count of the underlying function, e.g., if algo- rithmic differentiation is used to provide the adjoints. For this reason, checkpointing approaches in many variants have become popular. This paper analyzes an extension of the so-called binomial approach to cover also possible failures of the computing systems. Such a measure of precaution is of special interest for massive parallel simulations and adjoint calculations where the mean time between failure of the large scale computing system is smaller than the time needed to complete the calculation of the adjoint information. We de- scribe the extensions of standard checkpointing approaches required for such resilience, provide a corresponding imple- mentation and discuss numerical results.
Forward selection two sample binomial test
Wong, Kam-Fai; Wong, Weng-Kee; Lin, Miao-Shan
2016-01-01
Fisher’s exact test (FET) is a conditional method that is frequently used to analyze data in a 2 × 2 table for small samples. This test is conservative and attempts have been made to modify the test to make it less conservative. For example, Crans and Shuster (2008) proposed adding more points in the rejection region to make the test more powerful. We provide another way to modify the test to make it less conservative by using two independent binomial distributions as the reference distribution for the test statistic. We compare our new test with several methods and show that our test has advantages over existing methods in terms of control of the type 1 and type 2 errors. We reanalyze results from an oncology trial using our proposed method and our software which is freely available to the reader. PMID:27335577
Constrained Dynamic Optimality and Binomial Terminal Wealth
DEFF Research Database (Denmark)
Pedersen, J. L.; Peskir, G.
2018-01-01
with interest rate $r \\in {R}$). Letting $P_{t,x}$ denote a probability measure under which $X^u$ takes value $x$ at time $t,$ we study the dynamic version of the nonlinear optimal control problem $\\inf_u\\, Var{t,X_t^u}(X_T^u)$ where the infimum is taken over admissible controls $u$ subject to $X_t^u \\ge e...... a martingale method combined with Lagrange multipliers, we derive the dynamically optimal control $u_*^d$ in closed form and prove that the dynamically optimal terminal wealth $X_T^d$ can only take two values $g$ and $\\beta$. This binomial nature of the dynamically optimal strategy stands in sharp contrast...... with other known portfolio selection strategies encountered in the literature. A direct comparison shows that the dynamically optimal (time-consistent) strategy outperforms the statically optimal (time-inconsistent) strategy in the problem....
Combinatorial bounds on the α-divergence of univariate mixture models
Nielsen, Frank; Sun, Ke
2017-01-01
We derive lower- and upper-bounds of α-divergence between univariate mixture models with components in the exponential family. Three pairs of bounds are presented in order with increasing quality and increasing computational cost. They are verified
Statistical imitation system using relational interest points and Gaussian mixture models
CSIR Research Space (South Africa)
Claassens, J
2009-11-01
Full Text Available The author proposes an imitation system that uses relational interest points (RIPs) and Gaussian mixture models (GMMs) to characterize a behaviour. The system's structure is inspired by the Robot Programming by Demonstration (RDP) paradigm...
Reynolds, Gavin K; Campbell, Jacqueline I; Roberts, Ron J
2017-10-05
A new model to predict the compressibility and compactability of mixtures of pharmaceutical powders has been developed. The key aspect of the model is consideration of the volumetric occupancy of each powder under an applied compaction pressure and the respective contribution it then makes to the mixture properties. The compressibility and compactability of three pharmaceutical powders: microcrystalline cellulose, mannitol and anhydrous dicalcium phosphate have been characterised. Binary and ternary mixtures of these excipients have been tested and used to demonstrate the predictive capability of the model. Furthermore, the model is shown to be uniquely able to capture a broad range of mixture behaviours, including neutral, negative and positive deviations, illustrating its utility for formulation design. Copyright © 2017 Elsevier B.V. All rights reserved.
Modeling Hydrodynamic State of Oil and Gas Condensate Mixture in a Pipeline
Directory of Open Access Journals (Sweden)
Dudin Sergey
2016-01-01
Based on the developed model a calculation method was obtained which is used to analyze hydrodynamic state and composition of hydrocarbon mixture in each ith section of the pipeline when temperature-pressure and hydraulic conditions change.
A predictive model of natural gas mixture combustion in internal combustion engines
Directory of Open Access Journals (Sweden)
Henry Espinoza
2007-05-01
Full Text Available This study shows the development of a predictive natural gas mixture combustion model for conventional com-bustion (ignition engines. The model was based on resolving two areas; one having unburned combustion mixture and another having combustion products. Energy and matter conservation equations were solved for each crankshaft turn angle for each area. Nonlinear differential equations for each phase’s energy (considering compression, combustion and expansion were solved by applying the fourth-order Runge-Kutta method. The model also enabled studying different natural gas components’ composition and evaluating combustion in the presence of dry and humid air. Validation results are shown with experimental data, demonstrating the software’s precision and accuracy in the results so produced. The results showed cylinder pressure, unburned and burned mixture temperature, burned mass fraction and combustion reaction heat for the engine being modelled using a natural gas mixture.
Mixture estimation with state-space components and Markov model of switching
Czech Academy of Sciences Publication Activity Database
Nagy, Ivan; Suzdaleva, Evgenia
2013-01-01
Roč. 37, č. 24 (2013), s. 9970-9984 ISSN 0307-904X R&D Projects: GA TA ČR TA01030123 Institutional support: RVO:67985556 Keywords : probabilistic dynamic mixtures, * probability density function * state-space models * recursive mixture estimation * Bayesian dynamic decision making under uncertainty * Kerridge inaccuracy Subject RIV: BC - Control Systems Theory Impact factor: 2.158, year: 2013 http://library.utia.cas.cz/separaty/2013/AS/nagy-mixture estimation with state-space components and markov model of switching.pdf
Estimation of value at risk and conditional value at risk using normal mixture distributions model
Kamaruzzaman, Zetty Ain; Isa, Zaidi
2013-04-01
Normal mixture distributions model has been successfully applied in financial time series analysis. In this paper, we estimate the return distribution, value at risk (VaR) and conditional value at risk (CVaR) for monthly and weekly rates of returns for FTSE Bursa Malaysia Kuala Lumpur Composite Index (FBMKLCI) from July 1990 until July 2010 using the two component univariate normal mixture distributions model. First, we present the application of normal mixture distributions model in empirical finance where we fit our real data. Second, we present the application of normal mixture distributions model in risk analysis where we apply the normal mixture distributions model to evaluate the value at risk (VaR) and conditional value at risk (CVaR) with model validation for both risk measures. The empirical results provide evidence that using the two components normal mixture distributions model can fit the data well and can perform better in estimating value at risk (VaR) and conditional value at risk (CVaR) where it can capture the stylized facts of non-normality and leptokurtosis in returns distribution.
Use of the negative binomial-truncated Poisson distribution in thunderstorm prediction
Cohen, A. C.
1971-01-01
A probability model is presented for the distribution of thunderstorms over a small area given that thunderstorm events (1 or more thunderstorms) are occurring over a larger area. The model incorporates the negative binomial and truncated Poisson distributions. Probability tables for Cape Kennedy for spring, summer, and fall months and seasons are presented. The computer program used to compute these probabilities is appended.
A quantitative trait locus mixture model that avoids spurious LOD score peaks.
Feenstra, Bjarke; Skovgaard, Ib M
2004-06-01
In standard interval mapping of quantitative trait loci (QTL), the QTL effect is described by a normal mixture model. At any given location in the genome, the evidence of a putative QTL is measured by the likelihood ratio of the mixture model compared to a single normal distribution (the LOD score). This approach can occasionally produce spurious LOD score peaks in regions of low genotype information (e.g., widely spaced markers), especially if the phenotype distribution deviates markedly from a normal distribution. Such peaks are not indicative of a QTL effect; rather, they are caused by the fact that a mixture of normals always produces a better fit than a single normal distribution. In this study, a mixture model for QTL mapping that avoids the problems of such spurious LOD score peaks is presented.
A Dirichlet process mixture of generalized Dirichlet distributions for proportional data modeling.
Bouguila, Nizar; Ziou, Djemel
2010-01-01
In this paper, we propose a clustering algorithm based on both Dirichlet processes and generalized Dirichlet distribution which has been shown to be very flexible for proportional data modeling. Our approach can be viewed as an extension of the finite generalized Dirichlet mixture model to the infinite case. The extension is based on nonparametric Bayesian analysis. This clustering algorithm does not require the specification of the number of mixture components to be given in advance and estimates it in a principled manner. Our approach is Bayesian and relies on the estimation of the posterior distribution of clusterings using Gibbs sampler. Through some applications involving real-data classification and image databases categorization using visual words, we show that clustering via infinite mixture models offers a more powerful and robust performance than classic finite mixtures.
International Nuclear Information System (INIS)
Maevskii, K. K.; Kinelovskii, S. A.
2015-01-01
The numerical results of modeling of shock wave loading of mixtures with the SiO 2 component are presented. The TEC (thermodynamic equilibrium component) model is employed to describe the behavior of solid and porous multicomponent mixtures and alloys under shock wave loading. State equations of a Mie–Grüneisen type are used to describe the behavior of condensed phases, taking into account the temperature dependence of the Grüneisen coefficient, gas in pores is one of the components of the environment. The model is based on the assumption that all components of the mixture under shock-wave loading are in thermodynamic equilibrium. The calculation results are compared with the experimental data derived by various authors. The behavior of the mixture containing components with a phase transition under high dynamic loads is described
Latent Transition Analysis with a Mixture Item Response Theory Measurement Model
Cho, Sun-Joo; Cohen, Allan S.; Kim, Seock-Ho; Bottge, Brian
2010-01-01
A latent transition analysis (LTA) model was described with a mixture Rasch model (MRM) as the measurement model. Unlike the LTA, which was developed with a latent class measurement model, the LTA-MRM permits within-class variability on the latent variable, making it more useful for measuring treatment effects within latent classes. A simulation…
DEFF Research Database (Denmark)
Feng, Huan; Pettinari, Matteo; Stang, Henrik
2016-01-01
modulus. Three different approaches have been used and compared for calibrating the Burger's contact model. Values of the dynamic modulus and phase angle of asphalt mixtures were predicted by conducting DE simulation under dynamic strain control loading. The excellent agreement between the predicted......In this paper the viscoelastic behavior of asphalt mixture was investigated by employing a three-dimensional discrete element method. Combined with Burger's model, three contact models were used for the construction of constitutive asphalt mixture model with viscoelastic properties...
Wigner Function of Density Operator for Negative Binomial Distribution
International Nuclear Information System (INIS)
Xu Xinglei; Li Hongqi
2008-01-01
By using the technique of integration within an ordered product (IWOP) of operator we derive Wigner function of density operator for negative binomial distribution of radiation field in the mixed state case, then we derive the Wigner function of squeezed number state, which yields negative binomial distribution by virtue of the entangled state representation and the entangled Wigner operator
Optimized Binomial Quantum States of Complex Oscillators with Real Spectrum
International Nuclear Information System (INIS)
Zelaya, K D; Rosas-Ortiz, O
2016-01-01
Classical and nonclassical states of quantum complex oscillators with real spectrum are presented. Such states are bi-orthonormal superpositions of n +1 energy eigenvectors of the system with binomial-like coefficients. For large values of n these optimized binomial states behave as photon added coherent states when the imaginary part of the potential is cancelled. (paper)
Discovering Binomial Identities with PascGaloisJE
Evans, Tyler J.
2008-01-01
We describe exercises in which students use PascGaloisJE to formulate conjectures about certain binomial identities which hold when the binomial coefficients are interpreted as elements in the cyclic group Z[subscript p] of integers modulo a prime integer "p". In addition to having an appealing visual component, these exercises are open-ended and…
Directory of Open Access Journals (Sweden)
Abdenaceur Boudlal
2010-01-01
Full Text Available This article investigates a new method of motion estimation based on block matching criterion through the modeling of image blocks by a mixture of two and three Gaussian distributions. Mixture parameters (weights, means vectors, and covariance matrices are estimated by the Expectation Maximization algorithm (EM which maximizes the log-likelihood criterion. The similarity between a block in the current image and the more resembling one in a search window on the reference image is measured by the minimization of Extended Mahalanobis distance between the clusters of mixture. Performed experiments on sequences of real images have given good results, and PSNR reached 3 dB.
Thermodiffusion in Multicomponent Mixtures Thermodynamic, Algebraic, and Neuro-Computing Models
Srinivasan, Seshasai
2013-01-01
Thermodiffusion in Multicomponent Mixtures presents the computational approaches that are employed in the study of thermodiffusion in various types of mixtures, namely, hydrocarbons, polymers, water-alcohol, molten metals, and so forth. We present a detailed formalism of these methods that are based on non-equilibrium thermodynamics or algebraic correlations or principles of the artificial neural network. The book will serve as single complete reference to understand the theoretical derivations of thermodiffusion models and its application to different types of multi-component mixtures. An exhaustive discussion of these is used to give a complete perspective of the principles and the key factors that govern the thermodiffusion process.
Effects of Test Conditions on APA Rutting and Prediction Modeling for Asphalt Mixtures
Directory of Open Access Journals (Sweden)
Hui Wang
2017-01-01
Full Text Available APA rutting tests were conducted for six kinds of asphalt mixtures under air-dry and immersing conditions. The influences of test conditions, including load, temperature, air voids, and moisture, on APA rutting depth were analyzed by using grey correlation method, and the APA rutting depth prediction model was established. Results show that the modified asphalt mixtures have bigger rutting depth ratios of air-dry to immersing conditions, indicating that the modified asphalt mixtures have better antirutting properties and water stability than the matrix asphalt mixtures. The grey correlation degrees of temperature, load, air void, and immersing conditions on APA rutting depth decrease successively, which means that temperature is the most significant influencing factor. The proposed indoor APA rutting prediction model has good prediction accuracy, and the correlation coefficient between the predicted and the measured rutting depths is 96.3%.
Study of the Internal Mechanical response of an asphalt mixture by 3-D Discrete Element Modeling
DEFF Research Database (Denmark)
Feng, Huan; Pettinari, Matteo; Hofko, Bernhard
2015-01-01
and the reliability of which have been validated. The dynamic modulus of asphalt mixtures were predicted by conducting Discrete Element simulation under dynamic strain control loading. In order to reduce the calculation time, a method based on frequency–temperature superposition principle has been implemented......In this paper the viscoelastic behavior of asphalt mixture was investigated by employing a three-dimensional Discrete Element Method (DEM). The cylinder model was filled with cubic array of spheres with a specified radius, and was considered as a whole mixture with uniform contact properties....... The ball density effect on the internal stress distribution of the asphalt mixture model has been studied when using this method. Furthermore, the internal stresses under dynamic loading have been studied. The agreement between the predicted and the laboratory test results of the complex modulus shows...
Excess Properties of Aqueous Mixtures of Methanol: Simple Models Versus Experiment
Czech Academy of Sciences Publication Activity Database
Vlček, Lukáš; Nezbeda, Ivo
roč. 131-132, - (2007), s. 158-162 ISSN 0167-7322. [International Conference on Solution Chemistry /29./. Portorož, 21.08.2005-25.08.2005] R&D Projects: GA AV ČR(CZ) IAA4072303; GA AV ČR(CZ) 1ET400720409 Institutional research plan: CEZ:AV0Z40720504 Keywords : aqueous mixtures * primitive models * water-alcohol mixtures Subject RIV: CF - Physical ; Theoretical Chemistry Impact factor: 0.982, year: 2007
Directory of Open Access Journals (Sweden)
F. C. PEIXOTO
1999-09-01
Full Text Available Fragmentation kinetics is employed to model a continuous reactive mixture of alkanes under catalytic cracking conditions. Standard moment analysis techniques are employed, and a dynamic system for the time evolution of moments of the mixture's dimensionless concentration distribution function (DCDF is found. The time behavior of the DCDF is recovered with successive estimations of scaled gamma distributions using the moments time data.
Determining order-up-to levels under periodic review for compound binomial (intermittent) demand
Teunter, R. H.; Syntetos, A. A.; Babai, M. Z.
2010-01-01
We propose a new method for determining order-up-to levels for intermittent demand items in a periodic review system. Contrary to existing methods, we exploit the intermittent character of demand by modelling lead time demand as a compound binomial process. in an extensive numerical study using
Piecewise Linear-Linear Latent Growth Mixture Models with Unknown Knots
Kohli, Nidhi; Harring, Jeffrey R.; Hancock, Gregory R.
2013-01-01
Latent growth curve models with piecewise functions are flexible and useful analytic models for investigating individual behaviors that exhibit distinct phases of development in observed variables. As an extension of this framework, this study considers a piecewise linear-linear latent growth mixture model (LGMM) for describing segmented change of…
A globally accurate theory for a class of binary mixture models
Dickman, Adriana G.; Stell, G.
The self-consistent Ornstein-Zernike approximation results for the 3D Ising model are used to obtain phase diagrams for binary mixtures described by decorated models, yielding the plait point, binodals, and closed-loop coexistence curves for the models proposed by Widom, Clark, Neece, and Wheeler. The results are in good agreement with series expansions and experiments.
Structure-reactivity modeling using mixture-based representation of chemical reactions.
Polishchuk, Pavel; Madzhidov, Timur; Gimadiev, Timur; Bodrov, Andrey; Nugmanov, Ramil; Varnek, Alexandre
2017-09-01
We describe a novel approach of reaction representation as a combination of two mixtures: a mixture of reactants and a mixture of products. In turn, each mixture can be encoded using an earlier reported approach involving simplex descriptors (SiRMS). The feature vector representing these two mixtures results from either concatenated product and reactant descriptors or the difference between descriptors of products and reactants. This reaction representation doesn't need an explicit labeling of a reaction center. The rigorous "product-out" cross-validation (CV) strategy has been suggested. Unlike the naïve "reaction-out" CV approach based on a random selection of items, the proposed one provides with more realistic estimation of prediction accuracy for reactions resulting in novel products. The new methodology has been applied to model rate constants of E2 reactions. It has been demonstrated that the use of the fragment control domain applicability approach significantly increases prediction accuracy of the models. The models obtained with new "mixture" approach performed better than those required either explicit (Condensed Graph of Reaction) or implicit (reaction fingerprints) reaction center labeling.
Karagiannis, Georgios; Lin, Guang
2017-08-01
For many real systems, several computer models may exist with different physics and predictive abilities. To achieve more accurate simulations/predictions, it is desirable for these models to be properly combined and calibrated. We propose the Bayesian calibration of computer model mixture method which relies on the idea of representing the real system output as a mixture of the available computer model outputs with unknown input dependent weight functions. The method builds a fully Bayesian predictive model as an emulator for the real system output by combining, weighting, and calibrating the available models in the Bayesian framework. Moreover, it fits a mixture of calibrated computer models that can be used by the domain scientist as a mean to combine the available computer models, in a flexible and principled manner, and perform reliable simulations. It can address realistic cases where one model may be more accurate than the others at different input values because the mixture weights, indicating the contribution of each model, are functions of the input. Inference on the calibration parameters can consider multiple computer models associated with different physics. The method does not require knowledge of the fidelity order of the models. We provide a technique able to mitigate the computational overhead due to the consideration of multiple computer models that is suitable to the mixture model framework. We implement the proposed method in a real-world application involving the Weather Research and Forecasting large-scale climate model.
Monitoring and modeling of ultrasonic wave propagation in crystallizing mixtures
Marshall, T.; Challis, R. E.; Tebbutt, J. S.
2002-05-01
The utility of ultrasonic compression wave techniques for monitoring crystallization processes is investigated in a study of the seeded crystallization of copper II sulfate pentahydrate from aqueous solution. Simple models are applied to predict crystal yield, crystal size distribution and the changing nature of the continuous phase. A scattering model is used to predict the ultrasonic attenuation as crystallization proceeds. Experiments confirm that modeled attenuation is in agreement with measured results.
Molenaar, Dylan; de Boeck, Paul
2018-06-01
In item response theory modeling of responses and response times, it is commonly assumed that the item responses have the same characteristics across the response times. However, heterogeneity might arise in the data if subjects resort to different response processes when solving the test items. These differences may be within-subject effects, that is, a subject might use a certain process on some of the items and a different process with different item characteristics on the other items. If the probability of using one process over the other process depends on the subject's response time, within-subject heterogeneity of the item characteristics across the response times arises. In this paper, the method of response mixture modeling is presented to account for such heterogeneity. Contrary to traditional mixture modeling where the full response vectors are classified, response mixture modeling involves classification of the individual elements in the response vector. In a simulation study, the response mixture model is shown to be viable in terms of parameter recovery. In addition, the response mixture model is applied to a real dataset to illustrate its use in investigating within-subject heterogeneity in the item characteristics across response times.
Latent Partially Ordered Classification Models and Normal Mixtures
Tatsuoka, Curtis; Varadi, Ferenc; Jaeger, Judith
2013-01-01
Latent partially ordered sets (posets) can be employed in modeling cognitive functioning, such as in the analysis of neuropsychological (NP) and educational test data. Posets are cognitively diagnostic in the sense that classification states in these models are associated with detailed profiles of cognitive functioning. These profiles allow for…
Safaei, Farinaz; Castorena, Cassie; Kim, Y. Richard
2016-08-01
Fatigue cracking is a major form of distress in asphalt pavements. Asphalt binder is the weakest asphalt concrete constituent and, thus, plays a critical role in determining the fatigue resistance of pavements. Therefore, the ability to characterize and model the inherent fatigue performance of an asphalt binder is a necessary first step to design mixtures and pavements that are not susceptible to premature fatigue failure. The simplified viscoelastic continuum damage (S-VECD) model has been used successfully by researchers to predict the damage evolution in asphalt mixtures for various traffic and climatic conditions using limited uniaxial test data. In this study, the S-VECD model, developed for asphalt mixtures, is adapted for asphalt binders tested under cyclic torsion in a dynamic shear rheometer. Derivation of the model framework is presented. The model is verified by producing damage characteristic curves that are both temperature- and loading history-independent based on time sweep tests, given that the effects of plasticity and adhesion loss on the material behavior are minimal. The applicability of the S-VECD model to the accelerated loading that is inherent of the linear amplitude sweep test is demonstrated, which reveals reasonable performance predictions, but with some loss in accuracy compared to time sweep tests due to the confounding effects of nonlinearity imposed by the high strain amplitudes included in the test. The asphalt binder S-VECD model is validated through comparisons to asphalt mixture S-VECD model results derived from cyclic direct tension tests and Accelerated Loading Facility performance tests. The results demonstrate good agreement between the asphalt binder and mixture test results and pavement performance, indicating that the developed model framework is able to capture the asphalt binder's contribution to mixture fatigue and pavement fatigue cracking performance.
Analysis of real-time mixture cytotoxicity data following repeated exposure using BK/TD models
International Nuclear Information System (INIS)
Teng, S.; Tebby, C.; Barcellini-Couget, S.; De Sousa, G.; Brochot, C.; Rahmani, R.; Pery, A.R.R.
2016-01-01
Cosmetic products generally consist of multiple ingredients. Thus, cosmetic risk assessment has to deal with mixture toxicity on a long-term scale which means it has to be assessed in the context of repeated exposure. Given that animal testing has been banned for cosmetics risk assessment, in vitro assays allowing long-term repeated exposure and adapted for in vitro – in vivo extrapolation need to be developed. However, most in vitro tests only assess short-term effects and consider static endpoints which hinder extrapolation to realistic human exposure scenarios where concentration in target organs is varies over time. Thanks to impedance metrics, real-time cell viability monitoring for repeated exposure has become possible. We recently constructed biokinetic/toxicodynamic models (BK/TD) to analyze such data (Teng et al., 2015) for three hepatotoxic cosmetic ingredients: coumarin, isoeugenol and benzophenone-2. In the present study, we aim to apply these models to analyze the dynamics of mixture impedance data using the concepts of concentration addition and independent action. Metabolic interactions between the mixture components were investigated, characterized and implemented in the models, as they impacted the actual cellular exposure. Indeed, cellular metabolism following mixture exposure induced a quick disappearance of the compounds from the exposure system. We showed that isoeugenol substantially decreased the metabolism of benzophenone-2, reducing the disappearance of this compound and enhancing its in vitro toxicity. Apart from this metabolic interaction, no mixtures showed any interaction, and all binary mixtures were successfully modeled by at least one model based on exposure to the individual compounds. - Highlights: • We could predict cell response over repeated exposure to mixtures of cosmetics. • Compounds acted independently on the cells. • Metabolic interactions impacted exposure concentrations to the compounds.
Analysis of real-time mixture cytotoxicity data following repeated exposure using BK/TD models
Energy Technology Data Exchange (ETDEWEB)
Teng, S.; Tebby, C. [Models for Toxicology and Ecotoxicology Unit, INERIS, Parc Technologique Alata, BP 2, 60550 Verneuil-en-Halatte (France); Barcellini-Couget, S. [ODESIA Neosciences, Sophia Antipolis, 400 route des chappes, 06903 Sophia Antipolis (France); De Sousa, G. [INRA, ToxAlim, 400 route des Chappes, BP, 167 06903 Sophia Antipolis, Cedex (France); Brochot, C. [Models for Toxicology and Ecotoxicology Unit, INERIS, Parc Technologique Alata, BP 2, 60550 Verneuil-en-Halatte (France); Rahmani, R. [INRA, ToxAlim, 400 route des Chappes, BP, 167 06903 Sophia Antipolis, Cedex (France); Pery, A.R.R., E-mail: alexandre.pery@agroparistech.fr [AgroParisTech, UMR 1402 INRA-AgroParisTech Ecosys, 78850 Thiverval Grignon (France); INRA, UMR 1402 INRA-AgroParisTech Ecosys, 78850 Thiverval Grignon (France)
2016-08-15
Cosmetic products generally consist of multiple ingredients. Thus, cosmetic risk assessment has to deal with mixture toxicity on a long-term scale which means it has to be assessed in the context of repeated exposure. Given that animal testing has been banned for cosmetics risk assessment, in vitro assays allowing long-term repeated exposure and adapted for in vitro – in vivo extrapolation need to be developed. However, most in vitro tests only assess short-term effects and consider static endpoints which hinder extrapolation to realistic human exposure scenarios where concentration in target organs is varies over time. Thanks to impedance metrics, real-time cell viability monitoring for repeated exposure has become possible. We recently constructed biokinetic/toxicodynamic models (BK/TD) to analyze such data (Teng et al., 2015) for three hepatotoxic cosmetic ingredients: coumarin, isoeugenol and benzophenone-2. In the present study, we aim to apply these models to analyze the dynamics of mixture impedance data using the concepts of concentration addition and independent action. Metabolic interactions between the mixture components were investigated, characterized and implemented in the models, as they impacted the actual cellular exposure. Indeed, cellular metabolism following mixture exposure induced a quick disappearance of the compounds from the exposure system. We showed that isoeugenol substantially decreased the metabolism of benzophenone-2, reducing the disappearance of this compound and enhancing its in vitro toxicity. Apart from this metabolic interaction, no mixtures showed any interaction, and all binary mixtures were successfully modeled by at least one model based on exposure to the individual compounds. - Highlights: • We could predict cell response over repeated exposure to mixtures of cosmetics. • Compounds acted independently on the cells. • Metabolic interactions impacted exposure concentrations to the compounds.
Modelling time course gene expression data with finite mixtures of linear additive models.
Grün, Bettina; Scharl, Theresa; Leisch, Friedrich
2012-01-15
A model class of finite mixtures of linear additive models is presented. The component-specific parameters in the regression models are estimated using regularized likelihood methods. The advantages of the regularization are that (i) the pre-specified maximum degrees of freedom for the splines is less crucial than for unregularized estimation and that (ii) for each component individually a suitable degree of freedom is selected in an automatic way. The performance is evaluated in a simulation study with artificial data as well as on a yeast cell cycle dataset of gene expression levels over time. The latest release version of the R package flexmix is available from CRAN (http://cran.r-project.org/).
Robust non-rigid point set registration using student's-t mixture model.
Directory of Open Access Journals (Sweden)
Zhiyong Zhou
Full Text Available The Student's-t mixture model, which is heavily tailed and more robust than the Gaussian mixture model, has recently received great attention on image processing. In this paper, we propose a robust non-rigid point set registration algorithm using the Student's-t mixture model. Specifically, first, we consider the alignment of two point sets as a probability density estimation problem and treat one point set as Student's-t mixture model centroids. Then, we fit the Student's-t mixture model centroids to the other point set which is treated as data. Finally, we get the closed-form solutions of registration parameters, leading to a computationally efficient registration algorithm. The proposed algorithm is especially effective for addressing the non-rigid point set registration problem when significant amounts of noise and outliers are present. Moreover, less registration parameters have to be set manually for our algorithm compared to the popular coherent points drift (CPD algorithm. We have compared our algorithm with other state-of-the-art registration algorithms on both 2D and 3D data with noise and outliers, where our non-rigid registration algorithm showed accurate results and outperformed the other algorithms.
Introduction to the special section on mixture modeling in personality assessment.
Wright, Aidan G C; Hallquist, Michael N
2014-01-01
Latent variable models offer a conceptual and statistical framework for evaluating the underlying structure of psychological constructs, including personality and psychopathology. Complex structures that combine or compare categorical and dimensional latent variables can be accommodated using mixture modeling approaches, which provide a powerful framework for testing nuanced theories about psychological structure. This special series includes introductory primers on cross-sectional and longitudinal mixture modeling, in addition to empirical examples applying these techniques to real-world data collected in clinical settings. This group of articles is designed to introduce personality assessment scientists and practitioners to a general latent variable framework that we hope will stimulate new research and application of mixture models to the assessment of personality and its pathology.
A BGK model for reactive mixtures of polyatomic gases with continuous internal energy
Bisi, M.; Monaco, R.; Soares, A. J.
2018-03-01
In this paper we derive a BGK relaxation model for a mixture of polyatomic gases with a continuous structure of internal energies. The emphasis of the paper is on the case of a quaternary mixture undergoing a reversible chemical reaction of bimolecular type. For such a mixture we prove an H -theorem and characterize the equilibrium solutions with the related mass action law of chemical kinetics. Further, a Chapman-Enskog asymptotic analysis is performed in view of computing the first-order non-equilibrium corrections to the distribution functions and investigating the transport properties of the reactive mixture. The chemical reaction rate is explicitly derived at the first order and the balance equations for the constituent number densities are derived at the Euler level.
The phase behavior of a hard sphere chain model of a binary n-alkane mixture
International Nuclear Information System (INIS)
Malanoski, A. P.; Monson, P. A.
2000-01-01
Monte Carlo computer simulations have been used to study the solid and fluid phase properties as well as phase equilibrium in a flexible, united atom, hard sphere chain model of n-heptane/n-octane mixtures. We describe a methodology for calculating the chemical potentials for the components in the mixture based on a technique used previously for atomic mixtures. The mixture was found to conform accurately to ideal solution behavior in the fluid phase. However, much greater nonidealities were seen in the solid phase. Phase equilibrium calculations indicate a phase diagram with solid-fluid phase equilibrium and a eutectic point. The components are only miscible in the solid phase for dilute solutions of the shorter chains in the longer chains. (c) 2000 American Institute of Physics
Directory of Open Access Journals (Sweden)
Jacek Namieśnik
2013-04-01
Full Text Available The paper presents the potential of an electronic nose technique in the field of fast diagnostics of patients suspected of Chronic Obstructive Pulmonary Disease (COPD. The investigations were performed using a simple electronic nose prototype equipped with a set of six semiconductor sensors manufactured by FIGARO Co. They were aimed at verification of a possibility of differentiation between model reference mixtures with potential COPD markers (N,N-dimethylformamide and N,N-dimethylacetamide. These mixtures contained volatile organic compounds (VOCs such as acetone, isoprene, carbon disulphide, propan-2-ol, formamide, benzene, toluene, acetonitrile, acetic acid, dimethyl ether, dimethyl sulphide, acrolein, furan, propanol and pyridine, recognized as the components of exhaled air. The model reference mixtures were prepared at three concentration levels—10 ppb, 25 ppb, 50 ppb v/v—of each component, except for the COPD markers. Concentration of the COPD markers in the mixtures was from 0 ppb to 100 ppb v/v. Interpretation of the obtained data employed principal component analysis (PCA. The investigations revealed the usefulness of the electronic device only in the case when the concentration of the COPD markers was twice as high as the concentration of the remaining components of the mixture and for a limited number of basic mixture components.
A numerical model for boiling heat transfer coefficient of zeotropic mixtures
Barraza Vicencio, Rodrigo; Caviedes Aedo, Eduardo
2017-12-01
Zeotropic mixtures never have the same liquid and vapor composition in the liquid-vapor equilibrium. Also, the bubble and the dew point are separated; this gap is called glide temperature (Tglide). Those characteristics have made these mixtures suitable for cryogenics Joule-Thomson (JT) refrigeration cycles. Zeotropic mixtures as working fluid in JT cycles improve their performance in an order of magnitude. Optimization of JT cycles have earned substantial importance for cryogenics applications (e.g, gas liquefaction, cryosurgery probes, cooling of infrared sensors, cryopreservation, and biomedical samples). Heat exchangers design on those cycles is a critical point; consequently, heat transfer coefficient and pressure drop of two-phase zeotropic mixtures are relevant. In this work, it will be applied a methodology in order to calculate the local convective heat transfer coefficients based on the law of the wall approach for turbulent flows. The flow and heat transfer characteristics of zeotropic mixtures in a heated horizontal tube are investigated numerically. The temperature profile and heat transfer coefficient for zeotropic mixtures of different bulk compositions are analysed. The numerical model has been developed and locally applied in a fully developed, constant temperature wall, and two-phase annular flow in a duct. Numerical results have been obtained using this model taking into account continuity, momentum, and energy equations. Local heat transfer coefficient results are compared with available experimental data published by Barraza et al. (2016), and they have shown good agreement.
A Bayesian approach to the analysis of quantal bioassay studies using nonparametric mixture models.
Fronczyk, Kassandra; Kottas, Athanasios
2014-03-01
We develop a Bayesian nonparametric mixture modeling framework for quantal bioassay settings. The approach is built upon modeling dose-dependent response distributions. We adopt a structured nonparametric prior mixture model, which induces a monotonicity restriction for the dose-response curve. Particular emphasis is placed on the key risk assessment goal of calibration for the dose level that corresponds to a specified response. The proposed methodology yields flexible inference for the dose-response relationship as well as for other inferential objectives, as illustrated with two data sets from the literature. © 2013, The International Biometric Society.
Konishi, C.
2014-12-01
Gravel-sand-clay mixture model is proposed particularly for unconsolidated sediments to predict permeability and velocity from volume fractions of the three components (i.e. gravel, sand, and clay). A well-known sand-clay mixture model or bimodal mixture model treats clay contents as volume fraction of the small particle and the rest of the volume is considered as that of the large particle. This simple approach has been commonly accepted and has validated by many studies before. However, a collection of laboratory measurements of permeability and grain size distribution for unconsolidated samples show an impact of presence of another large particle; i.e. only a few percent of gravel particles increases the permeability of the sample significantly. This observation cannot be explained by the bimodal mixture model and it suggests the necessity of considering the gravel-sand-clay mixture model. In the proposed model, I consider the three volume fractions of each component instead of using only the clay contents. Sand becomes either larger or smaller particles in the three component mixture model, whereas it is always the large particle in the bimodal mixture model. The total porosity of the two cases, one is the case that the sand is smaller particle and the other is the case that the sand is larger particle, can be modeled independently from sand volume fraction by the same fashion in the bimodal model. However, the two cases can co-exist in one sample; thus, the total porosity of the mixed sample is calculated by weighted average of the two cases by the volume fractions of gravel and clay. The effective porosity is distinguished from the total porosity assuming that the porosity associated with clay is zero effective porosity. In addition, effective grain size can be computed from the volume fractions and representative grain sizes for each component. Using the effective porosity and the effective grain size, the permeability is predicted by Kozeny-Carman equation
A general mixture model and its application to coastal sandbar migration simulation
Liang, Lixin; Yu, Xiping
2017-04-01
A mixture model for general description of sediment laden flows is developed and then applied to coastal sandbar migration simulation. Firstly the mixture model is derived based on the Eulerian-Eulerian approach of the complete two-phase flow theory. The basic equations of the model include the mass and momentum conservation equations for the water-sediment mixture and the continuity equation for sediment concentration. The turbulent motion of the mixture is formulated for the fluid and the particles respectively. A modified k-ɛ model is used to describe the fluid turbulence while an algebraic model is adopted for the particles. A general formulation for the relative velocity between the two phases in sediment laden flows, which is derived by manipulating the momentum equations of the enhanced two-phase flow model, is incorporated into the mixture model. A finite difference method based on SMAC scheme is utilized for numerical solutions. The model is validated by suspended sediment motion in steady open channel flows, both in equilibrium and non-equilibrium state, and in oscillatory flows as well. The computed sediment concentrations, horizontal velocity and turbulence kinetic energy of the mixture are all shown to be in good agreement with experimental data. The mixture model is then applied to the study of sediment suspension and sandbar migration in surf zones under a vertical 2D framework. The VOF method for the description of water-air free surface and topography reaction model is coupled. The bed load transport rate and suspended load entrainment rate are all decided by the sea bed shear stress, which is obtained from the boundary layer resolved mixture model. The simulation results indicated that, under small amplitude regular waves, erosion occurred on the sandbar slope against the wave propagation direction, while deposition dominated on the slope towards wave propagation, indicating an onshore migration tendency. The computation results also shows that
Gao, Yongfei; Feng, Jianfeng; Kang, Lili; Xu, Xin; Zhu, Lin
2018-01-01
The joint toxicity of chemical mixtures has emerged as a popular topic, particularly on the additive and potential synergistic actions of environmental mixtures. We investigated the 24h toxicity of Cu-Zn, Cu-Cd, and Cu-Pb and 96h toxicity of Cd-Pb binary mixtures on the survival of zebrafish larvae. Joint toxicity was predicted and compared using the concentration addition (CA) and independent action (IA) models with different assumptions in the toxic action mode in toxicodynamic processes through single and binary metal mixture tests. Results showed that the CA and IA models presented varying predictive abilities for different metal combinations. For the Cu-Cd and Cd-Pb mixtures, the CA model simulated the observed survival rates better than the IA model. By contrast, the IA model simulated the observed survival rates better than the CA model for the Cu-Zn and Cu-Pb mixtures. These findings revealed that the toxic action mode may depend on the combinations and concentrations of tested metal mixtures. Statistical analysis of the antagonistic or synergistic interactions indicated that synergistic interactions were observed for the Cu-Cd and Cu-Pb mixtures, non-interactions were observed for the Cd-Pb mixtures, and slight antagonistic interactions for the Cu-Zn mixtures. These results illustrated that the CA and IA models are consistent in specifying the interaction patterns of binary metal mixtures. Copyright © 2017 Elsevier B.V. All rights reserved.
Modeling of nanoscale liquid mixture transport by density functional hydrodynamics
Dinariev, Oleg Yu.; Evseev, Nikolay V.
2017-06-01
Modeling of multiphase compositional hydrodynamics at nanoscale is performed by means of density functional hydrodynamics (DFH). DFH is the method based on density functional theory and continuum mechanics. This method has been developed by the authors over 20 years and used for modeling in various multiphase hydrodynamic applications. In this paper, DFH was further extended to encompass phenomena inherent in liquids at nanoscale. The new DFH extension is based on the introduction of external potentials for chemical components. These potentials are localized in the vicinity of solid surfaces and take account of the van der Waals forces. A set of numerical examples, including disjoining pressure, film precursors, anomalous rheology, liquid in contact with heterogeneous surface, capillary condensation, and forward and reverse osmosis, is presented to demonstrate modeling capabilities.
A Frank mixture copula family for modeling higher-order correlations of neural spike counts
International Nuclear Information System (INIS)
Onken, Arno; Obermayer, Klaus
2009-01-01
In order to evaluate the importance of higher-order correlations in neural spike count codes, flexible statistical models of dependent multivariate spike counts are required. Copula families, parametric multivariate distributions that represent dependencies, can be applied to construct such models. We introduce the Frank mixture family as a new copula family that has separate parameters for all pairwise and higher-order correlations. In contrast to the Farlie-Gumbel-Morgenstern copula family that shares this property, the Frank mixture copula can model strong correlations. We apply spike count models based on the Frank mixture copula to data generated by a network of leaky integrate-and-fire neurons and compare the goodness of fit to distributions based on the Farlie-Gumbel-Morgenstern family. Finally, we evaluate the importance of using proper single neuron spike count distributions on the Shannon information. We find notable deviations in the entropy that increase with decreasing firing rates. Moreover, we find that the Frank mixture family increases the log likelihood of the fit significantly compared to the Farlie-Gumbel-Morgenstern family. This shows that the Frank mixture copula is a useful tool to assess the importance of higher-order correlations in spike count codes.
A nonlinear isobologram model with Box-Cox transformation to both sides for chemical mixtures.
Chen, D G; Pounds, J G
1998-12-01
The linear logistical isobologram is a commonly used and powerful graphical and statistical tool for analyzing the combined effects of simple chemical mixtures. In this paper a nonlinear isobologram model is proposed to analyze the joint action of chemical mixtures for quantitative dose-response relationships. This nonlinear isobologram model incorporates two additional new parameters, Ymin and Ymax, to facilitate analysis of response data that are not constrained between 0 and 1, where parameters Ymin and Ymax represent the minimal and the maximal observed toxic response. This nonlinear isobologram model for binary mixtures can be expressed as [formula: see text] In addition, a Box-Cox transformation to both sides is introduced to improve the goodness of fit and to provide a more robust model for achieving homogeneity and normality of the residuals. Finally, a confidence band is proposed for selected isobols, e.g., the median effective dose, to facilitate graphical and statistical analysis of the isobologram. The versatility of this approach is demonstrated using published data describing the toxicity of the binary mixtures of citrinin and ochratoxin as well as a new experimental data from our laboratory for mixtures of mercury and cadmium.
NBLDA: negative binomial linear discriminant analysis for RNA-Seq data.
Dong, Kai; Zhao, Hongyu; Tong, Tiejun; Wan, Xiang
2016-09-13
RNA-sequencing (RNA-Seq) has become a powerful technology to characterize gene expression profiles because it is more accurate and comprehensive than microarrays. Although statistical methods that have been developed for microarray data can be applied to RNA-Seq data, they are not ideal due to the discrete nature of RNA-Seq data. The Poisson distribution and negative binomial distribution are commonly used to model count data. Recently, Witten (Annals Appl Stat 5:2493-2518, 2011) proposed a Poisson linear discriminant analysis for RNA-Seq data. The Poisson assumption may not be as appropriate as the negative binomial distribution when biological replicates are available and in the presence of overdispersion (i.e., when the variance is larger than or equal to the mean). However, it is more complicated to model negative binomial variables because they involve a dispersion parameter that needs to be estimated. In this paper, we propose a negative binomial linear discriminant analysis for RNA-Seq data. By Bayes' rule, we construct the classifier by fitting a negative binomial model, and propose some plug-in rules to estimate the unknown parameters in the classifier. The relationship between the negative binomial classifier and the Poisson classifier is explored, with a numerical investigation of the impact of dispersion on the discriminant score. Simulation results show the superiority of our proposed method. We also analyze two real RNA-Seq data sets to demonstrate the advantages of our method in real-world applications. We have developed a new classifier using the negative binomial model for RNA-seq data classification. Our simulation results show that our proposed classifier has a better performance than existing works. The proposed classifier can serve as an effective tool for classifying RNA-seq data. Based on the comparison results, we have provided some guidelines for scientists to decide which method should be used in the discriminant analysis of RNA-Seq data
Modeling diffusion coefficients in binary mixtures of polar and non-polar compounds
DEFF Research Database (Denmark)
Medvedev, Oleg; Shapiro, Alexander
2005-01-01
The theory of transport coefficients in liquids, developed previously, is tested on a description of the diffusion coefficients in binary polar/non-polar mixtures, by applying advanced thermodynamic models. Comparison to a large set of experimental data shows good performance of the model. Only f...
Johnson, David L.; Jansen, Ritsert C.; Arendonk, Johan A.M. van
1999-01-01
A mixture model approach is employed for the mapping of quantitative trait loci (QTL) for the situation where individuals, in an outbred population, are selectively genotyped. Maximum likelihood estimation of model parameters is obtained from an Expectation-Maximization (EM) algorithm facilitated by
Growth Kinetics and Modeling of Direct Oxynitride Growth with NO-O2 Gas Mixtures
Energy Technology Data Exchange (ETDEWEB)
Everist, Sarah; Nelson, Jerry; Sharangpani, Rahul; Smith, Paul Martin; Tay, Sing-Pin; Thakur, Randhir
1999-05-03
We have modeled growth kinetics of oxynitrides grown in NO-O_{2} gas mixtures from first principles using modified Deal-Grove equations. Retardation of oxygen diffusion through the nitrided dielectric was assumed to be the dominant growth-limiting step. The model was validated against experimentally obtained curves with good agreement. Excellent uniformity, which exceeded expected walues, was observed.
DEFF Research Database (Denmark)
Kogelman, Lisette; Trabzuni, Daniah; Bonder, Marc Jan
effects of the interactions between tissues and probes using BLUP (Best Linear Unbiased Prediction) linear models correcting for gender, which were subsequently used in a finite mixture model to detect DE genes in each tissue. This approach evades the multiple-testing problem and is able to detect...
Smoothed particle hydrodynamics model for phase separating fluid mixtures. I. General equations
Thieulot, C; Janssen, LPBM; Espanol, P
We present a thermodynamically consistent discrete fluid particle model for the simulation of a recently proposed set of hydrodynamic equations for a phase separating van der Waals fluid mixture [P. Espanol and C.A.P. Thieulot, J. Chem. Phys. 118, 9109 (2003)]. The discrete model is formulated by
Duarte, João C.; Schellart, Wouter P.; Cruden, Alexander R.
2014-01-01
Paraffins have been widely used in analogue modelling of geological processes. Petrolatum and paraffin oil are commonly used to lubricate model boundaries and to simulate weak layers. In this paper, we present rheological tests of petrolatum, paraffin oil and several homogeneous mixtures of the two.
Validity of the negative binomial distribution in particle production
International Nuclear Information System (INIS)
Cugnon, J.; Harouna, O.
1987-01-01
Some aspects of the clan picture for particle production in nuclear and in high-energy processes are examined. In particular, it is shown that the requirement of having logarithmic distribution for the number of particles within a clan in order to generate a negative binomial should not be taken strictly. Large departures are allowed without distorting too much the negative binomial. The question of the undetected particles is also studied. It is shown that, under reasonable circumstances, the latter do not affect the negative binomial character of the multiplicity distribution
Maximum likelihood pixel labeling using a spatially variant finite mixture model
International Nuclear Information System (INIS)
Gopal, S.S.; Hebert, T.J.
1996-01-01
We propose a spatially-variant mixture model for pixel labeling. Based on this spatially-variant mixture model we derive an expectation maximization algorithm for maximum likelihood estimation of the pixel labels. While most algorithms using mixture models entail the subsequent use of a Bayes classifier for pixel labeling, the proposed algorithm yields maximum likelihood estimates of the labels themselves and results in unambiguous pixel labels. The proposed algorithm is fast, robust, easy to implement, flexible in that it can be applied to any arbitrary image data where the number of classes is known and, most importantly, obviates the need for an explicit labeling rule. The algorithm is evaluated both quantitatively and qualitatively on simulated data and on clinical magnetic resonance images of the human brain
Modelling of phase equilibria for associating mixtures using an equation of state
International Nuclear Information System (INIS)
Ferreira, Olga; Brignole, Esteban A.; Macedo, Eugenia A.
2004-01-01
In the present work, the group contribution with association equation of state (GCA-EoS) is extended to represent phase equilibria in mixtures containing acids, esters, and ketones, with water, alcohols, and any number of inert components. Association effects are represented by a group-contribution approach. Self- and cross-association between the associating groups present in these mixtures are considered. The GCA-EoS model is compared to the group-contribution method MHV2, which does not take into account explicitly association effects. The results obtained with the GCA-EoS model are, in general, more accurate when compared to the ones achieved by the MHV2 equation with less number of parameters. Model predictions are presented for binary self- and cross-associating mixtures
HTCC - a heat transfer model for gas-steam mixtures
International Nuclear Information System (INIS)
Papadimitriou, P.
1983-01-01
The mathematical model HTCC (Heat Transfer Coefficient in Containment) has been developed for RALOC after a loss-of-coolant accident in order to determine the local heat transfer coefficients for transfer between the containment atmosphere and the walls of the reactor building. The model considers the current values of room and wall temperature, the concentration of steam and non-condensible gases, geometry data and those of fluid dynamics together with thermodynamic parameters and from these determines the heat transfer mechanisms due to convection, radiation and condensation. The HTCC is implemented in the RALOC program. Comparative analyses of computed temperature profiles, for HEDL Standard problems A and B on hydrogen distribution, and of computed temperature profiles determined during the heat-up phase in the CSE-A5 experiment show a good agreement with experimental data. (orig.) [de
Factoring variations in natural images with deep Gaussian mixture models
van den Oord, Aäron; Schrauwen, Benjamin
2014-01-01
Generative models can be seen as the swiss army knives of machine learning, as many problems can be written probabilistically in terms of the distribution of the data, including prediction, reconstruction, imputation and simulation. One of the most promising directions for unsupervised learning may lie in Deep Learning methods, given their success in supervised learning. However, one of the cur- rent problems with deep unsupervised learning methods, is that they often are harder to scale. As ...
Nonlinear Structured Growth Mixture Models in Mplus and OpenMx
Grimm, Kevin J.; Ram, Nilam; Estabrook, Ryne
2014-01-01
Growth mixture models (GMMs; Muthén & Muthén, 2000; Muthén & Shedden, 1999) are a combination of latent curve models (LCMs) and finite mixture models to examine the existence of latent classes that follow distinct developmental patterns. GMMs are often fit with linear, latent basis, multiphase, or polynomial change models because of their common use, flexibility in modeling many types of change patterns, the availability of statistical programs to fit such models, and the ease of programming. In this paper, we present additional ways of modeling nonlinear change patterns with GMMs. Specifically, we show how LCMs that follow specific nonlinear functions can be extended to examine the presence of multiple latent classes using the Mplus and OpenMx computer programs. These models are fit to longitudinal reading data from the Early Childhood Longitudinal Study-Kindergarten Cohort to illustrate their use. PMID:25419006
Dynamic classification of fetal heart rates by hierarchical Dirichlet process mixture models.
Directory of Open Access Journals (Sweden)
Kezi Yu
Full Text Available In this paper, we propose an application of non-parametric Bayesian (NPB models for classification of fetal heart rate (FHR recordings. More specifically, we propose models that are used to differentiate between FHR recordings that are from fetuses with or without adverse outcomes. In our work, we rely on models based on hierarchical Dirichlet processes (HDP and the Chinese restaurant process with finite capacity (CRFC. Two mixture models were inferred from real recordings, one that represents healthy and another, non-healthy fetuses. The models were then used to classify new recordings and provide the probability of the fetus being healthy. First, we compared the classification performance of the HDP models with that of support vector machines on real data and concluded that the HDP models achieved better performance. Then we demonstrated the use of mixture models based on CRFC for dynamic classification of the performance of (FHR recordings in a real-time setting.
A Mixture Innovation Heterogeneous Autoregressive Model for Structural Breaks and Long Memory
DEFF Research Database (Denmark)
Nonejad, Nima
We propose a flexible model to describe nonlinearities and long-range dependence in time series dynamics. Our model is an extension of the heterogeneous autoregressive model. Structural breaks occur through mixture distributions in state innovations of linear Gaussian state space models. Monte...... Carlo simulations evaluate the properties of the estimation procedures. Results show that the proposed model is viable and flexible for purposes of forecasting volatility. Model uncertainty is accounted for by employing Bayesian model averaging. Bayesian model averaging provides very competitive...... forecasts compared to any single model specification. It provides further improvements when we average over nonlinear specifications....
Analysis of overdispersed count data by mixtures of Poisson variables and Poisson processes.
Hougaard, P; Lee, M L; Whitmore, G A
1997-12-01
Count data often show overdispersion compared to the Poisson distribution. Overdispersion is typically modeled by a random effect for the mean, based on the gamma distribution, leading to the negative binomial distribution for the count. This paper considers a larger family of mixture distributions, including the inverse Gaussian mixture distribution. It is demonstrated that it gives a significantly better fit for a data set on the frequency of epileptic seizures. The same approach can be used to generate counting processes from Poisson processes, where the rate or the time is random. A random rate corresponds to variation between patients, whereas a random time corresponds to variation within patients.
Evaluation of thermodynamic properties of fluid mixtures by PC-SAFT model
International Nuclear Information System (INIS)
Almasi, Mohammad
2014-01-01
Experimental and calculated partial molar volumes (V ¯ m,1 ) of MIK with (♦) 2-PrOH, (♢) 2-BuOH, (●) 2-PenOH at T = 298.15 K. (—) PC-SAFT model. - Highlights: • Densities and viscosities of the mixtures (MIK + 2-alkanols) were measured. • PC-SAFT model was applied to correlate the volumetric properties of binary mixtures. • Agreement between experimental data and calculated values by PC-SAFT model is good. - Abstract: Densities and viscosities of binary mixtures of methyl isobutyl ketone (MIK) with polar solvents namely, 2-propanol, 2-butanol and 2-pentanol, were measured at 7 temperatures (293.15–323.15 K) over the entire range of composition. Using the experimental data, excess molar volumes V m E , isobaric thermal expansivity α p , partial molar volumes V ¯ m,i and viscosity deviations Δη, have been calculated due to their importance in the study of specific molecular interactions. The observed negative and positive values of deviation/excess parameters were explained on the basis of the intermolecular interactions occur in these mixtures. The Perturbed Chain Statistical Association Fluid Theory (PC-SAFT) has been used to correlate the volumetric behavior of the mixtures
Evaluation of thermodynamic properties of fluid mixtures by PC-SAFT model
Energy Technology Data Exchange (ETDEWEB)
Almasi, Mohammad, E-mail: m.almasi@khouzestan.srbiau.ac.ir
2014-09-10
Experimental and calculated partial molar volumes (V{sup ¯}{sub m,1}) of MIK with (♦) 2-PrOH, (♢) 2-BuOH, (●) 2-PenOH at T = 298.15 K. (—) PC-SAFT model. - Highlights: • Densities and viscosities of the mixtures (MIK + 2-alkanols) were measured. • PC-SAFT model was applied to correlate the volumetric properties of binary mixtures. • Agreement between experimental data and calculated values by PC-SAFT model is good. - Abstract: Densities and viscosities of binary mixtures of methyl isobutyl ketone (MIK) with polar solvents namely, 2-propanol, 2-butanol and 2-pentanol, were measured at 7 temperatures (293.15–323.15 K) over the entire range of composition. Using the experimental data, excess molar volumes V{sub m}{sup E}, isobaric thermal expansivity α{sub p}, partial molar volumes V{sup ¯}{sub m,i} and viscosity deviations Δη, have been calculated due to their importance in the study of specific molecular interactions. The observed negative and positive values of deviation/excess parameters were explained on the basis of the intermolecular interactions occur in these mixtures. The Perturbed Chain Statistical Association Fluid Theory (PC-SAFT) has been used to correlate the volumetric behavior of the mixtures.
Directory of Open Access Journals (Sweden)
Niels Hadrup
Full Text Available Humans are concomitantly exposed to numerous chemicals. An infinite number of combinations and doses thereof can be imagined. For toxicological risk assessment the mathematical prediction of mixture effects, using knowledge on single chemicals, is therefore desirable. We investigated pros and cons of the concentration addition (CA, independent action (IA and generalized concentration addition (GCA models. First we measured effects of single chemicals and mixtures thereof on steroid synthesis in H295R cells. Then single chemical data were applied to the models; predictions of mixture effects were calculated and compared to the experimental mixture data. Mixture 1 contained environmental chemicals adjusted in ratio according to human exposure levels. Mixture 2 was a potency adjusted mixture containing five pesticides. Prediction of testosterone effects coincided with the experimental Mixture 1 data. In contrast, antagonism was observed for effects of Mixture 2 on this hormone. The mixtures contained chemicals exerting only limited maximal effects. This hampered prediction by the CA and IA models, whereas the GCA model could be used to predict a full dose response curve. Regarding effects on progesterone and estradiol, some chemicals were having stimulatory effects whereas others had inhibitory effects. The three models were not applicable in this situation and no predictions could be performed. Finally, the expected contributions of single chemicals to the mixture effects were calculated. Prochloraz was the predominant but not sole driver of the mixtures, suggesting that one chemical alone was not responsible for the mixture effects. In conclusion, the GCA model seemed to be superior to the CA and IA models for the prediction of testosterone effects. A situation with chemicals exerting opposing effects, for which the models could not be applied, was identified. In addition, the data indicate that in non-potency adjusted mixtures the effects cannot
Application of Parameter Estimation for Diffusions and Mixture Models
DEFF Research Database (Denmark)
Nolsøe, Kim
The first part of this thesis proposes a method to determine the preferred number of structures, their proportions and the corresponding geometrical shapes of an m-membered ring molecule. This is obtained by formulating a statistical model for the data and constructing an algorithm which samples...... with the posterior score function. From an application point of view this methology is easy to apply, since the optimal estimating function G(;Xt1 ; : : : ;Xtn ) is equal to the classical optimal estimating function, plus a correction term which takes into account the prior information. The methology is particularly...
Szulczyński, Bartosz; Gębicki, Jacek; Namieśnik, Jacek
2018-01-01
The paper presents the possibility of application of fuzzy logic to determine the odour intensity of model, ternary gas mixtures (α-pinene, toluene and triethylamine) using electronic nose prototype. The results obtained using fuzzy logic algorithms were compared with the values obtained using multiple linear regression (MLR) model and sensory analysis. As the results of the studies, it was found the electronic nose prototype along with the fuzzy logic pattern recognition system can be successfully used to estimate the odour intensity of tested gas mixtures. The correctness of the results obtained using fuzzy logic was equal to 68%.
A Mixture Model of Consumers' Intended Purchase Decisions for Genetically Modified Foods
Kristine M. Grimsrud; Robert P. Berrens; Ron C. Mittelhammer
2006-01-01
A finite probability mixture model is used to analyze the existence of multiple market segments for a pre-market good. The approach has at least two principal benefits. First, the model is capable of identifying likely market segments and their differentiating characteristics. Second, the model can be used to estimate the discount different consumer groups require to purchase the good. The model is illustrated using stated preference survey data collected on consumer responses to the potentia...
International Nuclear Information System (INIS)
Joshi, A.; Lawande, S.V.
1990-01-01
A systematic study of squeezing obtained from k-photon anharmonic oscillator (with interaction hamiltonian of the form (a † ) k , k ≥ 2) interacting with light whose statistics can be varied from sub-Poissonian to poissonian via binomial state of field and super-Poissonian to poissonian via negative binomial state of field is presented. The authors predict that for all values of k there is a tendency increase in squeezing with increased sub-Poissonian character of the field while the reverse is true with super-Poissonian field. They also present non-classical behavior of the first order coherence function explicitly for k = 2 case (i.e., for two-photon anharmonic oscillator model used for a Kerr-like medium) with variation in the statistics of the input light
Analysis of real-time mixture cytotoxicity data following repeated exposure using BK/TD models.
Teng, S; Tebby, C; Barcellini-Couget, S; De Sousa, G; Brochot, C; Rahmani, R; Pery, A R R
2016-08-15
Cosmetic products generally consist of multiple ingredients. Thus, cosmetic risk assessment has to deal with mixture toxicity on a long-term scale which means it has to be assessed in the context of repeated exposure. Given that animal testing has been banned for cosmetics risk assessment, in vitro assays allowing long-term repeated exposure and adapted for in vitro - in vivo extrapolation need to be developed. However, most in vitro tests only assess short-term effects and consider static endpoints which hinder extrapolation to realistic human exposure scenarios where concentration in target organs is varies over time. Thanks to impedance metrics, real-time cell viability monitoring for repeated exposure has become possible. We recently constructed biokinetic/toxicodynamic models (BK/TD) to analyze such data (Teng et al., 2015) for three hepatotoxic cosmetic ingredients: coumarin, isoeugenol and benzophenone-2. In the present study, we aim to apply these models to analyze the dynamics of mixture impedance data using the concepts of concentration addition and independent action. Metabolic interactions between the mixture components were investigated, characterized and implemented in the models, as they impacted the actual cellular exposure. Indeed, cellular metabolism following mixture exposure induced a quick disappearance of the compounds from the exposure system. We showed that isoeugenol substantially decreased the metabolism of benzophenone-2, reducing the disappearance of this compound and enhancing its in vitro toxicity. Apart from this metabolic interaction, no mixtures showed any interaction, and all binary mixtures were successfully modeled by at least one model based on exposure to the individual compounds. Copyright © 2016 Elsevier Inc. All rights reserved.
Pythagoras, Binomial, and de Moivre Revisited Through Differential Equations
Singh, Jitender; Bajaj, Renu
2018-01-01
The classical Pythagoras theorem, binomial theorem, de Moivre's formula, and numerous other deductions are made using the uniqueness theorem for the initial value problems in linear ordinary differential equations.
Ion swarm data for electrical discharge modeling in air and flue gas mixtures
International Nuclear Information System (INIS)
Nelson, D.; Benhenni, M.; Eichwald, O.; Yousfi, M.
2003-01-01
The first step of this work is the determination of the elastic and inelastic ion-molecule collision cross sections for the main ions (N 2 + , O 2 + , CO 2 + , H 2 O + and O - ) usually present either in the air or flue gas discharges. The obtained cross section sets, given for ion kinetic energies not exceeding 100 eV, correspond to the interactions of each ion with its parent molecule (symmetric case) or nonparent molecule (asymmetric case). Then by using these different cross section sets, it is possible to obtain the ion swarm data for the different gas mixtures involving N 2 , CO 2 , H 2 O and O 2 molecules whatever their relative proportions. These ion swarm data are obtained from an optimized Monte Carlo method well adapted for the ion transport in gas mixtures. This also allows us to clearly show that the classical linear approximations usually applied for the ion swarm data in mixtures such as Blanc's law are far to be valid. Then, the ion swarm data are given in three cases of gas mixtures: a dry air (80% N 2 , 20% O 2 ), a ternary gas mixture (82% N 2 , 12% CO 2 , 6% O 2 ) and a typical flue gas (76% N 2 , 12% CO 2 , 6% O 2 , 6% H 2 O). From these reliable ion swarm data, electrical discharge modeling for a wire to plane electrode configuration has been carried out in these three mixtures at the atmospheric pressure for different applied voltages. Under the same discharge conditions, large discrepancies in the streamer formation and propagation have been observed in these three mixture cases. They are due to the deviations existing not only between the different effective electron-molecule ionization rates but also between the ion transport properties mainly because of the presence of a highly polar molecule such as H 2 O. This emphasizes the necessity to properly consider the ion transport in the discharge modeling
Application of binomial-edited CPMG to shale characterization.
Washburn, Kathryn E; Birdwell, Justin E
2014-09-01
Unconventional shale resources may contain a significant amount of hydrogen in organic solids such as kerogen, but it is not possible to directly detect these solids with many NMR systems. Binomial-edited pulse sequences capitalize on magnetization transfer between solids, semi-solids, and liquids to provide an indirect method of detecting solid organic materials in shales. When the organic solids can be directly measured, binomial-editing helps distinguish between different phases. We applied a binomial-edited CPMG pulse sequence to a range of natural and experimentally-altered shale samples. The most substantial signal loss is seen in shales rich in organic solids while fluids associated with inorganic pores seem essentially unaffected. This suggests that binomial-editing is a potential method for determining fluid locations, solid organic content, and kerogen-bitumen discrimination. Copyright © 2014 Elsevier Inc. All rights reserved.
Multifractal structure of multiplicity distributions and negative binomials
International Nuclear Information System (INIS)
Malik, S.; Delhi, Univ.
1997-01-01
The paper presents experimental results of the multifractal structure analysis in proton-emulsion interactions at 800 GeV. The multiplicity moments have a power law dependence on the mean multiplicity in varying bin sizes of pseudorapidity. The values of generalised dimensions are calculated from the slope value. The multifractal characteristics are also examined in the light of negative binomials. The observed multiplicity moments and those derived from the negative-binomial fits agree well with each other. Also the values of D q , both observed and derived from the negative-binomial fits not only decrease with q typifying multifractality but also agree well each other showing consistency with the negative-binomial form
Entanglement of Generalized Two-Mode Binomial States and Teleportation
International Nuclear Information System (INIS)
Wang Dongmei; Yu Youhong
2009-01-01
The entanglement of the generalized two-mode binomial states in the phase damping channel is studied by making use of the relative entropy of the entanglement. It is shown that the factors of q and p play the crucial roles in control the relative entropy of the entanglement. Furthermore, we propose a scheme of teleporting an unknown state via the generalized two-mode binomial states, and calculate the mean fidelity of the scheme. (general)
Negative binomial properties and clan structure in multiplicity distributions
International Nuclear Information System (INIS)
Giovannini, A.; Van Hove, L.
1988-01-01
We review the negative binomial properties measured recently for many multiplicity distributions of high energy hadronic, semi-leptonic reactions in selected rapidity intervals. We analyse them in terms of the ''clan'' structure which can be defined for any negative binomial distribution. By comparing reactions we exhibit a number of regularities for the average number N-bar of clans and the average charged multiplicity (n-bar) c per clan. 22 refs., 6 figs. (author)
On some binomial [Formula: see text]-difference sequence spaces.
Meng, Jian; Song, Meimei
2017-01-01
In this paper, we introduce the binomial sequence spaces [Formula: see text], [Formula: see text] and [Formula: see text] by combining the binomial transformation and difference operator. We prove the BK -property and some inclusion relations. Furthermore, we obtain Schauder bases and compute the α -, β - and γ -duals of these sequence spaces. Finally, we characterize matrix transformations on the sequence space [Formula: see text].
Dynamic prediction of cumulative incidence functions by direct binomial regression.
Grand, Mia K; de Witte, Theo J M; Putter, Hein
2018-03-25
In recent years there have been a series of advances in the field of dynamic prediction. Among those is the development of methods for dynamic prediction of the cumulative incidence function in a competing risk setting. These models enable the predictions to be updated as time progresses and more information becomes available, for example when a patient comes back for a follow-up visit after completing a year of treatment, the risk of death, and adverse events may have changed since treatment initiation. One approach to model the cumulative incidence function in competing risks is by direct binomial regression, where right censoring of the event times is handled by inverse probability of censoring weights. We extend the approach by combining it with landmarking to enable dynamic prediction of the cumulative incidence function. The proposed models are very flexible, as they allow the covariates to have complex time-varying effects, and we illustrate how to investigate possible time-varying structures using Wald tests. The models are fitted using generalized estimating equations. The method is applied to bone marrow transplant data and the performance is investigated in a simulation study. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Okada, Kensuke; Vandekerckhove, Joachim; Lee, Michael D
2018-02-01
People often interact with environments that can provide only a finite number of items as resources. Eventually a book contains no more chapters, there are no more albums available from a band, and every Pokémon has been caught. When interacting with these sorts of environments, people either actively choose to quit collecting new items, or they are forced to quit when the items are exhausted. Modeling the distribution of how many items people collect before they quit involves untangling these two possibilities, We propose that censored geometric models are a useful basic technique for modeling the quitting distribution, and, show how, by implementing these models in a hierarchical and latent-mixture framework through Bayesian methods, they can be extended to capture the additional features of specific situations. We demonstrate this approach by developing and testing a series of models in two case studies involving real-world data. One case study deals with people choosing jokes from a recommender system, and the other deals with people completing items in a personality survey.
Energy Technology Data Exchange (ETDEWEB)
Karanam, Aditya; Sharma, Pavan K.; Ganju, Sunil; Singh, Ram Kumar [Bhabha Atomic Research Centre (BARC), Mumbai (India). Reactor Safety Div.
2016-12-15
During postulated accident sequences in nuclear reactors, hydrogen may get released from the core and form a flammable mixture in the surrounding containment structure. Ignition of such mixtures and the subsequent pressure rise are an imminent threat for safe and sustainable operation of nuclear reactors. Methods for evaluating post ignition characteristics are important for determining the design safety margins in such scenarios. This study presents two thermo-chemical models for determining the post ignition state. The first model is based on internal energy balance while the second model uses the concept of element potentials to minimize the free energy of the system with internal energy imposed as a constraint. Predictions from both the models have been compared against published data over a wide range of mixture compositions. Important differences in the regions close to flammability limits and for stoichiometric mixtures have been identified and explained. The equilibrium model has been validated for varied temperatures and pressures representative of initial conditions that may be present in the containment during accidents. Special emphasis has been given to the understanding of the role of dissociation and its effect on equilibrium pressure, temperature and species concentrations.
Semiparametric accelerated failure time cure rate mixture models with competing risks.
Choi, Sangbum; Zhu, Liang; Huang, Xuelin
2018-01-15
Modern medical treatments have substantially improved survival rates for many chronic diseases and have generated considerable interest in developing cure fraction models for survival data with a non-ignorable cured proportion. Statistical analysis of such data may be further complicated by competing risks that involve multiple types of endpoints. Regression analysis of competing risks is typically undertaken via a proportional hazards model adapted on cause-specific hazard or subdistribution hazard. In this article, we propose an alternative approach that treats competing events as distinct outcomes in a mixture. We consider semiparametric accelerated failure time models for the cause-conditional survival function that are combined through a multinomial logistic model within the cure-mixture modeling framework. The cure-mixture approach to competing risks provides a means to determine the overall effect of a treatment and insights into how this treatment modifies the components of the mixture in the presence of a cure fraction. The regression and nonparametric parameters are estimated by a nonparametric kernel-based maximum likelihood estimation method. Variance estimation is achieved through resampling methods for the kernel-smoothed likelihood function. Simulation studies show that the procedures work well in practical settings. Application to a sarcoma study demonstrates the use of the proposed method for competing risk data with a cure fraction. Copyright © 2017 John Wiley & Sons, Ltd.
Son, Heesook; Friedmann, Erika; Thomas, Sue A
2012-01-01
Longitudinal studies are used in nursing research to examine changes over time in health indicators. Traditional approaches to longitudinal analysis of means, such as analysis of variance with repeated measures, are limited to analyzing complete cases. This limitation can lead to biased results due to withdrawal or data omission bias or to imputation of missing data, which can lead to bias toward the null if data are not missing completely at random. Pattern mixture models are useful to evaluate the informativeness of missing data and to adjust linear mixed model (LMM) analyses if missing data are informative. The aim of this study was to provide an example of statistical procedures for applying a pattern mixture model to evaluate the informativeness of missing data and conduct analyses of data with informative missingness in longitudinal studies using SPSS. The data set from the Patients' and Families' Psychological Response to Home Automated External Defibrillator Trial was used as an example to examine informativeness of missing data with pattern mixture models and to use a missing data pattern in analysis of longitudinal data. Prevention of withdrawal bias, omitted data bias, and bias toward the null in longitudinal LMMs requires the assessment of the informativeness of the occurrence of missing data. Missing data patterns can be incorporated as fixed effects into LMMs to evaluate the contribution of the presence of informative missingness to and control for the effects of missingness on outcomes. Pattern mixture models are a useful method to address the presence and effect of informative missingness in longitudinal studies.
International Nuclear Information System (INIS)
Karanam, Aditya; Sharma, Pavan K.; Ganju, Sunil; Singh, Ram Kumar
2016-01-01
During postulated accident sequences in nuclear reactors, hydrogen may get released from the core and form a flammable mixture in the surrounding containment structure. Ignition of such mixtures and the subsequent pressure rise are an imminent threat for safe and sustainable operation of nuclear reactors. Methods for evaluating post ignition characteristics are important for determining the design safety margins in such scenarios. This study presents two thermo-chemical models for determining the post ignition state. The first model is based on internal energy balance while the second model uses the concept of element potentials to minimize the free energy of the system with internal energy imposed as a constraint. Predictions from both the models have been compared against published data over a wide range of mixture compositions. Important differences in the regions close to flammability limits and for stoichiometric mixtures have been identified and explained. The equilibrium model has been validated for varied temperatures and pressures representative of initial conditions that may be present in the containment during accidents. Special emphasis has been given to the understanding of the role of dissociation and its effect on equilibrium pressure, temperature and species concentrations.
DEFF Research Database (Denmark)
Tsivintzelis, Ioannis; Kontogeorgis, Georgios M.
2015-01-01
of CPA for ternary and multicomponent CO2 mixtures containing alcohols (methanol, ethanol or propanol) water and hydrocarbons. This work belongs to a series of studies aiming to arrive in a single "engineering approach" for applying CPA to acid gas mixtures, without introducing significant changes...... to the model. In this direction, CPA results were obtained using various approaches, i.e. different association schemes for pure CO2 (assuming that it is a non-associating compound, or that it is a self-associating fluid with two, three or four association sites) and different possibilities for modelling...... mixtures of CO2 with water and alcohols (only use of one interaction parameter kij or assuming cross-association interactions and obtaining the relevant parameters either via a combining rule or using an experimental value for the cross-association energy). It is concluded that CPA is a powerful model...
Orlov Alexey; Ushakov Anton; Sovach Victor
2016-01-01
This article presents results of development of the mathematical model of nonstationary separation processes occurring in gas centrifuge cascades for separation of multicomponent isotope mixtures. This model was used for the calculation parameters of gas centrifuge cascade for separation of germanium isotopes. Comparison of obtained values with results of other authors revealed that developed mathematical model is adequate to describe nonstationary separation processes in gas centrifuge casca...
Orlov, Aleksey Alekseevich; Ushakov, Anton; Sovach, Victor
2017-01-01
The article presents results of development of a mathematical model of nonstationary hydraulic processes in gas centrifuge cascade for separation of multicomponent isotope mixtures. This model was used for the calculation parameters of gas centrifuge cascade for separation of silicon isotopes. Comparison of obtained values with results of other authors revealed that developed mathematical model is adequate to describe nonstationary hydraulic processes in gas centrifuge cascades for separation...
DEFF Research Database (Denmark)
Tsivintzelis, Ioannis; Ali, Shahid; Kontogeorgis, Georgios
2015-01-01
The thermodynamic properties of pure gaseous, liquid or supercritical CO2 and CO2 mixtures with hydrocarbons and other compounds such as water, alcohols, and glycols are very important in many processes in the oil and gas industry. Design of such processes requires use of accurate thermodynamic...... models, capable of predicting the complex phase behavior of multicomponent mixtures as well as their volumetric properties. In this direction, over the last several years, the cubic-plus-association (CPA) thermodynamic model has been successfully used for describing volumetric properties and phase...
Shiyko, Mariya P.; Li, Yuelin; Rindskopf, David
2012-01-01
Intensive longitudinal data (ILD) have become increasingly common in the social and behavioral sciences; count variables, such as the number of daily smoked cigarettes, are frequently used outcomes in many ILD studies. We demonstrate a generalized extension of growth mixture modeling (GMM) to Poisson-distributed ILD for identifying qualitatively…
Soot modeling of counterflow diffusion flames of ethylene-based binary mixture fuels
Wang, Yu; Raj, Abhijeet Dhayal; Chung, Suk-Ho
2015-01-01
of ethylene and its binary mixtures with methane, ethane and propane based on the method of moments. The soot model has 36 soot nucleation reactions from 8 PAH molecules including pyrene and larger PAHs. Soot surface growth reactions were based on a modified
Densities of Pure Ionic Liquids and Mixtures: Modeling and Data Analysis
DEFF Research Database (Denmark)
Abildskov, Jens; O’Connell, John P.
2015-01-01
Our two-parameter corresponding states model for liquid densities and compressibilities has been extended to more pure ionic liquids and to their mixtures with one or two solvents. A total of 19 new group contributions (5 new cations and 14 new anions) have been obtained for predicting pressure...
Estimating Lion Abundance using N-mixture Models for Social Species.
Belant, Jerrold L; Bled, Florent; Wilton, Clay M; Fyumagwa, Robert; Mwampeta, Stanslaus B; Beyer, Dean E
2016-10-27
Declining populations of large carnivores worldwide, and the complexities of managing human-carnivore conflicts, require accurate population estimates of large carnivores to promote their long-term persistence through well-informed management We used N-mixture models to estimate lion (Panthera leo) abundance from call-in and track surveys in southeastern Serengeti National Park, Tanzania. Because of potential habituation to broadcasted calls and social behavior, we developed a hierarchical observation process within the N-mixture model conditioning lion detectability on their group response to call-ins and individual detection probabilities. We estimated 270 lions (95% credible interval = 170-551) using call-ins but were unable to estimate lion abundance from track data. We found a weak negative relationship between predicted track density and predicted lion abundance from the call-in surveys. Luminosity was negatively correlated with individual detection probability during call-in surveys. Lion abundance and track density were influenced by landcover, but direction of the corresponding effects were undetermined. N-mixture models allowed us to incorporate multiple parameters (e.g., landcover, luminosity, observer effect) influencing lion abundance and probability of detection directly into abundance estimates. We suggest that N-mixture models employing a hierarchical observation process can be used to estimate abundance of other social, herding, and grouping species.
The Support Reduction Algorithm for Computing Non-Parametric Function Estimates in Mixture Models
GROENEBOOM, PIET; JONGBLOED, GEURT; WELLNER, JON A.
2008-01-01
In this paper, we study an algorithm (which we call the support reduction algorithm) that can be used to compute non-parametric M-estimators in mixture models. The algorithm is compared with natural competitors in the context of convex regression and the ‘Aspect problem’ in quantum physics.
Modelling and simulation of an energy transport phenomenon in a solid-fluid mixture
International Nuclear Information System (INIS)
Costa, M.L.M.; Sampaio, R.; Gama, R.M.S. da.
1989-08-01
In the present work a model for a local description of the energy transfer phenomenon in a binary (solid-fluid) saturated mixture is proposed. The heat transfer in a saturated flow (through a porous medium) between two parallel plates is simulated by using the Finite Volumes Method. (author) [pt
Zhang, Danhui; Orrill, Chandra; Campbell, Todd
2015-01-01
The purpose of this study was to investigate whether mixture Rasch models followed by qualitative item-by-item analysis of selected Programme for International Student Assessment (PISA) mathematics and science items offered insight into knowledge students invoke in mathematics and science separately and combined. The researchers administered an…
Market segment derivation and profiling via a finite mixture model framework
Wedel, M; Desarbo, WS
The Marketing literature has shown how difficult it is to profile market segments derived with finite mixture models. especially using traditional descriptor variables (e.g., demographics). Such profiling is critical for the proper implementation of segmentation strategy. we propose a new finite
Finite mixture models for sub-pixel coastal land cover classification
CSIR Research Space (South Africa)
Ritchie, Michaela C
2017-05-01
Full Text Available Models for Sub- pixel Coastal Land Cover Classification M. Ritchie Dr. M. Lück-Vogel Dr. P. Debba Dr. V. Goodall ISRSE - 37 Tshwane, South Africa 10 May 2017 2Study Area Africa South Africa FALSE BAY 3Strand Gordon’s Bay Study Area WorldView-2 Image.../Urban 1 10 10 Herbaceous Vegetation 1 5 5 Shadow 1 8 8 Sparse Vegetation 1 3 3 Water 1 10 10 Woody Vegetation 1 5 5 11 Maximum Likelihood Classification (MLC) 12 Gaussian Mixture Discriminant Analysis (GMDA) 13 A B C t-distribution Mixture Discriminant...
An Empirical Bayes Mixture Model for Effect Size Distributions in Genome-Wide Association Studies
DEFF Research Database (Denmark)
Thompson, Wesley K.; Wang, Yunpeng; Schork, Andrew J.
2015-01-01
-wide association study (GWAS) test statistics. Test statistics corresponding to null associations are modeled as random draws from a normal distribution with zero mean; test statistics corresponding to non-null associations are also modeled as normal with zero mean, but with larger variance. The model is fit via...... analytically and in simulations. We apply this approach to meta-analysis test statistics from two large GWAS, one for Crohn’s disease (CD) and the other for schizophrenia (SZ). A scale mixture of two normals distribution provides an excellent fit to the SZ nonparametric replication effect size estimates. While...... minimizing discrepancies between the parametric mixture model and resampling-based nonparametric estimates of replication effect sizes and variances. We describe in detail the implications of this model for estimation of the non-null proportion, the probability of replication in de novo samples, the local...
Modelling of associating mixtures for applications in the oil & gas and chemical industries
DEFF Research Database (Denmark)
Kontogeorgis, Georgios; Folas, Georgios; Muro Sunè, Nuria
2007-01-01
Thermodynamic properties and phase equilibria of associating mixtures cannot often be satisfactorily modelled using conventional models such as cubic equations of state. CPA (cubic-plus-association) is an equation of state (EoS), which combines the SRK EoS with the association term of SAFT. For non......-alcohol (glycol)-alkanes and certain acid and amine-containing mixtures. Recent results include glycol-aromatic hydrocarbons including multiphase, multicomponent equilibria and gas hydrate calculations in combination with the van der Waals-Platteeuw model. This article will outline some new applications...... thermodynamic models especially those combining cubic EoS with local composition activity coefficient models are included. (C) 2007 Elsevier B.V. All rights reserved....
Validation of a mixture-averaged thermal diffusion model for premixed lean hydrogen flames
Schlup, Jason; Blanquart, Guillaume
2018-03-01
The mixture-averaged thermal diffusion model originally proposed by Chapman and Cowling is validated using multiple flame configurations. Simulations using detailed hydrogen chemistry are done on one-, two-, and three-dimensional flames. The analysis spans flat and stretched, steady and unsteady, and laminar and turbulent flames. Quantitative and qualitative results using the thermal diffusion model compare very well with the more complex multicomponent diffusion model. Comparisons are made using flame speeds, surface areas, species profiles, and chemical source terms. Once validated, this model is applied to three-dimensional laminar and turbulent flames. For these cases, thermal diffusion causes an increase in the propagation speed of the flames as well as increased product chemical source terms in regions of high positive curvature. The results illustrate the necessity for including thermal diffusion, and the accuracy and computational efficiency of the mixture-averaged thermal diffusion model.
Dynamic mean field theory for lattice gas models of fluid mixtures confined in mesoporous materials.
Edison, J R; Monson, P A
2013-11-12
We present the extension of dynamic mean field theory (DMFT) for fluids in porous materials (Monson, P. A. J. Chem. Phys. 2008, 128, 084701) to the case of mixtures. The theory can be used to describe the relaxation processes in the approach to equilibrium or metastable equilibrium states for fluids in pores after a change in the bulk pressure or composition. It is especially useful for studying systems where there are capillary condensation or evaporation transitions. Nucleation processes associated with these transitions are emergent features of the theory and can be visualized via the time dependence of the density distribution and composition distribution in the system. For mixtures an important component of the dynamics is relaxation of the composition distribution in the system, especially in the neighborhood of vapor-liquid interfaces. We consider two different types of mixtures, modeling hydrocarbon adsorption in carbon-like slit pores. We first present results on bulk phase equilibria of the mixtures and then the equilibrium (stable/metastable) behavior of these mixtures in a finite slit pore and an inkbottle pore. We then use DMFT to describe the evolution of the density and composition in the pore in the approach to equilibrium after changing the state of the bulk fluid via composition or pressure changes.
Measurement and modelling of hydrogen bonding in 1-alkanol plus n-alkane binary mixtures
DEFF Research Database (Denmark)
von Solms, Nicolas; Jensen, Lars; Kofod, Jonas L.
2007-01-01
Two equations of state (simplified PC-SAFT and CPA) are used to predict the monomer fraction of 1-alkanols in binary mixtures with n-alkanes. It is found that the choice of parameters and association schemes significantly affects the ability of a model to predict hydrogen bonding in mixtures, eve...... studies, which is clarified in the present work. New hydrogen bonding data based on infrared spectroscopy are reported for seven binary mixtures of alcohols and alkanes. (C) 2007 Elsevier B.V. All rights reserved....... though pure-component liquid densities and vapour pressures are predicted equally accurately for the associating compound. As was the case in the study of pure components, there exists some confusion in the literature about the correct interpretation and comparison of experimental data and theoretical...
Maximum likelihood estimation of semiparametric mixture component models for competing risks data.
Choi, Sangbum; Huang, Xuelin
2014-09-01
In the analysis of competing risks data, the cumulative incidence function is a useful quantity to characterize the crude risk of failure from a specific event type. In this article, we consider an efficient semiparametric analysis of mixture component models on cumulative incidence functions. Under the proposed mixture model, latency survival regressions given the event type are performed through a class of semiparametric models that encompasses the proportional hazards model and the proportional odds model, allowing for time-dependent covariates. The marginal proportions of the occurrences of cause-specific events are assessed by a multinomial logistic model. Our mixture modeling approach is advantageous in that it makes a joint estimation of model parameters associated with all competing risks under consideration, satisfying the constraint that the cumulative probability of failing from any cause adds up to one given any covariates. We develop a novel maximum likelihood scheme based on semiparametric regression analysis that facilitates efficient and reliable estimation. Statistical inferences can be conveniently made from the inverse of the observed information matrix. We establish the consistency and asymptotic normality of the proposed estimators. We validate small sample properties with simulations and demonstrate the methodology with a data set from a study of follicular lymphoma. © 2014, The International Biometric Society.
Finite mixture models for the computation of isotope ratios in mixed isotopic samples
Koffler, Daniel; Laaha, Gregor; Leisch, Friedrich; Kappel, Stefanie; Prohaska, Thomas
2013-04-01
Finite mixture models have been used for more than 100 years, but have seen a real boost in popularity over the last two decades due to the tremendous increase in available computing power. The areas of application of mixture models range from biology and medicine to physics, economics and marketing. These models can be applied to data where observations originate from various groups and where group affiliations are not known, as is the case for multiple isotope ratios present in mixed isotopic samples. Recently, the potential of finite mixture models for the computation of 235U/238U isotope ratios from transient signals measured in individual (sub-)µm-sized particles by laser ablation - multi-collector - inductively coupled plasma mass spectrometry (LA-MC-ICPMS) was demonstrated by Kappel et al. [1]. The particles, which were deposited on the same substrate, were certified with respect to their isotopic compositions. Here, we focus on the statistical model and its application to isotope data in ecogeochemistry. Commonly applied evaluation approaches for mixed isotopic samples are time-consuming and are dependent on the judgement of the analyst. Thus, isotopic compositions may be overlooked due to the presence of more dominant constituents. Evaluation using finite mixture models can be accomplished unsupervised and automatically. The models try to fit several linear models (regression lines) to subgroups of data taking the respective slope as estimation for the isotope ratio. The finite mixture models are parameterised by: • The number of different ratios. • Number of points belonging to each ratio-group. • The ratios (i.e. slopes) of each group. Fitting of the parameters is done by maximising the log-likelihood function using an iterative expectation-maximisation (EM) algorithm. In each iteration step, groups of size smaller than a control parameter are dropped; thereby the number of different ratios is determined. The analyst only influences some control
Directory of Open Access Journals (Sweden)
Bastien Boussau
2009-06-01
Full Text Available Homologous recombination is a pervasive biological process that affects sequences in all living organisms and viruses. In the presence of recombination, the evolutionary history of an alignment of homologous sequences cannot be properly depicted by a single bifurcating tree: some sites have evolved along a specific phylogenetic tree, others have followed another path. Methods available to analyse recombination in sequences usually involve an analysis of the alignment through sliding-windows, or are particularly demanding in computational resources, and are often limited to nucleotide sequences. In this article, we propose and implement a Mixture Model on trees and a phylogenetic Hidden Markov Model to reveal recombination breakpoints while searching for the various evolutionary histories that are present in an alignment known to have undergone homologous recombination. These models are sufficiently efficient to be applied to dozens of sequences on a single desktop computer, and can handle equivalently nucleotide or protein sequences. We estimate their accuracy on simulated sequences and test them on real data.
Directory of Open Access Journals (Sweden)
Bastien Boussau
2009-01-01
Full Text Available Homologous recombination is a pervasive biological process that affects sequences in all living organisms and viruses. In the presence of recombination, the evolutionary history of an alignment of homologous sequences cannot be properly depicted by a single bifurcating tree: some sites have evolved along a specific phylogenetic tree, others have followed another path. Methods available to analyse recombination in sequences usually involve an analysis of the alignment through sliding-windows, or are particularly demanding in computational resources, and are often limited to nucleotide sequences. In this article, we propose and implement a Mixture Model on trees and a phylogenetic Hidden Markov Model to reveal recombination breakpoints while searching for the various evolutionary histories that are present in an alignment known to have undergone homologous recombination. These models are sufficiently efficient to be applied to dozens of sequences on a single desktop computer, and can handle equivalently nucleotide or protein sequences. We estimate their accuracy on simulated sequences and test them on real data.
Deposition behaviour of model biofuel ash in mixtures with quartz sand. Part 1: Experimental data
Energy Technology Data Exchange (ETDEWEB)
Mischa Theis; Christian Mueller; Bengt-Johan Skrifvars; Mikko Hupa; Honghi Tran [Aabo Akademi Process Chemistry Centre, Aabo (Finland). Combustion and Materials Chemistry
2006-10-15
Model biofuel ash of well-defined size and melting properties was fed into an entrained flow reactor (EFR) to simulate the deposition behaviour of commercially applied biofuel mixtures in large-scale boilers. The aim was to obtain consistent experimental data that can be used for validation of computational fluid dynamics (CFD)-based deposition models. The results showed that while up to 80 wt% of the feed was lost to the EFR wall, the composition of the model ash particles collected at the reactor exit did not change. When model ashes were fed into the reactor individually, the ash particles were found to be sticky when they contained more than 15 wt% molten phase. When model ashes were fed in mixtures with silica sand, it was found that only a small amount of sand particles was captured in the deposits; the majority rebounded upon impact. The presence of sand in the feed mixture reduced the deposit buildup by more than could be expected from linear interpolation between the model ash and the sand. The results suggested that sand addition to model ash may prevent deposit buildup through erosion. 22 refs., 6 figs., 3 tabs.
Directory of Open Access Journals (Sweden)
Shanshan Wang
2017-12-01
Full Text Available In cities’ policy-making, it is a hot issue to grasp the determinants of carbon dioxide emission in Chinese cities. And the common method is to use the STIRPAT model, where its coefficients represent the influence intensity of each determinants of carbon emission. However, less work discusses estimation accuracy, especially in the framework of non-normal distribution and heterogeneity among cities’ emission. To improve the estimation accuracy, this paper employs a new method to estimate the STIRPAT model. The method uses a mixture of Asymmetric Laplace distributions (ALDs to approximate the true distribution of the error term. Meantime, a designed two-layer EM algorithm is used to obtain estimators. We test the robustness via the comparison results of five different models. We find that the ALDs Mixture Model is more reliable the others. Further, a significant Kuznets curve relationship is identified in China.
Sinha, B K; Pal, Manisha; Das, P
2014-01-01
The book dwells mainly on the optimality aspects of mixture designs. As mixture models are a special case of regression models, a general discussion on regression designs has been presented, which includes topics like continuous designs, de la Garza phenomenon, Loewner order domination, Equivalence theorems for different optimality criteria and standard optimality results for single variable polynomial regression and multivariate linear and quadratic regression models. This is followed by a review of the available literature on estimation of parameters in mixture models. Based on recent research findings, the volume also introduces optimal mixture designs for estimation of optimum mixing proportions in different mixture models, which include Scheffé’s quadratic model, Darroch-Waller model, log- contrast model, mixture-amount models, random coefficient models and multi-response model. Robust mixture designs and mixture designs in blocks have been also reviewed. Moreover, some applications of mixture desig...
Directory of Open Access Journals (Sweden)
Sarah Depaoli
2015-03-01
Full Text Available Background: After traumatic events, such as disaster, war trauma, and injuries including burns (which is the focus here, the risk to develop posttraumatic stress disorder (PTSD is approximately 10% (Breslau & Davis, 1992. Latent Growth Mixture Modeling can be used to classify individuals into distinct groups exhibiting different patterns of PTSD (Galatzer-Levy, 2015. Currently, empirical evidence points to four distinct trajectories of PTSD patterns in those who have experienced burn trauma. These trajectories are labeled as: resilient, recovery, chronic, and delayed onset trajectories (e.g., Bonanno, 2004; Bonanno, Brewin, Kaniasty, & Greca, 2010; Maercker, Gäbler, O'Neil, Schützwohl, & Müller, 2013; Pietrzak et al., 2013. The delayed onset trajectory affects only a small group of individuals, that is, about 4–5% (O'Donnell, Elliott, Lau, & Creamer, 2007. In addition to its low frequency, the later onset of this trajectory may contribute to the fact that these individuals can be easily overlooked by professionals. In this special symposium on Estimating PTSD trajectories (Van de Schoot, 2015a, we illustrate how to properly identify this small group of individuals through the Bayesian estimation framework using previous knowledge through priors (see, e.g., Depaoli & Boyajian, 2014; Van de Schoot, Broere, Perryck, Zondervan-Zwijnenburg, & Van Loey, 2015. Method: We used latent growth mixture modeling (LGMM (Van de Schoot, 2015b to estimate PTSD trajectories across 4 years that followed a traumatic burn. We demonstrate and compare results from traditional (maximum likelihood and Bayesian estimation using priors (see, Depaoli, 2012, 2013. Further, we discuss where priors come from and how to define them in the estimation process. Results: We demonstrate that only the Bayesian approach results in the desired theory-driven solution of PTSD trajectories. Since the priors are chosen subjectively, we also present a sensitivity analysis of the
International Nuclear Information System (INIS)
Belmonte-Beitia, Juan; Perez-Garcia, Victor M.; Vekslerchik, Vadym
2007-01-01
In this paper, we study a system of coupled nonlinear Schroedinger equations modelling a quantum degenerate mixture of bosons and fermions. We analyze the stability of plane waves, give precise conditions for the existence of solitons and write explicit solutions in the form of periodic waves. We also check that the solitons observed previously in numerical simulations of the model correspond exactly to our explicit solutions and see how plane waves destabilize to form periodic waves
On Partial Defaults in Portfolio Credit Risk : A Poisson Mixture Model Approach
Weißbach, Rafael; von Lieres und Wilkau, Carsten
2005-01-01
Most credit portfolio models exclusively calculate the loss distribution for a portfolio of performing counterparts. Conservative default definitions cause considerable insecurity about the loss for a long time after the default. We present three approaches to account for defaulted counterparts in the calculation of the economic capital. Two of the approaches are based on the Poisson mixture model CreditRisk+ and derive a loss distribution for an integrated portfolio. The third method treats ...
Batterman, Stuart; Su, Feng-Chiao; Li, Shi; Mukherjee, Bhramar; Jia, Chunrong
2015-01-01
INTRODUCTION Emission sources of volatile organic compounds (VOCs) are numerous and widespread in both indoor and outdoor environments. Concentrations of VOCs indoors typically exceed outdoor levels, and most people spend nearly 90% of their time indoors. Thus, indoor sources generally contribute the majority of VOC exposures for most people. VOC exposure has been associated with a wide range of acute and chronic health effects; for example, asthma, respiratory diseases, liver and kidney dysfunction, neurologic impairment, and cancer. Although exposures to most VOCs for most persons fall below health-based guidelines, and long-term trends show decreases in ambient emissions and concentrations, a subset of individuals experience much higher exposures that exceed guidelines. Thus, exposure to VOCs remains an important environmental health concern. The present understanding of VOC exposures is incomplete. With the exception of a few compounds, concentration and especially exposure data are limited; and like other environmental data, VOC exposure data can show multiple modes, low and high extreme values, and sometimes a large portion of data below method detection limits (MDLs). Field data also show considerable spatial or interpersonal variability, and although evidence is limited, temporal variability seems high. These characteristics can complicate modeling and other analyses aimed at risk assessment, policy actions, and exposure management. In addition to these analytic and statistical issues, exposure typically occurs as a mixture, and mixture components may interact or jointly contribute to adverse effects. However most pollutant regulations, guidelines, and studies remain focused on single compounds, and thus may underestimate cumulative exposures and risks arising from coexposures. In addition, the composition of VOC mixtures has not been thoroughly investigated, and mixture components show varying and complex dependencies. Finally, although many factors are
Extra-binomial variation approach for analysis of pooled DNA sequencing data
Wallace, Chris
2012-01-01
Motivation: The invention of next-generation sequencing technology has made it possible to study the rare variants that are more likely to pinpoint causal disease genes. To make such experiments financially viable, DNA samples from several subjects are often pooled before sequencing. This induces large between-pool variation which, together with other sources of experimental error, creates over-dispersed data. Statistical analysis of pooled sequencing data needs to appropriately model this additional variance to avoid inflating the false-positive rate. Results: We propose a new statistical method based on an extra-binomial model to address the over-dispersion and apply it to pooled case-control data. We demonstrate that our model provides a better fit to the data than either a standard binomial model or a traditional extra-binomial model proposed by Williams and can analyse both rare and common variants with lower or more variable pool depths compared to the other methods. Availability: Package ‘extraBinomial’ is on http://cran.r-project.org/ Contact: chris.wallace@cimr.cam.ac.uk Supplementary information: Supplementary data are available at Bioinformatics Online. PMID:22976083
An odor interaction model of binary odorant mixtures by a partial differential equation method.
Yan, Luchun; Liu, Jiemin; Wang, Guihua; Wu, Chuandong
2014-07-09
A novel odor interaction model was proposed for binary mixtures of benzene and substituted benzenes by a partial differential equation (PDE) method. Based on the measurement method (tangent-intercept method) of partial molar volume, original parameters of corresponding formulas were reasonably displaced by perceptual measures. By these substitutions, it was possible to relate a mixture's odor intensity to the individual odorant's relative odor activity value (OAV). Several binary mixtures of benzene and substituted benzenes were respectively tested to establish the PDE models. The obtained results showed that the PDE model provided an easily interpretable method relating individual components to their joint odor intensity. Besides, both predictive performance and feasibility of the PDE model were proved well through a series of odor intensity matching tests. If combining the PDE model with portable gas detectors or on-line monitoring systems, olfactory evaluation of odor intensity will be achieved by instruments instead of odor assessors. Many disadvantages (e.g., expense on a fixed number of odor assessors) also will be successfully avoided. Thus, the PDE model is predicted to be helpful to the monitoring and management of odor pollutions.
Fomin, P. A.
2018-03-01
Two-step approximate models of chemical kinetics of detonation combustion of (i) one hydrocarbon fuel CnHm (for example, methane, propane, cyclohexane etc.) and (ii) multi-fuel gaseous mixtures (∑aiCniHmi) (for example, mixture of methane and propane, synthesis gas, benzene and kerosene) are presented for the first time. The models can be used for any stoichiometry, including fuel/fuels-rich mixtures, when reaction products contain molecules of carbon. Owing to the simplicity and high accuracy, the models can be used in multi-dimensional numerical calculations of detonation waves in corresponding gaseous mixtures. The models are in consistent with the second law of thermodynamics and Le Chatelier's principle. Constants of the models have a clear physical meaning. The models can be used for calculation thermodynamic parameters of the mixture in a state of chemical equilibrium.
Duarte, Adam; Adams, Michael J.; Peterson, James T.
2018-01-01
Monitoring animal populations is central to wildlife and fisheries management, and the use of N-mixture models toward these efforts has markedly increased in recent years. Nevertheless, relatively little work has evaluated estimator performance when basic assumptions are violated. Moreover, diagnostics to identify when bias in parameter estimates from N-mixture models is likely is largely unexplored. We simulated count data sets using 837 combinations of detection probability, number of sample units, number of survey occasions, and type and extent of heterogeneity in abundance or detectability. We fit Poisson N-mixture models to these data, quantified the bias associated with each combination, and evaluated if the parametric bootstrap goodness-of-fit (GOF) test can be used to indicate bias in parameter estimates. We also explored if assumption violations can be diagnosed prior to fitting N-mixture models. In doing so, we propose a new model diagnostic, which we term the quasi-coefficient of variation (QCV). N-mixture models performed well when assumptions were met and detection probabilities were moderate (i.e., ≥0.3), and the performance of the estimator improved with increasing survey occasions and sample units. However, the magnitude of bias in estimated mean abundance with even slight amounts of unmodeled heterogeneity was substantial. The parametric bootstrap GOF test did not perform well as a diagnostic for bias in parameter estimates when detectability and sample sizes were low. The results indicate the QCV is useful to diagnose potential bias and that potential bias associated with unidirectional trends in abundance or detectability can be diagnosed using Poisson regression. This study represents the most thorough assessment to date of assumption violations and diagnostics when fitting N-mixture models using the most commonly implemented error distribution. Unbiased estimates of population state variables are needed to properly inform management decision
Predicting Cumulative Incidence Probability: Marginal and Cause-Specific Modelling
DEFF Research Database (Denmark)
Scheike, Thomas H.; Zhang, Mei-Jie
2005-01-01
cumulative incidence probability; cause-specific hazards; subdistribution hazard; binomial modelling......cumulative incidence probability; cause-specific hazards; subdistribution hazard; binomial modelling...
Batterman, Stuart; Su, Feng-Chiao; Li, Shi; Mukherjee, Bhramar; Jia, Chunrong
2014-06-01
Emission sources of volatile organic compounds (VOCs*) are numerous and widespread in both indoor and outdoor environments. Concentrations of VOCs indoors typically exceed outdoor levels, and most people spend nearly 90% of their time indoors. Thus, indoor sources generally contribute the majority of VOC exposures for most people. VOC exposure has been associated with a wide range of acute and chronic health effects; for example, asthma, respiratory diseases, liver and kidney dysfunction, neurologic impairment, and cancer. Although exposures to most VOCs for most persons fall below health-based guidelines, and long-term trends show decreases in ambient emissions and concentrations, a subset of individuals experience much higher exposures that exceed guidelines. Thus, exposure to VOCs remains an important environmental health concern. The present understanding of VOC exposures is incomplete. With the exception of a few compounds, concentration and especially exposure data are limited; and like other environmental data, VOC exposure data can show multiple modes, low and high extreme values, and sometimes a large portion of data below method detection limits (MDLs). Field data also show considerable spatial or interpersonal variability, and although evidence is limited, temporal variability seems high. These characteristics can complicate modeling and other analyses aimed at risk assessment, policy actions, and exposure management. In addition to these analytic and statistical issues, exposure typically occurs as a mixture, and mixture components may interact or jointly contribute to adverse effects. However most pollutant regulations, guidelines, and studies remain focused on single compounds, and thus may underestimate cumulative exposures and risks arising from coexposures. In addition, the composition of VOC mixtures has not been thoroughly investigated, and mixture components show varying and complex dependencies. Finally, although many factors are known to
Adapting cultural mixture modeling for continuous measures of knowledge and memory fluency.
Tan, Yin-Yin Sarah; Mueller, Shane T
2016-09-01
Previous research (e.g., cultural consensus theory (Romney, Weller, & Batchelder, American Anthropologist, 88, 313-338, 1986); cultural mixture modeling (Mueller & Veinott, 2008)) has used overt response patterns (i.e., responses to questionnaires and surveys) to identify whether a group shares a single coherent attitude or belief set. Yet many domains in social science have focused on implicit attitudes that are not apparent in overt responses but still may be detected via response time patterns. We propose a method for modeling response times as a mixture of Gaussians, adapting the strong-consensus model of cultural mixture modeling to model this implicit measure of knowledge strength. We report the results of two behavioral experiments and one simulation experiment that establish the usefulness of the approach, as well as some of the boundary conditions under which distinct groups of shared agreement might be recovered, even when the group identity is not known. The results reveal that the ability to recover and identify shared-belief groups depends on (1) the level of noise in the measurement, (2) the differential signals for strong versus weak attitudes, and (3) the similarity between group attitudes. Consequently, the method shows promise for identifying latent groups among a population whose overt attitudes do not differ, but whose implicit or covert attitudes or knowledge may differ.
Schindler, Michael
2017-08-02
The classification of effects caused by mixtures of agents as synergistic, antagonistic or additive depends critically on the reference model of 'null interaction'. Two main approaches are currently in use, the Additive Dose (ADM) or concentration addition (CA) and the Multiplicative Survival (MSM) or independent action (IA) models. We compare several response surface models to a newly developed Hill response surface, obtained by solving a logistic partial differential equation (PDE). Assuming that a mixture of chemicals with individual Hill-type dose-response curves can be described by an n-dimensional logistic function, Hill's differential equation for pure agents is replaced by a PDE for mixtures whose solution provides Hill surfaces as 'null-interaction' models and relies neither on Bliss independence or Loewe additivity nor uses Chou's unified general theory. An n-dimensional logistic PDE decribing the Hill-type response of n-component mixtures is solved. Appropriate boundary conditions ensure the correct asymptotic behaviour. Mathematica 11 (Wolfram, Mathematica Version 11.0, 2016) is used for the mathematics and graphics presented in this article. The Hill response surface ansatz can be applied to mixtures of compounds with arbitrary Hill parameters. Restrictions which are required when deriving analytical expressions for response surfaces from other principles, are unnecessary. Many approaches based on Loewe additivity turn out be special cases of the Hill approach whose increased flexibility permits a better description of 'null-effect' responses. Missing sham-compliance of Bliss IA, known as Colby's model in agrochemistry, leads to incompatibility with the Hill surface ansatz. Examples of binary and ternary mixtures illustrate the differences between the approaches. For Hill-slopes close to one and doses below the half-maximum effect doses MSM (Colby, Bliss, Finney, Abbott) predicts synergistic effects where the Hill model indicates 'null
Multi-Step Time Series Forecasting with an Ensemble of Varied Length Mixture Models.
Ouyang, Yicun; Yin, Hujun
2018-05-01
Many real-world problems require modeling and forecasting of time series, such as weather temperature, electricity demand, stock prices and foreign exchange (FX) rates. Often, the tasks involve predicting over a long-term period, e.g. several weeks or months. Most existing time series models are inheritably for one-step prediction, that is, predicting one time point ahead. Multi-step or long-term prediction is difficult and challenging due to the lack of information and uncertainty or error accumulation. The main existing approaches, iterative and independent, either use one-step model recursively or treat the multi-step task as an independent model. They generally perform poorly in practical applications. In this paper, as an extension of the self-organizing mixture autoregressive (AR) model, the varied length mixture (VLM) models are proposed to model and forecast time series over multi-steps. The key idea is to preserve the dependencies between the time points within the prediction horizon. Training data are segmented to various lengths corresponding to various forecasting horizons, and the VLM models are trained in a self-organizing fashion on these segments to capture these dependencies in its component AR models of various predicting horizons. The VLM models form a probabilistic mixture of these varied length models. A combination of short and long VLM models and an ensemble of them are proposed to further enhance the prediction performance. The effectiveness of the proposed methods and their marked improvements over the existing methods are demonstrated through a number of experiments on synthetic data, real-world FX rates and weather temperatures.
DEFF Research Database (Denmark)
Liang, Xiaodong; Aloupis, Georgios; Kontogeorgis, Georgios M.
2017-01-01
the performance of the CPA and sPC-SAFT EOS for modeling the fluid-phase equilibria of gas hydrate-related systems and will try to explore how the models can help in suggesting experimental measurements. These systems contain water, hydrocarbon (alkane or aromatic), and either methanol or monoethylene glycol...... parameter sets have been chosen for the sPC-SAFT EOS for a fair comparison. The comparisons are made for pure fluid properties, vapor liquid-equilibria, and liquid liquid equilibria of binary and ternary mixtures as well as vapor liquid liquid equilibria of quaternary mixtures. The results show, from...
New approach in modeling Cr(VI) sorption onto biomass from metal binary mixtures solutions
Energy Technology Data Exchange (ETDEWEB)
Liu, Chang [College of Environmental Science and Engineering, Anhui Normal University, South Jiuhua Road, 189, 241002 Wuhu (China); Chemical Engineering Department, Escola Politècnica Superior, Universitat de Girona, Ma Aurèlia Capmany, 61, 17071 Girona (Spain); Fiol, Núria [Chemical Engineering Department, Escola Politècnica Superior, Universitat de Girona, Ma Aurèlia Capmany, 61, 17071 Girona (Spain); Villaescusa, Isabel, E-mail: Isabel.Villaescusa@udg.edu [Chemical Engineering Department, Escola Politècnica Superior, Universitat de Girona, Ma Aurèlia Capmany, 61, 17071 Girona (Spain); Poch, Jordi [Applied Mathematics Department, Escola Politècnica Superior, Universitat de Girona, Ma Aurèlia Capmany, 61, 17071 Girona (Spain)
2016-01-15
In the last decades Cr(VI) sorption equilibrium and kinetic studies have been carried out using several types of biomasses. However there are few researchers that consider all the simultaneous processes that take place during Cr(VI) sorption (i.e., sorption/reduction of Cr(VI) and simultaneous formation and binding of reduced Cr(III)) when formulating a model that describes the overall sorption process. On the other hand Cr(VI) scarcely exists alone in wastewaters, it is usually found in mixtures with divalent metals. Therefore, the simultaneous removal of Cr(VI) and divalent metals in binary mixtures and the interactive mechanism governing Cr(VI) elimination have gained more and more attention. In the present work, kinetics of Cr(VI) sorption onto exhausted coffee from Cr(VI)–Cu(II) binary mixtures has been studied in a stirred batch reactor. A model including Cr(VI) sorption and reduction, Cr(III) sorption and the effect of the presence of Cu(II) in these processes has been developed and validated. This study constitutes an important advance in modeling Cr(VI) sorption kinetics especially when chromium sorption is in part based on the sorbent capacity of reducing hexavalent chromium and a metal cation is present in the binary mixture. - Highlights: • A kinetic model including Cr(VI) reduction, Cr(VI) and Cr(III) sorption/desorption • Synergistic effect of Cu(II) on Cr(VI) elimination included in the model • Model validation by checking it against independent sets of data.
New approach in modeling Cr(VI) sorption onto biomass from metal binary mixtures solutions
International Nuclear Information System (INIS)
Liu, Chang; Fiol, Núria; Villaescusa, Isabel; Poch, Jordi
2016-01-01
In the last decades Cr(VI) sorption equilibrium and kinetic studies have been carried out using several types of biomasses. However there are few researchers that consider all the simultaneous processes that take place during Cr(VI) sorption (i.e., sorption/reduction of Cr(VI) and simultaneous formation and binding of reduced Cr(III)) when formulating a model that describes the overall sorption process. On the other hand Cr(VI) scarcely exists alone in wastewaters, it is usually found in mixtures with divalent metals. Therefore, the simultaneous removal of Cr(VI) and divalent metals in binary mixtures and the interactive mechanism governing Cr(VI) elimination have gained more and more attention. In the present work, kinetics of Cr(VI) sorption onto exhausted coffee from Cr(VI)–Cu(II) binary mixtures has been studied in a stirred batch reactor. A model including Cr(VI) sorption and reduction, Cr(III) sorption and the effect of the presence of Cu(II) in these processes has been developed and validated. This study constitutes an important advance in modeling Cr(VI) sorption kinetics especially when chromium sorption is in part based on the sorbent capacity of reducing hexavalent chromium and a metal cation is present in the binary mixture. - Highlights: • A kinetic model including Cr(VI) reduction, Cr(VI) and Cr(III) sorption/desorption • Synergistic effect of Cu(II) on Cr(VI) elimination included in the model • Model validation by checking it against independent sets of data
International Nuclear Information System (INIS)
Doneddu, F.
1982-01-01
Starting from the modelization of gaseous flow in a porous medium (flow in a capillary), we generalize the law of enrichment in an infinite cylindrical capillary, established for an isotropic linear mixture, to a multicomponent mixture. A generalization is given of the notion of separation yields and characteristic pressure classically used for separations of isotropic linear mixtures. We present formulas for diagonalizing the diffusion operator, modelization of a multistage, gaseous diffusion cascade and comparison with the experimental results of a drain cascade (N 2 -SF 6 -UF 6 mixture). [fr
Directory of Open Access Journals (Sweden)
Patricio Peña-Rehbein
Full Text Available This paper describes the frequency and number of Sphyrion laevigatum in the skin of Genypterus blacodes, an important economic resource in Chile. The analysis of a spatial distribution model indicated that the parasites tended to cluster. Variations in the number of parasites per host could be described by a negative binomial distribution. The maximum number of parasites observed per host was two.
Ajmani, Subhash; Rogers, Stephen C; Barley, Mark H; Burgess, Andrew N; Livingstone, David J
2010-09-17
In our earlier work, we have demonstrated that it is possible to characterize binary mixtures using single component descriptors by applying various mixing rules. We also showed that these methods were successful in building predictive QSPR models to study various mixture properties of interest. Here in, we developed a QSPR model of an excess thermodynamic property of binary mixtures i.e. excess molar volume (V(E) ). In the present study, we use a set of mixture descriptors which we earlier designed to specifically account for intermolecular interactions between the components of a mixture and applied successfully to the prediction of infinite-dilution activity coefficients using neural networks (part 1 of this series). We obtain a significant QSPR model for the prediction of excess molar volume (V(E) ) using consensus neural networks and five mixture descriptors. We find that hydrogen bond and thermodynamic descriptors are the most important in determining excess molar volume (V(E) ), which is in line with the theory of intermolecular forces governing excess mixture properties. The results also suggest that the mixture descriptors utilized herein may be sufficient to model a wide variety of properties of binary and possibly even more complex mixtures. Copyright © 2010 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Astuti, Ani Budi; Iriawan, Nur; Irhamah, Kuswanto, Heri
2017-12-01
In the Bayesian mixture modeling requires stages the identification number of the most appropriate mixture components thus obtained mixture models fit the data through data driven concept. Reversible Jump Markov Chain Monte Carlo (RJMCMC) is a combination of the reversible jump (RJ) concept and the Markov Chain Monte Carlo (MCMC) concept used by some researchers to solve the problem of identifying the number of mixture components which are not known with certainty number. In its application, RJMCMC using the concept of the birth/death and the split-merge with six types of movement, that are w updating, θ updating, z updating, hyperparameter β updating, split-merge for components and birth/death from blank components. The development of the RJMCMC algorithm needs to be done according to the observed case. The purpose of this study is to know the performance of RJMCMC algorithm development in identifying the number of mixture components which are not known with certainty number in the Bayesian mixture modeling for microarray data in Indonesia. The results of this study represent that the concept RJMCMC algorithm development able to properly identify the number of mixture components in the Bayesian normal mixture model wherein the component mixture in the case of microarray data in Indonesia is not known for certain number.
Estimating animal abundance with N-mixture models using the R-INLA package for R
Meehan, Timothy D.
2017-05-03
Successful management of wildlife populations requires accurate estimates of abundance. Abundance estimates can be confounded by imperfect detection during wildlife surveys. N-mixture models enable quantification of detection probability and often produce abundance estimates that are less biased. The purpose of this study was to demonstrate the use of the R-INLA package to analyze N-mixture models and to compare performance of R-INLA to two other common approaches -- JAGS (via the runjags package), which uses Markov chain Monte Carlo and allows Bayesian inference, and unmarked, which uses Maximum Likelihood and allows frequentist inference. We show that R-INLA is an attractive option for analyzing N-mixture models when (1) familiar model syntax and data format (relative to other R packages) are desired, (2) survey level covariates of detection are not essential, (3) fast computing times are necessary (R-INLA is 10 times faster than unmarked, 300 times faster than JAGS), and (4) Bayesian inference is preferred.
Hadronic multiplicity distributions: the negative binomial and its alternatives
International Nuclear Information System (INIS)
Carruthers, P.
1986-01-01
We review properties of the negative binomial distribution, along with its many possible statistical or dynamical origins. Considering the relation of the multiplicity distribution to the density matrix for Boson systems, we re-introduce the partially coherent laser distribution, which allows for coherent as well as incoherent hadronic emission from the k fundamental cells, and provides equally good phenomenological fits to existing data. The broadening of non-single diffractive hadron-hadron distributions can be equally well due to the decrease of coherent with increasing energy as to the large (and rapidly decreasing) values of k deduced from negative binomial fits. Similarly the narrowness of e + -e - multiplicity distribution is due to nearly coherent (therefore nearly Poissonian) emission from a small number of jets, in contrast to the negative binomial with enormous values of k. 31 refs
Hadronic multiplicity distributions: the negative binomial and its alternatives
International Nuclear Information System (INIS)
Carruthers, P.
1986-01-01
We review properties of the negative binomial distribution, along with its many possible statistical or dynamical origins. Considering the relation of the multiplicity distribution to the density matrix for boson systems, we re-introduce the partially coherent laser distribution, which allows for coherent as well as incoherent hadronic emission from the k fundamental cells, and provides equally good phenomenological fits to existing data. The broadening of non-single diffractive hadron-hadron distributions can be equally well due to the decrease of coherence with increasing energy as to the large (and rapidly decreasing) values of k deduced from negative binomial fits. Similarly the narrowness of e + -e - multiplicity distribution is due to nearly coherent (therefore nearly Poissonian) emission from a small number of jets, in contrast to the negative binomial with enormous values of k. 31 refs
Modeling the surface tension of complex, reactive organic-inorganic mixtures
Schwier, A. N.; Viglione, G. A.; Li, Z.; McNeill, V. Faye
2013-11-01
Atmospheric aerosols can contain thousands of organic compounds which impact aerosol surface tension, affecting aerosol properties such as heterogeneous reactivity, ice nucleation, and cloud droplet formation. We present new experimental data for the surface tension of complex, reactive organic-inorganic aqueous mixtures mimicking tropospheric aerosols. Each solution contained 2-6 organic compounds, including methylglyoxal, glyoxal, formaldehyde, acetaldehyde, oxalic acid, succinic acid, leucine, alanine, glycine, and serine, with and without ammonium sulfate. We test two semi-empirical surface tension models and find that most reactive, complex, aqueous organic mixtures which do not contain salt are well described by a weighted Szyszkowski-Langmuir (S-L) model which was first presented by Henning et al. (2005). Two approaches for modeling the effects of salt were tested: (1) the Tuckermann approach (an extension of the Henning model with an additional explicit salt term), and (2) a new implicit method proposed here which employs experimental surface tension data obtained for each organic species in the presence of salt used with the Henning model. We recommend the use of method (2) for surface tension modeling of aerosol systems because the Henning model (using data obtained from organic-inorganic systems) and Tuckermann approach provide similar modeling results and goodness-of-fit (χ2) values, yet the Henning model is a simpler and more physical approach to modeling the effects of salt, requiring less empirically determined parameters.
An Empirical Bayes Mixture Model for Effect Size Distributions in Genome-Wide Association Studies.
Directory of Open Access Journals (Sweden)
Wesley K Thompson
2015-12-01
Full Text Available Characterizing the distribution of effects from genome-wide genotyping data is crucial for understanding important aspects of the genetic architecture of complex traits, such as number or proportion of non-null loci, average proportion of phenotypic variance explained per non-null effect, power for discovery, and polygenic risk prediction. To this end, previous work has used effect-size models based on various distributions, including the normal and normal mixture distributions, among others. In this paper we propose a scale mixture of two normals model for effect size distributions of genome-wide association study (GWAS test statistics. Test statistics corresponding to null associations are modeled as random draws from a normal distribution with zero mean; test statistics corresponding to non-null associations are also modeled as normal with zero mean, but with larger variance. The model is fit via minimizing discrepancies between the parametric mixture model and resampling-based nonparametric estimates of replication effect sizes and variances. We describe in detail the implications of this model for estimation of the non-null proportion, the probability of replication in de novo samples, the local false discovery rate, and power for discovery of a specified proportion of phenotypic variance explained from additive effects of loci surpassing a given significance threshold. We also examine the crucial issue of the impact of linkage disequilibrium (LD on effect sizes and parameter estimates, both analytically and in simulations. We apply this approach to meta-analysis test statistics from two large GWAS, one for Crohn's disease (CD and the other for schizophrenia (SZ. A scale mixture of two normals distribution provides an excellent fit to the SZ nonparametric replication effect size estimates. While capturing the general behavior of the data, this mixture model underestimates the tails of the CD effect size distribution. We discuss the
An Empirical Bayes Mixture Model for Effect Size Distributions in Genome-Wide Association Studies.
Thompson, Wesley K; Wang, Yunpeng; Schork, Andrew J; Witoelar, Aree; Zuber, Verena; Xu, Shujing; Werge, Thomas; Holland, Dominic; Andreassen, Ole A; Dale, Anders M
2015-12-01
Characterizing the distribution of effects from genome-wide genotyping data is crucial for understanding important aspects of the genetic architecture of complex traits, such as number or proportion of non-null loci, average proportion of phenotypic variance explained per non-null effect, power for discovery, and polygenic risk prediction. To this end, previous work has used effect-size models based on various distributions, including the normal and normal mixture distributions, among others. In this paper we propose a scale mixture of two normals model for effect size distributions of genome-wide association study (GWAS) test statistics. Test statistics corresponding to null associations are modeled as random draws from a normal distribution with zero mean; test statistics corresponding to non-null associations are also modeled as normal with zero mean, but with larger variance. The model is fit via minimizing discrepancies between the parametric mixture model and resampling-based nonparametric estimates of replication effect sizes and variances. We describe in detail the implications of this model for estimation of the non-null proportion, the probability of replication in de novo samples, the local false discovery rate, and power for discovery of a specified proportion of phenotypic variance explained from additive effects of loci surpassing a given significance threshold. We also examine the crucial issue of the impact of linkage disequilibrium (LD) on effect sizes and parameter estimates, both analytically and in simulations. We apply this approach to meta-analysis test statistics from two large GWAS, one for Crohn's disease (CD) and the other for schizophrenia (SZ). A scale mixture of two normals distribution provides an excellent fit to the SZ nonparametric replication effect size estimates. While capturing the general behavior of the data, this mixture model underestimates the tails of the CD effect size distribution. We discuss the implications of
Directory of Open Access Journals (Sweden)
M. F. Gayol
2017-06-01
Full Text Available A methodology for predicting the thermodynamic and transport properties of a multi-component oily mixture, in which the different mixture components are grouped into a small number of pseudo components is shown. This prediction of properties is used in the mathematical modeling of molecular distillation, which consists of a system of differential equations in partial derivatives, according to the principles of the Transport Phenomena and is solved by an implicit finite difference method using a computer code. The mathematical model was validated with experimental data, specifically the molecular distillation of a deodorizer distillate (DD of sunflower oil. The results obtained were satisfactory, with errors less than 10% with respect to the experimental data in a temperature range in which it is possible to apply the proposed method.
Finite mixture model: A maximum likelihood estimation approach on time series data
Yen, Phoong Seuk; Ismail, Mohd Tahir; Hamzah, Firdaus Mohamad
2014-09-01
Recently, statistician emphasized on the fitting of finite mixture model by using maximum likelihood estimation as it provides asymptotic properties. In addition, it shows consistency properties as the sample sizes increases to infinity. This illustrated that maximum likelihood estimation is an unbiased estimator. Moreover, the estimate parameters obtained from the application of maximum likelihood estimation have smallest variance as compared to others statistical method as the sample sizes increases. Thus, maximum likelihood estimation is adopted in this paper to fit the two-component mixture model in order to explore the relationship between rubber price and exchange rate for Malaysia, Thailand, Philippines and Indonesia. Results described that there is a negative effect among rubber price and exchange rate for all selected countries.
International Nuclear Information System (INIS)
Gayol, M.F.; Pramparo, M.C.; Miró Erdmann, S.M.
2017-01-01
A methodology for predicting the thermodynamic and transport properties of a multi-component oily mixture, in which the different mixture components are grouped into a small number of pseudo components is shown. This prediction of properties is used in the mathematical modeling of molecular distillation, which consists of a system of differential equations in partial derivatives, according to the principles of the Transport Phenomena and is solved by an implicit finite difference method using a computer code. The mathematical model was validated with experimental data, specifically the molecular distillation of a deodorizer distillate (DD) of sunflower oil. The results obtained were satisfactory, with errors less than 10% with respect to the experimental data in a temperature range in which it is possible to apply the proposed method. [es
Modeling the flow of activated H2 + CH4 mixture by deposition of diamond nanostructures
Directory of Open Access Journals (Sweden)
Plotnikov Mikhail
2017-01-01
Full Text Available Algorithm of the direct simulation Monte Carlo method for the flow of hydrogen and methane mixture in a cylindrical channel is developed. Heterogeneous reactions on tungsten channel surfaces are included into the model. Their effects on flows are analyzed. A one-dimensional approach based on the solution of equilibrium chemical kinetics equations is used to analyze gas-phase methane decomposition. The obtained results may be useful for optimization of gas-dynamic sources of activated gas diamond synthesis.
Catalytically stabilized combustion of lean methane-air-mixtures: a numerical model
Energy Technology Data Exchange (ETDEWEB)
Dogwiler, U; Benz, P; Mantharas, I [Paul Scherrer Inst. (PSI), Villigen (Switzerland)
1997-06-01
The catalytically stabilized combustion of lean methane/air mixtures has been studied numerically under conditions closely resembling the ones prevailing in technical devices. A detailed numerical model has been developed for a laminar, stationary, 2-D channel flow with full heterogeneous and homogeneous reaction mechanisms. The computations provide direct information on the coupling between heterogeneous-homogeneous combustion and in particular on the means of homogeneous ignitions and stabilization. (author) 4 figs., 3 refs.
C-Vine copula mixture model for clustering of residential electrical load pattern data
Sun, M; Konstantelos, I; Strbac, G
2016-01-01
The ongoing deployment of residential smart meters in numerous jurisdictions has led to an influx of electricity consumption data. This information presents a valuable opportunity to suppliers for better understanding their customer base and designing more effective tariff structures. In the past, various clustering methods have been proposed for meaningful customer partitioning. This paper presents a novel finite mixture modeling framework based on C-vine copulas (CVMM) for carrying out cons...
Self-similarity of the negative binomial multiplicity distributions
International Nuclear Information System (INIS)
Calucci, G.; Treleani, D.
1998-01-01
The negative binomial distribution is self-similar: If the spectrum over the whole rapidity range gives rise to a negative binomial, in the absence of correlation and if the source is unique, also a partial range in rapidity gives rise to the same distribution. The property is not seen in experimental data, which are rather consistent with the presence of a number of independent sources. When multiplicities are very large, self-similarity might be used to isolate individual sources in a complex production process. copyright 1997 The American Physical Society
Interpretations and implications of negative binomial distributions of multiparticle productions
International Nuclear Information System (INIS)
Arisawa, Tetsuo
2006-01-01
The number of particles produced in high energy experiments is approximated by a negative binomial distribution. Deriving a representation of the distribution from a stochastic equation, conditions for the process to satisfy the distribution are clarified. Based on them, it is proposed that multiparticle production consists of spontaneous and induced production. The rate of the induced production is proportional to the number of existing particles. The ratio of the two production rates remains constant during the process. The ''NBD space'' is also defined where the number of particles produced in its subspaces follows negative binomial distributions with different parameters
Poisson and negative binomial item count techniques for surveys with sensitive question.
Tian, Guo-Liang; Tang, Man-Lai; Wu, Qin; Liu, Yin
2017-04-01
Although the item count technique is useful in surveys with sensitive questions, privacy of those respondents who possess the sensitive characteristic of interest may not be well protected due to a defect in its original design. In this article, we propose two new survey designs (namely the Poisson item count technique and negative binomial item count technique) which replace several independent Bernoulli random variables required by the original item count technique with a single Poisson or negative binomial random variable, respectively. The proposed models not only provide closed form variance estimate and confidence interval within [0, 1] for the sensitive proportion, but also simplify the survey design of the original item count technique. Most importantly, the new designs do not leak respondents' privacy. Empirical results show that the proposed techniques perform satisfactorily in the sense that it yields accurate parameter estimate and confidence interval.
DEFF Research Database (Denmark)
Baylaucq, A.; Boned, C.; Canet, X.
2005-01-01
Viscosity measurements of well-defined mixtures are useful in order to evaluate existing viscosity models. Recently, an extensive experimental study of the viscosity at pressures up to 140 MPa has been carried out for the binary systems methane + n-decane and methane toluene, between 293.15 and 3...
Halty, Virginia; Valdés, Matías; Tejera, Mauricio; Picasso, Valentín; Fort, Hugo
2017-12-01
The contribution of plant species richness to productivity and ecosystem functioning is a longstanding issue in ecology, with relevant implications for both conservation and agriculture. Both experiments and quantitative modeling are fundamental to the design of sustainable agroecosystems and the optimization of crop production. We modeled communities of perennial crop mixtures by using a generalized Lotka-Volterra model, i.e., a model such that the interspecific interactions are more general than purely competitive. We estimated model parameters -carrying capacities and interaction coefficients- from, respectively, the observed biomass of monocultures and bicultures measured in a large diversity experiment of seven perennial forage species in Iowa, United States. The sign and absolute value of the interaction coefficients showed that the biological interactions between species pairs included amensalism, competition, and parasitism (asymmetric positive-negative interaction), with various degrees of intensity. We tested the model fit by simulating the combinations of more than two species and comparing them with the polycultures experimental data. Overall, theoretical predictions are in good agreement with the experiments. Using this model, we also simulated species combinations that were not sown. From all possible mixtures (sown and not sown) we identified which are the most productive species combinations. Our results demonstrate that a combination of experiments and modeling can contribute to the design of sustainable agricultural systems in general and to the optimization of crop production in particular. © 2017 by the Ecological Society of America.
Comparison of multinomial and binomial proportion methods for analysis of multinomial count data.
Galyean, M L; Wester, D B
2010-10-01
Simulation methods were used to generate 1,000 experiments, each with 3 treatments and 10 experimental units/treatment, in completely randomized (CRD) and randomized complete block designs. Data were counts in 3 ordered or 4 nominal categories from multinomial distributions. For the 3-category analyses, category probabilities were 0.6, 0.3, and 0.1, respectively, for 2 of the treatments, and 0.5, 0.35, and 0.15 for the third treatment. In the 4-category analysis (CRD only), probabilities were 0.3, 0.3, 0.2, and 0.2 for treatments 1 and 2 vs. 0.4, 0.4, 0.1, and 0.1 for treatment 3. The 3-category data were analyzed with generalized linear mixed models as an ordered multinomial distribution with a cumulative logit link or by regrouping the data (e.g., counts in 1 category/sum of counts in all categories), followed by analysis of single categories as binomial proportions. Similarly, the 4-category data were analyzed as a nominal multinomial distribution with a glogit link or by grouping data as binomial proportions. For the 3-category CRD analyses, empirically determined type I error rates based on pair-wise comparisons (F- and Wald chi(2) tests) did not differ between multinomial and individual binomial category analyses with 10 (P = 0.38 to 0.60) or 50 (P = 0.19 to 0.67) sampling units/experimental unit. When analyzed as binomial proportions, power estimates varied among categories, with analysis of the category with the greatest counts yielding power similar to the multinomial analysis. Agreement between methods (percentage of experiments with the same results for the overall test for treatment effects) varied considerably among categories analyzed and sampling unit scenarios for the 3-category CRD analyses. Power (F-test) was 24.3, 49.1, 66.9, 83.5, 86.8, and 99.7% for 10, 20, 30, 40, 50, and 100 sampling units/experimental unit for the 3-category multinomial CRD analyses. Results with randomized complete block design simulations were similar to those with the CRD
A mixture model-based approach to the clustering of microarray expression data.
McLachlan, G J; Bean, R W; Peel, D
2002-03-01
This paper introduces the software EMMIX-GENE that has been developed for the specific purpose of a model-based approach to the clustering of microarray expression data, in particular, of tissue samples on a very large number of genes. The latter is a nonstandard problem in parametric cluster analysis because the dimension of the feature space (the number of genes) is typically much greater than the number of tissues. A feasible approach is provided by first selecting a subset of the genes relevant for the clustering of the tissue samples by fitting mixtures of t distributions to rank the genes in order of increasing size of the likelihood ratio statistic for the test of one versus two components in the mixture model. The imposition of a threshold on the likelihood ratio statistic used in conjunction with a threshold on the size of a cluster allows the selection of a relevant set of genes. However, even this reduced set of genes will usually be too large for a normal mixture model to be fitted directly to the tissues, and so the use of mixtures of factor analyzers is exploited to reduce effectively the dimension of the feature space of genes. The usefulness of the EMMIX-GENE approach for the clustering of tissue samples is demonstrated on two well-known data sets on colon and leukaemia tissues. For both data sets, relevant subsets of the genes are able to be selected that reveal interesting clusterings of the tissues that are either consistent with the external classification of the tissues or with background and biological knowledge of these sets. EMMIX-GENE is available at http://www.maths.uq.edu.au/~gjm/emmix-gene/
López Martínez, Laura Elizabeth
2010-01-01
En este trabajo se realiza inferencia estadística en la distribución Binomial Negativa Generalizada (BNG) y los modelos que anida, los cuales son Binomial, Binomial Negativa y Poisson. Se aborda el problema de estimación de parámetros en la distribución BNG y se propone una prueba de razón de verosimilitud generalizada para discernir si un conjunto de datos se ajusta en particular al modelo Binomial, Binomial Negativa o Poisson. Además, se estudian las potencias y tamaños de la prueba p...
Hubbard, Rebecca A; Johnson, Eric; Chubak, Jessica; Wernli, Karen J; Kamineni, Aruna; Bogart, Andy; Rutter, Carolyn M
2017-06-01
Exposures derived from electronic health records (EHR) may be misclassified, leading to biased estimates of their association with outcomes of interest. An example of this problem arises in the context of cancer screening where test indication, the purpose for which a test was performed, is often unavailable. This poses a challenge to understanding the effectiveness of screening tests because estimates of screening test effectiveness are biased if some diagnostic tests are misclassified as screening. Prediction models have been developed for a variety of exposure variables that can be derived from EHR, but no previous research has investigated appropriate methods for obtaining unbiased association estimates using these predicted probabilities. The full likelihood incorporating information on both the predicted probability of exposure-class membership and the association between the exposure and outcome of interest can be expressed using a finite mixture model. When the regression model of interest is a generalized linear model (GLM), the expectation-maximization algorithm can be used to estimate the parameters using standard software for GLMs. Using simulation studies, we compared the bias and efficiency of this mixture model approach to alternative approaches including multiple imputation and dichotomization of the predicted probabilities to create a proxy for the missing predictor. The mixture model was the only approach that was unbiased across all scenarios investigated. Finally, we explored the performance of these alternatives in a study of colorectal cancer screening with colonoscopy. These findings have broad applicability in studies using EHR data where gold-standard exposures are unavailable and prediction models have been developed for estimating proxies.
Using finite mixture models in thermal-hydraulics system code uncertainty analysis
Energy Technology Data Exchange (ETDEWEB)
Carlos, S., E-mail: scarlos@iqn.upv.es [Department d’Enginyeria Química i Nuclear, Universitat Politècnica de València, Camí de Vera s.n, 46022 València (Spain); Sánchez, A. [Department d’Estadística Aplicada i Qualitat, Universitat Politècnica de València, Camí de Vera s.n, 46022 València (Spain); Ginestar, D. [Department de Matemàtica Aplicada, Universitat Politècnica de València, Camí de Vera s.n, 46022 València (Spain); Martorell, S. [Department d’Enginyeria Química i Nuclear, Universitat Politècnica de València, Camí de Vera s.n, 46022 València (Spain)
2013-09-15
Highlights: • Best estimate codes simulation needs uncertainty quantification. • The output variables can present multimodal probability distributions. • The analysis of multimodal distribution is performed using finite mixture models. • Two methods to reconstruct output variable probability distribution are used. -- Abstract: Nuclear Power Plant safety analysis is mainly based on the use of best estimate (BE) codes that predict the plant behavior under normal or accidental conditions. As the BE codes introduce uncertainties due to uncertainty in input parameters and modeling, it is necessary to perform uncertainty assessment (UA), and eventually sensitivity analysis (SA), of the results obtained. These analyses are part of the appropriate treatment of uncertainties imposed by current regulation based on the adoption of the best estimate plus uncertainty (BEPU) approach. The most popular approach for uncertainty assessment, based on Wilks’ method, obtains a tolerance/confidence interval, but it does not completely characterize the output variable behavior, which is required for an extended UA and SA. However, the development of standard UA and SA impose high computational cost due to the large number of simulations needed. In order to obtain more information about the output variable and, at the same time, to keep computational cost as low as possible, there has been a recent shift toward developing metamodels (model of model), or surrogate models, that approximate or emulate complex computer codes. In this way, there exist different techniques to reconstruct the probability distribution using the information provided by a sample of values as, for example, the finite mixture models. In this paper, the Expectation Maximization and the k-means algorithms are used to obtain a finite mixture model that reconstructs the output variable probability distribution from data obtained with RELAP-5 simulations. Both methodologies have been applied to a separated
International Nuclear Information System (INIS)
Vladimir V Chudanov; Alexei A Leonov
2005-01-01
Full text of publication follows: One of the mathematical models (hyperbolic type) for describing evolution of compressible two-phase mixtures was offered in [1] to deal with the following applications: interfaces between compressible materials; shock waves in multiphase mixtures; evolution of homogeneous two-phase flows; cavitation in liquids. The basic difficulties of this model was connected to discretization of the non-conservative equation terms. As result, the class of problems concerned with passage of shock waves through fields with a discontinuing profile of a volume fraction was not described by means of this model. A class of schemes that are able to converge to the correct solution of such problems was received in [2] due to a deeper analysis of two-phase model. The technique offered in [2] was implemented on a Eulerian grid via the Godunov scheme. In present paper the additional analysis of two-phase model in view of microstructure of an mixture topology is carried out in Lagrange mass coordinates. As result, the equations averaged over the set of all possible realizations for two-phase mixture are received. The numerical solution is carried out with use of PPM method [3] in two steps: at first - the equations averaged over mass variable are solved; on the second - the solution, found on the previous step, is re-mapped to a fixed Eulerian grid. Such approach allows to expand the proposed technique on two-dimensional (three-dimensional) case, as in the Lagrange variables the Euler equations system is split on two (three) identical subsystems, each of which describes evolution of considered medium in the given direction. The accuracy and robustness of the described procedure are demonstrated on a sequence of the numerical problems. References: (1). R. Saurel, R. Abgrall, A multiphase Godunov method for compressible multi-fluid and multiphase flows, J. Comput. Phys. 150 (1999) 425-467; (2). R. Saurel, R. Abgrall, Discrete equations for physical and
Discrete Element Method Modeling of the Rheological Properties of Coke/Pitch Mixtures
Directory of Open Access Journals (Sweden)
Behzad Majidi
2016-05-01
Full Text Available Rheological properties of pitch and pitch/coke mixtures at temperatures around 150 °C are of great interest for the carbon anode manufacturing process in the aluminum industry. In the present work, a cohesive viscoelastic contact model based on Burger’s model is developed using the discrete element method (DEM on the YADE, the open-source DEM software. A dynamic shear rheometer (DSR is used to measure the viscoelastic properties of pitch at 150 °C. The experimental data obtained is then used to estimate the Burger’s model parameters and calibrate the DEM model. The DSR tests were then simulated by a three-dimensional model. Very good agreement was observed between the experimental data and simulation results. Coke aggregates were modeled by overlapping spheres in the DEM model. Coke/pitch mixtures were numerically created by adding 5, 10, 20, and 30 percent of coke aggregates of the size range of 0.297–0.595 mm (−30 + 50 mesh to pitch. Adding up to 30% of coke aggregates to pitch can increase its complex shear modulus at 60 Hz from 273 Pa to 1557 Pa. Results also showed that adding coke particles increases both storage and loss moduli, while it does not have a meaningful effect on the phase angle of pitch.
Discrete Element Method Modeling of the Rheological Properties of Coke/Pitch Mixtures.
Majidi, Behzad; Taghavi, Seyed Mohammad; Fafard, Mario; Ziegler, Donald P; Alamdari, Houshang
2016-05-04
Rheological properties of pitch and pitch/coke mixtures at temperatures around 150 °C are of great interest for the carbon anode manufacturing process in the aluminum industry. In the present work, a cohesive viscoelastic contact model based on Burger's model is developed using the discrete element method (DEM) on the YADE, the open-source DEM software. A dynamic shear rheometer (DSR) is used to measure the viscoelastic properties of pitch at 150 °C. The experimental data obtained is then used to estimate the Burger's model parameters and calibrate the DEM model. The DSR tests were then simulated by a three-dimensional model. Very good agreement was observed between the experimental data and simulation results. Coke aggregates were modeled by overlapping spheres in the DEM model. Coke/pitch mixtures were numerically created by adding 5, 10, 20, and 30 percent of coke aggregates of the size range of 0.297-0.595 mm (-30 + 50 mesh) to pitch. Adding up to 30% of coke aggregates to pitch can increase its complex shear modulus at 60 Hz from 273 Pa to 1557 Pa. Results also showed that adding coke particles increases both storage and loss moduli, while it does not have a meaningful effect on the phase angle of pitch.
A Bayesian Approach to Model Selection in Hierarchical Mixtures-of-Experts Architectures.
Tanner, Martin A.; Peng, Fengchun; Jacobs, Robert A.
1997-03-01
There does not exist a statistical model that shows good performance on all tasks. Consequently, the model selection problem is unavoidable; investigators must decide which model is best at summarizing the data for each task of interest. This article presents an approach to the model selection problem in hierarchical mixtures-of-experts architectures. These architectures combine aspects of generalized linear models with those of finite mixture models in order to perform tasks via a recursive "divide-and-conquer" strategy. Markov chain Monte Carlo methodology is used to estimate the distribution of the architectures' parameters. One part of our approach to model selection attempts to estimate the worth of each component of an architecture so that relatively unused components can be pruned from the architecture's structure. A second part of this approach uses a Bayesian hypothesis testing procedure in order to differentiate inputs that carry useful information from nuisance inputs. Simulation results suggest that the approach presented here adheres to the dictum of Occam's razor; simple architectures that are adequate for summarizing the data are favored over more complex structures. Copyright 1997 Elsevier Science Ltd. All Rights Reserved.
DEFF Research Database (Denmark)
Feng, Huan; Pettinari, Matteo; Stang, Henrik
2015-01-01
In this paper, the viscoelastic behavior of asphalt mixture was studied by using discrete element method. The dynamic properties of asphalt mixture were captured by implementing Burger’s contact model. Different ways of taking into account of the normal and shear material properties of asphalt mi...
Biesheuvel, P.M.; Lindhoud, S.; Vries, de R.J.; Stuart, M.A.C.
2006-01-01
We study the phase behavior of mixtures of oppositely charged nanoparticles, both theoretically and experimentally. As an experimental model system we consider mixtures of lysozyme and lysozyme that has been chemically modified in such a way that its charge is nearly equal in magnitude but opposite
A semi-nonparametric mixture model for selecting functionally consistent proteins.
Yu, Lianbo; Doerge, Rw
2010-09-28
High-throughput technologies have led to a new era of proteomics. Although protein microarray experiments are becoming more common place there are a variety of experimental and statistical issues that have yet to be addressed, and that will carry over to new high-throughput technologies unless they are investigated. One of the largest of these challenges is the selection of functionally consistent proteins. We present a novel semi-nonparametric mixture model for classifying proteins as consistent or inconsistent while controlling the false discovery rate and the false non-discovery rate. The performance of the proposed approach is compared to current methods via simulation under a variety of experimental conditions. We provide a statistical method for selecting functionally consistent proteins in the context of protein microarray experiments, but the proposed semi-nonparametric mixture model method can certainly be generalized to solve other mixture data problems. The main advantage of this approach is that it provides the posterior probability of consistency for each protein.
International Nuclear Information System (INIS)
Fouque, A.L.; Ciuciu, Ph.; Risser, L.; Fouque, A.L.; Ciuciu, Ph.; Risser, L.
2009-01-01
In this paper, a novel statistical parcellation of intra-subject functional MRI (fMRI) data is proposed. The key idea is to identify functionally homogenous regions of interest from their hemodynamic parameters. To this end, a non-parametric voxel-based estimation of hemodynamic response function is performed as a prerequisite. Then, the extracted hemodynamic features are entered as the input data of a Multivariate Spatial Gaussian Mixture Model (MSGMM) to be fitted. The goal of the spatial aspect is to favor the recovery of connected components in the mixture. Our statistical clustering approach is original in the sense that it extends existing works done on univariate spatially regularized Gaussian mixtures. A specific Gibbs sampler is derived to account for different covariance structures in the feature space. On realistic artificial fMRI datasets, it is shown that our algorithm is helpful for identifying a parsimonious functional parcellation required in the context of joint detection estimation of brain activity. This allows us to overcome the classical assumption of spatial stationarity of the BOLD signal model. (authors)
Mixtures of endocrine disrupting contaminants modelled on human high end exposures
DEFF Research Database (Denmark)
Christiansen, Sofie; Kortenkamp, A.; Petersen, Marta Axelstad
2012-01-01
exceeding 1 is expected to lead to effects in the rat, a total dose more than 62 times higher than human exposures should lead to responses. Considering the high uncertainty of this estimate, experience on lowest‐observed‐adverse‐effect‐level (LOAEL)/NOAEL ratios and statistical power of rat studies, we...... expected that combined doses 150 times higher than high end human intake estimates should give no, or only borderline effects, whereas doses 450 times higher should produce significant responses. Experiments indeed showed clear developmental toxicity of the 450‐fold dose in terms of increased nipple...... though each individual chemical is present at low, ineffective doses, but the effects of mixtures modelled based on human intakes have not previously been investigated. To address this issue for the first time, we selected 13 chemicals for a developmental mixture toxicity study in rats where data about...
Two-component mixture model: Application to palm oil and exchange rate
Phoong, Seuk-Yen; Ismail, Mohd Tahir; Hamzah, Firdaus Mohamad
2014-12-01
Palm oil is a seed crop which is widely adopt for food and non-food products such as cookie, vegetable oil, cosmetics, household products and others. Palm oil is majority growth in Malaysia and Indonesia. However, the demand for palm oil is getting growth and rapidly running out over the years. This phenomenal cause illegal logging of trees and destroy the natural habitat. Hence, the present paper investigates the relationship between exchange rate and palm oil price in Malaysia by using Maximum Likelihood Estimation via Newton-Raphson algorithm to fit a two components mixture model. Besides, this paper proposes a mixture of normal distribution to accommodate with asymmetry characteristics and platykurtic time series data.
Phase equilibria for mixtures containing nonionic surfactant systems: Modeling and experiments
International Nuclear Information System (INIS)
Shin, Moon Sam; Kim, Hwayong
2008-01-01
Surfactants are important materials with numerous applications in the cosmetic, pharmaceutical, and food industries due to inter-associating and intra-associating bond. We present a lattice fluid equation-of-state that combines the quasi-chemical nonrandom lattice fluid model with Veytsman statistics for (intra + inter) molecular association to calculate phase behavior for mixtures containing nonionic surfactants. We also measured binary (vapor + liquid) equilibrium data for {2-butoxyethanol (C 4 E 1 ) + n-hexane} and {2-butoxyethanol (C 4 E 1 ) + n-heptane} systems at temperatures ranging from (303.15 to 323.15) K. A static apparatus was used in this study. The presented equation-of-state correlated well with the measured and published data for mixtures containing nonionic surfactant systems
Generation of the reciprocal-binomial state for optical fields
International Nuclear Information System (INIS)
Valverde, C.; Avelar, A.T.; Baseia, B.; Malbouisson, J.M.C.
2003-01-01
We compare the efficiencies of two interesting schemes to generate truncated states of the light field in running modes, namely the 'quantum scissors' and the 'beam-splitter array' schemes. The latter is applied to create the reciprocal-binomial state as a travelling wave, required to implement recent experimental proposals of phase-distribution determination and of quantum lithography
Improved binomial charts for monitoring high-quality processes
Albers, Willem/Wim
2009-01-01
For processes concerning attribute data with (very) small failure rate p, often negative binomial control charts are used. The decision whether to stop or continue is made each time r failures have occurred, for some r≥1. Finding the optimal r for detecting a given increase of p first requires
Improved binomial charts for high-quality processes
Albers, Willem/Wim
For processes concerning attribute data with (very) small failure rate p, often negative binomial control charts are used. The decision whether to stop or continue is made each time r failures have occurred, for some r≥1. Finding the optimal r for detecting a given increase of p first requires
Calculation of generalized secant integral using binomial coefficients
International Nuclear Information System (INIS)
Guseinov, I.I.; Mamedov, B.A.
2004-01-01
A single series expansion relation is derived for the generalized secant (GS) integral in terms of binomial coefficients, exponential integrals and incomplete gamma functions. The convergence of the series is tested by the concrete cases of parameters. The formulas given in this study for the evaluation of GS integral show good rate of convergence and numerical stability
A Neutrosophic Binomial Factorial Theorem with their Refrains
Directory of Open Access Journals (Sweden)
Huda E. Khalid
2016-12-01
Full Text Available The Neutrosophic Precalculus and the Neutrosophic Calculus can be developed in many ways, depending on the types of indeterminacy one has and on the method used to deal with such indeterminacy. This article is innovative since the form of neutrosophic binomial factorial theorem was constructed in addition to its refrains.
A BAYESIAN NONPARAMETRIC MIXTURE MODEL FOR SELECTING GENES AND GENE SUBNETWORKS.
Zhao, Yize; Kang, Jian; Yu, Tianwei
2014-06-01
It is very challenging to select informative features from tens of thousands of measured features in high-throughput data analysis. Recently, several parametric/regression models have been developed utilizing the gene network information to select genes or pathways strongly associated with a clinical/biological outcome. Alternatively, in this paper, we propose a nonparametric Bayesian model for gene selection incorporating network information. In addition to identifying genes that have a strong association with a clinical outcome, our model can select genes with particular expressional behavior, in which case the regression models are not directly applicable. We show that our proposed model is equivalent to an infinity mixture model for which we develop a posterior computation algorithm based on Markov chain Monte Carlo (MCMC) methods. We also propose two fast computing algorithms that approximate the posterior simulation with good accuracy but relatively low computational cost. We illustrate our methods on simulation studies and the analysis of Spellman yeast cell cycle microarray data.
Beyond GLMs: a generative mixture modeling approach to neural system identification.
Directory of Open Access Journals (Sweden)
Lucas Theis
Full Text Available Generalized linear models (GLMs represent a popular choice for the probabilistic characterization of neural spike responses. While GLMs are attractive for their computational tractability, they also impose strong assumptions and thus only allow for a limited range of stimulus-response relationships to be discovered. Alternative approaches exist that make only very weak assumptions but scale poorly to high-dimensional stimulus spaces. Here we seek an approach which can gracefully interpolate between the two extremes. We extend two frequently used special cases of the GLM-a linear and a quadratic model-by assuming that the spike-triggered and non-spike-triggered distributions can be adequately represented using Gaussian mixtures. Because we derive the model from a generative perspective, its components are easy to interpret as they correspond to, for example, the spike-triggered distribution and the interspike interval distribution. The model is able to capture complex dependencies on high-dimensional stimuli with far fewer parameters than other approaches such as histogram-based methods. The added flexibility comes at the cost of a non-concave log-likelihood. We show that in practice this does not have to be an issue and the mixture-based model is able to outperform generalized linear and quadratic models.
Hess, Julian; Wang, Yongqi
2016-11-01
A new mixture model for granular-fluid flows, which is thermodynamically consistent with the entropy principle, is presented. The extra pore pressure described by a pressure diffusion equation and the hypoplastic material behavior obeying a transport equation are taken into account. The model is applied to granular-fluid flows, using a closing assumption in conjunction with the dynamic fluid pressure to describe the pressure-like residual unknowns, hereby overcoming previous uncertainties in the modeling process. Besides the thermodynamically consistent modeling, numerical simulations are carried out and demonstrate physically reasonable results, including simple shear flow in order to investigate the vertical distribution of the physical quantities, and a mixture flow down an inclined plane by means of the depth-integrated model. Results presented give insight in the ability of the deduced model to capture the key characteristics of granular-fluid flows. We acknowledge the support of the Deutsche Forschungsgemeinschaft (DFG) for this work within the Project Number WA 2610/3-1.
An Odor Interaction Model of Binary Odorant Mixtures by a Partial Differential Equation Method
Directory of Open Access Journals (Sweden)
Luchun Yan
2014-07-01
Full Text Available A novel odor interaction model was proposed for binary mixtures of benzene and substituted benzenes by a partial differential equation (PDE method. Based on the measurement method (tangent-intercept method of partial molar volume, original parameters of corresponding formulas were reasonably displaced by perceptual measures. By these substitutions, it was possible to relate a mixture’s odor intensity to the individual odorant’s relative odor activity value (OAV. Several binary mixtures of benzene and substituted benzenes were respectively tested to establish the PDE models. The obtained results showed that the PDE model provided an easily interpretable method relating individual components to their joint odor intensity. Besides, both predictive performance and feasibility of the PDE model were proved well through a series of odor intensity matching tests. If combining the PDE model with portable gas detectors or on-line monitoring systems, olfactory evaluation of odor intensity will be achieved by instruments instead of odor assessors. Many disadvantages (e.g., expense on a fixed number of odor assessors also will be successfully avoided. Thus, the PDE model is predicted to be helpful to the monitoring and management of odor pollutions.
Estimating demographic parameters using a combination of known-fate and open N-mixture models.
Schmidt, Joshua H; Johnson, Devin S; Lindberg, Mark S; Adams, Layne G
2015-10-01
Accurate estimates of demographic parameters are required to infer appropriate ecological relationships and inform management actions. Known-fate data from marked individuals are commonly used to estimate survival rates, whereas N-mixture models use count data from unmarked individuals to estimate multiple demographic parameters. However, a joint approach combining the strengths of both analytical tools has not been developed. Here we develop an integrated model combining known-fate and open N-mixture models, allowing the estimation of detection probability, recruitment, and the joint estimation of survival. We demonstrate our approach through both simulations and an applied example using four years of known-fate and pack count data for wolves (Canis lupus). Simulation results indicated that the integrated model reliably recovered parameters with no evidence of bias, and survival estimates were more precise under the joint model. Results from the applied example indicated that the marked sample of wolves was biased toward individuals with higher apparent survival rates than the unmarked pack mates, suggesting that joint estimates may be more representative of the overall population. Our integrated model is a practical approach for reducing bias while increasing precision and the amount of information gained from mark-resight data sets. We provide implementations in both the BUGS language and an R package.
Manual hierarchical clustering of regional geochemical data using a Bayesian finite mixture model
International Nuclear Information System (INIS)
Ellefsen, Karl J.; Smith, David B.
2016-01-01
Interpretation of regional scale, multivariate geochemical data is aided by a statistical technique called “clustering.” We investigate a particular clustering procedure by applying it to geochemical data collected in the State of Colorado, United States of America. The clustering procedure partitions the field samples for the entire survey area into two clusters. The field samples in each cluster are partitioned again to create two subclusters, and so on. This manual procedure generates a hierarchy of clusters, and the different levels of the hierarchy show geochemical and geological processes occurring at different spatial scales. Although there are many different clustering methods, we use Bayesian finite mixture modeling with two probability distributions, which yields two clusters. The model parameters are estimated with Hamiltonian Monte Carlo sampling of the posterior probability density function, which usually has multiple modes. Each mode has its own set of model parameters; each set is checked to ensure that it is consistent both with the data and with independent geologic knowledge. The set of model parameters that is most consistent with the independent geologic knowledge is selected for detailed interpretation and partitioning of the field samples. - Highlights: • We evaluate a clustering procedure by applying it to geochemical data. • The procedure generates a hierarchy of clusters. • Different levels of the hierarchy show geochemical processes at different spatial scales. • The clustering method is Bayesian finite mixture modeling. • Model parameters are estimated with Hamiltonian Monte Carlo sampling.
Discrete Element Method Modeling of the Rheological Properties of Coke/Pitch Mixtures
Majidi, Behzad; Taghavi, Seyed Mohammad; Fafard, Mario; Ziegler, Donald P.; Alamdari, Houshang
2016-01-01
Rheological properties of pitch and pitch/coke mixtures at temperatures around 150 °C are of great interest for the carbon anode manufacturing process in the aluminum industry. In the present work, a cohesive viscoelastic contact model based on Burger’s model is developed using the discrete element method (DEM) on the YADE, the open-source DEM software. A dynamic shear rheometer (DSR) is used to measure the viscoelastic properties of pitch at 150 °C. The experimental data obtained is then use...
Nguyen, Hien D; Ullmann, Jeremy F P; McLachlan, Geoffrey J; Voleti, Venkatakaushik; Li, Wenze; Hillman, Elizabeth M C; Reutens, David C; Janke, Andrew L
2018-02-01
Calcium is a ubiquitous messenger in neural signaling events. An increasing number of techniques are enabling visualization of neurological activity in animal models via luminescent proteins that bind to calcium ions. These techniques generate large volumes of spatially correlated time series. A model-based functional data analysis methodology via Gaussian mixtures is suggested for the clustering of data from such visualizations is proposed. The methodology is theoretically justified and a computationally efficient approach to estimation is suggested. An example analysis of a zebrafish imaging experiment is presented.
Genome-enabled predictions for binomial traits in sugar beet populations.
Biscarini, Filippo; Stevanato, Piergiorgio; Broccanello, Chiara; Stella, Alessandra; Saccomani, Massimo
2014-07-22
Genomic information can be used to predict not only continuous but also categorical (e.g. binomial) traits. Several traits of interest in human medicine and agriculture present a discrete distribution of phenotypes (e.g. disease status). Root vigor in sugar beet (B. vulgaris) is an example of binomial trait of agronomic importance. In this paper, a panel of 192 SNPs (single nucleotide polymorphisms) was used to genotype 124 sugar beet individual plants from 18 lines, and to classify them as showing "high" or "low" root vigor. A threshold model was used to fit the relationship between binomial root vigor and SNP genotypes, through the matrix of genomic relationships between individuals in a genomic BLUP (G-BLUP) approach. From a 5-fold cross-validation scheme, 500 testing subsets were generated. The estimated average cross-validation error rate was 0.000731 (0.073%). Only 9 out of 12326 test observations (500 replicates for an average test set size of 24.65) were misclassified. The estimated prediction accuracy was quite high. Such accurate predictions may be related to the high estimated heritability for root vigor (0.783) and to the few genes with large effect underlying the trait. Despite the sparse SNP panel, there was sufficient within-scaffold LD where SNPs with large effect on root vigor were located to allow for genome-enabled predictions to work.
Tollerup, Kris E; Marcum, Daniel; Wilson, Rob; Godfrey, Larry
2013-08-01
The two-spotted spider mite, Tetranychus urticae Koch, is an economic pest on peppermint [Mentha x piperita (L.), 'Black Mitcham'] grown in California. A sampling plan for T. urticae was developed under Pacific Northwest conditions in the early 1980s and has been used by California growers since approximately 1998. This sampling plan, however, is cumbersome and a poor predictor of T. urticae densities in California. Between June and August, the numbers of immature and adult T. urticae were counted on leaves at three commercial peppermint fields (sites) in 2010 and a single field in 2011. In each of seven locations per site, 45 leaves were sampled, that is, 9 leaves per five stems. Leaf samples were stratified by collecting three leaves from the top, middle, and bottom strata per stem. The on-plant distribution of T. urticae did not significantly differ among the stem strata through the growing season. Binomial and enumerative sampling plans were developed using generic Taylor's power law coefficient values. The best fit of our data for binomial sampling occurred using a tally threshold of T = 0. The optimum number of leaves required for T urticae at the critical density of five mites per leaf was 20 for the binomial and 23 for the enumerative sampling plans, respectively. Sampling models were validated using Resampling for Validation of Sampling Plan Software.
International Nuclear Information System (INIS)
Finne, E.F.; Cooper, G.A.; Koop, B.F.; Hylland, K.; Tollefsen, K.E.
2007-01-01
As more salmon gene expression data has become available, the cDNA microarray platform has emerged as an appealing alternative in ecotoxicological screening of single chemicals and environmental samples relevant to the aquatic environment. This study was performed to validate biomarker gene responses of in vitro cultured rainbow trout (Oncorhynchus mykiss) hepatocytes exposed to model chemicals, and to investigate effects of mixture toxicity in a synthetic mixture. Chemicals used for 24 h single chemical- and mixture exposures were 10 nM 17α-ethinylestradiol (EE2), 0.75 nM 2,3,7,8-tetrachloro-di-benzodioxin (TCDD), 100 μM paraquat (PQ) and 0.75 μM 4-nitroquinoline-1-oxide (NQO). RNA was isolated from exposed cells, DNAse treated and quality controlled before cDNA synthesis, fluorescent labelling and hybridisation to a 16k salmonid microarray. The salmonid 16k cDNA array identified differential gene expression predictive of exposure, which could be verified by quantitative real time PCR. More precisely, the responses of biomarker genes such as cytochrome p4501A and UDP-glucuronosyl transferase to TCDD exposure, glutathione reductase and gammaglutamyl cysteine synthetase to paraquat exposure, as well as vitellogenin and vitelline envelope protein to EE2 exposure validated the use of microarray applied to RNA extracted from in vitro exposed hepatocytes. The mutagenic compound NQO did not result in any change in gene expression. Results from exposure to a synthetic mixture of the same four chemicals, using identical concentrations as for single chemical exposures, revealed combined effects that were not predicted by results for individual chemicals alone. In general, the response of exposure to this mixture led to an average loss of approximately 60% of the transcriptomic signature found for single chemical exposure. The present findings show that microarray analyses may contribute to our mechanistic understanding of single contaminant mode of action as well as
Energy Technology Data Exchange (ETDEWEB)
Finne, E.F. [Norwegian Institute for Water Research, Gaustadalleen 21, N-0349 Oslo (Norway) and University of Oslo, Department of Biology, P.O. Box 1066, Blindern, N-0316 Oslo (Norway)]. E-mail: eivind.finne@niva.no; Cooper, G.A. [Centre for Biomedical Research, University of Victoria, BC V8P5C2 (Canada); Koop, B.F. [Centre for Biomedical Research, University of Victoria, BC V8P5C2 (Canada); Hylland, K. [Norwegian Institute for Water Research, Gaustadalleen 21, N-0349 Oslo (Norway); University of Oslo, Department of Biology, P.O. Box 1066, Blindern, N-0316 Oslo (Norway); Tollefsen, K.E. [Norwegian Institute for Water Research, Gaustadalleen 21, N-0349 Oslo (Norway)
2007-03-10
As more salmon gene expression data has become available, the cDNA microarray platform has emerged as an appealing alternative in ecotoxicological screening of single chemicals and environmental samples relevant to the aquatic environment. This study was performed to validate biomarker gene responses of in vitro cultured rainbow trout (Oncorhynchus mykiss) hepatocytes exposed to model chemicals, and to investigate effects of mixture toxicity in a synthetic mixture. Chemicals used for 24 h single chemical- and mixture exposures were 10 nM 17{alpha}-ethinylestradiol (EE2), 0.75 nM 2,3,7,8-tetrachloro-di-benzodioxin (TCDD), 100 {mu}M paraquat (PQ) and 0.75 {mu}M 4-nitroquinoline-1-oxide (NQO). RNA was isolated from exposed cells, DNAse treated and quality controlled before cDNA synthesis, fluorescent labelling and hybridisation to a 16k salmonid microarray. The salmonid 16k cDNA array identified differential gene expression predictive of exposure, which could be verified by quantitative real time PCR. More precisely, the responses of biomarker genes such as cytochrome p4501A and UDP-glucuronosyl transferase to TCDD exposure, glutathione reductase and gammaglutamyl cysteine synthetase to paraquat exposure, as well as vitellogenin and vitelline envelope protein to EE2 exposure validated the use of microarray applied to RNA extracted from in vitro exposed hepatocytes. The mutagenic compound NQO did not result in any change in gene expression. Results from exposure to a synthetic mixture of the same four chemicals, using identical concentrations as for single chemical exposures, revealed combined effects that were not predicted by results for individual chemicals alone. In general, the response of exposure to this mixture led to an average loss of approximately 60% of the transcriptomic signature found for single chemical exposure. The present findings show that microarray analyses may contribute to our mechanistic understanding of single contaminant mode of action as
Vakanski, A; Ferguson, J M; Lee, S
2016-12-01
The objective of the proposed research is to develop a methodology for modeling and evaluation of human motions, which will potentially benefit patients undertaking a physical rehabilitation therapy (e.g., following a stroke or due to other medical conditions). The ultimate aim is to allow patients to perform home-based rehabilitation exercises using a sensory system for capturing the motions, where an algorithm will retrieve the trajectories of a patient's exercises, will perform data analysis by comparing the performed motions to a reference model of prescribed motions, and will send the analysis results to the patient's physician with recommendations for improvement. The modeling approach employs an artificial neural network, consisting of layers of recurrent neuron units and layers of neuron units for estimating a mixture density function over the spatio-temporal dependencies within the human motion sequences. Input data are sequences of motions related to a prescribed exercise by a physiotherapist to a patient, and recorded with a motion capture system. An autoencoder subnet is employed for reducing the dimensionality of captured sequences of human motions, complemented with a mixture density subnet for probabilistic modeling of the motion data using a mixture of Gaussian distributions. The proposed neural network architecture produced a model for sets of human motions represented with a mixture of Gaussian density functions. The mean log-likelihood of observed sequences was employed as a performance metric in evaluating the consistency of a subject's performance relative to the reference dataset of motions. A publically available dataset of human motions captured with Microsoft Kinect was used for validation of the proposed method. The article presents a novel approach for modeling and evaluation of human motions with a potential application in home-based physical therapy and rehabilitation. The described approach employs the recent progress in the field of
Stochastic analysis of complex reaction networks using binomial moment equations.
Barzel, Baruch; Biham, Ofer
2012-09-01
The stochastic analysis of complex reaction networks is a difficult problem because the number of microscopic states in such systems increases exponentially with the number of reactive species. Direct integration of the master equation is thus infeasible and is most often replaced by Monte Carlo simulations. While Monte Carlo simulations are a highly effective tool, equation-based formulations are more amenable to analytical treatment and may provide deeper insight into the dynamics of the network. Here, we present a highly efficient equation-based method for the analysis of stochastic reaction networks. The method is based on the recently introduced binomial moment equations [Barzel and Biham, Phys. Rev. Lett. 106, 150602 (2011)]. The binomial moments are linear combinations of the ordinary moments of the probability distribution function of the population sizes of the interacting species. They capture the essential combinatorics of the reaction processes reflecting their stoichiometric structure. This leads to a simple and transparent form of the equations, and allows a highly efficient and surprisingly simple truncation scheme. Unlike ordinary moment equations, in which the inclusion of high order moments is prohibitively complicated, the binomial moment equations can be easily constructed up to any desired order. The result is a set of equations that enables the stochastic analysis of complex reaction networks under a broad range of conditions. The number of equations is dramatically reduced from the exponential proliferation of the master equation to a polynomial (and often quadratic) dependence on the number of reactive species in the binomial moment equations. The aim of this paper is twofold: to present a complete derivation of the binomial moment equations; to demonstrate the applicability of the moment equations for a representative set of example networks, in which stochastic effects play an important role.
Lara, Jesus R; Hoddle, Mark S
2015-08-01
Oligonychus perseae Tuttle, Baker, & Abatiello is a foliar pest of 'Hass' avocados [Persea americana Miller (Lauraceae)]. The recommended action threshold is 50-100 motile mites per leaf, but this count range and other ecological factors associated with O. perseae infestations limit the application of enumerative sampling plans in the field. Consequently, a comprehensive modeling approach was implemented to compare the practical application of various binomial sampling models for decision-making of O. perseae in California. An initial set of sequential binomial sampling models were developed using three mean-proportion modeling techniques (i.e., Taylor's power law, maximum likelihood, and an empirical model) in combination with two-leaf infestation tally thresholds of either one or two mites. Model performance was evaluated using a robust mite count database consisting of >20,000 Hass avocado leaves infested with varying densities of O. perseae and collected from multiple locations. Operating characteristic and average sample number results for sequential binomial models were used as the basis to develop and validate a standardized fixed-size binomial sampling model with guidelines on sample tree and leaf selection within blocks of avocado trees. This final validated model requires a leaf sampling cost of 30 leaves and takes into account the spatial dynamics of O. perseae to make reliable mite density classifications for a 50-mite action threshold. Recommendations for implementing this fixed-size binomial sampling plan to assess densities of O. perseae in commercial California avocado orchards are discussed. © The Authors 2015. Published by Oxford University Press on behalf of Entomological Society of America. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Cao, Qingqing; Wu, Zhenqiang; Sun, Ying; Wang, Tiezhu; Han, Tengwei; Gu, Chaomei; Sun, Yehuan
2011-11-01
To Eexplore the application of negative binomial regression and modified Poisson regression analysis in analyzing the influential factors for injury frequency and the risk factors leading to the increase of injury frequency. 2917 primary and secondary school students were selected from Hefei by cluster random sampling method and surveyed by questionnaire. The data on the count event-based injuries used to fitted modified Poisson regression and negative binomial regression model. The risk factors incurring the increase of unintentional injury frequency for juvenile students was explored, so as to probe the efficiency of these two models in studying the influential factors for injury frequency. The Poisson model existed over-dispersion (P Poisson regression and negative binomial regression model, was fitted better. respectively. Both showed that male gender, younger age, father working outside of the hometown, the level of the guardian being above junior high school and smoking might be the results of higher injury frequencies. On a tendency of clustered frequency data on injury event, both the modified Poisson regression analysis and negative binomial regression analysis can be used. However, based on our data, the modified Poisson regression fitted better and this model could give a more accurate interpretation of relevant factors affecting the frequency of injury.
Sworn testimony of the model evidence: Gaussian Mixture Importance (GAME) sampling
Volpi, Elena; Schoups, Gerrit; Firmani, Giovanni; Vrugt, Jasper A.
2017-07-01
What is the "best" model? The answer to this question lies in part in the eyes of the beholder, nevertheless a good model must blend rigorous theory with redeeming qualities such as parsimony and quality of fit. Model selection is used to make inferences, via weighted averaging, from a set of K candidate models, Mk; k=>(1,…,K>), and help identify which model is most supported by the observed data, Y>˜=>(y˜1,…,y˜n>). Here, we introduce a new and robust estimator of the model evidence, p>(Y>˜|Mk>), which acts as normalizing constant in the denominator of Bayes' theorem and provides a single quantitative measure of relative support for each hypothesis that integrates model accuracy, uncertainty, and complexity. However, p>(Y>˜|Mk>) is analytically intractable for most practical modeling problems. Our method, coined GAussian Mixture importancE (GAME) sampling, uses bridge sampling of a mixture distribution fitted to samples of the posterior model parameter distribution derived from MCMC simulation. We benchmark the accuracy and reliability of GAME sampling by application to a diverse set of multivariate target distributions (up to 100 dimensions) with known values of p>(Y>˜|Mk>) and to hypothesis testing using numerical modeling of the rainfall-runoff transformation of the Leaf River watershed in Mississippi, USA. These case studies demonstrate that GAME sampling provides robust and unbiased estimates of the evidence at a relatively small computational cost outperforming commonly used estimators. The GAME sampler is implemented in the MATLAB package of DREAM and simplifies considerably scientific inquiry through hypothesis testing and model selection.
Evaluation of Thermodynamic Models for Predicting Phase Equilibria of CO2 + Impurity Binary Mixture
Shin, Byeong Soo; Rho, Won Gu; You, Seong-Sik; Kang, Jeong Won; Lee, Chul Soo
2018-03-01
For the design and operation of CO2 capture and storage (CCS) processes, equation of state (EoS) models are used for phase equilibrium calculations. Reliability of an EoS model plays a crucial role, and many variations of EoS models have been reported and continue to be published. The prediction of phase equilibria for CO2 mixtures containing SO2, N2, NO, H2, O2, CH4, H2S, Ar, and H2O is important for CO2 transportation because the captured gas normally contains small amounts of impurities even though it is purified in advance. For the design of pipelines in deep sea or arctic conditions, flow assurance and safety are considered priority issues, and highly reliable calculations are required. In this work, predictive Soave-Redlich-Kwong, cubic plus association, Groupe Européen de Recherches Gazières (GERG-2008), perturbed-chain statistical associating fluid theory, and non-random lattice fluids hydrogen bond EoS models were compared regarding performance in calculating phase equilibria of CO2-impurity binary mixtures and with the collected literature data. No single EoS could cover the entire range of systems considered in this study. Weaknesses and strong points of each EoS model were analyzed, and recommendations are given as guidelines for safe design and operation of CCS processes.
Lee, Soojeong; Rajan, Sreeraman; Jeon, Gwanggil; Chang, Joon-Hyuk; Dajani, Hilmi R; Groza, Voicu Z
2017-06-01
Blood pressure (BP) is one of the most important vital indicators and plays a key role in determining the cardiovascular activity of patients. This paper proposes a hybrid approach consisting of nonparametric bootstrap (NPB) and machine learning techniques to obtain the characteristic ratios (CR) used in the blood pressure estimation algorithm to improve the accuracy of systolic blood pressure (SBP) and diastolic blood pressure (DBP) estimates and obtain confidence intervals (CI). The NPB technique is used to circumvent the requirement for large sample set for obtaining the CI. A mixture of Gaussian densities is assumed for the CRs and Gaussian mixture model (GMM) is chosen to estimate the SBP and DBP ratios. The K-means clustering technique is used to obtain the mixture order of the Gaussian densities. The proposed approach achieves grade "A" under British Society of Hypertension testing protocol and is superior to the conventional approach based on maximum amplitude algorithm (MAA) that uses fixed CR ratios. The proposed approach also yields a lower mean error (ME) and the standard deviation of the error (SDE) in the estimates when compared to the conventional MAA method. In addition, CIs obtained through the proposed hybrid approach are also narrower with a lower SDE. The proposed approach combining the NPB technique with the GMM provides a methodology to derive individualized characteristic ratio. The results exhibit that the proposed approach enhances the accuracy of SBP and DBP estimation and provides narrower confidence intervals for the estimates. Copyright © 2015 Elsevier Ltd. All rights reserved.
Use of finite mixture distribution models in the analysis of wind energy in the Canarian Archipelago
International Nuclear Information System (INIS)
Carta, Jose Antonio; Ramirez, Penelope
2007-01-01
The statistical characteristics of hourly mean wind speed data recorded at 16 weather stations located in the Canarian Archipelago are analyzed in this paper. As a result of this analysis we see that the typical two parameter Weibull wind speed distribution (W-pdf) does not accurately represent all wind regimes observed in that region. However, a Singly Truncated from below Normal Weibull mixture distribution (TNW-pdf) and a two component mixture Weibull distribution (WW-pdf) developed here do provide very good fits for both unimodal and bimodal wind speed frequency distributions observed in that region and offer less relative errors in determining the annual mean wind power density. The parameters of the distributions are estimated using the least squares method, which is resolved in this paper using the Levenberg-Marquardt algorithm. The suitability of the distributions is judged from the probability plot correlation coefficient plot R 2 , adjusted for degrees of freedom. Based on the results obtained, we conclude that the two mixture distributions proposed here provide very flexible models for wind speed studies and can be applied in a widespread manner to represent the wind regimes in the Canarian archipelago and in other regions with similar characteristics. The TNW-pdf takes into account the frequency of null winds, whereas the WW-pdf and W-pdf do not. It can, therefore, better represent wind regimes with high percentages of null wind speeds. However, calculation of the TNW-pdf is markedly slower
Bennett, Bradley C; Husby, Chad E
2008-03-28
Botanical pharmacopoeias are non-random subsets of floras, with some taxonomic groups over- or under-represented. Moerman [Moerman, D.E., 1979. Symbols and selectivity: a statistical analysis of Native American medical ethnobotany, Journal of Ethnopharmacology 1, 111-119] introduced linear regression/residual analysis to examine these patterns. However, regression, the commonly-employed analysis, suffers from several statistical flaws. We use contingency table and binomial analyses to examine patterns of Shuar medicinal plant use (from Amazonian Ecuador). We first analyzed the Shuar data using Moerman's approach, modified to better meet requirements of linear regression analysis. Second, we assessed the exact randomization contingency table test for goodness of fit. Third, we developed a binomial model to test for non-random selection of plants in individual families. Modified regression models (which accommodated assumptions of linear regression) reduced R(2) to from 0.59 to 0.38, but did not eliminate all problems associated with regression analyses. Contingency table analyses revealed that the entire flora departs from the null model of equal proportions of medicinal plants in all families. In the binomial analysis, only 10 angiosperm families (of 115) differed significantly from the null model. These 10 families are largely responsible for patterns seen at higher taxonomic levels. Contingency table and binomial analyses offer an easy and statistically valid alternative to the regression approach.
Examining the cost efficiency of Chinese hydroelectric companies using a finite mixture model
International Nuclear Information System (INIS)
Barros, Carlos Pestana; Chen, Zhongfei; Managi, Shunsuke; Antunes, Olinda Sequeira
2013-01-01
This paper evaluates the operational activities of Chinese hydroelectric power companies over the period 2000–2010 using a finite mixture model that controls for unobserved heterogeneity. In so doing, a stochastic frontier latent class model, which allows for the existence of different technologies, is adopted to estimate cost frontiers. This procedure not only enables us to identify different groups among the hydro-power companies analysed, but also permits the analysis of their cost efficiency. The main result is that three groups are identified in the sample, each equipped with different technologies, suggesting that distinct business strategies need to be adapted to the characteristics of China's hydro-power companies. Some managerial implications are developed. - Highlights: ► This paper evaluates the operational activities of Chinese electricity hydric companies. ► This study uses data from 2000 to 2010 using a finite mixture model. ► The model procedure identifies different groups of Chinese hydric companies analysed. ► Three groups are identified in the sample, each equipped with completely different “technologies”. ► This suggests that distinct business strategies need to be adapted to the characteristics of the hydric companies
A mixture model for robust point matching under multi-layer motion.
Directory of Open Access Journals (Sweden)
Jiayi Ma
Full Text Available This paper proposes an efficient mixture model for establishing robust point correspondences between two sets of points under multi-layer motion. Our algorithm starts by creating a set of putative correspondences which can contain a number of false correspondences, or outliers, in addition to the true correspondences (inliers. Next we solve for correspondence by interpolating a set of spatial transformations on the putative correspondence set based on a mixture model, which involves estimating a consensus of inlier points whose matching follows a non-parametric geometrical constraint. We formulate this as a maximum a posteriori (MAP estimation of a Bayesian model with hidden/latent variables indicating whether matches in the putative set are outliers or inliers. We impose non-parametric geometrical constraints on the correspondence, as a prior distribution, in a reproducing kernel Hilbert space (RKHS. MAP estimation is performed by the EM algorithm which by also estimating the variance of the prior model (initialized to a large value is able to obtain good estimates very quickly (e.g., avoiding many of the local minima inherent in this formulation. We further provide a fast implementation based on sparse approximation which can achieve a significant speed-up without much performance degradation. We illustrate the proposed method on 2D and 3D real images for sparse feature correspondence, as well as a public available dataset for shape matching. The quantitative results demonstrate that our method is robust to non-rigid deformation and multi-layer/large discontinuous motion.
N-mix for fish: estimating riverine salmonid habitat selection via N-mixture models
Som, Nicholas A.; Perry, Russell W.; Jones, Edward C.; De Juilio, Kyle; Petros, Paul; Pinnix, William D.; Rupert, Derek L.
2018-01-01
Models that formulate mathematical linkages between fish use and habitat characteristics are applied for many purposes. For riverine fish, these linkages are often cast as resource selection functions with variables including depth and velocity of water and distance to nearest cover. Ecologists are now recognizing the role that detection plays in observing organisms, and failure to account for imperfect detection can lead to spurious inference. Herein, we present a flexible N-mixture model to associate habitat characteristics with the abundance of riverine salmonids that simultaneously estimates detection probability. Our formulation has the added benefits of accounting for demographics variation and can generate probabilistic statements regarding intensity of habitat use. In addition to the conceptual benefits, model application to data from the Trinity River, California, yields interesting results. Detection was estimated to vary among surveyors, but there was little spatial or temporal variation. Additionally, a weaker effect of water depth on resource selection is estimated than that reported by previous studies not accounting for detection probability. N-mixture models show great promise for applications to riverine resource selection.
Spatial Mixture Modelling for Unobserved Point Processes: Examples in Immunofluorescence Histology.
Ji, Chunlin; Merl, Daniel; Kepler, Thomas B; West, Mike
2009-12-04
We discuss Bayesian modelling and computational methods in analysis of indirectly observed spatial point processes. The context involves noisy measurements on an underlying point process that provide indirect and noisy data on locations of point outcomes. We are interested in problems in which the spatial intensity function may be highly heterogenous, and so is modelled via flexible nonparametric Bayesian mixture models. Analysis aims to estimate the underlying intensity function and the abundance of realized but unobserved points. Our motivating applications involve immunological studies of multiple fluorescent intensity images in sections of lymphatic tissue where the point processes represent geographical configurations of cells. We are interested in estimating intensity functions and cell abundance for each of a series of such data sets to facilitate comparisons of outcomes at different times and with respect to differing experimental conditions. The analysis is heavily computational, utilizing recently introduced MCMC approaches for spatial point process mixtures and extending them to the broader new context here of unobserved outcomes. Further, our example applications are problems in which the individual objects of interest are not simply points, but rather small groups of pixels; this implies a need to work at an aggregate pixel region level and we develop the resulting novel methodology for this. Two examples with with immunofluorescence histology data demonstrate the models and computational methodology.
Regional SAR Image Segmentation Based on Fuzzy Clustering with Gamma Mixture Model
Li, X. L.; Zhao, Q. H.; Li, Y.
2017-09-01
Most of stochastic based fuzzy clustering algorithms are pixel-based, which can not effectively overcome the inherent speckle noise in SAR images. In order to deal with the problem, a regional SAR image segmentation algorithm based on fuzzy clustering with Gamma mixture model is proposed in this paper. First, initialize some generating points randomly on the image, the image domain is divided into many sub-regions using Voronoi tessellation technique. Each sub-region is regarded as a homogeneous area in which the pixels share the same cluster label. Then, assume the probability of the pixel to be a Gamma mixture model with the parameters respecting to the cluster which the pixel belongs to. The negative logarithm of the probability represents the dissimilarity measure between the pixel and the cluster. The regional dissimilarity measure of one sub-region is defined as the sum of the measures of pixels in the region. Furthermore, the Markov Random Field (MRF) model is extended from pixels level to Voronoi sub-regions, and then the regional objective function is established under the framework of fuzzy clustering. The optimal segmentation results can be obtained by the solution of model parameters and generating points. Finally, the effectiveness of the proposed algorithm can be proved by the qualitative and quantitative analysis from the segmentation results of the simulated and real SAR images.
Segmentation and intensity estimation of microarray images using a gamma-t mixture model.
Baek, Jangsun; Son, Young Sook; McLachlan, Geoffrey J
2007-02-15
We present a new approach to the analysis of images for complementary DNA microarray experiments. The image segmentation and intensity estimation are performed simultaneously by adopting a two-component mixture model. One component of this mixture corresponds to the distribution of the background intensity, while the other corresponds to the distribution of the foreground intensity. The intensity measurement is a bivariate vector consisting of red and green intensities. The background intensity component is modeled by the bivariate gamma distribution, whose marginal densities for the red and green intensities are independent three-parameter gamma distributions with different parameters. The foreground intensity component is taken to be the bivariate t distribution, with the constraint that the mean of the foreground is greater than that of the background for each of the two colors. The degrees of freedom of this t distribution are inferred from the data but they could be specified in advance to reduce the computation time. Also, the covariance matrix is not restricted to being diagonal and so it allows for nonzero correlation between R and G foreground intensities. This gamma-t mixture model is fitted by maximum likelihood via the EM algorithm. A final step is executed whereby nonparametric (kernel) smoothing is undertaken of the posterior probabilities of component membership. The main advantages of this approach are: (1) it enjoys the well-known strengths of a mixture model, namely flexibility and adaptability to the data; (2) it considers the segmentation and intensity simultaneously and not separately as in commonly used existing software, and it also works with the red and green intensities in a bivariate framework as opposed to their separate estimation via univariate methods; (3) the use of the three-parameter gamma distribution for the background red and green intensities provides a much better fit than the normal (log normal) or t distributions; (4) the
International Nuclear Information System (INIS)
Fabrice Malet; Nathalie Lamoureux; Nabiha Djebaili-Chaumeix; Claude-Etienne Paillard; Pierre Pailhories; Jean-Pierre L'heriteau; Bernard Chaumont; Ahmed Bentaib
2005-01-01
Full text of publication follows: In the case of hypothetic severe accident on light water nuclear reactor, hydrogen would be produced during reactor core degradation and released to the reactor building which could subsequently raise a combustion hazard. A local ignition of the combustible mixture would give birth initially to a slow flame which can be accelerated due to turbulence. Depending on the geometry and the premixed combustible mixture composition, the flame can accelerate and for some conditions transit to detonation or be quenched after a certain distance. The flame acceleration is responsible for the generation of high pressure loads that could damage the reactor's building. Moreover, geometrical configuration is a major factor leading to flame acceleration. Thus, recording experimental data notably on mid-size installations is required for the numeric simulations validation before modelling realistic scales. The ENACCEF vertical facility is a 6 meters high acceleration tube aimed at representing steam generator room leading to containment dome. This setup can be equipped with obstacles of different blockage ratios and shapes in order to obtain an acceleration of the flame. Depending on the geometrical characteristics of these obstacles, different regimes of the flame propagation can be achieved. The mixture composition's influence on flame velocity and acceleration has been investigated. Using a steam physical-like diluent (40% He - 60% CO 2 ), influence of dilution on flame speed and acceleration has been investigated. The flame front has also been recorded with ultra fast ombroscopy visualization, both in the tube and in dome's the entering. The flame propagation is computed using the TONUS code. Based on Euler's equation solving code using structured finite volumes, it includes the CREBCOM flames modelling and simulates the hydrogen/air turbulent flame propagation, taking into account 3D complex geometry and reactants concentration gradients. Since
Directory of Open Access Journals (Sweden)
Hainian Wang
2014-02-01
Full Text Available X-ray CT (computed tomography was used to scan asphalt mixture specimen to obtain high resolution continuous cross-section images and the meso-structure. According to the theory of three-dimensional (3D reconstruction, the 3D reconstruction algorithm was investigated in this paper. The key to the reconstruction technique is the acquisition of the voxel positions and the relationship between the pixel element and node. Three-dimensional numerical model of asphalt mixture specimen was created by a self-developed program. A splitting test was conducted to predict the stress distributions of the asphalt mixture and verify the rationality of the 3D model.
Mixed Platoon Flow Dispersion Model Based on Speed-Truncated Gaussian Mixture Distribution
Directory of Open Access Journals (Sweden)
Weitiao Wu
2013-01-01
Full Text Available A mixed traffic flow feature is presented on urban arterials in China due to a large amount of buses. Based on field data, a macroscopic mixed platoon flow dispersion model (MPFDM was proposed to simulate the platoon dispersion process along the road section between two adjacent intersections from the flow view. More close to field observation, truncated Gaussian mixture distribution was adopted as the speed density distribution for mixed platoon. Expectation maximum (EM algorithm was used for parameters estimation. The relationship between the arriving flow distribution at downstream intersection and the departing flow distribution at upstream intersection was investigated using the proposed model. Comparison analysis using virtual flow data was performed between the Robertson model and the MPFDM. The results confirmed the validity of the proposed model.