Owen, Art B
2001-01-01
Empirical likelihood provides inferences whose validity does not depend on specifying a parametric model for the data. Because it uses a likelihood, the method has certain inherent advantages over resampling methods: it uses the data to determine the shape of the confidence regions, and it makes it easy to combined data from multiple sources. It also facilitates incorporating side information, and it simplifies accounting for censored, truncated, or biased sampling.One of the first books published on the subject, Empirical Likelihood offers an in-depth treatment of this method for constructing confidence regions and testing hypotheses. The author applies empirical likelihood to a range of problems, from those as simple as setting a confidence region for a univariate mean under IID sampling, to problems defined through smooth functions of means, regression models, generalized linear models, estimating equations, or kernel smooths, and to sampling with non-identically distributed data. Abundant figures offer vi...
An alternative empirical likelihood method in missing response problems and causal inference.
Ren, Kaili; Drummond, Christopher A; Brewster, Pamela S; Haller, Steven T; Tian, Jiang; Cooper, Christopher J; Zhang, Biao
2016-11-30
Missing responses are common problems in medical, social, and economic studies. When responses are missing at random, a complete case data analysis may result in biases. A popular debias method is inverse probability weighting proposed by Horvitz and Thompson. To improve efficiency, Robins et al. proposed an augmented inverse probability weighting method. The augmented inverse probability weighting estimator has a double-robustness property and achieves the semiparametric efficiency lower bound when the regression model and propensity score model are both correctly specified. In this paper, we introduce an empirical likelihood-based estimator as an alternative to Qin and Zhang (2007). Our proposed estimator is also doubly robust and locally efficient. Simulation results show that the proposed estimator has better performance when the propensity score is correctly modeled. Moreover, the proposed method can be applied in the estimation of average treatment effect in observational causal inferences. Finally, we apply our method to an observational study of smoking, using data from the Cardiovascular Outcomes in Renal Atherosclerotic Lesions clinical trial. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Extended likelihood inference in reliability
International Nuclear Information System (INIS)
Martz, H.F. Jr.; Beckman, R.J.; Waller, R.A.
1978-10-01
Extended likelihood methods of inference are developed in which subjective information in the form of a prior distribution is combined with sampling results by means of an extended likelihood function. The extended likelihood function is standardized for use in obtaining extended likelihood intervals. Extended likelihood intervals are derived for the mean of a normal distribution with known variance, the failure-rate of an exponential distribution, and the parameter of a binomial distribution. Extended second-order likelihood methods are developed and used to solve several prediction problems associated with the exponential and binomial distributions. In particular, such quantities as the next failure-time, the number of failures in a given time period, and the time required to observe a given number of failures are predicted for the exponential model with a gamma prior distribution on the failure-rate. In addition, six types of life testing experiments are considered. For the binomial model with a beta prior distribution on the probability of nonsurvival, methods are obtained for predicting the number of nonsurvivors in a given sample size and for predicting the required sample size for observing a specified number of nonsurvivors. Examples illustrate each of the methods developed. Finally, comparisons are made with Bayesian intervals in those cases where these are known to exist
Likelihood inference for unions of interacting discs
DEFF Research Database (Denmark)
Møller, Jesper; Helisova, K.
2010-01-01
This is probably the first paper which discusses likelihood inference for a random set using a germ-grain model, where the individual grains are unobservable, edge effects occur and other complications appear. We consider the case where the grains form a disc process modelled by a marked point...... process, where the germs are the centres and the marks are the associated radii of the discs. We propose to use a recent parametric class of interacting disc process models, where the minimal sufficient statistic depends on various geometric properties of the random set, and the density is specified......-based maximum likelihood inference and the effect of specifying different reference Poisson models....
Essays on empirical likelihood in economics
Gao, Z.
2012-01-01
This thesis intends to exploit the roots of empirical likelihood and its related methods in mathematical programming and computation. The roots will be connected and the connections will induce new solutions for the problems of estimation, computation, and generalization of empirical likelihood.
Likelihood inference for unions of interacting discs
DEFF Research Database (Denmark)
Møller, Jesper; Helisová, Katarina
To the best of our knowledge, this is the first paper which discusses likelihood inference or a random set using a germ-grain model, where the individual grains are unobservable edge effects occur, and other complications appear. We consider the case where the grains form a disc process modelled...... is specified with respect to a given marked Poisson model (i.e. a Boolean model). We show how edge effects and other complications can be handled by considering a certain conditional likelihood. Our methodology is illustrated by analyzing Peter Diggle's heather dataset, where we discuss the results...... of simulation-based maximum likelihood inference and the effect of specifying different reference Poisson models....
Likelihood-based inference for clustered line transect data
DEFF Research Database (Denmark)
Waagepetersen, Rasmus; Schweder, Tore
2006-01-01
The uncertainty in estimation of spatial animal density from line transect surveys depends on the degree of spatial clustering in the animal population. To quantify the clustering we model line transect data as independent thinnings of spatial shot-noise Cox processes. Likelihood-based inference...
Likelihood-based inference for clustered line transect data
DEFF Research Database (Denmark)
Waagepetersen, Rasmus Plenge; Schweder, Tore
The uncertainty in estimation of spatial animal density from line transect surveys depends on the degree of spatial clustering in the animal population. To quantify the clustering we model line transect data as independent thinnings of spatial shot-noise Cox processes. Likelihood-based inference...
Generalized empirical likelihood methods for analyzing longitudinal data
Wang, S.; Qian, L.; Carroll, R. J.
2010-01-01
Efficient estimation of parameters is a major objective in analyzing longitudinal data. We propose two generalized empirical likelihood based methods that take into consideration within-subject correlations. A nonparametric version of the Wilks
Block Empirical Likelihood for Longitudinal Single-Index Varying-Coefficient Model
Directory of Open Access Journals (Sweden)
Yunquan Song
2013-01-01
Full Text Available In this paper, we consider a single-index varying-coefficient model with application to longitudinal data. In order to accommodate the within-group correlation, we apply the block empirical likelihood procedure to longitudinal single-index varying-coefficient model, and prove a nonparametric version of Wilks’ theorem which can be used to construct the block empirical likelihood confidence region with asymptotically correct coverage probability for the parametric component. In comparison with normal approximations, the proposed method does not require a consistent estimator for the asymptotic covariance matrix, making it easier to conduct inference for the model's parametric component. Simulations demonstrate how the proposed method works.
Generalized empirical likelihood methods for analyzing longitudinal data
Wang, S.
2010-02-16
Efficient estimation of parameters is a major objective in analyzing longitudinal data. We propose two generalized empirical likelihood based methods that take into consideration within-subject correlations. A nonparametric version of the Wilks theorem for the limiting distributions of the empirical likelihood ratios is derived. It is shown that one of the proposed methods is locally efficient among a class of within-subject variance-covariance matrices. A simulation study is conducted to investigate the finite sample properties of the proposed methods and compare them with the block empirical likelihood method by You et al. (2006) and the normal approximation with a correctly estimated variance-covariance. The results suggest that the proposed methods are generally more efficient than existing methods which ignore the correlation structure, and better in coverage compared to the normal approximation with correctly specified within-subject correlation. An application illustrating our methods and supporting the simulation study results is also presented.
Likelihood-Based Inference of B Cell Clonal Families.
Directory of Open Access Journals (Sweden)
Duncan K Ralph
2016-10-01
Full Text Available The human immune system depends on a highly diverse collection of antibody-making B cells. B cell receptor sequence diversity is generated by a random recombination process called "rearrangement" forming progenitor B cells, then a Darwinian process of lineage diversification and selection called "affinity maturation." The resulting receptors can be sequenced in high throughput for research and diagnostics. Such a collection of sequences contains a mixture of various lineages, each of which may be quite numerous, or may consist of only a single member. As a step to understanding the process and result of this diversification, one may wish to reconstruct lineage membership, i.e. to cluster sampled sequences according to which came from the same rearrangement events. We call this clustering problem "clonal family inference." In this paper we describe and validate a likelihood-based framework for clonal family inference based on a multi-hidden Markov Model (multi-HMM framework for B cell receptor sequences. We describe an agglomerative algorithm to find a maximum likelihood clustering, two approximate algorithms with various trade-offs of speed versus accuracy, and a third, fast algorithm for finding specific lineages. We show that under simulation these algorithms greatly improve upon existing clonal family inference methods, and that they also give significantly different clusters than previous methods when applied to two real data sets.
Likelihood inference for a nonstationary fractional autoregressive model
DEFF Research Database (Denmark)
Johansen, Søren; Ørregård Nielsen, Morten
2010-01-01
This paper discusses model-based inference in an autoregressive model for fractional processes which allows the process to be fractional of order d or d-b. Fractional differencing involves infinitely many past values and because we are interested in nonstationary processes we model the data X1......,...,X_{T} given the initial values X_{-n}, n=0,1,..., as is usually done. The initial values are not modeled but assumed to be bounded. This represents a considerable generalization relative to all previous work where it is assumed that initial values are zero. For the statistical analysis we assume...... the conditional Gaussian likelihood and for the probability analysis we also condition on initial values but assume that the errors in the autoregressive model are i.i.d. with suitable moment conditions. We analyze the conditional likelihood and its derivatives as stochastic processes in the parameters, including...
Likelihood inference for a fractionally cointegrated vector autoregressive model
DEFF Research Database (Denmark)
Johansen, Søren; Ørregård Nielsen, Morten
2012-01-01
such that the process X_{t} is fractional of order d and cofractional of order d-b; that is, there exist vectors ß for which ß'X_{t} is fractional of order d-b, and no other fractionality order is possible. We define the statistical model by 0inference when the true values satisfy b0¿1/2 and d0-b0......We consider model based inference in a fractionally cointegrated (or cofractional) vector autoregressive model with a restricted constant term, ¿, based on the Gaussian likelihood conditional on initial values. The model nests the I(d) VAR model. We give conditions on the parameters...... process in the parameters when errors are i.i.d. with suitable moment conditions and initial values are bounded. When the limit is deterministic this implies uniform convergence in probability of the conditional likelihood function. If the true value b0>1/2, we prove that the limit distribution of (ß...
Moment Conditions Selection Based on Adaptive Penalized Empirical Likelihood
Directory of Open Access Journals (Sweden)
Yunquan Song
2014-01-01
Full Text Available Empirical likelihood is a very popular method and has been widely used in the fields of artificial intelligence (AI and data mining as tablets and mobile application and social media dominate the technology landscape. This paper proposes an empirical likelihood shrinkage method to efficiently estimate unknown parameters and select correct moment conditions simultaneously, when the model is defined by moment restrictions in which some are possibly misspecified. We show that our method enjoys oracle-like properties; that is, it consistently selects the correct moment conditions and at the same time its estimator is as efficient as the empirical likelihood estimator obtained by all correct moment conditions. Moreover, unlike the GMM, our proposed method allows us to carry out confidence regions for the parameters included in the model without estimating the covariances of the estimators. For empirical implementation, we provide some data-driven procedures for selecting the tuning parameter of the penalty function. The simulation results show that the method works remarkably well in terms of correct moment selection and the finite sample properties of the estimators. Also, a real-life example is carried out to illustrate the new methodology.
A Reliability Test of a Complex System Based on Empirical Likelihood
Zhou, Yan; Fu, Liya; Zhang, Jun; Hui, Yongchang
2016-01-01
To analyze the reliability of a complex system described by minimal paths, an empirical likelihood method is proposed to solve the reliability test problem when the subsystem distributions are unknown. Furthermore, we provide a reliability test statistic of the complex system and extract the limit distribution of the test statistic. Therefore, we can obtain the confidence interval for reliability and make statistical inferences. The simulation studies also demonstrate the theorem results.
Bayesian interpretation of Generalized empirical likelihood by maximum entropy
Rochet , Paul
2011-01-01
We study a parametric estimation problem related to moment condition models. As an alternative to the generalized empirical likelihood (GEL) and the generalized method of moments (GMM), a Bayesian approach to the problem can be adopted, extending the MEM procedure to parametric moment conditions. We show in particular that a large number of GEL estimators can be interpreted as a maximum entropy solution. Moreover, we provide a more general field of applications by proving the method to be rob...
Empirical Likelihood in Nonignorable Covariate-Missing Data Problems.
Xie, Yanmei; Zhang, Biao
2017-04-20
Missing covariate data occurs often in regression analysis, which frequently arises in the health and social sciences as well as in survey sampling. We study methods for the analysis of a nonignorable covariate-missing data problem in an assumed conditional mean function when some covariates are completely observed but other covariates are missing for some subjects. We adopt the semiparametric perspective of Bartlett et al. (Improving upon the efficiency of complete case analysis when covariates are MNAR. Biostatistics 2014;15:719-30) on regression analyses with nonignorable missing covariates, in which they have introduced the use of two working models, the working probability model of missingness and the working conditional score model. In this paper, we study an empirical likelihood approach to nonignorable covariate-missing data problems with the objective of effectively utilizing the two working models in the analysis of covariate-missing data. We propose a unified approach to constructing a system of unbiased estimating equations, where there are more equations than unknown parameters of interest. One useful feature of these unbiased estimating equations is that they naturally incorporate the incomplete data into the data analysis, making it possible to seek efficient estimation of the parameter of interest even when the working regression function is not specified to be the optimal regression function. We apply the general methodology of empirical likelihood to optimally combine these unbiased estimating equations. We propose three maximum empirical likelihood estimators of the underlying regression parameters and compare their efficiencies with other existing competitors. We present a simulation study to compare the finite-sample performance of various methods with respect to bias, efficiency, and robustness to model misspecification. The proposed empirical likelihood method is also illustrated by an analysis of a data set from the US National Health and
Zhou, Xiaofan; Shen, Xing-Xing; Hittinger, Chris Todd
2018-01-01
Abstract The sizes of the data matrices assembled to resolve branches of the tree of life have increased dramatically, motivating the development of programs for fast, yet accurate, inference. For example, several different fast programs have been developed in the very popular maximum likelihood framework, including RAxML/ExaML, PhyML, IQ-TREE, and FastTree. Although these programs are widely used, a systematic evaluation and comparison of their performance using empirical genome-scale data matrices has so far been lacking. To address this question, we evaluated these four programs on 19 empirical phylogenomic data sets with hundreds to thousands of genes and up to 200 taxa with respect to likelihood maximization, tree topology, and computational speed. For single-gene tree inference, we found that the more exhaustive and slower strategies (ten searches per alignment) outperformed faster strategies (one tree search per alignment) using RAxML, PhyML, or IQ-TREE. Interestingly, single-gene trees inferred by the three programs yielded comparable coalescent-based species tree estimations. For concatenation-based species tree inference, IQ-TREE consistently achieved the best-observed likelihoods for all data sets, and RAxML/ExaML was a close second. In contrast, PhyML often failed to complete concatenation-based analyses, whereas FastTree was the fastest but generated lower likelihood values and more dissimilar tree topologies in both types of analyses. Finally, data matrix properties, such as the number of taxa and the strength of phylogenetic signal, sometimes substantially influenced the programs’ relative performance. Our results provide real-world gene and species tree phylogenetic inference benchmarks to inform the design and execution of large-scale phylogenomic data analyses. PMID:29177474
CONSTRUCTING A FLEXIBLE LIKELIHOOD FUNCTION FOR SPECTROSCOPIC INFERENCE
International Nuclear Information System (INIS)
Czekala, Ian; Andrews, Sean M.; Mandel, Kaisey S.; Green, Gregory M.; Hogg, David W.
2015-01-01
We present a modular, extensible likelihood framework for spectroscopic inference based on synthetic model spectra. The subtraction of an imperfect model from a continuously sampled spectrum introduces covariance between adjacent datapoints (pixels) into the residual spectrum. For the high signal-to-noise data with large spectral range that is commonly employed in stellar astrophysics, that covariant structure can lead to dramatically underestimated parameter uncertainties (and, in some cases, biases). We construct a likelihood function that accounts for the structure of the covariance matrix, utilizing the machinery of Gaussian process kernels. This framework specifically addresses the common problem of mismatches in model spectral line strengths (with respect to data) due to intrinsic model imperfections (e.g., in the atomic/molecular databases or opacity prescriptions) by developing a novel local covariance kernel formalism that identifies and self-consistently downweights pathological spectral line “outliers.” By fitting many spectra in a hierarchical manner, these local kernels provide a mechanism to learn about and build data-driven corrections to synthetic spectral libraries. An open-source software implementation of this approach is available at http://iancze.github.io/Starfish, including a sophisticated probabilistic scheme for spectral interpolation when using model libraries that are sparsely sampled in the stellar parameters. We demonstrate some salient features of the framework by fitting the high-resolution V-band spectrum of WASP-14, an F5 dwarf with a transiting exoplanet, and the moderate-resolution K-band spectrum of Gliese 51, an M5 field dwarf
Directory of Open Access Journals (Sweden)
Yuejiao Fu
2018-04-01
Full Text Available The Sharpe ratio is a widely used risk-adjusted performance measurement in economics and finance. Most of the known statistical inferential methods devoted to the Sharpe ratio are based on the assumption that the data are normally distributed. In this article, without making any distributional assumption on the data, we develop the adjusted empirical likelihood method to obtain inference for a parameter of interest in the presence of nuisance parameters. We show that the log adjusted empirical likelihood ratio statistic is asymptotically distributed as the chi-square distribution. The proposed method is applied to obtain inference for the Sharpe ratio. Simulation results illustrate that the proposed method is comparable to Jobson and Korkie’s method (1981 and outperforms the empirical likelihood method when the data are from a symmetric distribution. In addition, when the data are from a skewed distribution, the proposed method significantly outperforms all other existing methods. A real-data example is analyzed to exemplify the application of the proposed method.
Malle, Bertram F; Holbrook, Jess
2012-04-01
People interpret behavior by making inferences about agents' intentionality, mind, and personality. Past research studied such inferences 1 at a time; in real life, people make these inferences simultaneously. The present studies therefore examined whether 4 major inferences (intentionality, desire, belief, and personality), elicited simultaneously in response to an observed behavior, might be ordered in a hierarchy of likelihood and speed. To achieve generalizability, the studies included a wide range of stimulus behaviors, presented them verbally and as dynamic videos, and assessed inferences both in a retrieval paradigm (measuring the likelihood and speed of accessing inferences immediately after they were made) and in an online processing paradigm (measuring the speed of forming inferences during behavior observation). Five studies provide evidence for a hierarchy of social inferences-from intentionality and desire to belief to personality-that is stable across verbal and visual presentations and that parallels the order found in developmental and primate research. (c) 2012 APA, all rights reserved.
A Non-standard Empirical Likelihood for Time Series
DEFF Research Database (Denmark)
Nordman, Daniel J.; Bunzel, Helle; Lahiri, Soumendra N.
Standard blockwise empirical likelihood (BEL) for stationary, weakly dependent time series requires specifying a fixed block length as a tuning parameter for setting confidence regions. This aspect can be difficult and impacts coverage accuracy. As an alternative, this paper proposes a new version...... of BEL based on a simple, though non-standard, data-blocking rule which uses a data block of every possible length. Consequently, the method involves no block selection and is also anticipated to exhibit better coverage performance. Its non-standard blocking scheme, however, induces non......-standard asymptotics and requires a significantly different development compared to standard BEL. We establish the large-sample distribution of log-ratio statistics from the new BEL method for calibrating confidence regions for mean or smooth function parameters of time series. This limit law is not the usual chi...
Alsing, Justin; Wandelt, Benjamin; Feeney, Stephen
2018-03-01
Many statistical models in cosmology can be simulated forwards but have intractable likelihood functions. Likelihood-free inference methods allow us to perform Bayesian inference from these models using only forward simulations, free from any likelihood assumptions or approximations. Likelihood-free inference generically involves simulating mock data and comparing to the observed data; this comparison in data-space suffers from the curse of dimensionality and requires compression of the data to a small number of summary statistics to be tractable. In this paper we use massive asymptotically-optimal data compression to reduce the dimensionality of the data-space to just one number per parameter, providing a natural and optimal framework for summary statistic choice for likelihood-free inference. Secondly, we present the first cosmological application of Density Estimation Likelihood-Free Inference (DELFI), which learns a parameterized model for joint distribution of data and parameters, yielding both the parameter posterior and the model evidence. This approach is conceptually simple, requires less tuning than traditional Approximate Bayesian Computation approaches to likelihood-free inference and can give high-fidelity posteriors from orders of magnitude fewer forward simulations. As an additional bonus, it enables parameter inference and Bayesian model comparison simultaneously. We demonstrate Density Estimation Likelihood-Free Inference with massive data compression on an analysis of the joint light-curve analysis supernova data, as a simple validation case study. We show that high-fidelity posterior inference is possible for full-scale cosmological data analyses with as few as ˜104 simulations, with substantial scope for further improvement, demonstrating the scalability of likelihood-free inference to large and complex cosmological datasets.
Maximum Likelihood Estimation and Inference With Examples in R, SAS and ADMB
Millar, Russell B
2011-01-01
This book takes a fresh look at the popular and well-established method of maximum likelihood for statistical estimation and inference. It begins with an intuitive introduction to the concepts and background of likelihood, and moves through to the latest developments in maximum likelihood methodology, including general latent variable models and new material for the practical implementation of integrated likelihood using the free ADMB software. Fundamental issues of statistical inference are also examined, with a presentation of some of the philosophical debates underlying the choice of statis
Likelihood based inference for partially observed renewal processes
van Lieshout, Maria Nicolette Margaretha
2016-01-01
This paper is concerned with inference for renewal processes on the real line that are observed in a broken interval. For such processes, the classic history-based approach cannot be used. Instead, we adapt tools from sequential spatial point process theory to propose a Monte Carlo maximum
Approximate maximum likelihood estimation for population genetic inference.
Bertl, Johanna; Ewing, Gregory; Kosiol, Carolin; Futschik, Andreas
2017-11-27
In many population genetic problems, parameter estimation is obstructed by an intractable likelihood function. Therefore, approximate estimation methods have been developed, and with growing computational power, sampling-based methods became popular. However, these methods such as Approximate Bayesian Computation (ABC) can be inefficient in high-dimensional problems. This led to the development of more sophisticated iterative estimation methods like particle filters. Here, we propose an alternative approach that is based on stochastic approximation. By moving along a simulated gradient or ascent direction, the algorithm produces a sequence of estimates that eventually converges to the maximum likelihood estimate, given a set of observed summary statistics. This strategy does not sample much from low-likelihood regions of the parameter space, and is fast, even when many summary statistics are involved. We put considerable efforts into providing tuning guidelines that improve the robustness and lead to good performance on problems with high-dimensional summary statistics and a low signal-to-noise ratio. We then investigate the performance of our resulting approach and study its properties in simulations. Finally, we re-estimate parameters describing the demographic history of Bornean and Sumatran orang-utans.
High-order Composite Likelihood Inference for Max-Stable Distributions and Processes
Castruccio, Stefano; Huser, Raphaë l; Genton, Marc G.
2015-01-01
In multivariate or spatial extremes, inference for max-stable processes observed at a large collection of locations is a very challenging problem in computational statistics, and current approaches typically rely on less expensive composite likelihoods constructed from small subsets of data. In this work, we explore the limits of modern state-of-the-art computational facilities to perform full likelihood inference and to efficiently evaluate high-order composite likelihoods. With extensive simulations, we assess the loss of information of composite likelihood estimators with respect to a full likelihood approach for some widely-used multivariate or spatial extreme models, we discuss how to choose composite likelihood truncation to improve the efficiency, and we also provide recommendations for practitioners. This article has supplementary material online.
High-order Composite Likelihood Inference for Max-Stable Distributions and Processes
Castruccio, Stefano
2015-09-29
In multivariate or spatial extremes, inference for max-stable processes observed at a large collection of locations is a very challenging problem in computational statistics, and current approaches typically rely on less expensive composite likelihoods constructed from small subsets of data. In this work, we explore the limits of modern state-of-the-art computational facilities to perform full likelihood inference and to efficiently evaluate high-order composite likelihoods. With extensive simulations, we assess the loss of information of composite likelihood estimators with respect to a full likelihood approach for some widely-used multivariate or spatial extreme models, we discuss how to choose composite likelihood truncation to improve the efficiency, and we also provide recommendations for practitioners. This article has supplementary material online.
Castruccio, Stefano; Huser, Raphaë l; Genton, Marc G.
2016-01-01
In multivariate or spatial extremes, inference for max-stable processes observed at a large collection of points is a very challenging problem and current approaches typically rely on less expensive composite likelihoods constructed from small subsets of data. In this work, we explore the limits of modern state-of-the-art computational facilities to perform full likelihood inference and to efficiently evaluate high-order composite likelihoods. With extensive simulations, we assess the loss of information of composite likelihood estimators with respect to a full likelihood approach for some widely used multivariate or spatial extreme models, we discuss how to choose composite likelihood truncation to improve the efficiency, and we also provide recommendations for practitioners. This article has supplementary material online.
Likelihood-Based Inference in Nonlinear Error-Correction Models
DEFF Research Database (Denmark)
Kristensen, Dennis; Rahbæk, Anders
We consider a class of vector nonlinear error correction models where the transfer function (or loadings) of the stationary relation- ships is nonlinear. This includes in particular the smooth transition models. A general representation theorem is given which establishes the dynamic properties...... and a linear trend in general. Gaussian likelihood-based estimators are considered for the long- run cointegration parameters, and the short-run parameters. Asymp- totic theory is provided for these and it is discussed to what extend asymptotic normality and mixed normaity can be found. A simulation study...
A simulation study of likelihood inference procedures in rayleigh distribution with censored data
International Nuclear Information System (INIS)
Baklizi, S. A.; Baker, H. M.
2001-01-01
Inference procedures based on the likelihood function are considered for the one parameter Rayleigh distribution with type1 and type 2 censored data. Using simulation techniques, the finite sample performances of the maximum likelihood estimator and the large sample likelihood interval estimation procedures based on the Wald, the Rao, and the likelihood ratio statistics are investigated. It appears that the maximum likelihood estimator is unbiased. The approximate variance estimates obtained from the asymptotic normal distribution of the maximum likelihood estimator are accurate under type 2 censored data while they tend to be smaller than the actual variances when considering type1 censored data of small size. It appears also that interval estimation based on the Wald and Rao statistics need much more sample size than interval estimation based on the likelihood ratio statistic to attain reasonable accuracy. (authors). 15 refs., 4 tabs
Directory of Open Access Journals (Sweden)
Lester L. Yuan
2007-06-01
Full Text Available This paper provides a brief introduction to the R package bio.infer, a set of scripts that facilitates the use of maximum likelihood (ML methods for predicting environmental conditions from assemblage composition. Environmental conditions can often be inferred from only biological data, and these inferences are useful when other sources of data are unavailable. ML prediction methods are statistically rigorous and applicable to a broader set of problems than more commonly used weighted averaging techniques. However, ML methods require a substantially greater investment of time to program algorithms and to perform computations. This package is designed to reduce the effort required to apply ML prediction methods.
Empirical Bayesian inference and model uncertainty
International Nuclear Information System (INIS)
Poern, K.
1994-01-01
This paper presents a hierarchical or multistage empirical Bayesian approach for the estimation of uncertainty concerning the intensity of a homogeneous Poisson process. A class of contaminated gamma distributions is considered to describe the uncertainty concerning the intensity. These distributions in turn are defined through a set of secondary parameters, the knowledge of which is also described and updated via Bayes formula. This two-stage Bayesian approach is an example where the modeling uncertainty is treated in a comprehensive way. Each contaminated gamma distributions, represented by a point in the 3D space of secondary parameters, can be considered as a specific model of the uncertainty about the Poisson intensity. Then, by the empirical Bayesian method each individual model is assigned a posterior probability
Likelihood inference of non-constant diversification rates with incomplete taxon sampling.
Höhna, Sebastian
2014-01-01
Large-scale phylogenies provide a valuable source to study background diversification rates and investigate if the rates have changed over time. Unfortunately most large-scale, dated phylogenies are sparsely sampled (fewer than 5% of the described species) and taxon sampling is not uniform. Instead, taxa are frequently sampled to obtain at least one representative per subgroup (e.g. family) and thus to maximize diversity (diversified sampling). So far, such complications have been ignored, potentially biasing the conclusions that have been reached. In this study I derive the likelihood of a birth-death process with non-constant (time-dependent) diversification rates and diversified taxon sampling. Using simulations I test if the true parameters and the sampling method can be recovered when the trees are small or medium sized (fewer than 200 taxa). The results show that the diversification rates can be inferred and the estimates are unbiased for large trees but are biased for small trees (fewer than 50 taxa). Furthermore, model selection by means of Akaike's Information Criterion favors the true model if the true rates differ sufficiently from alternative models (e.g. the birth-death model is recovered if the extinction rate is large and compared to a pure-birth model). Finally, I applied six different diversification rate models--ranging from a constant-rate pure birth process to a decreasing speciation rate birth-death process but excluding any rate shift models--on three large-scale empirical phylogenies (ants, mammals and snakes with respectively 149, 164 and 41 sampled species). All three phylogenies were constructed by diversified taxon sampling, as stated by the authors. However only the snake phylogeny supported diversified taxon sampling. Moreover, a parametric bootstrap test revealed that none of the tested models provided a good fit to the observed data. The model assumptions, such as homogeneous rates across species or no rate shifts, appear to be
Likelihood inference of non-constant diversification rates with incomplete taxon sampling.
Directory of Open Access Journals (Sweden)
Sebastian Höhna
Full Text Available Large-scale phylogenies provide a valuable source to study background diversification rates and investigate if the rates have changed over time. Unfortunately most large-scale, dated phylogenies are sparsely sampled (fewer than 5% of the described species and taxon sampling is not uniform. Instead, taxa are frequently sampled to obtain at least one representative per subgroup (e.g. family and thus to maximize diversity (diversified sampling. So far, such complications have been ignored, potentially biasing the conclusions that have been reached. In this study I derive the likelihood of a birth-death process with non-constant (time-dependent diversification rates and diversified taxon sampling. Using simulations I test if the true parameters and the sampling method can be recovered when the trees are small or medium sized (fewer than 200 taxa. The results show that the diversification rates can be inferred and the estimates are unbiased for large trees but are biased for small trees (fewer than 50 taxa. Furthermore, model selection by means of Akaike's Information Criterion favors the true model if the true rates differ sufficiently from alternative models (e.g. the birth-death model is recovered if the extinction rate is large and compared to a pure-birth model. Finally, I applied six different diversification rate models--ranging from a constant-rate pure birth process to a decreasing speciation rate birth-death process but excluding any rate shift models--on three large-scale empirical phylogenies (ants, mammals and snakes with respectively 149, 164 and 41 sampled species. All three phylogenies were constructed by diversified taxon sampling, as stated by the authors. However only the snake phylogeny supported diversified taxon sampling. Moreover, a parametric bootstrap test revealed that none of the tested models provided a good fit to the observed data. The model assumptions, such as homogeneous rates across species or no rate shifts, appear
Simple simulation of diffusion bridges with application to likelihood inference for diffusions
DEFF Research Database (Denmark)
Bladt, Mogens; Sørensen, Michael
2014-01-01
the accuracy and efficiency of the approximate method and compare it to exact simulation methods. In the study, our method provides a very good approximation to the distribution of a diffusion bridge for bridges that are likely to occur in applications to statistical inference. To illustrate the usefulness......With a view to statistical inference for discretely observed diffusion models, we propose simple methods of simulating diffusion bridges, approximately and exactly. Diffusion bridge simulation plays a fundamental role in likelihood and Bayesian inference for diffusion processes. First a simple......-dimensional diffusions and is applicable to all one-dimensional diffusion processes with finite speed-measure. One advantage of the new approach is that simple simulation methods like the Milstein scheme can be applied to bridge simulation. Another advantage over previous bridge simulation methods is that the proposed...
Empirical inference festschrift in honor of Vladimir N. Vapnik
Schölkopf, Bernhard; Vovk, Vladimir
2013-01-01
This book honours the outstanding contributions of Vladimir Vapnik, a rare example of a scientist for whom the following statements hold true simultaneously: his work led to the inception of a new field of research, the theory of statistical learning and empirical inference; he has lived to see the field blossom; and he is still as active as ever.
Cosmic shear measurement with maximum likelihood and maximum a posteriori inference
Hall, Alex; Taylor, Andy
2017-06-01
We investigate the problem of noise bias in maximum likelihood and maximum a posteriori estimators for cosmic shear. We derive the leading and next-to-leading order biases and compute them in the context of galaxy ellipticity measurements, extending previous work on maximum likelihood inference for weak lensing. We show that a large part of the bias on these point estimators can be removed using information already contained in the likelihood when a galaxy model is specified, without the need for external calibration. We test these bias-corrected estimators on simulated galaxy images similar to those expected from planned space-based weak lensing surveys, with promising results. We find that the introduction of an intrinsic shape prior can help with mitigation of noise bias, such that the maximum a posteriori estimate can be made less biased than the maximum likelihood estimate. Second-order terms offer a check on the convergence of the estimators, but are largely subdominant. We show how biases propagate to shear estimates, demonstrating in our simple set-up that shear biases can be reduced by orders of magnitude and potentially to within the requirements of planned space-based surveys at mild signal-to-noise ratio. We find that second-order terms can exhibit significant cancellations at low signal-to-noise ratio when Gaussian noise is assumed, which has implications for inferring the performance of shear-measurement algorithms from simplified simulations. We discuss the viability of our point estimators as tools for lensing inference, arguing that they allow for the robust measurement of ellipticity and shear.
Generalized Empirical Likelihood-Based Focused Information Criterion and Model Averaging
Directory of Open Access Journals (Sweden)
Naoya Sueishi
2013-07-01
Full Text Available This paper develops model selection and averaging methods for moment restriction models. We first propose a focused information criterion based on the generalized empirical likelihood estimator. We address the issue of selecting an optimal model, rather than a correct model, for estimating a specific parameter of interest. Then, this study investigates a generalized empirical likelihood-based model averaging estimator that minimizes the asymptotic mean squared error. A simulation study suggests that our averaging estimator can be a useful alternative to existing post-selection estimators.
Inference for the Sharpe Ratio Using a Likelihood-Based Approach
Directory of Open Access Journals (Sweden)
Ying Liu
2012-01-01
Full Text Available The Sharpe ratio is the prominent risk-adjusted performance measure used by practitioners. Statistical testing of this ratio using its asymptotic distribution has lagged behind its use. In this paper, highly accurate likelihood analysis is applied for inference on the Sharpe ratio. Both the one- and two-sample problems are considered. The methodology has O(n−3/2 distributional accuracy and can be implemented using any parametric return distribution structure. Simulations are provided to demonstrate the method's superior accuracy over existing methods used for testing in the literature.
Bayesian Inference using Neural Net Likelihood Models for Protein Secondary Structure Prediction
Directory of Open Access Journals (Sweden)
Seong-Gon Kim
2011-06-01
Full Text Available Several techniques such as Neural Networks, Genetic Algorithms, Decision Trees and other statistical or heuristic methods have been used to approach the complex non-linear task of predicting Alpha-helicies, Beta-sheets and Turns of a proteins secondary structure in the past. This project introduces a new machine learning method by using an offline trained Multilayered Perceptrons (MLP as the likelihood models within a Bayesian Inference framework to predict secondary structures proteins. Varying window sizes are used to extract neighboring amino acid information and passed back and forth between the Neural Net models and the Bayesian Inference process until there is a convergence of the posterior secondary structure probability.
Likelihood Inference of Nonlinear Models Based on a Class of Flexible Skewed Distributions
Directory of Open Access Journals (Sweden)
Xuedong Chen
2014-01-01
Full Text Available This paper deals with the issue of the likelihood inference for nonlinear models with a flexible skew-t-normal (FSTN distribution, which is proposed within a general framework of flexible skew-symmetric (FSS distributions by combining with skew-t-normal (STN distribution. In comparison with the common skewed distributions such as skew normal (SN, and skew-t (ST as well as scale mixtures of skew normal (SMSN, the FSTN distribution can accommodate more flexibility and robustness in the presence of skewed, heavy-tailed, especially multimodal outcomes. However, for this distribution, a usual approach of maximum likelihood estimates based on EM algorithm becomes unavailable and an alternative way is to return to the original Newton-Raphson type method. In order to improve the estimation as well as the way for confidence estimation and hypothesis test for the parameters of interest, a modified Newton-Raphson iterative algorithm is presented in this paper, based on profile likelihood for nonlinear regression models with FSTN distribution, and, then, the confidence interval and hypothesis test are also developed. Furthermore, a real example and simulation are conducted to demonstrate the usefulness and the superiority of our approach.
FPGA Acceleration of the phylogenetic likelihood function for Bayesian MCMC inference methods
Directory of Open Access Journals (Sweden)
Bakos Jason D
2010-04-01
Full Text Available Abstract Background Likelihood (ML-based phylogenetic inference has become a popular method for estimating the evolutionary relationships among species based on genomic sequence data. This method is used in applications such as RAxML, GARLI, MrBayes, PAML, and PAUP. The Phylogenetic Likelihood Function (PLF is an important kernel computation for this method. The PLF consists of a loop with no conditional behavior or dependencies between iterations. As such it contains a high potential for exploiting parallelism using micro-architectural techniques. In this paper, we describe a technique for mapping the PLF and supporting logic onto a Field Programmable Gate Array (FPGA-based co-processor. By leveraging the FPGA's on-chip DSP modules and the high-bandwidth local memory attached to the FPGA, the resultant co-processor can accelerate ML-based methods and outperform state-of-the-art multi-core processors. Results We use the MrBayes 3 tool as a framework for designing our co-processor. For large datasets, we estimate that our accelerated MrBayes, if run on a current-generation FPGA, achieves a 10× speedup relative to software running on a state-of-the-art server-class microprocessor. The FPGA-based implementation achieves its performance by deeply pipelining the likelihood computations, performing multiple floating-point operations in parallel, and through a natural log approximation that is chosen specifically to leverage a deeply pipelined custom architecture. Conclusions Heterogeneous computing, which combines general-purpose processors with special-purpose co-processors such as FPGAs and GPUs, is a promising approach for high-performance phylogeny inference as shown by the growing body of literature in this field. FPGAs in particular are well-suited for this task because of their low power consumption as compared to many-core processors and Graphics Processor Units (GPUs 1.
β-empirical Bayes inference and model diagnosis of microarray data
Directory of Open Access Journals (Sweden)
Hossain Mollah Mohammad
2012-06-01
Full Text Available Abstract Background Microarray data enables the high-throughput survey of mRNA expression profiles at the genomic level; however, the data presents a challenging statistical problem because of the large number of transcripts with small sample sizes that are obtained. To reduce the dimensionality, various Bayesian or empirical Bayes hierarchical models have been developed. However, because of the complexity of the microarray data, no model can explain the data fully. It is generally difficult to scrutinize the irregular patterns of expression that are not expected by the usual statistical gene by gene models. Results As an extension of empirical Bayes (EB procedures, we have developed the β-empirical Bayes (β-EB approach based on a β-likelihood measure which can be regarded as an ’evidence-based’ weighted (quasi- likelihood inference. The weight of a transcript t is described as a power function of its likelihood, fβ(yt|θ. Genes with low likelihoods have unexpected expression patterns and low weights. By assigning low weights to outliers, the inference becomes robust. The value of β, which controls the balance between the robustness and efficiency, is selected by maximizing the predictive β0-likelihood by cross-validation. The proposed β-EB approach identified six significant (p−5 contaminated transcripts as differentially expressed (DE in normal/tumor tissues from the head and neck of cancer patients. These six genes were all confirmed to be related to cancer; they were not identified as DE genes by the classical EB approach. When applied to the eQTL analysis of Arabidopsis thaliana, the proposed β-EB approach identified some potential master regulators that were missed by the EB approach. Conclusions The simulation data and real gene expression data showed that the proposed β-EB method was robust against outliers. The distribution of the weights was used to scrutinize the irregular patterns of expression and diagnose the model
Wu, Yufeng
2012-03-01
Incomplete lineage sorting can cause incongruence between the phylogenetic history of genes (the gene tree) and that of the species (the species tree), which can complicate the inference of phylogenies. In this article, I present a new coalescent-based algorithm for species tree inference with maximum likelihood. I first describe an improved method for computing the probability of a gene tree topology given a species tree, which is much faster than an existing algorithm by Degnan and Salter (2005). Based on this method, I develop a practical algorithm that takes a set of gene tree topologies and infers species trees with maximum likelihood. This algorithm searches for the best species tree by starting from initial species trees and performing heuristic search to obtain better trees with higher likelihood. This algorithm, called STELLS (which stands for Species Tree InfErence with Likelihood for Lineage Sorting), has been implemented in a program that is downloadable from the author's web page. The simulation results show that the STELLS algorithm is more accurate than an existing maximum likelihood method for many datasets, especially when there is noise in gene trees. I also show that the STELLS algorithm is efficient and can be applied to real biological datasets. © 2011 The Author. Evolution© 2011 The Society for the Study of Evolution.
Yuan, Ke-Hai; Tian, Yubin; Yanagihara, Hirokazu
2015-06-01
Survey data typically contain many variables. Structural equation modeling (SEM) is commonly used in analyzing such data. The most widely used statistic for evaluating the adequacy of a SEM model is T ML, a slight modification to the likelihood ratio statistic. Under normality assumption, T ML approximately follows a chi-square distribution when the number of observations (N) is large and the number of items or variables (p) is small. However, in practice, p can be rather large while N is always limited due to not having enough participants. Even with a relatively large N, empirical results show that T ML rejects the correct model too often when p is not too small. Various corrections to T ML have been proposed, but they are mostly heuristic. Following the principle of the Bartlett correction, this paper proposes an empirical approach to correct T ML so that the mean of the resulting statistic approximately equals the degrees of freedom of the nominal chi-square distribution. Results show that empirically corrected statistics follow the nominal chi-square distribution much more closely than previously proposed corrections to T ML, and they control type I errors reasonably well whenever N ≥ max(50,2p). The formulations of the empirically corrected statistics are further used to predict type I errors of T ML as reported in the literature, and they perform well.
de Queiroz, K; Poe, S
2001-06-01
Advocates of cladistic parsimony methods have invoked the philosophy of Karl Popper in an attempt to argue for the superiority of those methods over phylogenetic methods based on Ronald Fisher's statistical principle of likelihood. We argue that the concept of likelihood in general, and its application to problems of phylogenetic inference in particular, are highly compatible with Popper's philosophy. Examination of Popper's writings reveals that his concept of corroboration is, in fact, based on likelihood. Moreover, because probabilistic assumptions are necessary for calculating the probabilities that define Popper's corroboration, likelihood methods of phylogenetic inference--with their explicit probabilistic basis--are easily reconciled with his concept. In contrast, cladistic parsimony methods, at least as described by certain advocates of those methods, are less easily reconciled with Popper's concept of corroboration. If those methods are interpreted as lacking probabilistic assumptions, then they are incompatible with corroboration. Conversely, if parsimony methods are to be considered compatible with corroboration, then they must be interpreted as carrying implicit probabilistic assumptions. Thus, the non-probabilistic interpretation of cladistic parsimony favored by some advocates of those methods is contradicted by an attempt by the same authors to justify parsimony methods in terms of Popper's concept of corroboration. In addition to being compatible with Popperian corroboration, the likelihood approach to phylogenetic inference permits researchers to test the assumptions of their analytical methods (models) in a way that is consistent with Popper's ideas about the provisional nature of background knowledge.
Yan, Yuan
2017-07-13
Gaussian likelihood inference has been studied and used extensively in both statistical theory and applications due to its simplicity. However, in practice, the assumption of Gaussianity is rarely met in the analysis of spatial data. In this paper, we study the effect of non-Gaussianity on Gaussian likelihood inference for the parameters of the Matérn covariance model. By using Monte Carlo simulations, we generate spatial data from a Tukey g-and-h random field, a flexible trans-Gaussian random field, with the Matérn covariance function, where g controls skewness and h controls tail heaviness. We use maximum likelihood based on the multivariate Gaussian distribution to estimate the parameters of the Matérn covariance function. We illustrate the effects of non-Gaussianity of the data on the estimated covariance function by means of functional boxplots. Thanks to our tailored simulation design, a comparison of the maximum likelihood estimator under both the increasing and fixed domain asymptotics for spatial data is performed. We find that the maximum likelihood estimator based on Gaussian likelihood is overall satisfying and preferable than the non-distribution-based weighted least squares estimator for data from the Tukey g-and-h random field. We also present the result for Gaussian kriging based on Matérn covariance estimates with data from the Tukey g-and-h random field and observe an overall satisfactory performance.
Yan, Yuan; Genton, Marc G.
2017-01-01
Gaussian likelihood inference has been studied and used extensively in both statistical theory and applications due to its simplicity. However, in practice, the assumption of Gaussianity is rarely met in the analysis of spatial data. In this paper, we study the effect of non-Gaussianity on Gaussian likelihood inference for the parameters of the Matérn covariance model. By using Monte Carlo simulations, we generate spatial data from a Tukey g-and-h random field, a flexible trans-Gaussian random field, with the Matérn covariance function, where g controls skewness and h controls tail heaviness. We use maximum likelihood based on the multivariate Gaussian distribution to estimate the parameters of the Matérn covariance function. We illustrate the effects of non-Gaussianity of the data on the estimated covariance function by means of functional boxplots. Thanks to our tailored simulation design, a comparison of the maximum likelihood estimator under both the increasing and fixed domain asymptotics for spatial data is performed. We find that the maximum likelihood estimator based on Gaussian likelihood is overall satisfying and preferable than the non-distribution-based weighted least squares estimator for data from the Tukey g-and-h random field. We also present the result for Gaussian kriging based on Matérn covariance estimates with data from the Tukey g-and-h random field and observe an overall satisfactory performance.
Directory of Open Access Journals (Sweden)
Shu-Hwa Chen
Full Text Available BACKGROUND: Selecting an appropriate substitution model and deriving a tree topology for a given sequence set are essential in phylogenetic analysis. However, such time consuming, computationally intensive tasks rely on knowledge of substitution model theories and related expertise to run through all possible combinations of several separate programs. To ensure a thorough and efficient analysis and avert tedious manipulations of various programs, this work presents an intuitive framework, the phylogenetic reconstruction with automatic likelihood model selectors (PALM, with convincing, updated algorithms and a best-fit model selection mechanism for seamless phylogenetic analysis. METHODOLOGY: As an integrated framework of ClustalW, PhyML, MODELTEST, ProtTest, and several in-house programs, PALM evaluates the fitness of 56 substitution models for nucleotide sequences and 112 substitution models for protein sequences with scores in various criteria. The input for PALM can be either sequences in FASTA format or a sequence alignment file in PHYLIP format. To accelerate the computing of maximum likelihood and bootstrapping, this work integrates MPICH2/PhyML, PalmMonitor and Palm job controller across several machines with multiple processors and adopts the task parallelism approach. Moreover, an intuitive and interactive web component, PalmTree, is developed for displaying and operating the output tree with options of tree rooting, branches swapping, viewing the branch length values, and viewing bootstrapping score, as well as removing nodes to restart analysis iteratively. SIGNIFICANCE: The workflow of PALM is straightforward and coherent. Via a succinct, user-friendly interface, researchers unfamiliar with phylogenetic analysis can easily use this server to submit sequences, retrieve the output, and re-submit a job based on a previous result if some sequences are to be deleted or added for phylogenetic reconstruction. PALM results in an inference of
Statistical detection of EEG synchrony using empirical bayesian inference.
Directory of Open Access Journals (Sweden)
Archana K Singh
Full Text Available There is growing interest in understanding how the brain utilizes synchronized oscillatory activity to integrate information across functionally connected regions. Computing phase-locking values (PLV between EEG signals is a popular method for quantifying such synchronizations and elucidating their role in cognitive tasks. However, high-dimensionality in PLV data incurs a serious multiple testing problem. Standard multiple testing methods in neuroimaging research (e.g., false discovery rate, FDR suffer severe loss of power, because they fail to exploit complex dependence structure between hypotheses that vary in spectral, temporal and spatial dimension. Previously, we showed that a hierarchical FDR and optimal discovery procedures could be effectively applied for PLV analysis to provide better power than FDR. In this article, we revisit the multiple comparison problem from a new Empirical Bayes perspective and propose the application of the local FDR method (locFDR; Efron, 2001 for PLV synchrony analysis to compute FDR as a posterior probability that an observed statistic belongs to a null hypothesis. We demonstrate the application of Efron's Empirical Bayes approach for PLV synchrony analysis for the first time. We use simulations to validate the specificity and sensitivity of locFDR and a real EEG dataset from a visual search study for experimental validation. We also compare locFDR with hierarchical FDR and optimal discovery procedures in both simulation and experimental analyses. Our simulation results showed that the locFDR can effectively control false positives without compromising on the power of PLV synchrony inference. Our results from the application locFDR on experiment data detected more significant discoveries than our previously proposed methods whereas the standard FDR method failed to detect any significant discoveries.
Statistical detection of EEG synchrony using empirical bayesian inference.
Singh, Archana K; Asoh, Hideki; Takeda, Yuji; Phillips, Steven
2015-01-01
There is growing interest in understanding how the brain utilizes synchronized oscillatory activity to integrate information across functionally connected regions. Computing phase-locking values (PLV) between EEG signals is a popular method for quantifying such synchronizations and elucidating their role in cognitive tasks. However, high-dimensionality in PLV data incurs a serious multiple testing problem. Standard multiple testing methods in neuroimaging research (e.g., false discovery rate, FDR) suffer severe loss of power, because they fail to exploit complex dependence structure between hypotheses that vary in spectral, temporal and spatial dimension. Previously, we showed that a hierarchical FDR and optimal discovery procedures could be effectively applied for PLV analysis to provide better power than FDR. In this article, we revisit the multiple comparison problem from a new Empirical Bayes perspective and propose the application of the local FDR method (locFDR; Efron, 2001) for PLV synchrony analysis to compute FDR as a posterior probability that an observed statistic belongs to a null hypothesis. We demonstrate the application of Efron's Empirical Bayes approach for PLV synchrony analysis for the first time. We use simulations to validate the specificity and sensitivity of locFDR and a real EEG dataset from a visual search study for experimental validation. We also compare locFDR with hierarchical FDR and optimal discovery procedures in both simulation and experimental analyses. Our simulation results showed that the locFDR can effectively control false positives without compromising on the power of PLV synchrony inference. Our results from the application locFDR on experiment data detected more significant discoveries than our previously proposed methods whereas the standard FDR method failed to detect any significant discoveries.
Chen, Ying-Ju; Ning, Wei; Gupta, Arjun K
2016-05-01
The mean residual life (MRL) function is one of the basic parameters of interest in survival analysis that describes the expected remaining time of an individual after a certain age. The study of changes in the MRL function is practical and interesting because it may help us to identify some factors such as age and gender that may influence the remaining lifetimes of patients after receiving a certain surgery. In this paper, we propose a detection procedure based on the empirical likelihood for the changes in MRL functions with right censored data. Two real examples are also given: Veterans' administration lung cancer study and Stanford heart transplant to illustrate the detecting procedure. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Xu, Xu Steven; Yuan, Min; Yang, Haitao; Feng, Yan; Xu, Jinfeng; Pinheiro, Jose
2017-01-01
Covariate analysis based on population pharmacokinetics (PPK) is used to identify clinically relevant factors. The likelihood ratio test (LRT) based on nonlinear mixed effect model fits is currently recommended for covariate identification, whereas individual empirical Bayesian estimates (EBEs) are considered unreliable due to the presence of shrinkage. The objectives of this research were to investigate the type I error for LRT and EBE approaches, to confirm the similarity of power between the LRT and EBE approaches from a previous report and to explore the influence of shrinkage on LRT and EBE inferences. Using an oral one-compartment PK model with a single covariate impacting on clearance, we conducted a wide range of simulations according to a two-way factorial design. The results revealed that the EBE-based regression not only provided almost identical power for detecting a covariate effect, but also controlled the false positive rate better than the LRT approach. Shrinkage of EBEs is likely not the root cause for decrease in power or inflated false positive rate although the size of the covariate effect tends to be underestimated at high shrinkage. In summary, contrary to the current recommendations, EBEs may be a better choice for statistical tests in PPK covariate analysis compared to LRT. We proposed a three-step covariate modeling approach for population PK analysis to utilize the advantages of EBEs while overcoming their shortcomings, which allows not only markedly reducing the run time for population PK analysis, but also providing more accurate covariate tests.
DEFF Research Database (Denmark)
Møller, Jesper
2010-01-01
Chapter 9: This contribution concerns statistical inference for parametric models used in stochastic geometry and based on quick and simple simulation free procedures as well as more comprehensive methods based on a maximum likelihood or Bayesian approach combined with markov chain Monte Carlo...... (MCMC) techniques. Due to space limitations the focus is on spatial point processes....
Chen, Baojiang; Qin, Jing
2014-05-10
In statistical analysis, a regression model is needed if one is interested in finding the relationship between a response variable and covariates. When the response depends on the covariate, then it may also depend on the function of this covariate. If one has no knowledge of this functional form but expect for monotonic increasing or decreasing, then the isotonic regression model is preferable. Estimation of parameters for isotonic regression models is based on the pool-adjacent-violators algorithm (PAVA), where the monotonicity constraints are built in. With missing data, people often employ the augmented estimating method to improve estimation efficiency by incorporating auxiliary information through a working regression model. However, under the framework of the isotonic regression model, the PAVA does not work as the monotonicity constraints are violated. In this paper, we develop an empirical likelihood-based method for isotonic regression model to incorporate the auxiliary information. Because the monotonicity constraints still hold, the PAVA can be used for parameter estimation. Simulation studies demonstrate that the proposed method can yield more efficient estimates, and in some situations, the efficiency improvement is substantial. We apply this method to a dementia study. Copyright © 2013 John Wiley & Sons, Ltd.
A Composite Likelihood Inference in Latent Variable Models for Ordinal Longitudinal Responses
Vasdekis, Vassilis G. S.; Cagnone, Silvia; Moustaki, Irini
2012-01-01
The paper proposes a composite likelihood estimation approach that uses bivariate instead of multivariate marginal probabilities for ordinal longitudinal responses using a latent variable model. The model considers time-dependent latent variables and item-specific random effects to be accountable for the interdependencies of the multivariate…
DEFF Research Database (Denmark)
Cavaliere, Giuseppe; Nielsen, Morten Ørregaard; Taylor, Robert
We consider the problem of conducting estimation and inference on the parameters of univariate heteroskedastic fractionally integrated time series models. We first extend existing results in the literature, developed for conditional sum-of squares estimators in the context of parametric fractional...... time series models driven by conditionally homoskedastic shocks, to allow for conditional and unconditional heteroskedasticity both of a quite general and unknown form. Global consistency and asymptotic normality are shown to still obtain; however, the covariance matrix of the limiting distribution...... of the estimator now depends on nuisance parameters derived both from the weak dependence and heteroskedasticity present in the shocks. We then investigate classical methods of inference based on the Wald, likelihood ratio and Lagrange multiplier tests for linear hypotheses on either or both of the long and short...
Directory of Open Access Journals (Sweden)
Fonseca Carlos M
2010-10-01
Full Text Available Abstract Background Irregularly shaped spatial clusters are difficult to delineate. A cluster found by an algorithm often spreads through large portions of the map, impacting its geographical meaning. Penalized likelihood methods for Kulldorff's spatial scan statistics have been used to control the excessive freedom of the shape of clusters. Penalty functions based on cluster geometry and non-connectivity have been proposed recently. Another approach involves the use of a multi-objective algorithm to maximize two objectives: the spatial scan statistics and the geometric penalty function. Results & Discussion We present a novel scan statistic algorithm employing a function based on the graph topology to penalize the presence of under-populated disconnection nodes in candidate clusters, the disconnection nodes cohesion function. A disconnection node is defined as a region within a cluster, such that its removal disconnects the cluster. By applying this function, the most geographically meaningful clusters are sifted through the immense set of possible irregularly shaped candidate cluster solutions. To evaluate the statistical significance of solutions for multi-objective scans, a statistical approach based on the concept of attainment function is used. In this paper we compared different penalized likelihoods employing the geometric and non-connectivity regularity functions and the novel disconnection nodes cohesion function. We also build multi-objective scans using those three functions and compare them with the previous penalized likelihood scans. An application is presented using comprehensive state-wide data for Chagas' disease in puerperal women in Minas Gerais state, Brazil. Conclusions We show that, compared to the other single-objective algorithms, multi-objective scans present better performance, regarding power, sensitivity and positive predicted value. The multi-objective non-connectivity scan is faster and better suited for the
Likelihood-based inference for cointegration with nonlinear error-correction
DEFF Research Database (Denmark)
Kristensen, Dennis; Rahbek, Anders Christian
2010-01-01
We consider a class of nonlinear vector error correction models where the transfer function (or loadings) of the stationary relationships is nonlinear. This includes in particular the smooth transition models. A general representation theorem is given which establishes the dynamic properties...... and a linear trend in general. Gaussian likelihood-based estimators are considered for the long-run cointegration parameters, and the short-run parameters. Asymptotic theory is provided for these and it is discussed to what extend asymptotic normality and mixed normality can be found. A simulation study...
Zhou, X.; Albertson, J. D.
2016-12-01
Natural gas is considered as a bridge fuel towards clean energy due to its potential lower greenhouse gas emission comparing with other fossil fuels. Despite numerous efforts, an efficient and cost-effective approach to monitor fugitive methane emissions along the natural gas production-supply chain has not been developed yet. Recently, mobile methane measurement has been introduced which applies a Bayesian approach to probabilistically infer methane emission rates and update estimates recursively when new measurements become available. However, the likelihood function, especially the error term which determines the shape of the estimate uncertainty, is not rigorously defined and evaluated with field data. To address this issue, we performed a series of near-source (using a specialized vehicle mounted with fast response methane analyzers and a GPS unit. Methane concentrations were measured at two different heights along mobile traversals downwind of the sources, and concurrent wind and temperature data are recorded by nearby 3-D sonic anemometers. With known methane release rates, the measurements were used to determine the functional form and the parameterization of the likelihood function in the Bayesian inference scheme under different meteorological conditions.
Energy Technology Data Exchange (ETDEWEB)
Weyant, Anja; Wood-Vasey, W. Michael [Pittsburgh Particle Physics, Astrophysics, and Cosmology Center (PITT PACC), Physics and Astronomy Department, University of Pittsburgh, Pittsburgh, PA 15260 (United States); Schafer, Chad, E-mail: anw19@pitt.edu [Department of Statistics, Carnegie Mellon University, Pittsburgh, PA 15213 (United States)
2013-02-20
Cosmological inference becomes increasingly difficult when complex data-generating processes cannot be modeled by simple probability distributions. With the ever-increasing size of data sets in cosmology, there is an increasing burden placed on adequate modeling; systematic errors in the model will dominate where previously these were swamped by statistical errors. For example, Gaussian distributions are an insufficient representation for errors in quantities like photometric redshifts. Likewise, it can be difficult to quantify analytically the distribution of errors that are introduced in complex fitting codes. Without a simple form for these distributions, it becomes difficult to accurately construct a likelihood function for the data as a function of parameters of interest. Approximate Bayesian computation (ABC) provides a means of probing the posterior distribution when direct calculation of a sufficiently accurate likelihood is intractable. ABC allows one to bypass direct calculation of the likelihood but instead relies upon the ability to simulate the forward process that generated the data. These simulations can naturally incorporate priors placed on nuisance parameters, and hence these can be marginalized in a natural way. We present and discuss ABC methods in the context of supernova cosmology using data from the SDSS-II Supernova Survey. Assuming a flat cosmology and constant dark energy equation of state, we demonstrate that ABC can recover an accurate posterior distribution. Finally, we show that ABC can still produce an accurate posterior distribution when we contaminate the sample with Type IIP supernovae.
International Nuclear Information System (INIS)
Weyant, Anja; Wood-Vasey, W. Michael; Schafer, Chad
2013-01-01
Cosmological inference becomes increasingly difficult when complex data-generating processes cannot be modeled by simple probability distributions. With the ever-increasing size of data sets in cosmology, there is an increasing burden placed on adequate modeling; systematic errors in the model will dominate where previously these were swamped by statistical errors. For example, Gaussian distributions are an insufficient representation for errors in quantities like photometric redshifts. Likewise, it can be difficult to quantify analytically the distribution of errors that are introduced in complex fitting codes. Without a simple form for these distributions, it becomes difficult to accurately construct a likelihood function for the data as a function of parameters of interest. Approximate Bayesian computation (ABC) provides a means of probing the posterior distribution when direct calculation of a sufficiently accurate likelihood is intractable. ABC allows one to bypass direct calculation of the likelihood but instead relies upon the ability to simulate the forward process that generated the data. These simulations can naturally incorporate priors placed on nuisance parameters, and hence these can be marginalized in a natural way. We present and discuss ABC methods in the context of supernova cosmology using data from the SDSS-II Supernova Survey. Assuming a flat cosmology and constant dark energy equation of state, we demonstrate that ABC can recover an accurate posterior distribution. Finally, we show that ABC can still produce an accurate posterior distribution when we contaminate the sample with Type IIP supernovae.
DREAM3: network inference using dynamic context likelihood of relatedness and the inferelator.
Directory of Open Access Journals (Sweden)
Aviv Madar
2010-03-01
Full Text Available Many current works aiming to learn regulatory networks from systems biology data must balance model complexity with respect to data availability and quality. Methods that learn regulatory associations based on unit-less metrics, such as Mutual Information, are attractive in that they scale well and reduce the number of free parameters (model complexity per interaction to a minimum. In contrast, methods for learning regulatory networks based on explicit dynamical models are more complex and scale less gracefully, but are attractive as they may allow direct prediction of transcriptional dynamics and resolve the directionality of many regulatory interactions.We aim to investigate whether scalable information based methods (like the Context Likelihood of Relatedness method and more explicit dynamical models (like Inferelator 1.0 prove synergistic when combined. We test a pipeline where a novel modification of the Context Likelihood of Relatedness (mixed-CLR, modified to use time series data is first used to define likely regulatory interactions and then Inferelator 1.0 is used for final model selection and to build an explicit dynamical model.Our method ranked 2nd out of 22 in the DREAM3 100-gene in silico networks challenge. Mixed-CLR and Inferelator 1.0 are complementary, demonstrating a large performance gain relative to any single tested method, with precision being especially high at low recall values. Partitioning the provided data set into four groups (knock-down, knock-out, time-series, and combined revealed that using comprehensive knock-out data alone provides optimal performance. Inferelator 1.0 proved particularly powerful at resolving the directionality of regulatory interactions, i.e. "who regulates who" (approximately of identified true positives were correctly resolved. Performance drops for high in-degree genes, i.e. as the number of regulators per target gene increases, but not with out-degree, i.e. performance is not affected by
Partial inversion of elliptic operator to speed up computation of likelihood in Bayesian inference
Litvinenko, Alexander
2017-08-09
In this paper, we speed up the solution of inverse problems in Bayesian settings. By computing the likelihood, the most expensive part of the Bayesian formula, one compares the available measurement data with the simulated data. To get simulated data, repeated solution of the forward problem is required. This could be a great challenge. Often, the available measurement is a functional $F(u)$ of the solution $u$ or a small part of $u$. Typical examples of $F(u)$ are the solution in a point, solution on a coarser grid, in a small subdomain, the mean value in a subdomain. It is a waste of computational resources to evaluate, first, the whole solution and then compute a part of it. In this work, we compute the functional $F(u)$ direct, without computing the full inverse operator and without computing the whole solution $u$. The main ingredients of the developed approach are the hierarchical domain decomposition technique, the finite element method and the Schur complements. To speed up computations and to reduce the storage cost, we approximate the forward operator and the Schur complement in the hierarchical matrix format. Applying the hierarchical matrix technique, we reduced the computing cost to $\\\\mathcal{O}(k^2n \\\\log^2 n)$, where $k\\\\ll n$ and $n$ is the number of degrees of freedom. Up to the $\\\\H$-matrix accuracy, the computation of the functional $F(u)$ is exact. To reduce the computational resources further, we can approximate $F(u)$ on, for instance, multiple coarse meshes. The offered method is well suited for solving multiscale problems. A disadvantage of this method is the assumption that one has to have access to the discretisation and to the procedure of assembling the Galerkin matrix.
Maximum likelihood inference of small trees in the presence of long branches.
Parks, Sarah L; Goldman, Nick
2014-09-01
The statistical basis of maximum likelihood (ML), its robustness, and the fact that it appears to suffer less from biases lead to it being one of the most popular methods for tree reconstruction. Despite its popularity, very few analytical solutions for ML exist, so biases suffered by ML are not well understood. One possible bias is long branch attraction (LBA), a regularly cited term generally used to describe a propensity for long branches to be joined together in estimated trees. Although initially mentioned in connection with inconsistency of parsimony, LBA has been claimed to affect all major phylogenetic reconstruction methods, including ML. Despite the widespread use of this term in the literature, exactly what LBA is and what may be causing it is poorly understood, even for simple evolutionary models and small model trees. Studies looking at LBA have focused on the effect of two long branches on tree reconstruction. However, to understand the effect of two long branches it is also important to understand the effect of just one long branch. If ML struggles to reconstruct one long branch, then this may have an impact on LBA. In this study, we look at the effect of one long branch on three-taxon tree reconstruction. We show that, counterintuitively, long branches are preferentially placed at the tips of the tree. This can be understood through the use of analytical solutions to the ML equation and distance matrix methods. We go on to look at the placement of two long branches on four-taxon trees, showing that there is no attraction between long branches, but that for extreme branch lengths long branches are joined together disproportionally often. These results illustrate that even small model trees are still interesting to help understand how ML phylogenetic reconstruction works, and that LBA is a complicated phenomenon that deserves further study. © The Author(s) 2014. Published by Oxford University Press, on behalf of the Society of Systematic Biologists.
Local Likelihood Approach for High-Dimensional Peaks-Over-Threshold Inference
Baki, Zhuldyzay
2018-05-14
Global warming is affecting the Earth climate year by year, the biggest difference being observable in increasing temperatures in the World Ocean. Following the long- term global ocean warming trend, average sea surface temperatures across the global tropics and subtropics have increased by 0.4–1◦C in the last 40 years. These rates become even higher in semi-enclosed southern seas, such as the Red Sea, threaten- ing the survival of thermal-sensitive species. As average sea surface temperatures are projected to continue to rise, careful study of future developments of extreme temper- atures is paramount for the sustainability of marine ecosystem and biodiversity. In this thesis, we use Extreme-Value Theory to study sea surface temperature extremes from a gridded dataset comprising 16703 locations over the Red Sea. The data were provided by Operational SST and Sea Ice Analysis (OSTIA), a satellite-based data system designed for numerical weather prediction. After pre-processing the data to account for seasonality and global trends, we analyze the marginal distribution of ex- tremes, defined as observations exceeding a high spatially varying threshold, using the Generalized Pareto distribution. This model allows us to extrapolate beyond the ob- served data to compute the 100-year return levels over the entire Red Sea, confirming the increasing trend of extreme temperatures. To understand the dynamics govern- ing the dependence of extreme temperatures in the Red Sea, we propose a flexible local approach based on R-Pareto processes, which extend the univariate Generalized Pareto distribution to the spatial setting. Assuming that the sea surface temperature varies smoothly over space, we perform inference based on the gradient score method over small regional neighborhoods, in which the data are assumed to be stationary in space. This approach allows us to capture spatial non-stationarity, and to reduce the overall computational cost by taking advantage of
Directory of Open Access Journals (Sweden)
Yi-Jen Mon
2012-10-01
Full Text Available A supervisory Adaptive Network-based Fuzzy Inference System (SANFIS is proposed for the empirical control of a mobile robot. This controller includes an ANFIS controller and a supervisory controller. The ANFIS controller is off-line tuned by an adaptive fuzzy inference system, the supervisory controller is designed to compensate for the approximation error between the ANFIS controller and the ideal controller, and drive the trajectory of the system onto a specified surface (called the sliding surface or switching surface while maintaining the trajectory onto this switching surface continuously to guarantee the system stability. This SANFIS controller can achieve favourable empirical control performance of the mobile robot in the empirical tests of driving the mobile robot with a square path. Practical experimental results demonstrate that the proposed SANFIS can achieve better control performance than that achieved using an ANFIS controller for empirical control of the mobile robot.
Xu, Maoqi; Chen, Liang
2018-01-01
The individual sample heterogeneity is one of the biggest obstacles in biomarker identification for complex diseases such as cancers. Current statistical models to identify differentially expressed genes between disease and control groups often overlook the substantial human sample heterogeneity. Meanwhile, traditional nonparametric tests lose detailed data information and sacrifice the analysis power, although they are distribution free and robust to heterogeneity. Here, we propose an empirical likelihood ratio test with a mean-variance relationship constraint (ELTSeq) for the differential expression analysis of RNA sequencing (RNA-seq). As a distribution-free nonparametric model, ELTSeq handles individual heterogeneity by estimating an empirical probability for each observation without making any assumption about read-count distribution. It also incorporates a constraint for the read-count overdispersion, which is widely observed in RNA-seq data. ELTSeq demonstrates a significant improvement over existing methods such as edgeR, DESeq, t-tests, Wilcoxon tests and the classic empirical likelihood-ratio test when handling heterogeneous groups. It will significantly advance the transcriptomics studies of cancers and other complex disease. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Aspects of likelihood inference
Reid, Nancy
2013-01-01
Comment: Published in at http://dx.doi.org/10.3150/12-BEJSP03 the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm)
Grimm, Guido W.; Renner, Susanne S.; Stamatakis, Alexandros; Hemleben, Vera
2007-01-01
The multi-copy internal transcribed spacer (ITS) region of nuclear ribosomal DNA is widely used to infer phylogenetic relationships among closely related taxa. Here we use maximum likelihood (ML) and splits graph analyses to extract phylogenetic information from ~ 600 mostly cloned ITS sequences, representing 81 species and subspecies of Acer, and both species of its sister Dipteronia. Additional analyses compared sequence motifs in Acer and several hundred Anacardiaceae, Burseraceae, Meliaceae, Rutaceae, and Sapindaceae ITS sequences in GenBank. We also assessed the effects of using smaller data sets of consensus sequences with ambiguity coding (accounting for within-species variation) instead of the full (partly redundant) original sequences. Neighbor-nets and bipartition networks were used to visualize conflict among character state patterns. Species clusters observed in the trees and networks largely agree with morphology-based classifications; of de Jong’s (1994) 16 sections, nine are supported in neighbor-net and bipartition networks, and ten by sequence motifs and the ML tree; of his 19 series, 14 are supported in networks, motifs, and the ML tree. Most nodes had higher bootstrap support with matrices of 105 or 40 consensus sequences than with the original matrix. Within-taxon ITS divergence did not differ between diploid and polyploid Acer, and there was little evidence of differentiated parental ITS haplotypes, suggesting that concerted evolution in Acer acts rapidly. PMID:19455198
Directory of Open Access Journals (Sweden)
Guido W. Grimm
2006-01-01
Full Text Available The multi-copy internal transcribed spacer (ITS region of nuclear ribosomal DNA is widely used to infer phylogenetic relationships among closely related taxa. Here we use maximum likelihood (ML and splits graph analyses to extract phylogenetic information from ~ 600 mostly cloned ITS sequences, representing 81 species and subspecies of Acer, and both species of its sister Dipteronia. Additional analyses compared sequence motifs in Acer and several hundred Anacardiaceae, Burseraceae, Meliaceae, Rutaceae, and Sapindaceae ITS sequences in GenBank. We also assessed the effects of using smaller data sets of consensus sequences with ambiguity coding (accounting for within-species variation instead of the full (partly redundant original sequences. Neighbor-nets and bipartition networks were used to visualize conflict among character state patterns. Species clusters observed in the trees and networks largely agree with morphology-based classifications; of de Jong’s (1994 16 sections, nine are supported in neighbor-net and bipartition networks, and ten by sequence motifs and the ML tree; of his 19 series, 14 are supported in networks, motifs, and the ML tree. Most nodes had higher bootstrap support with matrices of 105 or 40 consensus sequences than with the original matrix. Within-taxon ITS divergence did not differ between diploid and polyploid Acer, and there was little evidence of differentiated parental ITS haplotypes, suggesting that concerted evolution in Acer acts rapidly.
CERN. Geneva
2015-01-01
Most physics results at the LHC end in a likelihood ratio test. This includes discovery and exclusion for searches as well as mass, cross-section, and coupling measurements. The use of Machine Learning (multivariate) algorithms in HEP is mainly restricted to searches, which can be reduced to classification between two fixed distributions: signal vs. background. I will show how we can extend the use of ML classifiers to distributions parameterized by physical quantities like masses and couplings as well as nuisance parameters associated to systematic uncertainties. This allows for one to approximate the likelihood ratio while still using a high dimensional feature vector for the data. Both the MEM and ABC approaches mentioned above aim to provide inference on model parameters (like cross-sections, masses, couplings, etc.). ABC is fundamentally tied Bayesian inference and focuses on the “likelihood free” setting where only a simulator is available and one cannot directly compute the likelihood for the dat...
Inferring causal molecular networks: empirical assessment through a community-based effort.
Hill, Steven M; Heiser, Laura M; Cokelaer, Thomas; Unger, Michael; Nesser, Nicole K; Carlin, Daniel E; Zhang, Yang; Sokolov, Artem; Paull, Evan O; Wong, Chris K; Graim, Kiley; Bivol, Adrian; Wang, Haizhou; Zhu, Fan; Afsari, Bahman; Danilova, Ludmila V; Favorov, Alexander V; Lee, Wai Shing; Taylor, Dane; Hu, Chenyue W; Long, Byron L; Noren, David P; Bisberg, Alexander J; Mills, Gordon B; Gray, Joe W; Kellen, Michael; Norman, Thea; Friend, Stephen; Qutub, Amina A; Fertig, Elana J; Guan, Yuanfang; Song, Mingzhou; Stuart, Joshua M; Spellman, Paul T; Koeppl, Heinz; Stolovitzky, Gustavo; Saez-Rodriguez, Julio; Mukherjee, Sach
2016-04-01
It remains unclear whether causal, rather than merely correlational, relationships in molecular networks can be inferred in complex biological settings. Here we describe the HPN-DREAM network inference challenge, which focused on learning causal influences in signaling networks. We used phosphoprotein data from cancer cell lines as well as in silico data from a nonlinear dynamical model. Using the phosphoprotein data, we scored more than 2,000 networks submitted by challenge participants. The networks spanned 32 biological contexts and were scored in terms of causal validity with respect to unseen interventional data. A number of approaches were effective, and incorporating known biology was generally advantageous. Additional sub-challenges considered time-course prediction and visualization. Our results suggest that learning causal relationships may be feasible in complex settings such as disease states. Furthermore, our scoring approach provides a practical way to empirically assess inferred molecular networks in a causal sense.
The phylogenetic likelihood library.
Flouri, T; Izquierdo-Carrasco, F; Darriba, D; Aberer, A J; Nguyen, L-T; Minh, B Q; Von Haeseler, A; Stamatakis, A
2015-03-01
We introduce the Phylogenetic Likelihood Library (PLL), a highly optimized application programming interface for developing likelihood-based phylogenetic inference and postanalysis software. The PLL implements appropriate data structures and functions that allow users to quickly implement common, error-prone, and labor-intensive tasks, such as likelihood calculations, model parameter as well as branch length optimization, and tree space exploration. The highly optimized and parallelized implementation of the phylogenetic likelihood function and a thorough documentation provide a framework for rapid development of scalable parallel phylogenetic software. By example of two likelihood-based phylogenetic codes we show that the PLL improves the sequential performance of current software by a factor of 2-10 while requiring only 1 month of programming time for integration. We show that, when numerical scaling for preventing floating point underflow is enabled, the double precision likelihood calculations in the PLL are up to 1.9 times faster than those in BEAGLE. On an empirical DNA dataset with 2000 taxa the AVX version of PLL is 4 times faster than BEAGLE (scaling enabled and required). The PLL is available at http://www.libpll.org under the GNU General Public License (GPL). © The Author(s) 2014. Published by Oxford University Press, on behalf of the Society of Systematic Biologists.
An empirical Bayesian approach for model-based inference of cellular signaling networks
Directory of Open Access Journals (Sweden)
Klinke David J
2009-11-01
Full Text Available Abstract Background A common challenge in systems biology is to infer mechanistic descriptions of biological process given limited observations of a biological system. Mathematical models are frequently used to represent a belief about the causal relationships among proteins within a signaling network. Bayesian methods provide an attractive framework for inferring the validity of those beliefs in the context of the available data. However, efficient sampling of high-dimensional parameter space and appropriate convergence criteria provide barriers for implementing an empirical Bayesian approach. The objective of this study was to apply an Adaptive Markov chain Monte Carlo technique to a typical study of cellular signaling pathways. Results As an illustrative example, a kinetic model for the early signaling events associated with the epidermal growth factor (EGF signaling network was calibrated against dynamic measurements observed in primary rat hepatocytes. A convergence criterion, based upon the Gelman-Rubin potential scale reduction factor, was applied to the model predictions. The posterior distributions of the parameters exhibited complicated structure, including significant covariance between specific parameters and a broad range of variance among the parameters. The model predictions, in contrast, were narrowly distributed and were used to identify areas of agreement among a collection of experimental studies. Conclusion In summary, an empirical Bayesian approach was developed for inferring the confidence that one can place in a particular model that describes signal transduction mechanisms and for inferring inconsistencies in experimental measurements.
Liu, Fang; Eugenio, Evercita C
2018-04-01
Beta regression is an increasingly popular statistical technique in medical research for modeling of outcomes that assume values in (0, 1), such as proportions and patient reported outcomes. When outcomes take values in the intervals [0,1), (0,1], or [0,1], zero-or-one-inflated beta (zoib) regression can be used. We provide a thorough review on beta regression and zoib regression in the modeling, inferential, and computational aspects via the likelihood-based and Bayesian approaches. We demonstrate the statistical and practical importance of correctly modeling the inflation at zero/one rather than ad hoc replacing them with values close to zero/one via simulation studies; the latter approach can lead to biased estimates and invalid inferences. We show via simulation studies that the likelihood-based approach is computationally faster in general than MCMC algorithms used in the Bayesian inferences, but runs the risk of non-convergence, large biases, and sensitivity to starting values in the optimization algorithm especially with clustered/correlated data, data with sparse inflation at zero and one, and data that warrant regularization of the likelihood. The disadvantages of the regular likelihood-based approach make the Bayesian approach an attractive alternative in these cases. Software packages and tools for fitting beta and zoib regressions in both the likelihood-based and Bayesian frameworks are also reviewed.
The Reliability and Stability of an Inferred Phylogenetic Tree from Empirical Data.
Katsura, Yukako; Stanley, Craig E; Kumar, Sudhir; Nei, Masatoshi
2017-03-01
The reliability of a phylogenetic tree obtained from empirical data is usually measured by the bootstrap probability (Pb) of interior branches of the tree. If the bootstrap probability is high for most branches, the tree is considered to be reliable. If some interior branches show relatively low bootstrap probabilities, we are not sure that the inferred tree is really reliable. Here, we propose another quantity measuring the reliability of the tree called the stability of a subtree. This quantity refers to the probability of obtaining a subtree (Ps) of an inferred tree obtained. We then show that if the tree is to be reliable, both Pb and Ps must be high. We also show that Ps is given by a bootstrap probability of the subtree with the closest outgroup sequence, and computer program RESTA for computing the Pb and Ps values will be presented. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Bayes Empirical Bayes Inference of Amino Acid Sites Under Positive Selection
DEFF Research Database (Denmark)
Yang, Ziheng; Wong, Wendy Shuk Wan; Nielsen, Rasmus
2005-01-01
, with > 1 indicating positive selection. Statistical distributions are used to model the variation in among sites, allowing a subset of sites to have > 1 while the rest of the sequence may be under purifying selection with ... probabilities that a site comes from the site class with > 1. Current implementations, however, use the naive EB (NEB) approach and fail to account for sampling errors in maximum likelihood estimates of model parameters, such as the proportions and ratios for the site classes. In small data sets lacking...... information, this approach may lead to unreliable posterior probability calculations. In this paper, we develop a Bayes empirical Bayes (BEB) approach to the problem, which assigns a prior to the model parameters and integrates over their uncertainties. We compare the new and old methods on real and simulated...
Empirical verification for application of Bayesian inference in situation awareness evaluations
International Nuclear Information System (INIS)
Kang, Seongkeun; Kim, Ar Ryum; Seong, Poong Hyun
2017-01-01
Highlights: • Situation awareness (SA) of human operators is significantly important for safe operation in nuclear power plants (NPPs). • SA of human operators was empirically estimated using Bayesian inference. • In this empirical study, the effect of attention and working memory to SA was considered. • Complexcity of the given task and design of human machine interface (HMI) considerably affect SA of human operators. - Abstract: Bayesian methodology has been widely used in various research fields. According to current research, malfunctions of nuclear power plants can be detected using this Bayesian inference, which consistently piles up newly incoming data and updates the estimation. However, these studies have been based on the assumption that people work like computers—perfectly—a supposition that may cause a problem in real world applications. Studies in cognitive psychology indicate that when the amount of information to be processed becomes larger, people cannot save the whole set of data in their heads due to limited attention and limited memory capacity, also known as working memory. The purpose of the current research is to consider how actual human aware the situation contrasts with our expectations, and how such disparity affects the results of conventional Bayesian inference, if at all. We compared situation awareness (SA) of ideal operators with SA of human operators, and for the human operator we used both text-based human machine interface (HMI) and infographic-based HMI to further compare two existing human operators. In addition, two different scenarios were selected how scenario complexity affects SA of human operators. As a results, when a malfunction occurred, the ideal operator found the malfunction nearly 100% probability of the time using Bayesian inference. In contrast, out of forty-six human operators, only 69.57% found the correct malfunction with simple scenario and 58.70% with complex scenario in the text-based HMI. In
Pal, Suvra; Balakrishnan, N
2017-10-01
In this paper, we consider a competing cause scenario and assume the number of competing causes to follow a Conway-Maxwell Poisson distribution which can capture both over and under dispersion that is usually encountered in discrete data. Assuming the population of interest having a component cure and the form of the data to be interval censored, as opposed to the usually considered right-censored data, the main contribution is in developing the steps of the expectation maximization algorithm for the determination of the maximum likelihood estimates of the model parameters of the flexible Conway-Maxwell Poisson cure rate model with Weibull lifetimes. An extensive Monte Carlo simulation study is carried out to demonstrate the performance of the proposed estimation method. Model discrimination within the Conway-Maxwell Poisson distribution is addressed using the likelihood ratio test and information-based criteria to select a suitable competing cause distribution that provides the best fit to the data. A simulation study is also carried out to demonstrate the loss in efficiency when selecting an improper competing cause distribution which justifies the use of a flexible family of distributions for the number of competing causes. Finally, the proposed methodology and the flexibility of the Conway-Maxwell Poisson distribution are illustrated with two known data sets from the literature: smoking cessation data and breast cosmesis data.
Gengsheng Qin; Davis, Angela E; Jing, Bing-Yi
2011-06-01
For a continuous-scale diagnostic test, it is often of interest to find the range of the sensitivity of the test at the cut-off that yields a desired specificity. In this article, we first define a profile empirical likelihood ratio for the sensitivity of a continuous-scale diagnostic test and show that its limiting distribution is a scaled chi-square distribution. We then propose two new empirical likelihood-based confidence intervals for the sensitivity of the test at a fixed level of specificity by using the scaled chi-square distribution. Simulation studies are conducted to compare the finite sample performance of the newly proposed intervals with the existing intervals for the sensitivity in terms of coverage probability. A real example is used to illustrate the application of the recommended methods.
Xu, Jason; Guttorp, Peter; Kato-Maeda, Midori; Minin, Vladimir N
2015-12-01
Continuous-time birth-death-shift (BDS) processes are frequently used in stochastic modeling, with many applications in ecology and epidemiology. In particular, such processes can model evolutionary dynamics of transposable elements-important genetic markers in molecular epidemiology. Estimation of the effects of individual covariates on the birth, death, and shift rates of the process can be accomplished by analyzing patient data, but inferring these rates in a discretely and unevenly observed setting presents computational challenges. We propose a multi-type branching process approximation to BDS processes and develop a corresponding expectation maximization algorithm, where we use spectral techniques to reduce calculation of expected sufficient statistics to low-dimensional integration. These techniques yield an efficient and robust optimization routine for inferring the rates of the BDS process, and apply broadly to multi-type branching processes whose rates can depend on many covariates. After rigorously testing our methodology in simulation studies, we apply our method to study intrapatient time evolution of IS6110 transposable element, a genetic marker frequently used during estimation of epidemiological clusters of Mycobacterium tuberculosis infections. © 2015, The International Biometric Society.
Rodriguez, Jesse M.
2013-01-01
Studies that map disease genes rely on accurate annotations that indicate whether individuals in the studied cohorts are related to each other or not. For example, in genome-wide association studies, the cohort members are assumed to be unrelated to one another. Investigators can correct for individuals in a cohort with previously-unknown shared familial descent by detecting genomic segments that are shared between them, which are considered to be identical by descent (IBD). Alternatively, elevated frequencies of IBD segments near a particular locus among affected individuals can be indicative of a disease-associated gene. As genotyping studies grow to use increasingly large sample sizes and meta-analyses begin to include many data sets, accurate and efficient detection of hidden relatedness becomes a challenge. To enable disease-mapping studies of increasingly large cohorts, a fast and accurate method to detect IBD segments is required. We present PARENTE, a novel method for detecting related pairs of individuals and shared haplotypic segments within these pairs. PARENTE is a computationally-efficient method based on an embedded likelihood ratio test. As demonstrated by the results of our simulations, our method exhibits better accuracy than the current state of the art, and can be used for the analysis of large genotyped cohorts. PARENTE\\'s higher accuracy becomes even more significant in more challenging scenarios, such as detecting shorter IBD segments or when an extremely low false-positive rate is required. PARENTE is publicly and freely available at http://parente.stanford.edu/. © 2013 Springer-Verlag.
DEFF Research Database (Denmark)
Møller, Jesper
(This text written by Jesper Møller, Aalborg University, is submitted for the collection ‘Stochastic Geometry: Highlights, Interactions and New Perspectives', edited by Wilfrid S. Kendall and Ilya Molchanov, to be published by ClarendonPress, Oxford, and planned to appear as Section 4.1 with the ......(This text written by Jesper Møller, Aalborg University, is submitted for the collection ‘Stochastic Geometry: Highlights, Interactions and New Perspectives', edited by Wilfrid S. Kendall and Ilya Molchanov, to be published by ClarendonPress, Oxford, and planned to appear as Section 4.......1 with the title ‘Inference'.) This contribution concerns statistical inference for parametric models used in stochastic geometry and based on quick and simple simulation free procedures as well as more comprehensive methods using Markov chain Monte Carlo (MCMC) simulations. Due to space limitations the focus...
Szilagyi, Andrew D.
1977-01-01
Attempts to empirically verify the causal source and direction of causal influence between role ambiguity, role conflict and job satisfaction and performance for three organizational levels in a hospital environment. (Author/RK)
International Nuclear Information System (INIS)
Quigley, John; Walls, Lesley
2011-01-01
Mixing Bayes and Empirical Bayes inference provides reliability estimates for variant system designs by using relevant failure data - observed and anticipated - about engineering changes arising due to modification and innovation. A coherent inference framework is proposed to predict the realization of engineering concerns during product development so that informed decisions can be made about the system design and the analysis conducted to prove reliability. The proposed method involves combining subjective prior distributions for the number of engineering concerns with empirical priors for the non-parametric distribution of time to realize these concerns in such a way that we can cross-tabulate classes of concerns to failure events within time partitions at an appropriate level of granularity. To support efficient implementation, a computationally convenient hypergeometric approximation is developed for the counting distributions appropriate to our underlying stochastic model. The accuracy of our approximation over first-order alternatives is examined, and demonstrated, through an evaluation experiment. An industrial application illustrates model implementation and shows how estimates can be updated using information arising during development test and analysis.
Energy Technology Data Exchange (ETDEWEB)
Athron, Peter; Balazs, Csaba [Monash University, School of Physics and Astronomy, Melbourne, VIC (Australia); Australian Research Council Centre of Excellence for Particle Physics at the Tera-scale (Australia); Bringmann, Torsten; Dal, Lars A.; Gonzalo, Tomas E.; Krislock, Abram; Raklev, Are [University of Oslo, Department of Physics, Oslo (Norway); Buckley, Andy [University of Glasgow, SUPA, School of Physics and Astronomy, Glasgow (United Kingdom); Chrzaszcz, Marcin [Universitaet Zuerich, Physik-Institut, Zurich (Switzerland); Polish Academy of Sciences, H. Niewodniczanski Institute of Nuclear Physics, Krakow (Poland); Conrad, Jan; Edsjoe, Joakim; Farmer, Ben; Lundberg, Johan [AlbaNova University Centre, Oskar Klein Centre for Cosmoparticle Physics, Stockholm (Sweden); Stockholm University, Department of Physics, Stockholm (Sweden); Cornell, Jonathan M. [McGill University, Department of Physics, Montreal, QC (Canada); Dickinson, Hugh [University of Minnesota, Minnesota Institute for Astrophysics, Minneapolis, MN (United States); Jackson, Paul; White, Martin [Australian Research Council Centre of Excellence for Particle Physics at the Tera-scale (Australia); University of Adelaide, Department of Physics, Adelaide, SA (Australia); Kvellestad, Anders; Savage, Christopher [NORDITA, Stockholm (Sweden); McKay, James [Imperial College London, Department of Physics, Blackett Laboratory, London (United Kingdom); Mahmoudi, Farvah [Lyon 1 Univ., ENS de Lyon, CNRS, Centre de Recherche Astrophysique de Lyon UMR5574, Saint-Genis-Laval (France); CERN, Theoretical Physics Department, Geneva (Switzerland); Institut Universitaire de France, Paris (France); Martinez, Gregory D. [University of California, Physics and Astronomy Department, Los Angeles, CA (United States); Putze, Antje [LAPTh, Universite de Savoie, CNRS, Annecy-le-Vieux (France); Ripken, Joachim [Max Planck Institute for Solar System Research, Goettingen (Germany); Rogan, Christopher [Harvard University, Department of Physics, Cambridge, MA (United States); Saavedra, Aldo [Australian Research Council Centre of Excellence for Particle Physics at the Tera-scale (Australia); University of Sydney, Centre for Translational Data Science, Faculty of Engineering and Information Technologies, School of Physics, Sydney, NSW (Australia); Scott, Pat [Imperial College London, Department of Physics, Blackett Laboratory, London (United Kingdom); Seo, Seon-Hee [Seoul National University, Department of Physics and Astronomy, Seoul (Korea, Republic of); Serra, Nicola [Universitaet Zuerich, Physik-Institut, Zurich (Switzerland); Weniger, Christoph [University of Amsterdam, GRAPPA, Institute of Physics, Amsterdam (Netherlands); Wild, Sebastian [DESY, Hamburg (Germany); Collaboration: The GAMBIT Collaboration
2018-02-15
In Ref. (GAMBIT Collaboration: Athron et. al., Eur. Phys. J. C.arXiv:1705.07908, 2017) we introduced the global-fitting framework GAMBIT. In this addendum, we describe a new minor version increment of this package. GAMBIT 1.1 includes full support for Mathematica backends, which we describe in some detail here. As an example, we backend SUSYHD (Vega and Villadoro, JHEP 07:159, 2015), which calculates the mass of the Higgs boson in the MSSM from effective field theory. We also describe updated likelihoods in PrecisionBit and DarkBit, and updated decay data included in DecayBit. (orig.)
Athron, Peter; Balazs, Csaba; Bringmann, Torsten; Buckley, Andy; Chrząszcz, Marcin; Conrad, Jan; Cornell, Jonathan M.; Dal, Lars A.; Dickinson, Hugh; Edsjö, Joakim; Farmer, Ben; Gonzalo, Tomás E.; Jackson, Paul; Krislock, Abram; Kvellestad, Anders; Lundberg, Johan; McKay, James; Mahmoudi, Farvah; Martinez, Gregory D.; Putze, Antje; Raklev, Are; Ripken, Joachim; Rogan, Christopher; Saavedra, Aldo; Savage, Christopher; Scott, Pat; Seo, Seon-Hee; Serra, Nicola; Weniger, Christoph; White, Martin; Wild, Sebastian
2018-02-01
In Ref. (GAMBIT Collaboration: Athron et. al., Eur. Phys. J. C. arXiv:1705.07908, 2017) we introduced the global-fitting framework GAMBIT. In this addendum, we describe a new minor version increment of this package. GAMBIT 1.1 includes full support for Mathematica backends, which we describe in some detail here. As an example, we backend SUSYHD (Vega and Villadoro, JHEP 07:159, 2015), which calculates the mass of the Higgs boson in the MSSM from effective field theory. We also describe updated likelihoods in PrecisionBit and DarkBit, and updated decay data included in DecayBit.
International Nuclear Information System (INIS)
Athron, Peter; Balazs, Csaba; Bringmann, Torsten; Dal, Lars A.; Gonzalo, Tomas E.; Krislock, Abram; Raklev, Are; Buckley, Andy; Chrzaszcz, Marcin; Conrad, Jan; Edsjoe, Joakim; Farmer, Ben; Lundberg, Johan; Cornell, Jonathan M.; Dickinson, Hugh; Jackson, Paul; White, Martin; Kvellestad, Anders; Savage, Christopher; McKay, James; Mahmoudi, Farvah; Martinez, Gregory D.; Putze, Antje; Ripken, Joachim; Rogan, Christopher; Saavedra, Aldo; Scott, Pat; Seo, Seon-Hee; Serra, Nicola; Weniger, Christoph; Wild, Sebastian
2018-01-01
In Ref. (GAMBIT Collaboration: Athron et. al., Eur. Phys. J. C.arXiv:1705.07908, 2017) we introduced the global-fitting framework GAMBIT. In this addendum, we describe a new minor version increment of this package. GAMBIT 1.1 includes full support for Mathematica backends, which we describe in some detail here. As an example, we backend SUSYHD (Vega and Villadoro, JHEP 07:159, 2015), which calculates the mass of the Higgs boson in the MSSM from effective field theory. We also describe updated likelihoods in PrecisionBit and DarkBit, and updated decay data included in DecayBit. (orig.)
Efficient Detection of Repeating Sites to Accelerate Phylogenetic Likelihood Calculations.
Kobert, K; Stamatakis, A; Flouri, T
2017-03-01
The phylogenetic likelihood function (PLF) is the major computational bottleneck in several applications of evolutionary biology such as phylogenetic inference, species delimitation, model selection, and divergence times estimation. Given the alignment, a tree and the evolutionary model parameters, the likelihood function computes the conditional likelihood vectors for every node of the tree. Vector entries for which all input data are identical result in redundant likelihood operations which, in turn, yield identical conditional values. Such operations can be omitted for improving run-time and, using appropriate data structures, reducing memory usage. We present a fast, novel method for identifying and omitting such redundant operations in phylogenetic likelihood calculations, and assess the performance improvement and memory savings attained by our method. Using empirical and simulated data sets, we show that a prototype implementation of our method yields up to 12-fold speedups and uses up to 78% less memory than one of the fastest and most highly tuned implementations of the PLF currently available. Our method is generic and can seamlessly be integrated into any phylogenetic likelihood implementation. [Algorithms; maximum likelihood; phylogenetic likelihood function; phylogenetics]. © The Author(s) 2016. Published by Oxford University Press, on behalf of the Society of Systematic Biologists.
Zhang, Kui; Wiener, Howard; Beasley, Mark; George, Varghese; Amos, Christopher I; Allison, David B
2006-08-01
Individual genome scans for quantitative trait loci (QTL) mapping often suffer from low statistical power and imprecise estimates of QTL location and effect. This lack of precision yields large confidence intervals for QTL location, which are problematic for subsequent fine mapping and positional cloning. In prioritizing areas for follow-up after an initial genome scan and in evaluating the credibility of apparent linkage signals, investigators typically examine the results of other genome scans of the same phenotype and informally update their beliefs about which linkage signals in their scan most merit confidence and follow-up via a subjective-intuitive integration approach. A method that acknowledges the wisdom of this general paradigm but formally borrows information from other scans to increase confidence in objectivity would be a benefit. We developed an empirical Bayes analytic method to integrate information from multiple genome scans. The linkage statistic obtained from a single genome scan study is updated by incorporating statistics from other genome scans as prior information. This technique does not require that all studies have an identical marker map or a common estimated QTL effect. The updated linkage statistic can then be used for the estimation of QTL location and effect. We evaluate the performance of our method by using extensive simulations based on actual marker spacing and allele frequencies from available data. Results indicate that the empirical Bayes method can account for between-study heterogeneity, estimate the QTL location and effect more precisely, and provide narrower confidence intervals than results from any single individual study. We also compared the empirical Bayes method with a method originally developed for meta-analysis (a closely related but distinct purpose). In the face of marked heterogeneity among studies, the empirical Bayes method outperforms the comparator.
DEFF Research Database (Denmark)
Justesen, Kristian Kjær; Andreasen, Søren Juhl; Shaker, Hamid Reza
2013-01-01
In this work, a dynamic MATLAB Simulink model of a H3-350 Reformed Methanol Fuel Cell (RMFC) stand-alone battery charger produced by Serenergy is developed on the basis of theoretical and empirical methods. The advantage of RMFC systems is that they use liquid methanol as a fuel instead of gaseous...... of the reforming process are implemented. Models of the cooling flow of the blowers for the fuel cell and the burner which supplies process heat for the reformer are made. The two blowers have a common exhaust, which means that the two blowers influence each other’s output. The models take this into account using...... an empirical approach. Fin efficiency models for the cooling effect of the air are also developed using empirical methods. A fuel cell model is also implemented based on a standard model which is adapted to fit the measured performance of the H3-350 module. All the individual parts of the model are verified...
DEFF Research Database (Denmark)
Justesen, Kristian Kjær; Andreasen, Søren Juhl; Shaker, Hamid Reza
2014-01-01
In this work, a dynamic MATLAB Simulink model of a H3-350 Reformed Methanol Fuel Cell (RMFC) stand-alone battery charger produced by Serenergy is developed on the basis of theoretical and empirical methods. The advantage of RMFC systems is that they use liquid methanol as a fuel instead of gaseous...... of the reforming process are implemented. Models of the cooling flow of the blowers for the fuel cell and the burner which supplies process heat for the reformer are made. The two blowers have a common exhaust, which means that the two blowers influence each other’s output. The models take this into account using...... an empirical approach. Fin efficiency models for the cooling effect of the air are also developed using empirical methods. A fuel cell model is also implemented based on a standard model which is adapted to fit the measured performance of the H3-350 module. All the individual parts of the model are verified...
Statistical inference an integrated approach
Migon, Helio S; Louzada, Francisco
2014-01-01
Introduction Information The concept of probability Assessing subjective probabilities An example Linear algebra and probability Notation Outline of the bookElements of Inference Common statistical modelsLikelihood-based functions Bayes theorem Exchangeability Sufficiency and exponential family Parameter elimination Prior Distribution Entirely subjective specification Specification through functional forms Conjugacy with the exponential family Non-informative priors Hierarchical priors Estimation Introduction to decision theoryBayesian point estimation Classical point estimation Empirical Bayes estimation Comparison of estimators Interval estimation Estimation in the Normal model Approximating Methods The general problem of inference Optimization techniquesAsymptotic theory Other analytical approximations Numerical integration methods Simulation methods Hypothesis Testing Introduction Classical hypothesis testingBayesian hypothesis testing Hypothesis testing and confidence intervalsAsymptotic tests Prediction...
Bailer-Jones, Coryn A. L.
2017-04-01
Preface; 1. Probability basics; 2. Estimation and uncertainty; 3. Statistical models and inference; 4. Linear models, least squares, and maximum likelihood; 5. Parameter estimation: single parameter; 6. Parameter estimation: multiple parameters; 7. Approximating distributions; 8. Monte Carlo methods for inference; 9. Parameter estimation: Markov chain Monte Carlo; 10. Frequentist hypothesis testing; 11. Model comparison; 12. Dealing with more complicated problems; References; Index.
Likelihood estimators for multivariate extremes
Huser, Raphaë l; Davison, Anthony C.; Genton, Marc G.
2015-01-01
The main approach to inference for multivariate extremes consists in approximating the joint upper tail of the observations by a parametric family arising in the limit for extreme events. The latter may be expressed in terms of componentwise maxima, high threshold exceedances or point processes, yielding different but related asymptotic characterizations and estimators. The present paper clarifies the connections between the main likelihood estimators, and assesses their practical performance. We investigate their ability to estimate the extremal dependence structure and to predict future extremes, using exact calculations and simulation, in the case of the logistic model.
Likelihood estimators for multivariate extremes
Huser, Raphaël
2015-11-17
The main approach to inference for multivariate extremes consists in approximating the joint upper tail of the observations by a parametric family arising in the limit for extreme events. The latter may be expressed in terms of componentwise maxima, high threshold exceedances or point processes, yielding different but related asymptotic characterizations and estimators. The present paper clarifies the connections between the main likelihood estimators, and assesses their practical performance. We investigate their ability to estimate the extremal dependence structure and to predict future extremes, using exact calculations and simulation, in the case of the logistic model.
Modeling gene expression measurement error: a quasi-likelihood approach
Directory of Open Access Journals (Sweden)
Strimmer Korbinian
2003-03-01
Full Text Available Abstract Background Using suitable error models for gene expression measurements is essential in the statistical analysis of microarray data. However, the true probabilistic model underlying gene expression intensity readings is generally not known. Instead, in currently used approaches some simple parametric model is assumed (usually a transformed normal distribution or the empirical distribution is estimated. However, both these strategies may not be optimal for gene expression data, as the non-parametric approach ignores known structural information whereas the fully parametric models run the risk of misspecification. A further related problem is the choice of a suitable scale for the model (e.g. observed vs. log-scale. Results Here a simple semi-parametric model for gene expression measurement error is presented. In this approach inference is based an approximate likelihood function (the extended quasi-likelihood. Only partial knowledge about the unknown true distribution is required to construct this function. In case of gene expression this information is available in the form of the postulated (e.g. quadratic variance structure of the data. As the quasi-likelihood behaves (almost like a proper likelihood, it allows for the estimation of calibration and variance parameters, and it is also straightforward to obtain corresponding approximate confidence intervals. Unlike most other frameworks, it also allows analysis on any preferred scale, i.e. both on the original linear scale as well as on a transformed scale. It can also be employed in regression approaches to model systematic (e.g. array or dye effects. Conclusions The quasi-likelihood framework provides a simple and versatile approach to analyze gene expression data that does not make any strong distributional assumptions about the underlying error model. For several simulated as well as real data sets it provides a better fit to the data than competing models. In an example it also
Statistical inference for financial engineering
Taniguchi, Masanobu; Ogata, Hiroaki; Taniai, Hiroyuki
2014-01-01
This monograph provides the fundamentals of statistical inference for financial engineering and covers some selected methods suitable for analyzing financial time series data. In order to describe the actual financial data, various stochastic processes, e.g. non-Gaussian linear processes, non-linear processes, long-memory processes, locally stationary processes etc. are introduced and their optimal estimation is considered as well. This book also includes several statistical approaches, e.g., discriminant analysis, the empirical likelihood method, control variate method, quantile regression, realized volatility etc., which have been recently developed and are considered to be powerful tools for analyzing the financial data, establishing a new bridge between time series and financial engineering. This book is well suited as a professional reference book on finance, statistics and statistical financial engineering. Readers are expected to have an undergraduate-level knowledge of statistics.
Learning Convex Inference of Marginals
Domke, Justin
2012-01-01
Graphical models trained using maximum likelihood are a common tool for probabilistic inference of marginal distributions. However, this approach suffers difficulties when either the inference process or the model is approximate. In this paper, the inference process is first defined to be the minimization of a convex function, inspired by free energy approximations. Learning is then done directly in terms of the performance of the inference process at univariate marginal prediction. The main ...
International Nuclear Information System (INIS)
Wall, M.J.W.
1992-01-01
The notion of open-quotes probabilityclose quotes is generalized to that of open-quotes likelihood,close quotes and a natural logical structure is shown to exist for any physical theory which predicts likelihoods. Two physically based axioms are given for this logical structure to form an orthomodular poset, with an order-determining set of states. The results strengthen the basis of the quantum logic approach to axiomatic quantum theory. 25 refs
On Bayesian Testing of Additive Conjoint Measurement Axioms Using Synthetic Likelihood.
Karabatsos, George
2018-06-01
This article introduces a Bayesian method for testing the axioms of additive conjoint measurement. The method is based on an importance sampling algorithm that performs likelihood-free, approximate Bayesian inference using a synthetic likelihood to overcome the analytical intractability of this testing problem. This new method improves upon previous methods because it provides an omnibus test of the entire hierarchy of cancellation axioms, beyond double cancellation. It does so while accounting for the posterior uncertainty that is inherent in the empirical orderings that are implied by these axioms, together. The new method is illustrated through a test of the cancellation axioms on a classic survey data set, and through the analysis of simulated data.
Maximum likelihood of phylogenetic networks.
Jin, Guohua; Nakhleh, Luay; Snir, Sagi; Tuller, Tamir
2006-11-01
Horizontal gene transfer (HGT) is believed to be ubiquitous among bacteria, and plays a major role in their genome diversification as well as their ability to develop resistance to antibiotics. In light of its evolutionary significance and implications for human health, developing accurate and efficient methods for detecting and reconstructing HGT is imperative. In this article we provide a new HGT-oriented likelihood framework for many problems that involve phylogeny-based HGT detection and reconstruction. Beside the formulation of various likelihood criteria, we show that most of these problems are NP-hard, and offer heuristics for efficient and accurate reconstruction of HGT under these criteria. We implemented our heuristics and used them to analyze biological as well as synthetic data. In both cases, our criteria and heuristics exhibited very good performance with respect to identifying the correct number of HGT events as well as inferring their correct location on the species tree. Implementation of the criteria as well as heuristics and hardness proofs are available from the authors upon request. Hardness proofs can also be downloaded at http://www.cs.tau.ac.il/~tamirtul/MLNET/Supp-ML.pdf
A guideline for the validation of likelihood ratio methods used for forensic evidence evaluation
Meuwly, Didier; Ramos, Daniel; Haraksim, Rudolf
2017-01-01
This Guideline proposes a protocol for the validation of forensic evaluation methods at the source level, using the Likelihood Ratio framework as defined within the Bayes’ inference model. In the context of the inference of identity of source, the Likelihood Ratio is used to evaluate the strength of
Model selection and inference a practical information-theoretic approach
Burnham, Kenneth P
1998-01-01
This book is unique in that it covers the philosophy of model-based data analysis and an omnibus strategy for the analysis of empirical data The book introduces information theoretic approaches and focuses critical attention on a priori modeling and the selection of a good approximating model that best represents the inference supported by the data Kullback-Leibler information represents a fundamental quantity in science and is Hirotugu Akaike's basis for model selection The maximized log-likelihood function can be bias-corrected to provide an estimate of expected, relative Kullback-Leibler information This leads to Akaike's Information Criterion (AIC) and various extensions and these are relatively simple and easy to use in practice, but little taught in statistics classes and far less understood in the applied sciences than should be the case The information theoretic approaches provide a unified and rigorous theory, an extension of likelihood theory, an important application of information theory, and are ...
Multimodel inference and adaptive management
Rehme, S.E.; Powell, L.A.; Allen, Craig R.
2011-01-01
Ecology is an inherently complex science coping with correlated variables, nonlinear interactions and multiple scales of pattern and process, making it difficult for experiments to result in clear, strong inference. Natural resource managers, policy makers, and stakeholders rely on science to provide timely and accurate management recommendations. However, the time necessary to untangle the complexities of interactions within ecosystems is often far greater than the time available to make management decisions. One method of coping with this problem is multimodel inference. Multimodel inference assesses uncertainty by calculating likelihoods among multiple competing hypotheses, but multimodel inference results are often equivocal. Despite this, there may be pressure for ecologists to provide management recommendations regardless of the strength of their study’s inference. We reviewed papers in the Journal of Wildlife Management (JWM) and the journal Conservation Biology (CB) to quantify the prevalence of multimodel inference approaches, the resulting inference (weak versus strong), and how authors dealt with the uncertainty. Thirty-eight percent and 14%, respectively, of articles in the JWM and CB used multimodel inference approaches. Strong inference was rarely observed, with only 7% of JWM and 20% of CB articles resulting in strong inference. We found the majority of weak inference papers in both journals (59%) gave specific management recommendations. Model selection uncertainty was ignored in most recommendations for management. We suggest that adaptive management is an ideal method to resolve uncertainty when research results in weak inference.
A short proof that phylogenetic tree reconstruction by maximum likelihood is hard.
Roch, Sebastien
2006-01-01
Maximum likelihood is one of the most widely used techniques to infer evolutionary histories. Although it is thought to be intractable, a proof of its hardness has been lacking. Here, we give a short proof that computing the maximum likelihood tree is NP-hard by exploiting a connection between likelihood and parsimony observed by Tuffley and Steel.
A Short Proof that Phylogenetic Tree Reconstruction by Maximum Likelihood is Hard
Roch, S.
2005-01-01
Maximum likelihood is one of the most widely used techniques to infer evolutionary histories. Although it is thought to be intractable, a proof of its hardness has been lacking. Here, we give a short proof that computing the maximum likelihood tree is NP-hard by exploiting a connection between likelihood and parsimony observed by Tuffley and Steel.
Variations on Bayesian Prediction and Inference
2016-05-09
inference 2.2.1 Background There are a number of statistical inference problems that are not generally formulated via a full probability model...problem of inference about an unknown parameter, the Bayesian approach requires a full probability 1. REPORT DATE (DD-MM-YYYY) 4. TITLE AND...the problem of inference about an unknown parameter, the Bayesian approach requires a full probability model/likelihood which can be an obstacle
Earthquake likelihood model testing
Schorlemmer, D.; Gerstenberger, M.C.; Wiemer, S.; Jackson, D.D.; Rhoades, D.A.
2007-01-01
INTRODUCTIONThe Regional Earthquake Likelihood Models (RELM) project aims to produce and evaluate alternate models of earthquake potential (probability per unit volume, magnitude, and time) for California. Based on differing assumptions, these models are produced to test the validity of their assumptions and to explore which models should be incorporated in seismic hazard and risk evaluation. Tests based on physical and geological criteria are useful but we focus on statistical methods using future earthquake catalog data only. We envision two evaluations: a test of consistency with observed data and a comparison of all pairs of models for relative consistency. Both tests are based on the likelihood method, and both are fully prospective (i.e., the models are not adjusted to fit the test data). To be tested, each model must assign a probability to any possible event within a specified region of space, time, and magnitude. For our tests the models must use a common format: earthquake rates in specified “bins” with location, magnitude, time, and focal mechanism limits.Seismology cannot yet deterministically predict individual earthquakes; however, it should seek the best possible models for forecasting earthquake occurrence. This paper describes the statistical rules of an experiment to examine and test earthquake forecasts. The primary purposes of the tests described below are to evaluate physical models for earthquakes, assure that source models used in seismic hazard and risk studies are consistent with earthquake data, and provide quantitative measures by which models can be assigned weights in a consensus model or be judged as suitable for particular regions.In this paper we develop a statistical method for testing earthquake likelihood models. A companion paper (Schorlemmer and Gerstenberger 2007, this issue) discusses the actual implementation of these tests in the framework of the RELM initiative.Statistical testing of hypotheses is a common task and a
Rokicki, Slawa; Cohen, Jessica; Fink, Günther; Salomon, Joshua A; Landrum, Mary Beth
2018-01-01
Difference-in-differences (DID) estimation has become increasingly popular as an approach to evaluate the effect of a group-level policy on individual-level outcomes. Several statistical methodologies have been proposed to correct for the within-group correlation of model errors resulting from the clustering of data. Little is known about how well these corrections perform with the often small number of groups observed in health research using longitudinal data. First, we review the most commonly used modeling solutions in DID estimation for panel data, including generalized estimating equations (GEE), permutation tests, clustered standard errors (CSE), wild cluster bootstrapping, and aggregation. Second, we compare the empirical coverage rates and power of these methods using a Monte Carlo simulation study in scenarios in which we vary the degree of error correlation, the group size balance, and the proportion of treated groups. Third, we provide an empirical example using the Survey of Health, Ageing, and Retirement in Europe. When the number of groups is small, CSE are systematically biased downwards in scenarios when data are unbalanced or when there is a low proportion of treated groups. This can result in over-rejection of the null even when data are composed of up to 50 groups. Aggregation, permutation tests, bias-adjusted GEE, and wild cluster bootstrap produce coverage rates close to the nominal rate for almost all scenarios, though GEE may suffer from low power. In DID estimation with a small number of groups, analysis using aggregation, permutation tests, wild cluster bootstrap, or bias-adjusted GEE is recommended.
Inferring relationships between pairs of individuals from locus heterozygosities
Directory of Open Access Journals (Sweden)
Spinetti Isabella
2002-11-01
Full Text Available Abstract Background The traditional exact method for inferring relationships between individuals from genetic data is not easily applicable in all situations that may be encountered in several fields of applied genetics. This study describes an approach that gives affordable results and is easily applicable; it is based on the probabilities that two individuals share 0, 1 or both alleles at a locus identical by state. Results We show that these probabilities (zi depend on locus heterozygosity (H, and are scarcely affected by variation of the distribution of allele frequencies. This allows us to obtain empirical curves relating zi's to H for a series of common relationships, so that the likelihood ratio of a pair of relationships between any two individuals, given their genotypes at a locus, is a function of a single parameter, H. Application to large samples of mother-child and full-sib pairs shows that the statistical power of this method to infer the correct relationship is not much lower than the exact method. Analysis of a large database of STR data proves that locus heterozygosity does not vary significantly among Caucasian populations, apart from special cases, so that the likelihood ratio of the more common relationships between pairs of individuals may be obtained by looking at tabulated zi values. Conclusions A simple method is provided, which may be used by any scientist with the help of a calculator or a spreadsheet to compute the likelihood ratios of common alternative relationships between pairs of individuals.
Dimension-Independent Likelihood-Informed MCMC
Cui, Tiangang; Law, Kody; Marzouk, Youssef
2015-01-01
Many Bayesian inference problems require exploring the posterior distribution of high-dimensional parameters, which in principle can be described as functions. By exploiting low-dimensional structure in the change from prior to posterior [distributions], we introduce a suite of MCMC samplers that can adapt to the complex structure of the posterior distribution, yet are well-defined on function space. Posterior sampling in nonlinear inverse problems arising from various partial di erential equations and also a stochastic differential equation are used to demonstrate the e ciency of these dimension-independent likelihood-informed samplers.
Dimension-Independent Likelihood-Informed MCMC
Cui, Tiangang
2015-01-07
Many Bayesian inference problems require exploring the posterior distribution of high-dimensional parameters, which in principle can be described as functions. By exploiting low-dimensional structure in the change from prior to posterior [distributions], we introduce a suite of MCMC samplers that can adapt to the complex structure of the posterior distribution, yet are well-defined on function space. Posterior sampling in nonlinear inverse problems arising from various partial di erential equations and also a stochastic differential equation are used to demonstrate the e ciency of these dimension-independent likelihood-informed samplers.
Mixed normal inference on multicointegration
Boswijk, H.P.
2009-01-01
Asymptotic likelihood analysis of cointegration in I(2) models, see Johansen (1997, 2006), Boswijk (2000) and Paruolo (2000), has shown that inference on most parameters is mixed normal, implying hypothesis test statistics with an asymptotic 2 null distribution. The asymptotic distribution of the
Robust Gaussian Process Regression with a Student-t Likelihood
Jylänki, P.P.; Vanhatalo, J.; Vehtari, A.
2011-01-01
This paper considers the robust and efficient implementation of Gaussian process regression with a Student-t observation model, which has a non-log-concave likelihood. The challenge with the Student-t model is the analytically intractable inference which is why several approximative methods have
LIKELIHOOD ESTIMATION OF PARAMETERS USING SIMULTANEOUSLY MONITORED PROCESSES
DEFF Research Database (Denmark)
Friis-Hansen, Peter; Ditlevsen, Ove Dalager
2004-01-01
The topic is maximum likelihood inference from several simultaneously monitored response processes of a structure to obtain knowledge about the parameters of other not monitored but important response processes when the structure is subject to some Gaussian load field in space and time. The consi....... The considered example is a ship sailing with a given speed through a Gaussian wave field....
Tapered composite likelihood for spatial max-stable models
Sang, Huiyan
2014-05-01
Spatial extreme value analysis is useful to environmental studies, in which extreme value phenomena are of interest and meaningful spatial patterns can be discerned. Max-stable process models are able to describe such phenomena. This class of models is asymptotically justified to characterize the spatial dependence among extremes. However, likelihood inference is challenging for such models because their corresponding joint likelihood is unavailable and only bivariate or trivariate distributions are known. In this paper, we propose a tapered composite likelihood approach by utilizing lower dimensional marginal likelihoods for inference on parameters of various max-stable process models. We consider a weighting strategy based on a "taper range" to exclude distant pairs or triples. The "optimal taper range" is selected to maximize various measures of the Godambe information associated with the tapered composite likelihood function. This method substantially reduces the computational cost and improves the efficiency over equally weighted composite likelihood estimators. We illustrate its utility with simulation experiments and an analysis of rainfall data in Switzerland.
Tapered composite likelihood for spatial max-stable models
Sang, Huiyan; Genton, Marc G.
2014-01-01
Spatial extreme value analysis is useful to environmental studies, in which extreme value phenomena are of interest and meaningful spatial patterns can be discerned. Max-stable process models are able to describe such phenomena. This class of models is asymptotically justified to characterize the spatial dependence among extremes. However, likelihood inference is challenging for such models because their corresponding joint likelihood is unavailable and only bivariate or trivariate distributions are known. In this paper, we propose a tapered composite likelihood approach by utilizing lower dimensional marginal likelihoods for inference on parameters of various max-stable process models. We consider a weighting strategy based on a "taper range" to exclude distant pairs or triples. The "optimal taper range" is selected to maximize various measures of the Godambe information associated with the tapered composite likelihood function. This method substantially reduces the computational cost and improves the efficiency over equally weighted composite likelihood estimators. We illustrate its utility with simulation experiments and an analysis of rainfall data in Switzerland.
Inference of directional selection and mutation parameters assuming equilibrium.
Vogl, Claus; Bergman, Juraj
2015-12-01
In a classical study, Wright (1931) proposed a model for the evolution of a biallelic locus under the influence of mutation, directional selection and drift. He derived the equilibrium distribution of the allelic proportion conditional on the scaled mutation rate, the mutation bias and the scaled strength of directional selection. The equilibrium distribution can be used for inference of these parameters with genome-wide datasets of "site frequency spectra" (SFS). Assuming that the scaled mutation rate is low, Wright's model can be approximated by a boundary-mutation model, where mutations are introduced into the population exclusively from sites fixed for the preferred or unpreferred allelic states. With the boundary-mutation model, inference can be partitioned: (i) the shape of the SFS distribution within the polymorphic region is determined by random drift and directional selection, but not by the mutation parameters, such that inference of the selection parameter relies exclusively on the polymorphic sites in the SFS; (ii) the mutation parameters can be inferred from the amount of polymorphic and monomorphic preferred and unpreferred alleles, conditional on the selection parameter. Herein, we derive maximum likelihood estimators for the mutation and selection parameters in equilibrium and apply the method to simulated SFS data as well as empirical data from a Madagascar population of Drosophila simulans. Copyright © 2015 Elsevier Inc. All rights reserved.
Likelihood devices in spatial statistics
Zwet, E.W. van
1999-01-01
One of the main themes of this thesis is the application to spatial data of modern semi- and nonparametric methods. Another, closely related theme is maximum likelihood estimation from spatial data. Maximum likelihood estimation is not common practice in spatial statistics. The method of moments
Empirical training for conditional random fields
Zhu, Zhemin; Hiemstra, Djoerd; Apers, Peter M.G.; Wombacher, Andreas
2013-01-01
In this paper (Zhu et al., 2013), we present a practi- cally scalable training method for CRFs called Empir- ical Training (EP). We show that the standard train- ing with unregularized log likelihood can have many maximum likelihood estimations (MLEs). Empirical training has a unique closed form MLE
Statistical inference an integrated Bayesianlikelihood approach
Aitkin, Murray
2010-01-01
Filling a gap in current Bayesian theory, Statistical Inference: An Integrated Bayesian/Likelihood Approach presents a unified Bayesian treatment of parameter inference and model comparisons that can be used with simple diffuse prior specifications. This novel approach provides new solutions to difficult model comparison problems and offers direct Bayesian counterparts of frequentist t-tests and other standard statistical methods for hypothesis testing.After an overview of the competing theories of statistical inference, the book introduces the Bayes/likelihood approach used throughout. It pre
Caticha, Ariel
2011-03-01
In this tutorial we review the essential arguments behing entropic inference. We focus on the epistemological notion of information and its relation to the Bayesian beliefs of rational agents. The problem of updating from a prior to a posterior probability distribution is tackled through an eliminative induction process that singles out the logarithmic relative entropy as the unique tool for inference. The resulting method of Maximum relative Entropy (ME), includes as special cases both MaxEnt and Bayes' rule, and therefore unifies the two themes of these workshops—the Maximum Entropy and the Bayesian methods—into a single general inference scheme.
Kroese, A.H.; van der Meulen, E.A.; Poortema, Klaas; Schaafsma, W.
1995-01-01
The making of statistical inferences in distributional form is conceptionally complicated because the epistemic 'probabilities' assigned are mixtures of fact and fiction. In this respect they are essentially different from 'physical' or 'frequency-theoretic' probabilities. The distributional form is
Caticha, Ariel
2010-01-01
In this tutorial we review the essential arguments behing entropic inference. We focus on the epistemological notion of information and its relation to the Bayesian beliefs of rational agents. The problem of updating from a prior to a posterior probability distribution is tackled through an eliminative induction process that singles out the logarithmic relative entropy as the unique tool for inference. The resulting method of Maximum relative Entropy (ME), includes as special cases both MaxEn...
Aggelopoulos, Nikolaos C
2015-08-01
Perceptual inference refers to the ability to infer sensory stimuli from predictions that result from internal neural representations built through prior experience. Methods of Bayesian statistical inference and decision theory model cognition adequately by using error sensing either in guiding action or in "generative" models that predict the sensory information. In this framework, perception can be seen as a process qualitatively distinct from sensation, a process of information evaluation using previously acquired and stored representations (memories) that is guided by sensory feedback. The stored representations can be utilised as internal models of sensory stimuli enabling long term associations, for example in operant conditioning. Evidence for perceptual inference is contributed by such phenomena as the cortical co-localisation of object perception with object memory, the response invariance in the responses of some neurons to variations in the stimulus, as well as from situations in which perception can be dissociated from sensation. In the context of perceptual inference, sensory areas of the cerebral cortex that have been facilitated by a priming signal may be regarded as comparators in a closed feedback loop, similar to the better known motor reflexes in the sensorimotor system. The adult cerebral cortex can be regarded as similar to a servomechanism, in using sensory feedback to correct internal models, producing predictions of the outside world on the basis of past experience. Copyright © 2015 Elsevier Ltd. All rights reserved.
Empirical Information Metrics for Prediction Power and Experiment Planning
Directory of Open Access Journals (Sweden)
Christopher Lee
2011-01-01
Full Text Available In principle, information theory could provide useful metrics for statistical inference. In practice this is impeded by divergent assumptions: Information theory assumes the joint distribution of variables of interest is known, whereas in statistical inference it is hidden and is the goal of inference. To integrate these approaches we note a common theme they share, namely the measurement of prediction power. We generalize this concept as an information metric, subject to several requirements: Calculation of the metric must be objective or model-free; unbiased; convergent; probabilistically bounded; and low in computational complexity. Unfortunately, widely used model selection metrics such as Maximum Likelihood, the Akaike Information Criterion and Bayesian Information Criterion do not necessarily meet all these requirements. We define four distinct empirical information metrics measured via sampling, with explicit Law of Large Numbers convergence guarantees, which meet these requirements: Ie, the empirical information, a measure of average prediction power; Ib, the overfitting bias information, which measures selection bias in the modeling procedure; Ip, the potential information, which measures the total remaining information in the observations not yet discovered by the model; and Im, the model information, which measures the model’s extrapolation prediction power. Finally, we show that Ip + Ie, Ip + Im, and Ie — Im are fixed constants for a given observed dataset (i.e. prediction target, independent of the model, and thus represent a fundamental subdivision of the total information contained in the observations. We discuss the application of these metrics to modeling and experiment planning.
Stability of maximum-likelihood-based clustering methods: exploring the backbone of classifications
International Nuclear Information System (INIS)
Mungan, Muhittin; Ramasco, José J
2010-01-01
Components of complex systems are often classified according to the way they interact with each other. In graph theory such groups are known as clusters or communities. Many different techniques have been recently proposed to detect them, some of which involve inference methods using either Bayesian or maximum likelihood approaches. In this paper, we study a statistical model designed for detecting clusters based on connection similarity. The basic assumption of the model is that the graph was generated by a certain grouping of the nodes and an expectation maximization algorithm is employed to infer that grouping. We show that the method admits further development to yield a stability analysis of the groupings that quantifies the extent to which each node influences its neighbors' group membership. Our approach naturally allows for the identification of the key elements responsible for the grouping and their resilience to changes in the network. Given the generality of the assumptions underlying the statistical model, such nodes are likely to play special roles in the original system. We illustrate this point by analyzing several empirical networks for which further information about the properties of the nodes is available. The search and identification of stabilizing nodes constitutes thus a novel technique to characterize the relevance of nodes in complex networks
Obtaining reliable Likelihood Ratio tests from simulated likelihood functions
DEFF Research Database (Denmark)
Andersen, Laura Mørch
It is standard practice by researchers and the default option in many statistical programs to base test statistics for mixed models on simulations using asymmetric draws (e.g. Halton draws). This paper shows that when the estimated likelihood functions depend on standard deviations of mixed param...
Multiple Improvements of Multiple Imputation Likelihood Ratio Tests
Chan, Kin Wai; Meng, Xiao-Li
2017-01-01
Multiple imputation (MI) inference handles missing data by first properly imputing the missing values $m$ times, and then combining the $m$ analysis results from applying a complete-data procedure to each of the completed datasets. However, the existing method for combining likelihood ratio tests has multiple defects: (i) the combined test statistic can be negative in practice when the reference null distribution is a standard $F$ distribution; (ii) it is not invariant to re-parametrization; ...
Statistical modelling of survival data with random effects h-likelihood approach
Ha, Il Do; Lee, Youngjo
2017-01-01
This book provides a groundbreaking introduction to the likelihood inference for correlated survival data via the hierarchical (or h-) likelihood in order to obtain the (marginal) likelihood and to address the computational difficulties in inferences and extensions. The approach presented in the book overcomes shortcomings in the traditional likelihood-based methods for clustered survival data such as intractable integration. The text includes technical materials such as derivations and proofs in each chapter, as well as recently developed software programs in R (“frailtyHL”), while the real-world data examples together with an R package, “frailtyHL” in CRAN, provide readers with useful hands-on tools. Reviewing new developments since the introduction of the h-likelihood to survival analysis (methods for interval estimation of the individual frailty and for variable selection of the fixed effects in the general class of frailty models) and guiding future directions, the book is of interest to research...
Dissociating response conflict and error likelihood in anterior cingulate cortex.
Yeung, Nick; Nieuwenhuis, Sander
2009-11-18
Neuroimaging studies consistently report activity in anterior cingulate cortex (ACC) in conditions of high cognitive demand, leading to the view that ACC plays a crucial role in the control of cognitive processes. According to one prominent theory, the sensitivity of ACC to task difficulty reflects its role in monitoring for the occurrence of competition, or "conflict," between responses to signal the need for increased cognitive control. However, a contrasting theory proposes that ACC is the recipient rather than source of monitoring signals, and that ACC activity observed in relation to task demand reflects the role of this region in learning about the likelihood of errors. Response conflict and error likelihood are typically confounded, making the theories difficult to distinguish empirically. The present research therefore used detailed computational simulations to derive contrasting predictions regarding ACC activity and error rate as a function of response speed. The simulations demonstrated a clear dissociation between conflict and error likelihood: fast response trials are associated with low conflict but high error likelihood, whereas slow response trials show the opposite pattern. Using the N2 component as an index of ACC activity, an EEG study demonstrated that when conflict and error likelihood are dissociated in this way, ACC activity tracks conflict and is negatively correlated with error likelihood. These findings support the conflict-monitoring theory and suggest that, in speeded decision tasks, ACC activity reflects current task demands rather than the retrospective coding of past performance.
Forward and backward inference in spatial cognition.
Directory of Open Access Journals (Sweden)
Will D Penny
Full Text Available This paper shows that the various computations underlying spatial cognition can be implemented using statistical inference in a single probabilistic model. Inference is implemented using a common set of 'lower-level' computations involving forward and backward inference over time. For example, to estimate where you are in a known environment, forward inference is used to optimally combine location estimates from path integration with those from sensory input. To decide which way to turn to reach a goal, forward inference is used to compute the likelihood of reaching that goal under each option. To work out which environment you are in, forward inference is used to compute the likelihood of sensory observations under the different hypotheses. For reaching sensory goals that require a chaining together of decisions, forward inference can be used to compute a state trajectory that will lead to that goal, and backward inference to refine the route and estimate control signals that produce the required trajectory. We propose that these computations are reflected in recent findings of pattern replay in the mammalian brain. Specifically, that theta sequences reflect decision making, theta flickering reflects model selection, and remote replay reflects route and motor planning. We also propose a mapping of the above computational processes onto lateral and medial entorhinal cortex and hippocampus.
A maximum likelihood framework for protein design
Directory of Open Access Journals (Sweden)
Philippe Hervé
2006-06-01
Full Text Available Abstract Background The aim of protein design is to predict amino-acid sequences compatible with a given target structure. Traditionally envisioned as a purely thermodynamic question, this problem can also be understood in a wider context, where additional constraints are captured by learning the sequence patterns displayed by natural proteins of known conformation. In this latter perspective, however, we still need a theoretical formalization of the question, leading to general and efficient learning methods, and allowing for the selection of fast and accurate objective functions quantifying sequence/structure compatibility. Results We propose a formulation of the protein design problem in terms of model-based statistical inference. Our framework uses the maximum likelihood principle to optimize the unknown parameters of a statistical potential, which we call an inverse potential to contrast with classical potentials used for structure prediction. We propose an implementation based on Markov chain Monte Carlo, in which the likelihood is maximized by gradient descent and is numerically estimated by thermodynamic integration. The fit of the models is evaluated by cross-validation. We apply this to a simple pairwise contact potential, supplemented with a solvent-accessibility term, and show that the resulting models have a better predictive power than currently available pairwise potentials. Furthermore, the model comparison method presented here allows one to measure the relative contribution of each component of the potential, and to choose the optimal number of accessibility classes, which turns out to be much higher than classically considered. Conclusion Altogether, this reformulation makes it possible to test a wide diversity of models, using different forms of potentials, or accounting for other factors than just the constraint of thermodynamic stability. Ultimately, such model-based statistical analyses may help to understand the forces
Gaussian copula as a likelihood function for environmental models
Wani, O.; Espadas, G.; Cecinati, F.; Rieckermann, J.
2017-12-01
Parameter estimation of environmental models always comes with uncertainty. To formally quantify this parametric uncertainty, a likelihood function needs to be formulated, which is defined as the probability of observations given fixed values of the parameter set. A likelihood function allows us to infer parameter values from observations using Bayes' theorem. The challenge is to formulate a likelihood function that reliably describes the error generating processes which lead to the observed monitoring data, such as rainfall and runoff. If the likelihood function is not representative of the error statistics, the parameter inference will give biased parameter values. Several uncertainty estimation methods that are currently being used employ Gaussian processes as a likelihood function, because of their favourable analytical properties. Box-Cox transformation is suggested to deal with non-symmetric and heteroscedastic errors e.g. for flow data which are typically more uncertain in high flows than in periods with low flows. Problem with transformations is that the results are conditional on hyper-parameters, for which it is difficult to formulate the analyst's belief a priori. In an attempt to address this problem, in this research work we suggest learning the nature of the error distribution from the errors made by the model in the "past" forecasts. We use a Gaussian copula to generate semiparametric error distributions . 1) We show that this copula can be then used as a likelihood function to infer parameters, breaking away from the practice of using multivariate normal distributions. Based on the results from a didactical example of predicting rainfall runoff, 2) we demonstrate that the copula captures the predictive uncertainty of the model. 3) Finally, we find that the properties of autocorrelation and heteroscedasticity of errors are captured well by the copula, eliminating the need to use transforms. In summary, our findings suggest that copulas are an
Ego involvement increases doping likelihood.
Ring, Christopher; Kavussanu, Maria
2018-08-01
Achievement goal theory provides a framework to help understand how individuals behave in achievement contexts, such as sport. Evidence concerning the role of motivation in the decision to use banned performance enhancing substances (i.e., doping) is equivocal on this issue. The extant literature shows that dispositional goal orientation has been weakly and inconsistently associated with doping intention and use. It is possible that goal involvement, which describes the situational motivational state, is a stronger determinant of doping intention. Accordingly, the current study used an experimental design to examine the effects of goal involvement, manipulated using direct instructions and reflective writing, on doping likelihood in hypothetical situations in college athletes. The ego-involving goal increased doping likelihood compared to no goal and a task-involving goal. The present findings provide the first evidence that ego involvement can sway the decision to use doping to improve athletic performance.
Rohatgi, Vijay K
2003-01-01
Unified treatment of probability and statistics examines and analyzes the relationship between the two fields, exploring inferential issues. Numerous problems, examples, and diagrams--some with solutions--plus clear-cut, highlighted summaries of results. Advanced undergraduate to graduate level. Contents: 1. Introduction. 2. Probability Model. 3. Probability Distributions. 4. Introduction to Statistical Inference. 5. More on Mathematical Expectation. 6. Some Discrete Models. 7. Some Continuous Models. 8. Functions of Random Variables and Random Vectors. 9. Large-Sample Theory. 10. General Meth
Towards Bayesian Inference of the Fast-Ion Distribution Function
DEFF Research Database (Denmark)
Stagner, L.; Heidbrink, W.W.; Salewski, Mirko
2012-01-01
sensitivity of the measurements are incorporated into Bayesian likelihood probabilities, while prior probabilities enforce physical constraints. As an initial step, this poster uses Bayesian statistics to infer the DIII-D electron density profile from multiple diagnostic measurements. Likelihood functions....... However, when theory and experiment disagree (for one or more diagnostics), it is unclear how to proceed. Bayesian statistics provides a framework to infer the DF, quantify errors, and reconcile discrepant diagnostic measurements. Diagnostic errors and ``weight functions" that describe the phase space...
Dimension-independent likelihood-informed MCMC
Cui, Tiangang
2015-10-08
Many Bayesian inference problems require exploring the posterior distribution of high-dimensional parameters that represent the discretization of an underlying function. This work introduces a family of Markov chain Monte Carlo (MCMC) samplers that can adapt to the particular structure of a posterior distribution over functions. Two distinct lines of research intersect in the methods developed here. First, we introduce a general class of operator-weighted proposal distributions that are well defined on function space, such that the performance of the resulting MCMC samplers is independent of the discretization of the function. Second, by exploiting local Hessian information and any associated low-dimensional structure in the change from prior to posterior distributions, we develop an inhomogeneous discretization scheme for the Langevin stochastic differential equation that yields operator-weighted proposals adapted to the non-Gaussian structure of the posterior. The resulting dimension-independent and likelihood-informed (DILI) MCMC samplers may be useful for a large class of high-dimensional problems where the target probability measure has a density with respect to a Gaussian reference measure. Two nonlinear inverse problems are used to demonstrate the efficiency of these DILI samplers: an elliptic PDE coefficient inverse problem and path reconstruction in a conditioned diffusion.
Dimension-independent likelihood-informed MCMC
Cui, Tiangang; Law, Kody; Marzouk, Youssef M.
2015-01-01
Many Bayesian inference problems require exploring the posterior distribution of high-dimensional parameters that represent the discretization of an underlying function. This work introduces a family of Markov chain Monte Carlo (MCMC) samplers that can adapt to the particular structure of a posterior distribution over functions. Two distinct lines of research intersect in the methods developed here. First, we introduce a general class of operator-weighted proposal distributions that are well defined on function space, such that the performance of the resulting MCMC samplers is independent of the discretization of the function. Second, by exploiting local Hessian information and any associated low-dimensional structure in the change from prior to posterior distributions, we develop an inhomogeneous discretization scheme for the Langevin stochastic differential equation that yields operator-weighted proposals adapted to the non-Gaussian structure of the posterior. The resulting dimension-independent and likelihood-informed (DILI) MCMC samplers may be useful for a large class of high-dimensional problems where the target probability measure has a density with respect to a Gaussian reference measure. Two nonlinear inverse problems are used to demonstrate the efficiency of these DILI samplers: an elliptic PDE coefficient inverse problem and path reconstruction in a conditioned diffusion.
Statistical Inference for a Class of Multivariate Negative Binomial Distributions
DEFF Research Database (Denmark)
Rubak, Ege H.; Møller, Jesper; McCullagh, Peter
This paper considers statistical inference procedures for a class of models for positively correlated count variables called -permanental random fields, and which can be viewed as a family of multivariate negative binomial distributions. Their appealing probabilistic properties have earlier been...... studied in the literature, while this is the first statistical paper on -permanental random fields. The focus is on maximum likelihood estimation, maximum quasi-likelihood estimation and on maximum composite likelihood estimation based on uni- and bivariate distributions. Furthermore, new results...
Inferring Phylogenetic Networks Using PhyloNet.
Wen, Dingqiao; Yu, Yun; Zhu, Jiafan; Nakhleh, Luay
2018-07-01
PhyloNet was released in 2008 as a software package for representing and analyzing phylogenetic networks. At the time of its release, the main functionalities in PhyloNet consisted of measures for comparing network topologies and a single heuristic for reconciling gene trees with a species tree. Since then, PhyloNet has grown significantly. The software package now includes a wide array of methods for inferring phylogenetic networks from data sets of unlinked loci while accounting for both reticulation (e.g., hybridization) and incomplete lineage sorting. In particular, PhyloNet now allows for maximum parsimony, maximum likelihood, and Bayesian inference of phylogenetic networks from gene tree estimates. Furthermore, Bayesian inference directly from sequence data (sequence alignments or biallelic markers) is implemented. Maximum parsimony is based on an extension of the "minimizing deep coalescences" criterion to phylogenetic networks, whereas maximum likelihood and Bayesian inference are based on the multispecies network coalescent. All methods allow for multiple individuals per species. As computing the likelihood of a phylogenetic network is computationally hard, PhyloNet allows for evaluation and inference of networks using a pseudolikelihood measure. PhyloNet summarizes the results of the various analyzes and generates phylogenetic networks in the extended Newick format that is readily viewable by existing visualization software.
Practical Statistics for LHC Physicists: Descriptive Statistics, Probability and Likelihood (1/3)
CERN. Geneva
2015-01-01
These lectures cover those principles and practices of statistics that are most relevant for work at the LHC. The first lecture discusses the basic ideas of descriptive statistics, probability and likelihood. The second lecture covers the key ideas in the frequentist approach, including confidence limits, profile likelihoods, p-values, and hypothesis testing. The third lecture covers inference in the Bayesian approach. Throughout, real-world examples will be used to illustrate the practical application of the ideas. No previous knowledge is assumed.
Optimal inference with suboptimal models: Addiction and active Bayesian inference
Schwartenbeck, Philipp; FitzGerald, Thomas H.B.; Mathys, Christoph; Dolan, Ray; Wurst, Friedrich; Kronbichler, Martin; Friston, Karl
2015-01-01
When casting behaviour as active (Bayesian) inference, optimal inference is defined with respect to an agent’s beliefs – based on its generative model of the world. This contrasts with normative accounts of choice behaviour, in which optimal actions are considered in relation to the true structure of the environment – as opposed to the agent’s beliefs about worldly states (or the task). This distinction shifts an understanding of suboptimal or pathological behaviour away from aberrant inference as such, to understanding the prior beliefs of a subject that cause them to behave less ‘optimally’ than our prior beliefs suggest they should behave. Put simply, suboptimal or pathological behaviour does not speak against understanding behaviour in terms of (Bayes optimal) inference, but rather calls for a more refined understanding of the subject’s generative model upon which their (optimal) Bayesian inference is based. Here, we discuss this fundamental distinction and its implications for understanding optimality, bounded rationality and pathological (choice) behaviour. We illustrate our argument using addictive choice behaviour in a recently described ‘limited offer’ task. Our simulations of pathological choices and addictive behaviour also generate some clear hypotheses, which we hope to pursue in ongoing empirical work. PMID:25561321
Empirical direction in design and analysis
Anderson, Norman H
2001-01-01
The goal of Norman H. Anderson's new book is to help students develop skills of scientific inference. To accomplish this he organized the book around the ""Experimental Pyramid""--six levels that represent a hierarchy of considerations in empirical investigation--conceptual framework, phenomena, behavior, measurement, design, and statistical inference. To facilitate conceptual and empirical understanding, Anderson de-emphasizes computational formulas and null hypothesis testing. Other features include: *emphasis on visual inspection as a basic skill in experimental analysis to help student
Estimation and inference in the same-different test
DEFF Research Database (Denmark)
Christensen, Rune Haubo Bojesen; Brockhoff, Per B.
2009-01-01
as well as similarity. We show that the likelihood root statistic is equivalent to the well known G(2) likelihood ratio statistic for tests of no difference. As an additional practical tool, we introduce the profile likelihood curve to provide a convenient graphical summary of the information in the data......Inference for the Thurstonian delta in the same-different protocol via the well known Wald statistic is shown to be inappropriate in a wide range of situations. We introduce the likelihood root statistic as an alternative to the Wald statistic to produce CIs and p-values for assessing difference...
DEFF Research Database (Denmark)
Khair, Tabish
2017-01-01
Review of 'Inglorious Empire: What the British did to India' by Shashi Tharoor, London, Hurst Publishers, 2017, 296 pp., £20.00......Review of 'Inglorious Empire: What the British did to India' by Shashi Tharoor, London, Hurst Publishers, 2017, 296 pp., £20.00...
Corporate governance effect on financial distress likelihood: Evidence from Spain
Directory of Open Access Journals (Sweden)
Montserrat Manzaneque
2016-01-01
Full Text Available The paper explores some mechanisms of corporate governance (ownership and board characteristics in Spanish listed companies and their impact on the likelihood of financial distress. An empirical study was conducted between 2007 and 2012 using a matched-pairs research design with 308 observations, with half of them classified as distressed and non-distressed. Based on the previous study by Pindado, Rodrigues, and De la Torre (2008, a broader concept of bankruptcy is used to define business failure. Employing several conditional logistic models, as well as to other previous studies on bankruptcy, the results confirm that in difficult situations prior to bankruptcy, the impact of board ownership and proportion of independent directors on business failure likelihood are similar to those exerted in more extreme situations. These results go one step further, to offer a negative relationship between board size and the likelihood of financial distress. This result is interpreted as a form of creating diversity and to improve the access to the information and resources, especially in contexts where the ownership is highly concentrated and large shareholders have a great power to influence the board structure. However, the results confirm that ownership concentration does not have a significant impact on financial distress likelihood in the Spanish context. It is argued that large shareholders are passive as regards an enhanced monitoring of management and, alternatively, they do not have enough incentives to hold back the financial distress. These findings have important implications in the Spanish context, where several changes in the regulatory listing requirements have been carried out with respect to corporate governance, and where there is no empirical evidence regarding this respect.
Haraksim, Rudolf
2014-01-01
In this chapter the Likelihood Ratio (LR) inference model will be introduced, the theoretical aspects of probabilities will be discussed and the validation framework for LR methods used for forensic evidence evaluation will be presented. Prior to introducing the validation framework, following
Comparisons of likelihood and machine learning methods of individual classification
Guinand, B.; Topchy, A.; Page, K.S.; Burnham-Curtis, M. K.; Punch, W.F.; Scribner, K.T.
2002-01-01
Classification methods used in machine learning (e.g., artificial neural networks, decision trees, and k-nearest neighbor clustering) are rarely used with population genetic data. We compare different nonparametric machine learning techniques with parametric likelihood estimations commonly employed in population genetics for purposes of assigning individuals to their population of origin (“assignment tests”). Classifier accuracy was compared across simulated data sets representing different levels of population differentiation (low and high FST), number of loci surveyed (5 and 10), and allelic diversity (average of three or eight alleles per locus). Empirical data for the lake trout (Salvelinus namaycush) exhibiting levels of population differentiation comparable to those used in simulations were examined to further evaluate and compare classification methods. Classification error rates associated with artificial neural networks and likelihood estimators were lower for simulated data sets compared to k-nearest neighbor and decision tree classifiers over the entire range of parameters considered. Artificial neural networks only marginally outperformed the likelihood method for simulated data (0–2.8% lower error rates). The relative performance of each machine learning classifier improved relative likelihood estimators for empirical data sets, suggesting an ability to “learn” and utilize properties of empirical genotypic arrays intrinsic to each population. Likelihood-based estimation methods provide a more accessible option for reliable assignment of individuals to the population of origin due to the intricacies in development and evaluation of artificial neural networks. In recent years, characterization of highly polymorphic molecular markers such as mini- and microsatellites and development of novel methods of analysis have enabled researchers to extend investigations of ecological and evolutionary processes below the population level to the level of
The Laplace Likelihood Ratio Test for Heteroscedasticity
Directory of Open Access Journals (Sweden)
J. Martin van Zyl
2011-01-01
Full Text Available It is shown that the likelihood ratio test for heteroscedasticity, assuming the Laplace distribution, gives good results for Gaussian and fat-tailed data. The likelihood ratio test, assuming normality, is very sensitive to any deviation from normality, especially when the observations are from a distribution with fat tails. Such a likelihood test can also be used as a robust test for a constant variance in residuals or a time series if the data is partitioned into groups.
MXLKID: a maximum likelihood parameter identifier
International Nuclear Information System (INIS)
Gavel, D.T.
1980-07-01
MXLKID (MaXimum LiKelihood IDentifier) is a computer program designed to identify unknown parameters in a nonlinear dynamic system. Using noisy measurement data from the system, the maximum likelihood identifier computes a likelihood function (LF). Identification of system parameters is accomplished by maximizing the LF with respect to the parameters. The main body of this report briefly summarizes the maximum likelihood technique and gives instructions and examples for running the MXLKID program. MXLKID is implemented LRLTRAN on the CDC7600 computer at LLNL. A detailed mathematical description of the algorithm is given in the appendices. 24 figures, 6 tables
Deep Learning for Population Genetic Inference.
Sheehan, Sara; Song, Yun S
2016-03-01
Given genomic variation data from multiple individuals, computing the likelihood of complex population genetic models is often infeasible. To circumvent this problem, we introduce a novel likelihood-free inference framework by applying deep learning, a powerful modern technique in machine learning. Deep learning makes use of multilayer neural networks to learn a feature-based function from the input (e.g., hundreds of correlated summary statistics of data) to the output (e.g., population genetic parameters of interest). We demonstrate that deep learning can be effectively employed for population genetic inference and learning informative features of data. As a concrete application, we focus on the challenging problem of jointly inferring natural selection and demography (in the form of a population size change history). Our method is able to separate the global nature of demography from the local nature of selection, without sequential steps for these two factors. Studying demography and selection jointly is motivated by Drosophila, where pervasive selection confounds demographic analysis. We apply our method to 197 African Drosophila melanogaster genomes from Zambia to infer both their overall demography, and regions of their genome under selection. We find many regions of the genome that have experienced hard sweeps, and fewer under selection on standing variation (soft sweep) or balancing selection. Interestingly, we find that soft sweeps and balancing selection occur more frequently closer to the centromere of each chromosome. In addition, our demographic inference suggests that previously estimated bottlenecks for African Drosophila melanogaster are too extreme.
Deep Learning for Population Genetic Inference.
Directory of Open Access Journals (Sweden)
Sara Sheehan
2016-03-01
Full Text Available Given genomic variation data from multiple individuals, computing the likelihood of complex population genetic models is often infeasible. To circumvent this problem, we introduce a novel likelihood-free inference framework by applying deep learning, a powerful modern technique in machine learning. Deep learning makes use of multilayer neural networks to learn a feature-based function from the input (e.g., hundreds of correlated summary statistics of data to the output (e.g., population genetic parameters of interest. We demonstrate that deep learning can be effectively employed for population genetic inference and learning informative features of data. As a concrete application, we focus on the challenging problem of jointly inferring natural selection and demography (in the form of a population size change history. Our method is able to separate the global nature of demography from the local nature of selection, without sequential steps for these two factors. Studying demography and selection jointly is motivated by Drosophila, where pervasive selection confounds demographic analysis. We apply our method to 197 African Drosophila melanogaster genomes from Zambia to infer both their overall demography, and regions of their genome under selection. We find many regions of the genome that have experienced hard sweeps, and fewer under selection on standing variation (soft sweep or balancing selection. Interestingly, we find that soft sweeps and balancing selection occur more frequently closer to the centromere of each chromosome. In addition, our demographic inference suggests that previously estimated bottlenecks for African Drosophila melanogaster are too extreme.
Deep Learning for Population Genetic Inference
Sheehan, Sara; Song, Yun S.
2016-01-01
Given genomic variation data from multiple individuals, computing the likelihood of complex population genetic models is often infeasible. To circumvent this problem, we introduce a novel likelihood-free inference framework by applying deep learning, a powerful modern technique in machine learning. Deep learning makes use of multilayer neural networks to learn a feature-based function from the input (e.g., hundreds of correlated summary statistics of data) to the output (e.g., population genetic parameters of interest). We demonstrate that deep learning can be effectively employed for population genetic inference and learning informative features of data. As a concrete application, we focus on the challenging problem of jointly inferring natural selection and demography (in the form of a population size change history). Our method is able to separate the global nature of demography from the local nature of selection, without sequential steps for these two factors. Studying demography and selection jointly is motivated by Drosophila, where pervasive selection confounds demographic analysis. We apply our method to 197 African Drosophila melanogaster genomes from Zambia to infer both their overall demography, and regions of their genome under selection. We find many regions of the genome that have experienced hard sweeps, and fewer under selection on standing variation (soft sweep) or balancing selection. Interestingly, we find that soft sweeps and balancing selection occur more frequently closer to the centromere of each chromosome. In addition, our demographic inference suggests that previously estimated bottlenecks for African Drosophila melanogaster are too extreme. PMID:27018908
Estimating rate of occurrence of rare events with empirical bayes: A railway application
International Nuclear Information System (INIS)
Quigley, John; Bedford, Tim; Walls, Lesley
2007-01-01
Classical approaches to estimating the rate of occurrence of events perform poorly when data are few. Maximum likelihood estimators result in overly optimistic point estimates of zero for situations where there have been no events. Alternative empirical-based approaches have been proposed based on median estimators or non-informative prior distributions. While these alternatives offer an improvement over point estimates of zero, they can be overly conservative. Empirical Bayes procedures offer an unbiased approach through pooling data across different hazards to support stronger statistical inference. This paper considers the application of Empirical Bayes to high consequence low-frequency events, where estimates are required for risk mitigation decision support such as as low as reasonably possible. A summary of empirical Bayes methods is given and the choices of estimation procedures to obtain interval estimates are discussed. The approaches illustrated within the case study are based on the estimation of the rate of occurrence of train derailments within the UK. The usefulness of empirical Bayes within this context is discussed
Directory of Open Access Journals (Sweden)
Daniel L. Rabosky
2006-01-01
Full Text Available Rates of species origination and extinction can vary over time during evolutionary radiations, and it is possible to reconstruct the history of diversification using molecular phylogenies of extant taxa only. Maximum likelihood methods provide a useful framework for inferring temporal variation in diversification rates. LASER is a package for the R programming environment that implements maximum likelihood methods based on the birth-death process to test whether diversification rates have changed over time. LASER contrasts the likelihood of phylogenetic data under models where diversification rates have changed over time to alternative models where rates have remained constant over time. Major strengths of the package include the ability to detect temporal increases in diversification rates and the inference of diversification parameters under multiple rate-variable models of diversification. The program and associated documentation are freely available from the R package archive at http://cran.r-project.org.
Inferring network structure from cascades
Ghonge, Sushrut; Vural, Dervis Can
2017-07-01
Many physical, biological, and social phenomena can be described by cascades taking place on a network. Often, the activity can be empirically observed, but not the underlying network of interactions. In this paper we offer three topological methods to infer the structure of any directed network given a set of cascade arrival times. Our formulas hold for a very general class of models where the activation probability of a node is a generic function of its degree and the number of its active neighbors. We report high success rates for synthetic and real networks, for several different cascade models.
Practical Statistics for LHC Physicists: Bayesian Inference (3/3)
CERN. Geneva
2015-01-01
These lectures cover those principles and practices of statistics that are most relevant for work at the LHC. The first lecture discusses the basic ideas of descriptive statistics, probability and likelihood. The second lecture covers the key ideas in the frequentist approach, including confidence limits, profile likelihoods, p-values, and hypothesis testing. The third lecture covers inference in the Bayesian approach. Throughout, real-world examples will be used to illustrate the practical application of the ideas. No previous knowledge is assumed.
Practical Statistics for LHC Physicists: Frequentist Inference (2/3)
CERN. Geneva
2015-01-01
These lectures cover those principles and practices of statistics that are most relevant for work at the LHC. The first lecture discusses the basic ideas of descriptive statistics, probability and likelihood. The second lecture covers the key ideas in the frequentist approach, including confidence limits, profile likelihoods, p-values, and hypothesis testing. The third lecture covers inference in the Bayesian approach. Throughout, real-world examples will be used to illustrate the practical application of the ideas. No previous knowledge is assumed.
Sampling of systematic errors to estimate likelihood weights in nuclear data uncertainty propagation
International Nuclear Information System (INIS)
Helgesson, P.; Sjöstrand, H.; Koning, A.J.; Rydén, J.; Rochman, D.; Alhassan, E.; Pomp, S.
2016-01-01
In methodologies for nuclear data (ND) uncertainty assessment and propagation based on random sampling, likelihood weights can be used to infer experimental information into the distributions for the ND. As the included number of correlated experimental points grows large, the computational time for the matrix inversion involved in obtaining the likelihood can become a practical problem. There are also other problems related to the conventional computation of the likelihood, e.g., the assumption that all experimental uncertainties are Gaussian. In this study, a way to estimate the likelihood which avoids matrix inversion is investigated; instead, the experimental correlations are included by sampling of systematic errors. It is shown that the model underlying the sampling methodology (using univariate normal distributions for random and systematic errors) implies a multivariate Gaussian for the experimental points (i.e., the conventional model). It is also shown that the likelihood estimates obtained through sampling of systematic errors approach the likelihood obtained with matrix inversion as the sample size for the systematic errors grows large. In studied practical cases, it is seen that the estimates for the likelihood weights converge impractically slowly with the sample size, compared to matrix inversion. The computational time is estimated to be greater than for matrix inversion in cases with more experimental points, too. Hence, the sampling of systematic errors has little potential to compete with matrix inversion in cases where the latter is applicable. Nevertheless, the underlying model and the likelihood estimates can be easier to intuitively interpret than the conventional model and the likelihood function involving the inverted covariance matrix. Therefore, this work can both have pedagogical value and be used to help motivating the conventional assumption of a multivariate Gaussian for experimental data. The sampling of systematic errors could also
Deformation of log-likelihood loss function for multiclass boosting.
Kanamori, Takafumi
2010-09-01
The purpose of this paper is to study loss functions in multiclass classification. In classification problems, the decision function is estimated by minimizing an empirical loss function, and then, the output label is predicted by using the estimated decision function. We propose a class of loss functions which is obtained by a deformation of the log-likelihood loss function. There are four main reasons why we focus on the deformed log-likelihood loss function: (1) this is a class of loss functions which has not been deeply investigated so far, (2) in terms of computation, a boosting algorithm with a pseudo-loss is available to minimize the proposed loss function, (3) the proposed loss functions provide a clear correspondence between the decision functions and conditional probabilities of output labels, (4) the proposed loss functions satisfy the statistical consistency of the classification error rate which is a desirable property in classification problems. Based on (3), we show that the deformed log-likelihood loss provides a model of mislabeling which is useful as a statistical model of medical diagnostics. We also propose a robust loss function against outliers in multiclass classification based on our approach. The robust loss function is a natural extension of the existing robust loss function for binary classification. A model of mislabeling and a robust loss function are useful to cope with noisy data. Some numerical studies are presented to show the robustness of the proposed loss function. A mathematical characterization of the deformed log-likelihood loss function is also presented. Copyright 2010 Elsevier Ltd. All rights reserved.
Examples in parametric inference with R
Dixit, Ulhas Jayram
2016-01-01
This book discusses examples in parametric inference with R. Combining basic theory with modern approaches, it presents the latest developments and trends in statistical inference for students who do not have an advanced mathematical and statistical background. The topics discussed in the book are fundamental and common to many fields of statistical inference and thus serve as a point of departure for in-depth study. The book is divided into eight chapters: Chapter 1 provides an overview of topics on sufficiency and completeness, while Chapter 2 briefly discusses unbiased estimation. Chapter 3 focuses on the study of moments and maximum likelihood estimators, and Chapter 4 presents bounds for the variance. In Chapter 5, topics on consistent estimator are discussed. Chapter 6 discusses Bayes, while Chapter 7 studies some more powerful tests. Lastly, Chapter 8 examines unbiased and other tests. Senior undergraduate and graduate students in statistics and mathematics, and those who have taken an introductory cou...
Statistical inference based on divergence measures
Pardo, Leandro
2005-01-01
The idea of using functionals of Information Theory, such as entropies or divergences, in statistical inference is not new. However, in spite of the fact that divergence statistics have become a very good alternative to the classical likelihood ratio test and the Pearson-type statistic in discrete models, many statisticians remain unaware of this powerful approach.Statistical Inference Based on Divergence Measures explores classical problems of statistical inference, such as estimation and hypothesis testing, on the basis of measures of entropy and divergence. The first two chapters form an overview, from a statistical perspective, of the most important measures of entropy and divergence and study their properties. The author then examines the statistical analysis of discrete multivariate data with emphasis is on problems in contingency tables and loglinear models using phi-divergence test statistics as well as minimum phi-divergence estimators. The final chapter looks at testing in general populations, prese...
Computation of the Likelihood in Biallelic Diffusion Models Using Orthogonal Polynomials
Directory of Open Access Journals (Sweden)
Claus Vogl
2014-11-01
Full Text Available In population genetics, parameters describing forces such as mutation, migration and drift are generally inferred from molecular data. Lately, approximate methods based on simulations and summary statistics have been widely applied for such inference, even though these methods waste information. In contrast, probabilistic methods of inference can be shown to be optimal, if their assumptions are met. In genomic regions where recombination rates are high relative to mutation rates, polymorphic nucleotide sites can be assumed to evolve independently from each other. The distribution of allele frequencies at a large number of such sites has been called “allele-frequency spectrum” or “site-frequency spectrum” (SFS. Conditional on the allelic proportions, the likelihoods of such data can be modeled as binomial. A simple model representing the evolution of allelic proportions is the biallelic mutation-drift or mutation-directional selection-drift diffusion model. With series of orthogonal polynomials, specifically Jacobi and Gegenbauer polynomials, or the related spheroidal wave function, the diffusion equations can be solved efficiently. In the neutral case, the product of the binomial likelihoods with the sum of such polynomials leads to finite series of polynomials, i.e., relatively simple equations, from which the exact likelihoods can be calculated. In this article, the use of orthogonal polynomials for inferring population genetic parameters is investigated.
Asymptotic Likelihood Distribution for Correlated & Constrained Systems
Agarwal, Ujjwal
2016-01-01
It describes my work as summer student at CERN. The report discusses the asymptotic distribution of the likelihood ratio for total no. of parameters being h and 2 out of these being are constrained and correlated.
Maximum likelihood estimation of phase-type distributions
DEFF Research Database (Denmark)
Esparza, Luz Judith R
for both univariate and multivariate cases. Methods like the EM algorithm and Markov chain Monte Carlo are applied for this purpose. Furthermore, this thesis provides explicit formulae for computing the Fisher information matrix for discrete and continuous phase-type distributions, which is needed to find......This work is concerned with the statistical inference of phase-type distributions and the analysis of distributions with rational Laplace transform, known as matrix-exponential distributions. The thesis is focused on the estimation of the maximum likelihood parameters of phase-type distributions...... confidence regions for their estimated parameters. Finally, a new general class of distributions, called bilateral matrix-exponential distributions, is defined. These distributions have the entire real line as domain and can be used, for instance, for modelling. In addition, this class of distributions...
Maximum-Likelihood Detection Of Noncoherent CPM
Divsalar, Dariush; Simon, Marvin K.
1993-01-01
Simplified detectors proposed for use in maximum-likelihood-sequence detection of symbols in alphabet of size M transmitted by uncoded, full-response continuous phase modulation over radio channel with additive white Gaussian noise. Structures of receivers derived from particular interpretation of maximum-likelihood metrics. Receivers include front ends, structures of which depends only on M, analogous to those in receivers of coherent CPM. Parts of receivers following front ends have structures, complexity of which would depend on N.
Statistical inferences for bearings life using sudden death test
Directory of Open Access Journals (Sweden)
Morariu Cristin-Olimpiu
2017-01-01
Full Text Available In this paper we propose a calculus method for reliability indicators estimation and a complete statistical inferences for three parameters Weibull distribution of bearings life. Using experimental values regarding the durability of bearings tested on stands by the sudden death tests involves a series of particularities of the estimation using maximum likelihood method and statistical inference accomplishment. The paper detailing these features and also provides an example calculation.
Inferring the photometric and size evolution of galaxies from image simulations. I. Method
Carassou, Sébastien; de Lapparent, Valérie; Bertin, Emmanuel; Le Borgne, Damien
2017-09-01
Context. Current constraints on models of galaxy evolution rely on morphometric catalogs extracted from multi-band photometric surveys. However, these catalogs are altered by selection effects that are difficult to model, that correlate in non trivial ways, and that can lead to contradictory predictions if not taken into account carefully. Aims: To address this issue, we have developed a new approach combining parametric Bayesian indirect likelihood (pBIL) techniques and empirical modeling with realistic image simulations that reproduce a large fraction of these selection effects. This allows us to perform a direct comparison between observed and simulated images and to infer robust constraints on model parameters. Methods: We use a semi-empirical forward model to generate a distribution of mock galaxies from a set of physical parameters. These galaxies are passed through an image simulator reproducing the instrumental characteristics of any survey and are then extracted in the same way as the observed data. The discrepancy between the simulated and observed data is quantified, and minimized with a custom sampling process based on adaptive Markov chain Monte Carlo methods. Results: Using synthetic data matching most of the properties of a Canada-France-Hawaii Telescope Legacy Survey Deep field, we demonstrate the robustness and internal consistency of our approach by inferring the parameters governing the size and luminosity functions and their evolutions for different realistic populations of galaxies. We also compare the results of our approach with those obtained from the classical spectral energy distribution fitting and photometric redshift approach. Conclusions: Our pipeline infers efficiently the luminosity and size distribution and evolution parameters with a very limited number of observables (three photometric bands). When compared to SED fitting based on the same set of observables, our method yields results that are more accurate and free from
International Nuclear Information System (INIS)
Peggs, S.; Talman, R.
1987-01-01
As proton accelerators get larger, and include more magnets, the conventional tracking programs which simulate them run slower. The purpose of this paper is to describe a method, still under development, in which element-by-element tracking around one turn is replaced by a single man, which can be processed far faster. It is assumed for this method that a conventional program exists which can perform faithful tracking in the lattice under study for some hundreds of turns, with all lattice parameters held constant. An empirical map is then generated by comparison with the tracking program. A procedure has been outlined for determining an empirical Hamiltonian, which can represent motion through many nonlinear kicks, by taking data from a conventional tracking program. Though derived by an approximate method this Hamiltonian is analytic in form and can be subjected to further analysis of varying degrees of mathematical rigor. Even though the empirical procedure has only been described in one transverse dimension, there is good reason to hope that it can be extended to include two transverse dimensions, so that it can become a more practical tool in realistic cases
Maximum Likelihood and Restricted Likelihood Solutions in Multiple-Method Studies.
Rukhin, Andrew L
2011-01-01
A formulation of the problem of combining data from several sources is discussed in terms of random effects models. The unknown measurement precision is assumed not to be the same for all methods. We investigate maximum likelihood solutions in this model. By representing the likelihood equations as simultaneous polynomial equations, the exact form of the Groebner basis for their stationary points is derived when there are two methods. A parametrization of these solutions which allows their comparison is suggested. A numerical method for solving likelihood equations is outlined, and an alternative to the maximum likelihood method, the restricted maximum likelihood, is studied. In the situation when methods variances are considered to be known an upper bound on the between-method variance is obtained. The relationship between likelihood equations and moment-type equations is also discussed.
Maximum likelihood estimation for integrated diffusion processes
DEFF Research Database (Denmark)
Baltazar-Larios, Fernando; Sørensen, Michael
We propose a method for obtaining maximum likelihood estimates of parameters in diffusion models when the data is a discrete time sample of the integral of the process, while no direct observations of the process itself are available. The data are, moreover, assumed to be contaminated...... EM-algorithm to obtain maximum likelihood estimates of the parameters in the diffusion model. As part of the algorithm, we use a recent simple method for approximate simulation of diffusion bridges. In simulation studies for the Ornstein-Uhlenbeck process and the CIR process the proposed method works...... by measurement errors. Integrated volatility is an example of this type of observations. Another example is ice-core data on oxygen isotopes used to investigate paleo-temperatures. The data can be viewed as incomplete observations of a model with a tractable likelihood function. Therefore we propose a simulated...
Empirical Bayes Approaches to Multivariate Fuzzy Partitions.
Woodbury, Max A.; Manton, Kenneth G.
1991-01-01
An empirical Bayes-maximum likelihood estimation procedure is presented for the application of fuzzy partition models in describing high dimensional discrete response data. The model describes individuals in terms of partial membership in multiple latent categories that represent bounded discrete spaces. (SLD)
Maintaining symmetry of simulated likelihood functions
DEFF Research Database (Denmark)
Andersen, Laura Mørch
This paper suggests solutions to two different types of simulation errors related to Quasi-Monte Carlo integration. Likelihood functions which depend on standard deviations of mixed parameters are symmetric in nature. This paper shows that antithetic draws preserve this symmetry and thereby...... improves precision substantially. Another source of error is that models testing away mixing dimensions must replicate the relevant dimensions of the quasi-random draws in the simulation of the restricted likelihood. These simulation errors are ignored in the standard estimation procedures used today...
DEFF Research Database (Denmark)
Andersen, Jesper
2009-01-01
Collateral evolution the problem of updating several library-using programs in response to API changes in the used library. In this dissertation we address the issue of understanding collateral evolutions by automatically inferring a high-level specification of the changes evident in a given set ...... specifications inferred by spdiff in Linux are shown. We find that the inferred specifications concisely capture the actual collateral evolution performed in the examples....
Fisher information and statistical inference for phase-type distributions
DEFF Research Database (Denmark)
Bladt, Mogens; Esparza, Luz Judith R; Nielsen, Bo Friis
2011-01-01
This paper is concerned with statistical inference for both continuous and discrete phase-type distributions. We consider maximum likelihood estimation, where traditionally the expectation-maximization (EM) algorithm has been employed. Certain numerical aspects of this method are revised and we...
Composite likelihood estimation of demographic parameters
Directory of Open Access Journals (Sweden)
Garrigan Daniel
2009-11-01
Full Text Available Abstract Background Most existing likelihood-based methods for fitting historical demographic models to DNA sequence polymorphism data to do not scale feasibly up to the level of whole-genome data sets. Computational economies can be achieved by incorporating two forms of pseudo-likelihood: composite and approximate likelihood methods. Composite likelihood enables scaling up to large data sets because it takes the product of marginal likelihoods as an estimator of the likelihood of the complete data set. This approach is especially useful when a large number of genomic regions constitutes the data set. Additionally, approximate likelihood methods can reduce the dimensionality of the data by summarizing the information in the original data by either a sufficient statistic, or a set of statistics. Both composite and approximate likelihood methods hold promise for analyzing large data sets or for use in situations where the underlying demographic model is complex and has many parameters. This paper considers a simple demographic model of allopatric divergence between two populations, in which one of the population is hypothesized to have experienced a founder event, or population bottleneck. A large resequencing data set from human populations is summarized by the joint frequency spectrum, which is a matrix of the genomic frequency spectrum of derived base frequencies in two populations. A Bayesian Metropolis-coupled Markov chain Monte Carlo (MCMCMC method for parameter estimation is developed that uses both composite and likelihood methods and is applied to the three different pairwise combinations of the human population resequence data. The accuracy of the method is also tested on data sets sampled from a simulated population model with known parameters. Results The Bayesian MCMCMC method also estimates the ratio of effective population size for the X chromosome versus that of the autosomes. The method is shown to estimate, with reasonable
Generalized linear models with random effects unified analysis via H-likelihood
Lee, Youngjo; Pawitan, Yudi
2006-01-01
Since their introduction in 1972, generalized linear models (GLMs) have proven useful in the generalization of classical normal models. Presenting methods for fitting GLMs with random effects to data, Generalized Linear Models with Random Effects: Unified Analysis via H-likelihood explores a wide range of applications, including combining information over trials (meta-analysis), analysis of frailty models for survival data, genetic epidemiology, and analysis of spatial and temporal models with correlated errors.Written by pioneering authorities in the field, this reference provides an introduction to various theories and examines likelihood inference and GLMs. The authors show how to extend the class of GLMs while retaining as much simplicity as possible. By maximizing and deriving other quantities from h-likelihood, they also demonstrate how to use a single algorithm for all members of the class, resulting in a faster algorithm as compared to existing alternatives. Complementing theory with examples, many of...
Efficient Bit-to-Symbol Likelihood Mappings
Moision, Bruce E.; Nakashima, Michael A.
2010-01-01
This innovation is an efficient algorithm designed to perform bit-to-symbol and symbol-to-bit likelihood mappings that represent a significant portion of the complexity of an error-correction code decoder for high-order constellations. Recent implementation of the algorithm in hardware has yielded an 8- percent reduction in overall area relative to the prior design.
Likelihood-ratio-based biometric verification
Bazen, A.M.; Veldhuis, Raymond N.J.
2002-01-01
This paper presents results on optimal similarity measures for biometric verification based on fixed-length feature vectors. First, we show that the verification of a single user is equivalent to the detection problem, which implies that for single-user verification the likelihood ratio is optimal.
Likelihood Ratio-Based Biometric Verification
Bazen, A.M.; Veldhuis, Raymond N.J.
The paper presents results on optimal similarity measures for biometric verification based on fixed-length feature vectors. First, we show that the verification of a single user is equivalent to the detection problem, which implies that, for single-user verification, the likelihood ratio is optimal.
Inference for shared-frailty survival models with left-truncated data
van den Berg, G.J.; Drepper, B.
2016-01-01
Shared-frailty survival models specify that systematic unobserved determinants of duration outcomes are identical within groups of individuals. We consider random-effects likelihood-based statistical inference if the duration data are subject to left-truncation. Such inference with left-truncated
ABC of SV: Limited Information Likelihood Inference in Stochastic Volatility Jump-Diffusion Models
DEFF Research Database (Denmark)
Creel, Michael; Kristensen, Dennis
and latent variables. We show how the methods can incorporate intra-daily information to improve on the estimation and filtering. In particular, the availability of realized volatility measures help us in learning about parameters and latent states. The method is employed in the estimation of a flexible...
Partial inversion of elliptic operator to speed up computation of likelihood in Bayesian inference
Litvinenko, Alexander
2017-01-01
we can approximate $F(u)$ on, for instance, multiple coarse meshes. The offered method is well suited for solving multiscale problems. A disadvantage of this method is the assumption that one has to have access to the discretisation and to the procedure of assembling the Galerkin matrix.
Local Likelihood Approach for High-Dimensional Peaks-Over-Threshold Inference
Baki, Zhuldyzay
2018-01-01
a gridded dataset comprising 16703 locations over the Red Sea. The data were provided by Operational SST and Sea Ice Analysis (OSTIA), a satellite-based data system designed for numerical weather prediction. After pre-processing the data to account
Exact sampling from conditional Boolean models with applications to maximum likelihood inference
Lieshout, van M.N.M.; Zwet, van E.W.
2001-01-01
We are interested in estimating the intensity parameter of a Boolean model of discs (the bombing model) from a single realization. To do so, we derive the conditional distribution of the points (germs) of the underlying Poisson process. We demonstrate how to apply coupling from the past to generate
Energy Technology Data Exchange (ETDEWEB)
Petrov, S.
1996-10-01
Languages with a solvable implication problem but without complete and consistent systems of inference rules (`poor` languages) are considered. The problem of existence of finite complete and consistent inference rule system for a ``poor`` language is stated independently of the language or rules syntax. Several properties of the problem arc proved. An application of results to the language of join dependencies is given.
Bayesian statistical inference
Directory of Open Access Journals (Sweden)
Bruno De Finetti
2017-04-01
Full Text Available This work was translated into English and published in the volume: Bruno De Finetti, Induction and Probability, Biblioteca di Statistica, eds. P. Monari, D. Cocchi, Clueb, Bologna, 1993.Bayesian statistical Inference is one of the last fundamental philosophical papers in which we can find the essential De Finetti's approach to the statistical inference.
Geometric statistical inference
International Nuclear Information System (INIS)
Periwal, Vipul
1999-01-01
A reparametrization-covariant formulation of the inverse problem of probability is explicitly solved for finite sample sizes. The inferred distribution is explicitly continuous for finite sample size. A geometric solution of the statistical inference problem in higher dimensions is outlined
Phylogenetic analysis using parsimony and likelihood methods.
Yang, Z
1996-02-01
The assumptions underlying the maximum-parsimony (MP) method of phylogenetic tree reconstruction were intuitively examined by studying the way the method works. Computer simulations were performed to corroborate the intuitive examination. Parsimony appears to involve very stringent assumptions concerning the process of sequence evolution, such as constancy of substitution rates between nucleotides, constancy of rates across nucleotide sites, and equal branch lengths in the tree. For practical data analysis, the requirement of equal branch lengths means similar substitution rates among lineages (the existence of an approximate molecular clock), relatively long interior branches, and also few species in the data. However, a small amount of evolution is neither a necessary nor a sufficient requirement of the method. The difficulties involved in the application of current statistical estimation theory to tree reconstruction were discussed, and it was suggested that the approach proposed by Felsenstein (1981, J. Mol. Evol. 17: 368-376) for topology estimation, as well as its many variations and extensions, differs fundamentally from the maximum likelihood estimation of a conventional statistical parameter. Evidence was presented showing that the Felsenstein approach does not share the asymptotic efficiency of the maximum likelihood estimator of a statistical parameter. Computer simulations were performed to study the probability that MP recovers the true tree under a hierarchy of models of nucleotide substitution; its performance relative to the likelihood method was especially noted. The results appeared to support the intuitive examination of the assumptions underlying MP. When a simple model of nucleotide substitution was assumed to generate data, the probability that MP recovers the true topology could be as high as, or even higher than, that for the likelihood method. When the assumed model became more complex and realistic, e.g., when substitution rates were
Pointwise probability reinforcements for robust statistical inference.
Frénay, Benoît; Verleysen, Michel
2014-02-01
Statistical inference using machine learning techniques may be difficult with small datasets because of abnormally frequent data (AFDs). AFDs are observations that are much more frequent in the training sample that they should be, with respect to their theoretical probability, and include e.g. outliers. Estimates of parameters tend to be biased towards models which support such data. This paper proposes to introduce pointwise probability reinforcements (PPRs): the probability of each observation is reinforced by a PPR and a regularisation allows controlling the amount of reinforcement which compensates for AFDs. The proposed solution is very generic, since it can be used to robustify any statistical inference method which can be formulated as a likelihood maximisation. Experiments show that PPRs can be easily used to tackle regression, classification and projection: models are freed from the influence of outliers. Moreover, outliers can be filtered manually since an abnormality degree is obtained for each observation. Copyright © 2013 Elsevier Ltd. All rights reserved.
International Nuclear Information System (INIS)
Peggs, S.; Talman, R.
1986-08-01
As proton accelerators get larger, and include more magnets, the conventional tracking programs which simulate them run slower. At the same time, in order to more carefully optimize the higher cost of the accelerators, they must return more accurate results, even in the presence of a longer list of realistic effects, such as magnet errors and misalignments. For these reasons conventional tracking programs continue to be computationally bound, despite the continually increasing computing power available. This limitation is especially severe for a class of problems in which some lattice parameter is slowly varying, when a faithful description is only obtained by tracking for an exceedingly large number of turns. Examples are synchrotron oscillations in which the energy varies slowly with a period of, say, hundreds of turns, or magnet ripple or noise on a comparably slow time scale. In these cases one may with to track for hundreds of periods of the slowly varying parameter. The purpose of this paper is to describe a method, still under development, in which element-by-element tracking around one turn is replaced by a single map, which can be processed far faster. Similar programs have already been written in which successive elements are ''concatenated'' with truncation to linear, sextupole, or octupole order, et cetera, using Lie algebraic techniques to preserve symplecticity. The method described here is rather more empirical than this but, in principle, contains information to all orders and is able to handle resonances in a more straightforward fashion
Automatic physical inference with information maximizing neural networks
Charnock, Tom; Lavaux, Guilhem; Wandelt, Benjamin D.
2018-04-01
Compressing large data sets to a manageable number of summaries that are informative about the underlying parameters vastly simplifies both frequentist and Bayesian inference. When only simulations are available, these summaries are typically chosen heuristically, so they may inadvertently miss important information. We introduce a simulation-based machine learning technique that trains artificial neural networks to find nonlinear functionals of data that maximize Fisher information: information maximizing neural networks (IMNNs). In test cases where the posterior can be derived exactly, likelihood-free inference based on automatically derived IMNN summaries produces nearly exact posteriors, showing that these summaries are good approximations to sufficient statistics. In a series of numerical examples of increasing complexity and astrophysical relevance we show that IMNNs are robustly capable of automatically finding optimal, nonlinear summaries of the data even in cases where linear compression fails: inferring the variance of Gaussian signal in the presence of noise, inferring cosmological parameters from mock simulations of the Lyman-α forest in quasar spectra, and inferring frequency-domain parameters from LISA-like detections of gravitational waveforms. In this final case, the IMNN summary outperforms linear data compression by avoiding the introduction of spurious likelihood maxima. We anticipate that the automatic physical inference method described in this paper will be essential to obtain both accurate and precise cosmological parameter estimates from complex and large astronomical data sets, including those from LSST and Euclid.
Factors Associated with Young Adults’ Pregnancy Likelihood
Kitsantas, Panagiota; Lindley, Lisa L.; Wu, Huichuan
2014-01-01
OBJECTIVES While progress has been made to reduce adolescent pregnancies in the United States, rates of unplanned pregnancy among young adults (18–29 years) remain high. In this study, we assessed factors associated with perceived likelihood of pregnancy (likelihood of getting pregnant/getting partner pregnant in the next year) among sexually experienced young adults who were not trying to get pregnant and had ever used contraceptives. METHODS We conducted a secondary analysis of 660 young adults, 18–29 years old in the United States, from the cross-sectional National Survey of Reproductive and Contraceptive Knowledge. Logistic regression and classification tree analyses were conducted to generate profiles of young adults most likely to report anticipating a pregnancy in the next year. RESULTS Nearly one-third (32%) of young adults indicated they believed they had at least some likelihood of becoming pregnant in the next year. Young adults who believed that avoiding pregnancy was not very important were most likely to report pregnancy likelihood (odds ratio [OR], 5.21; 95% CI, 2.80–9.69), as were young adults for whom avoiding a pregnancy was important but not satisfied with their current contraceptive method (OR, 3.93; 95% CI, 1.67–9.24), attended religious services frequently (OR, 3.0; 95% CI, 1.52–5.94), were uninsured (OR, 2.63; 95% CI, 1.31–5.26), and were likely to have unprotected sex in the next three months (OR, 1.77; 95% CI, 1.04–3.01). DISCUSSION These results may help guide future research and the development of pregnancy prevention interventions targeting sexually experienced young adults. PMID:25782849
Review of Elaboration Likelihood Model of persuasion
藤原, 武弘; 神山, 貴弥
1989-01-01
This article mainly introduces Elaboration Likelihood Model (ELM), proposed by Petty & Cacioppo, that is, a general attitude change theory. ELM posturates two routes to persuasion; central and peripheral route. Attitude change by central route is viewed as resulting from a diligent consideration of the issue-relevant informations presented. On the other hand, attitude change by peripheral route is viewed as resulting from peripheral cues in the persuasion context. Secondly we compare these tw...
Efficient Bayesian inference for ARFIMA processes
Graves, T.; Gramacy, R. B.; Franzke, C. L. E.; Watkins, N. W.
2015-03-01
Many geophysical quantities, like atmospheric temperature, water levels in rivers, and wind speeds, have shown evidence of long-range dependence (LRD). LRD means that these quantities experience non-trivial temporal memory, which potentially enhances their predictability, but also hampers the detection of externally forced trends. Thus, it is important to reliably identify whether or not a system exhibits LRD. In this paper we present a modern and systematic approach to the inference of LRD. Rather than Mandelbrot's fractional Gaussian noise, we use the more flexible Autoregressive Fractional Integrated Moving Average (ARFIMA) model which is widely used in time series analysis, and of increasing interest in climate science. Unlike most previous work on the inference of LRD, which is frequentist in nature, we provide a systematic treatment of Bayesian inference. In particular, we provide a new approximate likelihood for efficient parameter inference, and show how nuisance parameters (e.g. short memory effects) can be integrated over in order to focus on long memory parameters, and hypothesis testing more directly. We illustrate our new methodology on the Nile water level data, with favorable comparison to the standard estimators.
HIERARCHICAL PROBABILISTIC INFERENCE OF COSMIC SHEAR
International Nuclear Information System (INIS)
Schneider, Michael D.; Dawson, William A.; Hogg, David W.; Marshall, Philip J.; Bard, Deborah J.; Meyers, Joshua; Lang, Dustin
2015-01-01
Point estimators for the shearing of galaxy images induced by gravitational lensing involve a complex inverse problem in the presence of noise, pixelization, and model uncertainties. We present a probabilistic forward modeling approach to gravitational lensing inference that has the potential to mitigate the biased inferences in most common point estimators and is practical for upcoming lensing surveys. The first part of our statistical framework requires specification of a likelihood function for the pixel data in an imaging survey given parameterized models for the galaxies in the images. We derive the lensing shear posterior by marginalizing over all intrinsic galaxy properties that contribute to the pixel data (i.e., not limited to galaxy ellipticities) and learn the distributions for the intrinsic galaxy properties via hierarchical inference with a suitably flexible conditional probabilitiy distribution specification. We use importance sampling to separate the modeling of small imaging areas from the global shear inference, thereby rendering our algorithm computationally tractable for large surveys. With simple numerical examples we demonstrate the improvements in accuracy from our importance sampling approach, as well as the significance of the conditional distribution specification for the intrinsic galaxy properties when the data are generated from an unknown number of distinct galaxy populations with different morphological characteristics
Evolutionary inference via the Poisson Indel Process.
Bouchard-Côté, Alexandre; Jordan, Michael I
2013-01-22
We address the problem of the joint statistical inference of phylogenetic trees and multiple sequence alignments from unaligned molecular sequences. This problem is generally formulated in terms of string-valued evolutionary processes along the branches of a phylogenetic tree. The classic evolutionary process, the TKF91 model [Thorne JL, Kishino H, Felsenstein J (1991) J Mol Evol 33(2):114-124] is a continuous-time Markov chain model composed of insertion, deletion, and substitution events. Unfortunately, this model gives rise to an intractable computational problem: The computation of the marginal likelihood under the TKF91 model is exponential in the number of taxa. In this work, we present a stochastic process, the Poisson Indel Process (PIP), in which the complexity of this computation is reduced to linear. The Poisson Indel Process is closely related to the TKF91 model, differing only in its treatment of insertions, but it has a global characterization as a Poisson process on the phylogeny. Standard results for Poisson processes allow key computations to be decoupled, which yields the favorable computational profile of inference under the PIP model. We present illustrative experiments in which Bayesian inference under the PIP model is compared with separate inference of phylogenies and alignments.
Physician Bayesian updating from personal beliefs about the base rate and likelihood ratio.
Rottman, Benjamin Margolin
2017-02-01
Whether humans can accurately make decisions in line with Bayes' rule has been one of the most important yet contentious topics in cognitive psychology. Though a number of paradigms have been used for studying Bayesian updating, rarely have subjects been allowed to use their own preexisting beliefs about the prior and the likelihood. A study is reported in which physicians judged the posttest probability of a diagnosis for a patient vignette after receiving a test result, and the physicians' posttest judgments were compared to the normative posttest calculated from their own beliefs in the sensitivity and false positive rate of the test (likelihood ratio) and prior probability of the diagnosis. On the one hand, the posttest judgments were strongly related to the physicians' beliefs about both the prior probability as well as the likelihood ratio, and the priors were used considerably more strongly than in previous research. On the other hand, both the prior and the likelihoods were still not used quite as much as they should have been, and there was evidence of other nonnormative aspects to the updating, such as updating independent of the likelihood beliefs. By focusing on how physicians use their own prior beliefs for Bayesian updating, this study provides insight into how well experts perform probabilistic inference in settings in which they rely upon their own prior beliefs rather than experimenter-provided cues. It suggests that there is reason to be optimistic about experts' abilities, but that there is still considerable need for improvement.
Unbinned likelihood analysis of EGRET observations
International Nuclear Information System (INIS)
Digel, Seth W.
2000-01-01
We present a newly-developed likelihood analysis method for EGRET data that defines the likelihood function without binning the photon data or averaging the instrumental response functions. The standard likelihood analysis applied to EGRET data requires the photons to be binned spatially and in energy, and the point-spread functions to be averaged over energy and inclination angle. The full-width half maximum of the point-spread function increases by about 40% from on-axis to 30 degree sign inclination, and depending on the binning in energy can vary by more than that in a single energy bin. The new unbinned method avoids the loss of information that binning and averaging cause and can properly analyze regions where EGRET viewing periods overlap and photons with different inclination angles would otherwise be combined in the same bin. In the poster, we describe the unbinned analysis method and compare its sensitivity with binned analysis for detecting point sources in EGRET data
Nagao, Makoto
1990-01-01
Knowledge and Inference discusses an important problem for software systems: How do we treat knowledge and ideas on a computer and how do we use inference to solve problems on a computer? The book talks about the problems of knowledge and inference for the purpose of merging artificial intelligence and library science. The book begins by clarifying the concept of """"knowledge"""" from many points of view, followed by a chapter on the current state of library science and the place of artificial intelligence in library science. Subsequent chapters cover central topics in the artificial intellig
Logical inference and evaluation
International Nuclear Information System (INIS)
Perey, F.G.
1981-01-01
Most methodologies of evaluation currently used are based upon the theory of statistical inference. It is generally perceived that this theory is not capable of dealing satisfactorily with what are called systematic errors. Theories of logical inference should be capable of treating all of the information available, including that not involving frequency data. A theory of logical inference is presented as an extension of deductive logic via the concept of plausibility and the application of group theory. Some conclusions, based upon the application of this theory to evaluation of data, are also given
Mikulich-Gilbertson, Susan K; Wagner, Brandie D; Grunwald, Gary K; Riggs, Paula D; Zerbe, Gary O
2018-01-01
Medical research is often designed to investigate changes in a collection of response variables that are measured repeatedly on the same subjects. The multivariate generalized linear mixed model (MGLMM) can be used to evaluate random coefficient associations (e.g. simple correlations, partial regression coefficients) among outcomes that may be non-normal and differently distributed by specifying a multivariate normal distribution for their random effects and then evaluating the latent relationship between them. Empirical Bayes predictors are readily available for each subject from any mixed model and are observable and hence, plotable. Here, we evaluate whether second-stage association analyses of empirical Bayes predictors from a MGLMM, provide a good approximation and visual representation of these latent association analyses using medical examples and simulations. Additionally, we compare these results with association analyses of empirical Bayes predictors generated from separate mixed models for each outcome, a procedure that could circumvent computational problems that arise when the dimension of the joint covariance matrix of random effects is large and prohibits estimation of latent associations. As has been shown in other analytic contexts, the p-values for all second-stage coefficients that were determined by naively assuming normality of empirical Bayes predictors provide a good approximation to p-values determined via permutation analysis. Analyzing outcomes that are interrelated with separate models in the first stage and then associating the resulting empirical Bayes predictors in a second stage results in different mean and covariance parameter estimates from the maximum likelihood estimates generated by a MGLMM. The potential for erroneous inference from using results from these separate models increases as the magnitude of the association among the outcomes increases. Thus if computable, scatterplots of the conditionally independent empirical Bayes
Ning, Jing; Chen, Yong; Piao, Jin
2017-07-01
Publication bias occurs when the published research results are systematically unrepresentative of the population of studies that have been conducted, and is a potential threat to meaningful meta-analysis. The Copas selection model provides a flexible framework for correcting estimates and offers considerable insight into the publication bias. However, maximizing the observed likelihood under the Copas selection model is challenging because the observed data contain very little information on the latent variable. In this article, we study a Copas-like selection model and propose an expectation-maximization (EM) algorithm for estimation based on the full likelihood. Empirical simulation studies show that the EM algorithm and its associated inferential procedure performs well and avoids the non-convergence problem when maximizing the observed likelihood. © The Author 2017. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Probability and Statistical Inference
Prosper, Harrison B.
2006-01-01
These lectures introduce key concepts in probability and statistical inference at a level suitable for graduate students in particle physics. Our goal is to paint as vivid a picture as possible of the concepts covered.
On quantum statistical inference
Barndorff-Nielsen, O.E.; Gill, R.D.; Jupp, P.E.
2003-01-01
Interest in problems of statistical inference connected to measurements of quantum systems has recently increased substantially, in step with dramatic new developments in experimental techniques for studying small quantum systems. Furthermore, developments in the theory of quantum measurements have
2018-02-15
expressed a variety of inference techniques on discrete and continuous distributions: exact inference, importance sampling, Metropolis-Hastings (MH...without redoing any math or rewriting any code. And although our main goal is composable reuse, our performance is also good because we can use...control paths. • The Hakaru language can express mixtures of discrete and continuous distributions, but the current disintegration transformation
Introductory statistical inference
Mukhopadhyay, Nitis
2014-01-01
This gracefully organized text reveals the rigorous theory of probability and statistical inference in the style of a tutorial, using worked examples, exercises, figures, tables, and computer simulations to develop and illustrate concepts. Drills and boxed summaries emphasize and reinforce important ideas and special techniques.Beginning with a review of the basic concepts and methods in probability theory, moments, and moment generating functions, the author moves to more intricate topics. Introductory Statistical Inference studies multivariate random variables, exponential families of dist
Cycle-Based Cluster Variational Method for Direct and Inverse Inference
Furtlehner, Cyril; Decelle, Aurélien
2016-08-01
Large scale inference problems of practical interest can often be addressed with help of Markov random fields. This requires to solve in principle two related problems: the first one is to find offline the parameters of the MRF from empirical data (inverse problem); the second one (direct problem) is to set up the inference algorithm to make it as precise, robust and efficient as possible. In this work we address both the direct and inverse problem with mean-field methods of statistical physics, going beyond the Bethe approximation and associated belief propagation algorithm. We elaborate on the idea that loop corrections to belief propagation can be dealt with in a systematic way on pairwise Markov random fields, by using the elements of a cycle basis to define regions in a generalized belief propagation setting. For the direct problem, the region graph is specified in such a way as to avoid feed-back loops as much as possible by selecting a minimal cycle basis. Following this line we are led to propose a two-level algorithm, where a belief propagation algorithm is run alternatively at the level of each cycle and at the inter-region level. Next we observe that the inverse problem can be addressed region by region independently, with one small inverse problem per region to be solved. It turns out that each elementary inverse problem on the loop geometry can be solved efficiently. In particular in the random Ising context we propose two complementary methods based respectively on fixed point equations and on a one-parameter log likelihood function minimization. Numerical experiments confirm the effectiveness of this approach both for the direct and inverse MRF inference. Heterogeneous problems of size up to 10^5 are addressed in a reasonable computational time, notably with better convergence properties than ordinary belief propagation.
Statistical theory and inference
Olive, David J
2014-01-01
This text is for a one semester graduate course in statistical theory and covers minimal and complete sufficient statistics, maximum likelihood estimators, method of moments, bias and mean square error, uniform minimum variance estimators and the Cramer-Rao lower bound, an introduction to large sample theory, likelihood ratio tests and uniformly most powerful tests and the Neyman Pearson Lemma. A major goal of this text is to make these topics much more accessible to students by using the theory of exponential families. Exponential families, indicator functions and the support of the distribution are used throughout the text to simplify the theory. More than 50 ``brand name" distributions are used to illustrate the theory with many examples of exponential families, maximum likelihood estimators and uniformly minimum variance unbiased estimators. There are many homework problems with over 30 pages of solutions.
Multi-Channel Maximum Likelihood Pitch Estimation
DEFF Research Database (Denmark)
Christensen, Mads Græsbøll
2012-01-01
In this paper, a method for multi-channel pitch estimation is proposed. The method is a maximum likelihood estimator and is based on a parametric model where the signals in the various channels share the same fundamental frequency but can have different amplitudes, phases, and noise characteristics....... This essentially means that the model allows for different conditions in the various channels, like different signal-to-noise ratios, microphone characteristics and reverberation. Moreover, the method does not assume that a certain array structure is used but rather relies on a more general model and is hence...
Approximate maximum parsimony and ancestral maximum likelihood.
Alon, Noga; Chor, Benny; Pardi, Fabio; Rapoport, Anat
2010-01-01
We explore the maximum parsimony (MP) and ancestral maximum likelihood (AML) criteria in phylogenetic tree reconstruction. Both problems are NP-hard, so we seek approximate solutions. We formulate the two problems as Steiner tree problems under appropriate distances. The gist of our approach is the succinct characterization of Steiner trees for a small number of leaves for the two distances. This enables the use of known Steiner tree approximation algorithms. The approach leads to a 16/9 approximation ratio for AML and asymptotically to a 1.55 approximation ratio for MP.
A Walk on the Wild Side: The Impact of Music on Risk-Taking Likelihood
Enström, Rickard; Schmaltz, Rodney
2017-01-01
From a marketing perspective, there has been substantial interest in on the role of risk-perception on consumer behavior. Specific ‘problem music’ like rap and heavy metal has long been associated with delinquent behavior, including violence, drug use, and promiscuous sex. Although individuals’ risk preferences have been investigated across a range of decision-making situations, there has been little empirical work demonstrating the direct role music may have on the likelihood of engaging in risky activities. In the exploratory study reported here, we assessed the impact of listening to different styles of music while assessing risk-taking likelihood through a psychometric scale. Risk-taking likelihood was measured across ethical, financial, health and safety, recreational and social domains. Through the means of a canonical correlation analysis, the multivariate relationship between different music styles and individual risk-taking likelihood across the different domains is discussed. Our results indicate that listening to different types of music does influence risk-taking likelihood, though not in areas of health and safety. PMID:28539908
A Walk on the Wild Side: The Impact of Music on Risk-Taking Likelihood.
Enström, Rickard; Schmaltz, Rodney
2017-01-01
From a marketing perspective, there has been substantial interest in on the role of risk-perception on consumer behavior. Specific 'problem music' like rap and heavy metal has long been associated with delinquent behavior, including violence, drug use, and promiscuous sex. Although individuals' risk preferences have been investigated across a range of decision-making situations, there has been little empirical work demonstrating the direct role music may have on the likelihood of engaging in risky activities. In the exploratory study reported here, we assessed the impact of listening to different styles of music while assessing risk-taking likelihood through a psychometric scale. Risk-taking likelihood was measured across ethical, financial, health and safety, recreational and social domains. Through the means of a canonical correlation analysis, the multivariate relationship between different music styles and individual risk-taking likelihood across the different domains is discussed. Our results indicate that listening to different types of music does influence risk-taking likelihood, though not in areas of health and safety.
A Walk on the Wild Side: The Impact of Music on Risk-Taking Likelihood
Directory of Open Access Journals (Sweden)
Rickard Enström
2017-05-01
Full Text Available From a marketing perspective, there has been substantial interest in on the role of risk-perception on consumer behavior. Specific ‘problem music’ like rap and heavy metal has long been associated with delinquent behavior, including violence, drug use, and promiscuous sex. Although individuals’ risk preferences have been investigated across a range of decision-making situations, there has been little empirical work demonstrating the direct role music may have on the likelihood of engaging in risky activities. In the exploratory study reported here, we assessed the impact of listening to different styles of music while assessing risk-taking likelihood through a psychometric scale. Risk-taking likelihood was measured across ethical, financial, health and safety, recreational and social domains. Through the means of a canonical correlation analysis, the multivariate relationship between different music styles and individual risk-taking likelihood across the different domains is discussed. Our results indicate that listening to different types of music does influence risk-taking likelihood, though not in areas of health and safety.
Causal Inference and Model Selection in Complex Settings
Zhao, Shandong
Propensity score methods have become a part of the standard toolkit for applied researchers who wish to ascertain causal effects from observational data. While they were originally developed for binary treatments, several researchers have proposed generalizations of the propensity score methodology for non-binary treatment regimes. In this article, we firstly review three main methods that generalize propensity scores in this direction, namely, inverse propensity weighting (IPW), the propensity function (P-FUNCTION), and the generalized propensity score (GPS), along with recent extensions of the GPS that aim to improve its robustness. We compare the assumptions, theoretical properties, and empirical performance of these methods. We propose three new methods that provide robust causal estimation based on the P-FUNCTION and GPS. While our proposed P-FUNCTION-based estimator preforms well, we generally advise caution in that all available methods can be biased by model misspecification and extrapolation. In a related line of research, we consider adjustment for posttreatment covariates in causal inference. Even in a randomized experiment, observations might have different compliance performance under treatment and control assignment. This posttreatment covariate cannot be adjusted using standard statistical methods. We review the principal stratification framework which allows for modeling this effect as part of its Bayesian hierarchical models. We generalize the current model to add the possibility of adjusting for pretreatment covariates. We also propose a new estimator of the average treatment effect over the entire population. In a third line of research, we discuss the spectral line detection problem in high energy astrophysics. We carefully review how this problem can be statistically formulated as a precise hypothesis test with point null hypothesis, why a usual likelihood ratio test does not apply for problem of this nature, and a doable fix to correctly
Indirect Inference for Stochastic Differential Equations Based on Moment Expansions
Ballesio, Marco
2016-01-06
We provide an indirect inference method to estimate the parameters of timehomogeneous scalar diffusion and jump diffusion processes. We obtain a system of ODEs that approximate the time evolution of the first two moments of the process by the approximation of the stochastic model applying a second order Taylor expansion of the SDE s infinitesimal generator in the Dynkin s formula. This method allows a simple and efficient procedure to infer the parameters of such stochastic processes given the data by the maximization of the likelihood of an approximating Gaussian process described by the two moments equations. Finally, we perform numerical experiments for two datasets arising from organic and inorganic fouling deposition phenomena.
Goegebeur, Y.; de Boeck, P.; Molenberghs, G.
2010-01-01
The local influence diagnostics, proposed by Cook (1986), provide a flexible way to assess the impact of minor model perturbations on key model parameters’ estimates. In this paper, we apply the local influence idea to the detection of test speededness in a model describing nonresponse in test data,
Type Inference with Inequalities
DEFF Research Database (Denmark)
Schwartzbach, Michael Ignatieff
1991-01-01
of (monotonic) inequalities on the types of variables and expressions. A general result about systems of inequalities over semilattices yields a solvable form. We distinguish between deciding typability (the existence of solutions) and type inference (the computation of a minimal solution). In our case, both......Type inference can be phrased as constraint-solving over types. We consider an implicitly typed language equipped with recursive types, multiple inheritance, 1st order parametric polymorphism, and assignments. Type correctness is expressed as satisfiability of a possibly infinite collection...
Bootstrapping phylogenies inferred from rearrangement data
Directory of Open Access Journals (Sweden)
Lin Yu
2012-08-01
Full Text Available Abstract Background Large-scale sequencing of genomes has enabled the inference of phylogenies based on the evolution of genomic architecture, under such events as rearrangements, duplications, and losses. Many evolutionary models and associated algorithms have been designed over the last few years and have found use in comparative genomics and phylogenetic inference. However, the assessment of phylogenies built from such data has not been properly addressed to date. The standard method used in sequence-based phylogenetic inference is the bootstrap, but it relies on a large number of homologous characters that can be resampled; yet in the case of rearrangements, the entire genome is a single character. Alternatives such as the jackknife suffer from the same problem, while likelihood tests cannot be applied in the absence of well established probabilistic models. Results We present a new approach to the assessment of distance-based phylogenetic inference from whole-genome data; our approach combines features of the jackknife and the bootstrap and remains nonparametric. For each feature of our method, we give an equivalent feature in the sequence-based framework; we also present the results of extensive experimental testing, in both sequence-based and genome-based frameworks. Through the feature-by-feature comparison and the experimental results, we show that our bootstrapping approach is on par with the classic phylogenetic bootstrap used in sequence-based reconstruction, and we establish the clear superiority of the classic bootstrap for sequence data and of our corresponding new approach for rearrangement data over proposed variants. Finally, we test our approach on a small dataset of mammalian genomes, verifying that the support values match current thinking about the respective branches. Conclusions Our method is the first to provide a standard of assessment to match that of the classic phylogenetic bootstrap for aligned sequences. Its
Bootstrapping phylogenies inferred from rearrangement data.
Lin, Yu; Rajan, Vaibhav; Moret, Bernard Me
2012-08-29
Large-scale sequencing of genomes has enabled the inference of phylogenies based on the evolution of genomic architecture, under such events as rearrangements, duplications, and losses. Many evolutionary models and associated algorithms have been designed over the last few years and have found use in comparative genomics and phylogenetic inference. However, the assessment of phylogenies built from such data has not been properly addressed to date. The standard method used in sequence-based phylogenetic inference is the bootstrap, but it relies on a large number of homologous characters that can be resampled; yet in the case of rearrangements, the entire genome is a single character. Alternatives such as the jackknife suffer from the same problem, while likelihood tests cannot be applied in the absence of well established probabilistic models. We present a new approach to the assessment of distance-based phylogenetic inference from whole-genome data; our approach combines features of the jackknife and the bootstrap and remains nonparametric. For each feature of our method, we give an equivalent feature in the sequence-based framework; we also present the results of extensive experimental testing, in both sequence-based and genome-based frameworks. Through the feature-by-feature comparison and the experimental results, we show that our bootstrapping approach is on par with the classic phylogenetic bootstrap used in sequence-based reconstruction, and we establish the clear superiority of the classic bootstrap for sequence data and of our corresponding new approach for rearrangement data over proposed variants. Finally, we test our approach on a small dataset of mammalian genomes, verifying that the support values match current thinking about the respective branches. Our method is the first to provide a standard of assessment to match that of the classic phylogenetic bootstrap for aligned sequences. Its support values follow a similar scale and its receiver
Fast maximum likelihood estimation of mutation rates using a birth-death process.
Wu, Xiaowei; Zhu, Hongxiao
2015-02-07
Since fluctuation analysis was first introduced by Luria and Delbrück in 1943, it has been widely used to make inference about spontaneous mutation rates in cultured cells. Under certain model assumptions, the probability distribution of the number of mutants that appear in a fluctuation experiment can be derived explicitly, which provides the basis of mutation rate estimation. It has been shown that, among various existing estimators, the maximum likelihood estimator usually demonstrates some desirable properties such as consistency and lower mean squared error. However, its application in real experimental data is often hindered by slow computation of likelihood due to the recursive form of the mutant-count distribution. We propose a fast maximum likelihood estimator of mutation rates, MLE-BD, based on a birth-death process model with non-differential growth assumption. Simulation studies demonstrate that, compared with the conventional maximum likelihood estimator derived from the Luria-Delbrück distribution, MLE-BD achieves substantial improvement on computational speed and is applicable to arbitrarily large number of mutants. In addition, it still retains good accuracy on point estimation. Published by Elsevier Ltd.
Causal inference of asynchronous audiovisual speech
Directory of Open Access Journals (Sweden)
John F Magnotti
2013-11-01
Full Text Available During speech perception, humans integrate auditory information from the voice with visual information from the face. This multisensory integration increases perceptual precision, but only if the two cues come from the same talker; this requirement has been largely ignored by current models of speech perception. We describe a generative model of multisensory speech perception that includes this critical step of determining the likelihood that the voice and face information have a common cause. A key feature of the model is that it is based on a principled analysis of how an observer should solve this causal inference problem using the asynchrony between two cues and the reliability of the cues. This allows the model to make predictions abut the behavior of subjects performing a synchrony judgment task, predictive power that does not exist in other approaches, such as post hoc fitting of Gaussian curves to behavioral data. We tested the model predictions against the performance of 37 subjects performing a synchrony judgment task viewing audiovisual speech under a variety of manipulations, including varying asynchronies, intelligibility, and visual cue reliability. The causal inference model outperformed the Gaussian model across two experiments, providing a better fit to the behavioral data with fewer parameters. Because the causal inference model is derived from a principled understanding of the task, model parameters are directly interpretable in terms of stimulus and subject properties.
A Predictive Likelihood Approach to Bayesian Averaging
Directory of Open Access Journals (Sweden)
Tomáš Jeřábek
2015-01-01
Full Text Available Multivariate time series forecasting is applied in a wide range of economic activities related to regional competitiveness and is the basis of almost all macroeconomic analysis. In this paper we combine multivariate density forecasts of GDP growth, inflation and real interest rates from four various models, two type of Bayesian vector autoregression (BVAR models, a New Keynesian dynamic stochastic general equilibrium (DSGE model of small open economy and DSGE-VAR model. The performance of models is identified using historical dates including domestic economy and foreign economy, which is represented by countries of the Eurozone. Because forecast accuracy of observed models are different, the weighting scheme based on the predictive likelihood, the trace of past MSE matrix, model ranks are used to combine the models. The equal-weight scheme is used as a simple combination scheme. The results show that optimally combined densities are comparable to the best individual models.
Maximum Likelihood Reconstruction for Magnetic Resonance Fingerprinting.
Zhao, Bo; Setsompop, Kawin; Ye, Huihui; Cauley, Stephen F; Wald, Lawrence L
2016-08-01
This paper introduces a statistical estimation framework for magnetic resonance (MR) fingerprinting, a recently proposed quantitative imaging paradigm. Within this framework, we present a maximum likelihood (ML) formalism to estimate multiple MR tissue parameter maps directly from highly undersampled, noisy k-space data. A novel algorithm, based on variable splitting, the alternating direction method of multipliers, and the variable projection method, is developed to solve the resulting optimization problem. Representative results from both simulations and in vivo experiments demonstrate that the proposed approach yields significantly improved accuracy in parameter estimation, compared to the conventional MR fingerprinting reconstruction. Moreover, the proposed framework provides new theoretical insights into the conventional approach. We show analytically that the conventional approach is an approximation to the ML reconstruction; more precisely, it is exactly equivalent to the first iteration of the proposed algorithm for the ML reconstruction, provided that a gridding reconstruction is used as an initialization.
Subtracting and Fitting Histograms using Profile Likelihood
D'Almeida, F M L
2008-01-01
It is known that many interesting signals expected at LHC are of unknown shape and strongly contaminated by background events. These signals will be dif cult to detect during the rst years of LHC operation due to the initial low luminosity. In this work, one presents a method of subtracting histograms based on the pro le likelihood function when the background is previously estimated by Monte Carlo events and one has low statistics. Estimators for the signal in each bin of the histogram difference are calculated so as limits for the signals with 68.3% of Con dence Level in a low statistics case when one has a exponential background and a Gaussian signal. The method can also be used to t histograms when the signal shape is known. Our results show a good performance and avoid the problem of negative values when subtracting histograms.
Model averaging, optimal inference and habit formation
Directory of Open Access Journals (Sweden)
Thomas H B FitzGerald
2014-06-01
Full Text Available Postulating that the brain performs approximate Bayesian inference generates principled and empirically testable models of neuronal function – the subject of much current interest in neuroscience and related disciplines. Current formulations address inference and learning under some assumed and particular model. In reality, organisms are often faced with an additional challenge – that of determining which model or models of their environment are the best for guiding behaviour. Bayesian model averaging – which says that an agent should weight the predictions of different models according to their evidence – provides a principled way to solve this problem. Importantly, because model evidence is determined by both the accuracy and complexity of the model, optimal inference requires that these be traded off against one another. This means an agent’s behaviour should show an equivalent balance. We hypothesise that Bayesian model averaging plays an important role in cognition, given that it is both optimal and realisable within a plausible neuronal architecture. We outline model averaging and how it might be implemented, and then explore a number of implications for brain and behaviour. In particular, we propose that model averaging can explain a number of apparently suboptimal phenomena within the framework of approximate (bounded Bayesian inference, focussing particularly upon the relationship between goal-directed and habitual behaviour.
Modelling maximum likelihood estimation of availability
International Nuclear Information System (INIS)
Waller, R.A.; Tietjen, G.L.; Rock, G.W.
1975-01-01
Suppose the performance of a nuclear powered electrical generating power plant is continuously monitored to record the sequence of failure and repairs during sustained operation. The purpose of this study is to assess one method of estimating the performance of the power plant when the measure of performance is availability. That is, we determine the probability that the plant is operational at time t. To study the availability of a power plant, we first assume statistical models for the variables, X and Y, which denote the time-to-failure and the time-to-repair variables, respectively. Once those statistical models are specified, the availability, A(t), can be expressed as a function of some or all of their parameters. Usually those parameters are unknown in practice and so A(t) is unknown. This paper discusses the maximum likelihood estimator of A(t) when the time-to-failure model for X is an exponential density with parameter, lambda, and the time-to-repair model for Y is an exponential density with parameter, theta. Under the assumption of exponential models for X and Y, it follows that the instantaneous availability at time t is A(t)=lambda/(lambda+theta)+theta/(lambda+theta)exp[-[(1/lambda)+(1/theta)]t] with t>0. Also, the steady-state availability is A(infinity)=lambda/(lambda+theta). We use the observations from n failure-repair cycles of the power plant, say X 1 , X 2 , ..., Xsub(n), Y 1 , Y 2 , ..., Ysub(n) to present the maximum likelihood estimators of A(t) and A(infinity). The exact sampling distributions for those estimators and some statistical properties are discussed before a simulation model is used to determine 95% simulation intervals for A(t). The methodology is applied to two examples which approximate the operating history of two nuclear power plants. (author)
Bayesianism and inference to the best explanation
Directory of Open Access Journals (Sweden)
Valeriano IRANZO
2008-01-01
Full Text Available Bayesianism and Inference to the best explanation (IBE are two different models of inference. Recently there has been some debate about the possibility of “bayesianizing” IBE. Firstly I explore several alternatives to include explanatory considerations in Bayes’s Theorem. Then I distinguish two different interpretations of prior probabilities: “IBE-Bayesianism” (IBE-Bay and “frequentist-Bayesianism” (Freq-Bay. After detailing the content of the latter, I propose a rule for assessing the priors. I also argue that Freq-Bay: (i endorses a role for explanatory value in the assessment of scientific hypotheses; (ii avoids a purely subjectivist reading of prior probabilities; and (iii fits better than IBE-Bayesianism with two basic facts about science, i.e., the prominent role played by empirical testing and the existence of many scientific theories in the past that failed to fulfil their promises and were subsequently abandoned.
L.U.St: a tool for approximated maximum likelihood supertree reconstruction.
Akanni, Wasiu A; Creevey, Christopher J; Wilkinson, Mark; Pisani, Davide
2014-06-12
Supertrees combine disparate, partially overlapping trees to generate a synthesis that provides a high level perspective that cannot be attained from the inspection of individual phylogenies. Supertrees can be seen as meta-analytical tools that can be used to make inferences based on results of previous scientific studies. Their meta-analytical application has increased in popularity since it was realised that the power of statistical tests for the study of evolutionary trends critically depends on the use of taxon-dense phylogenies. Further to that, supertrees have found applications in phylogenomics where they are used to combine gene trees and recover species phylogenies based on genome-scale data sets. Here, we present the L.U.St package, a python tool for approximate maximum likelihood supertree inference and illustrate its application using a genomic data set for the placental mammals. L.U.St allows the calculation of the approximate likelihood of a supertree, given a set of input trees, performs heuristic searches to look for the supertree of highest likelihood, and performs statistical tests of two or more supertrees. To this end, L.U.St implements a winning sites test allowing ranking of a collection of a-priori selected hypotheses, given as a collection of input supertree topologies. It also outputs a file of input-tree-wise likelihood scores that can be used as input to CONSEL for calculation of standard tests of two trees (e.g. Kishino-Hasegawa, Shimidoara-Hasegawa and Approximately Unbiased tests). This is the first fully parametric implementation of a supertree method, it has clearly understood properties, and provides several advantages over currently available supertree approaches. It is easy to implement and works on any platform that has python installed. bitBucket page - https://afro-juju@bitbucket.org/afro-juju/l.u.st.git. Davide.Pisani@bristol.ac.uk.
Watson, Jane
2007-01-01
Inference, or decision making, is seen in curriculum documents as the final step in a statistical investigation. For a formal statistical enquiry this may be associated with sophisticated tests involving probability distributions. For young students without the mathematical background to perform such tests, it is still possible to draw informal…
Hybrid Optical Inference Machines
1991-09-27
with labels. Now, events. a set of facts cal be generated in the dyadic form "u, R 1,2" Eichmann and Caulfield (19] consider the same type of and can...these enceding-schemes. These architectures are-based pri- 19. G. Eichmann and H. J. Caulfield, "Optical Learning (Inference)marily on optical inner
Bacterial clonal diagnostics as a tool for evidence-based empiric antibiotic selection.
Directory of Open Access Journals (Sweden)
Veronika Tchesnokova
Full Text Available Despite the known clonal distribution of antibiotic resistance in many bacteria, empiric (pre-culture antibiotic selection still relies heavily on species-level cumulative antibiograms, resulting in overuse of broad-spectrum agents and excessive antibiotic/pathogen mismatch. Urinary tract infections (UTIs, which account for a large share of antibiotic use, are caused predominantly by Escherichia coli, a highly clonal pathogen. In an observational clinical cohort study of urgent care patients with suspected UTI, we assessed the potential for E. coli clonal-level antibiograms to improve empiric antibiotic selection. A novel PCR-based clonotyping assay was applied to fresh urine samples to rapidly detect E. coli and the urine strain's clonotype. Based on a database of clonotype-specific antibiograms, the acceptability of various antibiotics for empiric therapy was inferred using a 20%, 10%, and 30% allowed resistance threshold. The test's performance characteristics and possible effects on prescribing were assessed. The rapid test identified E. coli clonotypes directly in patients' urine within 25-35 minutes, with high specificity and sensitivity compared to culture. Antibiotic selection based on a clonotype-specific antibiogram could reduce the relative likelihood of antibiotic/pathogen mismatch by ≥ 60%. Compared to observed prescribing patterns, clonal diagnostics-guided antibiotic selection could safely double the use of trimethoprim/sulfamethoxazole and minimize fluoroquinolone use. In summary, a rapid clonotyping test showed promise for improving empiric antibiotic prescribing for E. coli UTI, including reversing preferential use of fluoroquinolones over trimethoprim/sulfamethoxazole. The clonal diagnostics approach merges epidemiologic surveillance, antimicrobial stewardship, and molecular diagnostics to bring evidence-based medicine directly to the point of care.
Bacterial clonal diagnostics as a tool for evidence-based empiric antibiotic selection.
Tchesnokova, Veronika; Avagyan, Hovhannes; Rechkina, Elena; Chan, Diana; Muradova, Mariya; Haile, Helen Ghirmai; Radey, Matthew; Weissman, Scott; Riddell, Kim; Scholes, Delia; Johnson, James R; Sokurenko, Evgeni V
2017-01-01
Despite the known clonal distribution of antibiotic resistance in many bacteria, empiric (pre-culture) antibiotic selection still relies heavily on species-level cumulative antibiograms, resulting in overuse of broad-spectrum agents and excessive antibiotic/pathogen mismatch. Urinary tract infections (UTIs), which account for a large share of antibiotic use, are caused predominantly by Escherichia coli, a highly clonal pathogen. In an observational clinical cohort study of urgent care patients with suspected UTI, we assessed the potential for E. coli clonal-level antibiograms to improve empiric antibiotic selection. A novel PCR-based clonotyping assay was applied to fresh urine samples to rapidly detect E. coli and the urine strain's clonotype. Based on a database of clonotype-specific antibiograms, the acceptability of various antibiotics for empiric therapy was inferred using a 20%, 10%, and 30% allowed resistance threshold. The test's performance characteristics and possible effects on prescribing were assessed. The rapid test identified E. coli clonotypes directly in patients' urine within 25-35 minutes, with high specificity and sensitivity compared to culture. Antibiotic selection based on a clonotype-specific antibiogram could reduce the relative likelihood of antibiotic/pathogen mismatch by ≥ 60%. Compared to observed prescribing patterns, clonal diagnostics-guided antibiotic selection could safely double the use of trimethoprim/sulfamethoxazole and minimize fluoroquinolone use. In summary, a rapid clonotyping test showed promise for improving empiric antibiotic prescribing for E. coli UTI, including reversing preferential use of fluoroquinolones over trimethoprim/sulfamethoxazole. The clonal diagnostics approach merges epidemiologic surveillance, antimicrobial stewardship, and molecular diagnostics to bring evidence-based medicine directly to the point of care.
The Prior Can Often Only Be Understood in the Context of the Likelihood
Directory of Open Access Journals (Sweden)
Andrew Gelman
2017-10-01
Full Text Available A key sticking point of Bayesian analysis is the choice of prior distribution, and there is a vast literature on potential defaults including uniform priors, Jeffreys’ priors, reference priors, maximum entropy priors, and weakly informative priors. These methods, however, often manifest a key conceptual tension in prior modeling: a model encoding true prior information should be chosen without reference to the model of the measurement process, but almost all common prior modeling techniques are implicitly motivated by a reference likelihood. In this paper we resolve this apparent paradox by placing the choice of prior into the context of the entire Bayesian analysis, from inference to prediction to model evaluation.
Progress on Bayesian Inference of the Fast Ion Distribution Function
DEFF Research Database (Denmark)
Stagner, L.; Heidbrink, W.W,; Chen, X.
2013-01-01
. However, when theory and experiment disagree (for one or more diagnostics), it is unclear how to proceed. Bayesian statistics provides a framework to infer the DF, quantify errors, and reconcile discrepant diagnostic measurements. Diagnostic errors and weight functions that describe the phase space...... sensitivity of the measurements are incorporated into Bayesian likelihood probabilities. Prior probabilities describe physical constraints. This poster will show reconstructions of classically described, low-power, MHD-quiescent distribution functions from actual FIDA measurements. A description of the full...
Asymptotic inference for waiting times and patiences in queues with abandonment
DEFF Research Database (Denmark)
Gorst-Rasmussen, Anders; Hansen, Martin Bøgsted
Motivated by applications in call center management, we propose a framework based on empirical process techniques for inference about the waiting time and patience distribution in multiserver queues with abandonment. The framework rigorises heuristics based on survival analysis of independent...
Asymptotic inference for waiting times and patiences in queues with abandonment
DEFF Research Database (Denmark)
Gorst-Rasmussen, Anders; Hansen, Martin Bøgsted
2009-01-01
Motivated by applications in call center management, we propose a framework based on empirical process techniques for inference about waiting time and patience distributions in multiserver queues with abandonment. The framework rigorises heuristics based on survival analysis of independent...
Explanation in causal inference methods for mediation and interaction
VanderWeele, Tyler
2015-01-01
A comprehensive examination of methods for mediation and interaction, VanderWeele's book is the first to approach this topic from the perspective of causal inference. Numerous software tools are provided, and the text is both accessible and easy to read, with examples drawn from diverse fields. The result is an essential reference for anyone conducting empirical research in the biomedical or social sciences.
Inference rule and problem solving
Energy Technology Data Exchange (ETDEWEB)
Goto, S
1982-04-01
Intelligent information processing signifies an opportunity of having man's intellectual activity executed on the computer, in which inference, in place of ordinary calculation, is used as the basic operational mechanism for such an information processing. Many inference rules are derived from syllogisms in formal logic. The problem of programming this inference function is referred to as a problem solving. Although logically inference and problem-solving are in close relation, the calculation ability of current computers is on a low level for inferring. For clarifying the relation between inference and computers, nonmonotonic logic has been considered. The paper deals with the above topics. 16 references.
Tamura, Koichiro; Peterson, Daniel; Peterson, Nicholas; Stecher, Glen; Nei, Masatoshi; Kumar, Sudhir
2011-01-01
Comparative analysis of molecular sequence data is essential for reconstructing the evolutionary histories of species and inferring the nature and extent of selective forces shaping the evolution of genes and species. Here, we announce the release of Molecular Evolutionary Genetics Analysis version 5 (MEGA5), which is a user-friendly software for mining online databases, building sequence alignments and phylogenetic trees, and using methods of evolutionary bioinformatics in basic biology, biomedicine, and evolution. The newest addition in MEGA5 is a collection of maximum likelihood (ML) analyses for inferring evolutionary trees, selecting best-fit substitution models (nucleotide or amino acid), inferring ancestral states and sequences (along with probabilities), and estimating evolutionary rates site-by-site. In computer simulation analyses, ML tree inference algorithms in MEGA5 compared favorably with other software packages in terms of computational efficiency and the accuracy of the estimates of phylogenetic trees, substitution parameters, and rate variation among sites. The MEGA user interface has now been enhanced to be activity driven to make it easier for the use of both beginners and experienced scientists. This version of MEGA is intended for the Windows platform, and it has been configured for effective use on Mac OS X and Linux desktops. It is available free of charge from http://www.megasoftware.net. PMID:21546353
Likelihood analysis of the minimal AMSB model
Energy Technology Data Exchange (ETDEWEB)
Bagnaschi, E.; Weiglein, G. [DESY, Hamburg (Germany); Borsato, M.; Chobanova, V.; Lucio, M.; Santos, D.M. [Universidade de Santiago de Compostela, Santiago de Compostela (Spain); Sakurai, K. [Institute for Particle Physics Phenomenology, University of Durham, Science Laboratories, Department of Physics, Durham (United Kingdom); University of Warsaw, Faculty of Physics, Institute of Theoretical Physics, Warsaw (Poland); Buchmueller, O.; Citron, M.; Costa, J.C.; Richards, A. [Imperial College, High Energy Physics Group, Blackett Laboratory, London (United Kingdom); Cavanaugh, R. [Fermi National Accelerator Laboratory, Batavia, IL (United States); University of Illinois at Chicago, Physics Department, Chicago, IL (United States); De Roeck, A. [Experimental Physics Department, CERN, Geneva (Switzerland); Antwerp University, Wilrijk (Belgium); Dolan, M.J. [School of Physics, University of Melbourne, ARC Centre of Excellence for Particle Physics at the Terascale, Melbourne (Australia); Ellis, J.R. [King' s College London, Theoretical Particle Physics and Cosmology Group, Department of Physics, London (United Kingdom); CERN, Theoretical Physics Department, Geneva (Switzerland); Flaecher, H. [University of Bristol, H.H. Wills Physics Laboratory, Bristol (United Kingdom); Heinemeyer, S. [Campus of International Excellence UAM+CSIC, Madrid (Spain); Instituto de Fisica Teorica UAM-CSIC, Madrid (Spain); Instituto de Fisica de Cantabria (CSIC-UC), Cantabria (Spain); Isidori, G. [Physik-Institut, Universitaet Zuerich, Zurich (Switzerland); Luo, F. [Kavli IPMU (WPI), UTIAS, The University of Tokyo, Kashiwa, Chiba (Japan); Olive, K.A. [School of Physics and Astronomy, University of Minnesota, William I. Fine Theoretical Physics Institute, Minneapolis, MN (United States)
2017-04-15
We perform a likelihood analysis of the minimal anomaly-mediated supersymmetry-breaking (mAMSB) model using constraints from cosmology and accelerator experiments. We find that either a wino-like or a Higgsino-like neutralino LSP, χ{sup 0}{sub 1}, may provide the cold dark matter (DM), both with similar likelihoods. The upper limit on the DM density from Planck and other experiments enforces m{sub χ{sup 0}{sub 1}}
Heersink, Daniel K; Caley, Peter; Paini, Dean R; Barry, Simon C
2016-05-01
The cost of an uncontrolled incursion of invasive alien species (IAS) arising from undetected entry through ports can be substantial, and knowledge of port-specific risks is needed to help allocate limited surveillance resources. Quantifying the establishment likelihood of such an incursion requires quantifying the ability of a species to enter, establish, and spread. Estimation of the approach rate of IAS into ports provides a measure of likelihood of entry. Data on the approach rate of IAS are typically sparse, and the combinations of risk factors relating to country of origin and port of arrival diverse. This presents challenges to making formal statistical inference on establishment likelihood. Here we demonstrate how these challenges can be overcome with judicious use of mixed-effects models when estimating the incursion likelihood into Australia of the European (Apis mellifera) and Asian (A. cerana) honeybees, along with the invasive parasites of biosecurity concern they host (e.g., Varroa destructor). Our results demonstrate how skewed the establishment likelihood is, with one-tenth of the ports accounting for 80% or more of the likelihood for both species. These results have been utilized by biosecurity agencies in the allocation of resources to the surveillance of maritime ports. © 2015 Society for Risk Analysis.
Likelihood Analysis of Supersymmetric SU(5) GUTs
Bagnaschi, E.
2017-01-01
We perform a likelihood analysis of the constraints from accelerator experiments and astrophysical observations on supersymmetric (SUSY) models with SU(5) boundary conditions on soft SUSY-breaking parameters at the GUT scale. The parameter space of the models studied has 7 parameters: a universal gaugino mass $m_{1/2}$, distinct masses for the scalar partners of matter fermions in five- and ten-dimensional representations of SU(5), $m_5$ and $m_{10}$, and for the $\\mathbf{5}$ and $\\mathbf{\\bar 5}$ Higgs representations $m_{H_u}$ and $m_{H_d}$, a universal trilinear soft SUSY-breaking parameter $A_0$, and the ratio of Higgs vevs $\\tan \\beta$. In addition to previous constraints from direct sparticle searches, low-energy and flavour observables, we incorporate constraints based on preliminary results from 13 TeV LHC searches for jets + MET events and long-lived particles, as well as the latest PandaX-II and LUX searches for direct Dark Matter detection. In addition to previously-identified mechanisms for bringi...
Reducing the likelihood of long tennis matches.
Barnett, Tristan; Alan, Brown; Pollard, Graham
2006-01-01
Long matches can cause problems for tournaments. For example, the starting times of subsequent matches can be substantially delayed causing inconvenience to players, spectators, officials and television scheduling. They can even be seen as unfair in the tournament setting when the winner of a very long match, who may have negative aftereffects from such a match, plays the winner of an average or shorter length match in the next round. Long matches can also lead to injuries to the participating players. One factor that can lead to long matches is the use of the advantage set as the fifth set, as in the Australian Open, the French Open and Wimbledon. Another factor is long rallies and a greater than average number of points per game. This tends to occur more frequently on the slower surfaces such as at the French Open. The mathematical method of generating functions is used to show that the likelihood of long matches can be substantially reduced by using the tiebreak game in the fifth set, or more effectively by using a new type of game, the 50-40 game, throughout the match. Key PointsThe cumulant generating function has nice properties for calculating the parameters of distributions in a tennis matchA final tiebreaker set reduces the length of matches as currently being used in the US OpenA new 50-40 game reduces the length of matches whilst maintaining comparable probabilities for the better player to win the match.
Maximum likelihood window for time delay estimation
International Nuclear Information System (INIS)
Lee, Young Sup; Yoon, Dong Jin; Kim, Chi Yup
2004-01-01
Time delay estimation for the detection of leak location in underground pipelines is critically important. Because the exact leak location depends upon the precision of the time delay between sensor signals due to leak noise and the speed of elastic waves, the research on the estimation of time delay has been one of the key issues in leak lovating with the time arrival difference method. In this study, an optimal Maximum Likelihood window is considered to obtain a better estimation of the time delay. This method has been proved in experiments, which can provide much clearer and more precise peaks in cross-correlation functions of leak signals. The leak location error has been less than 1 % of the distance between sensors, for example the error was not greater than 3 m for 300 m long underground pipelines. Apart from the experiment, an intensive theoretical analysis in terms of signal processing has been described. The improved leak locating with the suggested method is due to the windowing effect in frequency domain, which offers a weighting in significant frequencies.
Stochastic processes inference theory
Rao, Malempati M
2014-01-01
This is the revised and enlarged 2nd edition of the authors’ original text, which was intended to be a modest complement to Grenander's fundamental memoir on stochastic processes and related inference theory. The present volume gives a substantial account of regression analysis, both for stochastic processes and measures, and includes recent material on Ridge regression with some unexpected applications, for example in econometrics. The first three chapters can be used for a quarter or semester graduate course on inference on stochastic processes. The remaining chapters provide more advanced material on stochastic analysis suitable for graduate seminars and discussions, leading to dissertation or research work. In general, the book will be of interest to researchers in probability theory, mathematical statistics and electrical and information theory.
Making Type Inference Practical
DEFF Research Database (Denmark)
Schwartzbach, Michael Ignatieff; Oxhøj, Nicholas; Palsberg, Jens
1992-01-01
We present the implementation of a type inference algorithm for untyped object-oriented programs with inheritance, assignments, and late binding. The algorithm significantly improves our previous one, presented at OOPSLA'91, since it can handle collection classes, such as List, in a useful way. Abo......, the complexity has been dramatically improved, from exponential time to low polynomial time. The implementation uses the techniques of incremental graph construction and constraint template instantiation to avoid representing intermediate results, doing superfluous work, and recomputing type information....... Experiments indicate that the implementation type checks as much as 100 lines pr. second. This results in a mature product, on which a number of tools can be based, for example a safety tool, an image compression tool, a code optimization tool, and an annotation tool. This may make type inference for object...
Directory of Open Access Journals (Sweden)
João Paulo Monteiro
2001-12-01
Full Text Available Russell's The Problems of Philosophy tries to establish a new theory of induction, at the same time that Hume is there accused of an irrational/ scepticism about induction". But a careful analysis of the theory of knowledge explicitly acknowledged by Hume reveals that, contrary to the standard interpretation in the XXth century, possibly influenced by Russell, Hume deals exclusively with causal inference (which he never classifies as "causal induction", although now we are entitled to do so, never with inductive inference in general, mainly generalizations about sensible qualities of objects ( whether, e.g., "all crows are black" or not is not among Hume's concerns. Russell's theories are thus only false alternatives to Hume's, in (1912 or in his (1948.
Causal inference in econometrics
Kreinovich, Vladik; Sriboonchitta, Songsak
2016-01-01
This book is devoted to the analysis of causal inference which is one of the most difficult tasks in data analysis: when two phenomena are observed to be related, it is often difficult to decide whether one of them causally influences the other one, or whether these two phenomena have a common cause. This analysis is the main focus of this volume. To get a good understanding of the causal inference, it is important to have models of economic phenomena which are as accurate as possible. Because of this need, this volume also contains papers that use non-traditional economic models, such as fuzzy models and models obtained by using neural networks and data mining techniques. It also contains papers that apply different econometric models to analyze real-life economic dependencies.
Active inference, sensory attenuation and illusions.
Brown, Harriet; Adams, Rick A; Parees, Isabel; Edwards, Mark; Friston, Karl
2013-11-01
Active inference provides a simple and neurobiologically plausible account of how action and perception are coupled in producing (Bayes) optimal behaviour. This can be seen most easily as minimising prediction error: we can either change our predictions to explain sensory input through perception. Alternatively, we can actively change sensory input to fulfil our predictions. In active inference, this action is mediated by classical reflex arcs that minimise proprioceptive prediction error created by descending proprioceptive predictions. However, this creates a conflict between action and perception; in that, self-generated movements require predictions to override the sensory evidence that one is not actually moving. However, ignoring sensory evidence means that externally generated sensations will not be perceived. Conversely, attending to (proprioceptive and somatosensory) sensations enables the detection of externally generated events but precludes generation of actions. This conflict can be resolved by attenuating the precision of sensory evidence during movement or, equivalently, attending away from the consequences of self-made acts. We propose that this Bayes optimal withdrawal of precise sensory evidence during movement is the cause of psychophysical sensory attenuation. Furthermore, it explains the force-matching illusion and reproduces empirical results almost exactly. Finally, if attenuation is removed, the force-matching illusion disappears and false (delusional) inferences about agency emerge. This is important, given the negative correlation between sensory attenuation and delusional beliefs in normal subjects--and the reduction in the magnitude of the illusion in schizophrenia. Active inference therefore links the neuromodulatory optimisation of precision to sensory attenuation and illusory phenomena during the attribution of agency in normal subjects. It also provides a functional account of deficits in syndromes characterised by false inference
Optimized Large-scale CMB Likelihood and Quadratic Maximum Likelihood Power Spectrum Estimation
Gjerløw, E.; Colombo, L. P. L.; Eriksen, H. K.; Górski, K. M.; Gruppuso, A.; Jewell, J. B.; Plaszczynski, S.; Wehus, I. K.
2015-11-01
We revisit the problem of exact cosmic microwave background (CMB) likelihood and power spectrum estimation with the goal of minimizing computational costs through linear compression. This idea was originally proposed for CMB purposes by Tegmark et al., and here we develop it into a fully functioning computational framework for large-scale polarization analysis, adopting WMAP as a working example. We compare five different linear bases (pixel space, harmonic space, noise covariance eigenvectors, signal-to-noise covariance eigenvectors, and signal-plus-noise covariance eigenvectors) in terms of compression efficiency, and find that the computationally most efficient basis is the signal-to-noise eigenvector basis, which is closely related to the Karhunen-Loeve and Principal Component transforms, in agreement with previous suggestions. For this basis, the information in 6836 unmasked WMAP sky map pixels can be compressed into a smaller set of 3102 modes, with a maximum error increase of any single multipole of 3.8% at ℓ ≤ 32 and a maximum shift in the mean values of a joint distribution of an amplitude-tilt model of 0.006σ. This compression reduces the computational cost of a single likelihood evaluation by a factor of 5, from 38 to 7.5 CPU seconds, and it also results in a more robust likelihood by implicitly regularizing nearly degenerate modes. Finally, we use the same compression framework to formulate a numerically stable and computationally efficient variation of the Quadratic Maximum Likelihood implementation, which requires less than 3 GB of memory and 2 CPU minutes per iteration for ℓ ≤ 32, rendering low-ℓ QML CMB power spectrum analysis fully tractable on a standard laptop.
Maximum likelihood versus likelihood-free quantum system identification in the atom maser
International Nuclear Information System (INIS)
Catana, Catalin; Kypraios, Theodore; Guţă, Mădălin
2014-01-01
We consider the problem of estimating a dynamical parameter of a Markovian quantum open system (the atom maser), by performing continuous time measurements in the system's output (outgoing atoms). Two estimation methods are investigated and compared. Firstly, the maximum likelihood estimator (MLE) takes into account the full measurement data and is asymptotically optimal in terms of its mean square error. Secondly, the ‘likelihood-free’ method of approximate Bayesian computation (ABC) produces an approximation of the posterior distribution for a given set of summary statistics, by sampling trajectories at different parameter values and comparing them with the measurement data via chosen statistics. Building on previous results which showed that atom counts are poor statistics for certain values of the Rabi angle, we apply MLE to the full measurement data and estimate its Fisher information. We then select several correlation statistics such as waiting times, distribution of successive identical detections, and use them as input of the ABC algorithm. The resulting posterior distribution follows closely the data likelihood, showing that the selected statistics capture ‘most’ statistical information about the Rabi angle. (paper)
Active inference and learning.
Friston, Karl; FitzGerald, Thomas; Rigoli, Francesco; Schwartenbeck, Philipp; O Doherty, John; Pezzulo, Giovanni
2016-09-01
This paper offers an active inference account of choice behaviour and learning. It focuses on the distinction between goal-directed and habitual behaviour and how they contextualise each other. We show that habits emerge naturally (and autodidactically) from sequential policy optimisation when agents are equipped with state-action policies. In active inference, behaviour has explorative (epistemic) and exploitative (pragmatic) aspects that are sensitive to ambiguity and risk respectively, where epistemic (ambiguity-resolving) behaviour enables pragmatic (reward-seeking) behaviour and the subsequent emergence of habits. Although goal-directed and habitual policies are usually associated with model-based and model-free schemes, we find the more important distinction is between belief-free and belief-based schemes. The underlying (variational) belief updating provides a comprehensive (if metaphorical) process theory for several phenomena, including the transfer of dopamine responses, reversal learning, habit formation and devaluation. Finally, we show that active inference reduces to a classical (Bellman) scheme, in the absence of ambiguity. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.
Bayesian Inference and Online Learning in Poisson Neuronal Networks.
Huang, Yanping; Rao, Rajesh P N
2016-08-01
Motivated by the growing evidence for Bayesian computation in the brain, we show how a two-layer recurrent network of Poisson neurons can perform both approximate Bayesian inference and learning for any hidden Markov model. The lower-layer sensory neurons receive noisy measurements of hidden world states. The higher-layer neurons infer a posterior distribution over world states via Bayesian inference from inputs generated by sensory neurons. We demonstrate how such a neuronal network with synaptic plasticity can implement a form of Bayesian inference similar to Monte Carlo methods such as particle filtering. Each spike in a higher-layer neuron represents a sample of a particular hidden world state. The spiking activity across the neural population approximates the posterior distribution over hidden states. In this model, variability in spiking is regarded not as a nuisance but as an integral feature that provides the variability necessary for sampling during inference. We demonstrate how the network can learn the likelihood model, as well as the transition probabilities underlying the dynamics, using a Hebbian learning rule. We present results illustrating the ability of the network to perform inference and learning for arbitrary hidden Markov models.
Probabilistic inductive inference: a survey
Ambainis, Andris
2001-01-01
Inductive inference is a recursion-theoretic theory of learning, first developed by E. M. Gold (1967). This paper surveys developments in probabilistic inductive inference. We mainly focus on finite inference of recursive functions, since this simple paradigm has produced the most interesting (and most complex) results.
Likelihood analysis of supersymmetric SU(5) GUTs
Energy Technology Data Exchange (ETDEWEB)
Bagnaschi, E.; Weiglein, G. [DESY, Hamburg (Germany); Costa, J.C.; Buchmueller, O.; Citron, M.; Richards, A.; De Vries, K.J. [Imperial College, High Energy Physics Group, Blackett Laboratory, London (United Kingdom); Sakurai, K. [University of Durham, Science Laboratories, Department of Physics, Institute for Particle Physics Phenomenology, Durham (United Kingdom); University of Warsaw, Faculty of Physics, Institute of Theoretical Physics, Warsaw (Poland); Borsato, M.; Chobanova, V.; Lucio, M.; Martinez Santos, D. [Universidade de Santiago de Compostela, Santiago de Compostela (Spain); Cavanaugh, R. [Fermi National Accelerator Laboratory, Batavia, IL (United States); University of Illinois at Chicago, Physics Department, Chicago, IL (United States); Roeck, A. de [CERN, Experimental Physics Department, Geneva (Switzerland); Antwerp University, Wilrijk (Belgium); Dolan, M.J. [University of Melbourne, ARC Centre of Excellence for Particle Physics at the Terascale, School of Physics, Parkville (Australia); Ellis, J.R. [King' s College London, Theoretical Particle Physics and Cosmology Group, Department of Physics, London (United Kingdom); Theoretical Physics Department, CERN, Geneva 23 (Switzerland); Flaecher, H. [University of Bristol, H.H. Wills Physics Laboratory, Bristol (United Kingdom); Heinemeyer, S. [Campus of International Excellence UAM+CSIC, Cantoblanco, Madrid (Spain); Instituto de Fisica Teorica UAM-CSIC, Madrid (Spain); Instituto de Fisica de Cantabria (CSIC-UC), Santander (Spain); Isidori, G. [Universitaet Zuerich, Physik-Institut, Zurich (Switzerland); Olive, K.A. [University of Minnesota, William I. Fine Theoretical Physics Institute, School of Physics and Astronomy, Minneapolis, MN (United States)
2017-02-15
We perform a likelihood analysis of the constraints from accelerator experiments and astrophysical observations on supersymmetric (SUSY) models with SU(5) boundary conditions on soft SUSY-breaking parameters at the GUT scale. The parameter space of the models studied has seven parameters: a universal gaugino mass m{sub 1/2}, distinct masses for the scalar partners of matter fermions in five- and ten-dimensional representations of SU(5), m{sub 5} and m{sub 10}, and for the 5 and anti 5 Higgs representations m{sub H{sub u}} and m{sub H{sub d}}, a universal trilinear soft SUSY-breaking parameter A{sub 0}, and the ratio of Higgs vevs tan β. In addition to previous constraints from direct sparticle searches, low-energy and flavour observables, we incorporate constraints based on preliminary results from 13 TeV LHC searches for jets + E{sub T} events and long-lived particles, as well as the latest PandaX-II and LUX searches for direct Dark Matter detection. In addition to previously identified mechanisms for bringing the supersymmetric relic density into the range allowed by cosmology, we identify a novel u{sub R}/c{sub R} - χ{sup 0}{sub 1} coannihilation mechanism that appears in the supersymmetric SU(5) GUT model and discuss the role of ν{sub τ} coannihilation. We find complementarity between the prospects for direct Dark Matter detection and SUSY searches at the LHC. (orig.)
Likelihood analysis of supersymmetric SU(5) GUTs
Energy Technology Data Exchange (ETDEWEB)
Bagnaschi, E. [DESY, Hamburg (Germany); Costa, J.C. [Imperial College, London (United Kingdom). Blackett Lab.; Sakurai, K. [Durham Univ. (United Kingdom). Inst. for Particle Physics Phenomonology; Warsaw Univ. (Poland). Inst. of Theoretical Physics; Collaboration: MasterCode Collaboration; and others
2016-10-15
We perform a likelihood analysis of the constraints from accelerator experiments and astrophysical observations on supersymmetric (SUSY) models with SU(5) boundary conditions on soft SUSY-breaking parameters at the GUT scale. The parameter space of the models studied has 7 parameters: a universal gaugino mass m{sub 1/2}, distinct masses for the scalar partners of matter fermions in five- and ten-dimensional representations of SU(5), m{sub 5} and m{sub 10}, and for the 5 and anti 5 Higgs representations m{sub H{sub u}} and m{sub H{sub d}}, a universal trilinear soft SUSY-breaking parameter A{sub 0}, and the ratio of Higgs vevs tan β. In addition to previous constraints from direct sparticle searches, low-energy and avour observables, we incorporate constraints based on preliminary results from 13 TeV LHC searches for jets+E{sub T} events and long-lived particles, as well as the latest PandaX-II and LUX searches for direct Dark Matter detection. In addition to previously-identified mechanisms for bringing the supersymmetric relic density into the range allowed by cosmology, we identify a novel u{sub R}/c{sub R}-χ{sup 0}{sub 1} coannihilation mechanism that appears in the supersymmetric SU(5) GUT model and discuss the role of ν{sub T} coannihilation. We find complementarity between the prospects for direct Dark Matter detection and SUSY searches at the LHC.
International Nuclear Information System (INIS)
Hadjisawas, Nicolas.
1982-01-01
After a critical study of the logical quantum mechanics formulations of Jauch and Piron, classical and quantum versions of statistical inference are studied. In order to do this, the significance of the Jaynes and Kulback principles (maximum likelihood, least squares principles) is revealed from the theorems established. In the quantum mechanics inference problem, a ''distance'' between states is defined. This concept is used to solve the quantum equivalent of the classical problem studied by Kulback. The ''projection postulate'' proposition is subsequently deduced [fr
Statistical inference for extended or shortened phase II studies based on Simon's two-stage designs.
Zhao, Junjun; Yu, Menggang; Feng, Xi-Ping
2015-06-07
Simon's two-stage designs are popular choices for conducting phase II clinical trials, especially in the oncology trials to reduce the number of patients placed on ineffective experimental therapies. Recently Koyama and Chen (2008) discussed how to conduct proper inference for such studies because they found that inference procedures used with Simon's designs almost always ignore the actual sampling plan used. In particular, they proposed an inference method for studies when the actual second stage sample sizes differ from planned ones. We consider an alternative inference method based on likelihood ratio. In particular, we order permissible sample paths under Simon's two-stage designs using their corresponding conditional likelihood. In this way, we can calculate p-values using the common definition: the probability of obtaining a test statistic value at least as extreme as that observed under the null hypothesis. In addition to providing inference for a couple of scenarios where Koyama and Chen's method can be difficult to apply, the resulting estimate based on our method appears to have certain advantage in terms of inference properties in many numerical simulations. It generally led to smaller biases and narrower confidence intervals while maintaining similar coverages. We also illustrated the two methods in a real data setting. Inference procedures used with Simon's designs almost always ignore the actual sampling plan. Reported P-values, point estimates and confidence intervals for the response rate are not usually adjusted for the design's adaptiveness. Proper statistical inference procedures should be used.
A time series intervention analysis (TSIA) of dendrochronological data to infer the tree growth-climate-disturbance relations and forest disturbance history is described. Maximum likelihood is used to estimate the parameters of a structural time series model with components for ...
Phylogenetic inference with weighted codon evolutionary distances.
Criscuolo, Alexis; Michel, Christian J
2009-04-01
We develop a new approach to estimate a matrix of pairwise evolutionary distances from a codon-based alignment based on a codon evolutionary model. The method first computes a standard distance matrix for each of the three codon positions. Then these three distance matrices are weighted according to an estimate of the global evolutionary rate of each codon position and averaged into a unique distance matrix. Using a large set of both real and simulated codon-based alignments of nucleotide sequences, we show that this approach leads to distance matrices that have a significantly better treelikeness compared to those obtained by standard nucleotide evolutionary distances. We also propose an alternative weighting to eliminate the part of the noise often associated with some codon positions, particularly the third position, which is known to induce a fast evolutionary rate. Simulation results show that fast distance-based tree reconstruction algorithms on distance matrices based on this codon position weighting can lead to phylogenetic trees that are at least as accurate as, if not better, than those inferred by maximum likelihood. Finally, a well-known multigene dataset composed of eight yeast species and 106 codon-based alignments is reanalyzed and shows that our codon evolutionary distances allow building a phylogenetic tree which is similar to those obtained by non-distance-based methods (e.g., maximum parsimony and maximum likelihood) and also significantly improved compared to standard nucleotide evolutionary distance estimates.
The behavior of the likelihood ratio test for testing missingness
Hens, Niel; Aerts, Marc; Molenberghs, Geert; Thijs, Herbert
2003-01-01
To asses the sensitivity of conclusions to model choices in the context of selection models for non-random dropout, one can oppose the different missing mechanisms to each other; e.g. by the likelihood ratio tests. The finite sample behavior of the null distribution and the power of the likelihood ratio test is studied under a variety of missingness mechanisms. missing data; sensitivity analysis; likelihood ratio test; missing mechanisms
The Likelihood of Recent Record Warmth.
Mann, Michael E; Rahmstorf, Stefan; Steinman, Byron A; Tingley, Martin; Miller, Sonya K
2016-01-25
2014 was nominally the warmest year on record for both the globe and northern hemisphere based on historical records spanning the past one and a half centuries. It was the latest in a recent run of record temperatures spanning the past decade and a half. Press accounts reported odds as low as one-in-650 million that the observed run of global temperature records would be expected to occur in the absence of human-caused global warming. Press reports notwithstanding, the question of how likely observed temperature records may have have been both with and without human influence is interesting in its own right. Here we attempt to address that question using a semi-empirical approach that combines the latest (CMIP5) climate model simulations with observations of global and hemispheric mean temperature. We find that individual record years and the observed runs of record-setting temperatures were extremely unlikely to have occurred in the absence of human-caused climate change, though not nearly as unlikely as press reports have suggested. These same record temperatures were, by contrast, quite likely to have occurred in the presence of anthropogenic climate forcing.
Nonparametric statistical inference
Gibbons, Jean Dickinson
2010-01-01
Overall, this remains a very fine book suitable for a graduate-level course in nonparametric statistics. I recommend it for all people interested in learning the basic ideas of nonparametric statistical inference.-Eugenia Stoimenova, Journal of Applied Statistics, June 2012… one of the best books available for a graduate (or advanced undergraduate) text for a theory course on nonparametric statistics. … a very well-written and organized book on nonparametric statistics, especially useful and recommended for teachers and graduate students.-Biometrics, 67, September 2011This excellently presente
Emotional inferences by pragmatics
Iza-Miqueleiz, Mauricio
2017-01-01
It has for long been taken for granted that, along the course of reading a text, world knowledge is often required in order to establish coherent links between sentences (McKoon & Ratcliff 1992, Iza & Ezquerro 2000). The content grasped from a text turns out to be strongly dependent upon the reader’s additional knowledge that allows a coherent interpretation of the text as a whole. The world knowledge directing the inference may be of distinctive nature. Gygax et al. (2007) showed that m...
DEFF Research Database (Denmark)
Andersen, Jesper; Lawall, Julia
2010-01-01
A key issue in maintaining Linux device drivers is the need to keep them up to date with respect to evolutions in Linux internal libraries. Currently, there is little tool support for performing and documenting such changes. In this paper we present a tool, spdiff, that identifies common changes...... developers can use it to extract an abstract representation of the set of changes that others have made. Our experiments on recent changes in Linux show that the inferred generic patches are more concise than the corresponding patches found in commits to the Linux source tree while being safe with respect...
Penalized Maximum Likelihood Estimation for univariate normal mixture distributions
International Nuclear Information System (INIS)
Ridolfi, A.; Idier, J.
2001-01-01
Due to singularities of the likelihood function, the maximum likelihood approach for the estimation of the parameters of normal mixture models is an acknowledged ill posed optimization problem. Ill posedness is solved by penalizing the likelihood function. In the Bayesian framework, it amounts to incorporating an inverted gamma prior in the likelihood function. A penalized version of the EM algorithm is derived, which is still explicit and which intrinsically assures that the estimates are not singular. Numerical evidence of the latter property is put forward with a test
Efficient simulation and likelihood methods for non-neutral multi-allele models.
Joyce, Paul; Genz, Alan; Buzbas, Erkan Ozge
2012-06-01
Throughout the 1980s, Simon Tavaré made numerous significant contributions to population genetics theory. As genetic data, in particular DNA sequence, became more readily available, a need to connect population-genetic models to data became the central issue. The seminal work of Griffiths and Tavaré (1994a , 1994b , 1994c) was among the first to develop a likelihood method to estimate the population-genetic parameters using full DNA sequences. Now, we are in the genomics era where methods need to scale-up to handle massive data sets, and Tavaré has led the way to new approaches. However, performing statistical inference under non-neutral models has proved elusive. In tribute to Simon Tavaré, we present an article in spirit of his work that provides a computationally tractable method for simulating and analyzing data under a class of non-neutral population-genetic models. Computational methods for approximating likelihood functions and generating samples under a class of allele-frequency based non-neutral parent-independent mutation models were proposed by Donnelly, Nordborg, and Joyce (DNJ) (Donnelly et al., 2001). DNJ (2001) simulated samples of allele frequencies from non-neutral models using neutral models as auxiliary distribution in a rejection algorithm. However, patterns of allele frequencies produced by neutral models are dissimilar to patterns of allele frequencies produced by non-neutral models, making the rejection method inefficient. For example, in some cases the methods in DNJ (2001) require 10(9) rejections before a sample from the non-neutral model is accepted. Our method simulates samples directly from the distribution of non-neutral models, making simulation methods a practical tool to study the behavior of the likelihood and to perform inference on the strength of selection.
Application of the method of maximum likelihood to the determination of cepheid radii
International Nuclear Information System (INIS)
Balona, L.A.
1977-01-01
A method is described whereby the radius of any pulsating star can be obtained by applying the Principle of Maximum Likelihood. The relative merits of this method and of the usual Baade-Wesselink method are discussed in an Appendix. The new method is applied to 54 well-observed cepheids which include a number of spectroscopic binaries and two W Vir stars. An empirical period-radius relation is constructed and discussed in terms of two recent period-luminosity-colour calibrations. It is shown that the new method gives radii with an error of no more than 10 per cent. (author)
Analysis of Minute Features in Speckled Imagery with Maximum Likelihood Estimation
Directory of Open Access Journals (Sweden)
Alejandro C. Frery
2004-12-01
Full Text Available This paper deals with numerical problems arising when performing maximum likelihood parameter estimation in speckled imagery using small samples. The noise that appears in images obtained with coherent illumination, as is the case of sonar, laser, ultrasound-B, and synthetic aperture radar, is called speckle, and it can neither be assumed Gaussian nor additive. The properties of speckle noise are well described by the multiplicative model, a statistical framework from which stem several important distributions. Amongst these distributions, one is regarded as the universal model for speckled data, namely, the Ã°ÂÂ’Â¢0 law. This paper deals with amplitude data, so the Ã°ÂÂ’Â¢A0 distribution will be used. The literature reports that techniques for obtaining estimates (maximum likelihood, based on moments and on order statistics of the parameters of the Ã°ÂÂ’Â¢A0 distribution require samples of hundreds, even thousands, of observations in order to obtain sensible values. This is verified for maximum likelihood estimation, and a proposal based on alternate optimization is made to alleviate this situation. The proposal is assessed with real and simulated data, showing that the convergence problems are no longer present. A Monte Carlo experiment is devised to estimate the quality of maximum likelihood estimators in small samples, and real data is successfully analyzed with the proposed alternated procedure. Stylized empirical influence functions are computed and used to choose a strategy for computing maximum likelihood estimates that is resistant to outliers.
Graphical models for inferring single molecule dynamics
Directory of Open Access Journals (Sweden)
Gonzalez Ruben L
2010-10-01
Full Text Available Abstract Background The recent explosion of experimental techniques in single molecule biophysics has generated a variety of novel time series data requiring equally novel computational tools for analysis and inference. This article describes in general terms how graphical modeling may be used to learn from biophysical time series data using the variational Bayesian expectation maximization algorithm (VBEM. The discussion is illustrated by the example of single-molecule fluorescence resonance energy transfer (smFRET versus time data, where the smFRET time series is modeled as a hidden Markov model (HMM with Gaussian observables. A detailed description of smFRET is provided as well. Results The VBEM algorithm returns the model’s evidence and an approximating posterior parameter distribution given the data. The former provides a metric for model selection via maximum evidence (ME, and the latter a description of the model’s parameters learned from the data. ME/VBEM provide several advantages over the more commonly used approach of maximum likelihood (ML optimized by the expectation maximization (EM algorithm, the most important being a natural form of model selection and a well-posed (non-divergent optimization problem. Conclusions The results demonstrate the utility of graphical modeling for inference of dynamic processes in single molecule biophysics.
de Queiroz, Kevin; Poe, Steven
2003-06-01
Kluge's (2001, Syst. Biol. 50:322-330) continued arguments that phylogenetic methods based on the statistical principle of likelihood are incompatible with the philosophy of science described by Karl Popper are based on false premises related to Kluge's misrepresentations of Popper's philosophy. Contrary to Kluge's conjectures, likelihood methods are not inherently verificationist; they do not treat every instance of a hypothesis as confirmation of that hypothesis. The historical nature of phylogeny does not preclude phylogenetic hypotheses from being evaluated using the probability of evidence. The low absolute probabilities of hypotheses are irrelevant to the correct interpretation of Popper's concept termed degree of corroboration, which is defined entirely in terms of relative probabilities. Popper did not advocate minimizing background knowledge; in any case, the background knowledge of both parsimony and likelihood methods consists of the general assumption of descent with modification and additional assumptions that are deterministic, concerning which tree is considered most highly corroborated. Although parsimony methods do not assume (in the sense of entailing) that homoplasy is rare, they do assume (in the sense of requiring to obtain a correct phylogenetic inference) certain things about patterns of homoplasy. Both parsimony and likelihood methods assume (in the sense of implying by the manner in which they operate) various things about evolutionary processes, although violation of those assumptions does not always cause the methods to yield incorrect phylogenetic inferences. Test severity is increased by sampling additional relevant characters rather than by character reanalysis, although either interpretation is compatible with the use of phylogenetic likelihood methods. Neither parsimony nor likelihood methods assess test severity (critical evidence) when used to identify a most highly corroborated tree(s) based on a single method or model and a
Planck intermediate results: XVI. Profile likelihoods for cosmological parameters
DEFF Research Database (Denmark)
Bartlett, J.G.; Cardoso, J.-F.; Delabrouille, J.
2014-01-01
We explore the 2013 Planck likelihood function with a high-precision multi-dimensional minimizer (Minuit). This allows a refinement of the CDM best-fit solution with respect to previously-released results, and the construction of frequentist confidence intervals using profile likelihoods. The agr...
Planck 2013 results. XV. CMB power spectra and likelihood
DEFF Research Database (Denmark)
Tauber, Jan; Bartlett, J.G.; Bucher, M.
2014-01-01
This paper presents the Planck 2013 likelihood, a complete statistical description of the two-point correlation function of the CMB temperature fluctuations that accounts for all known relevant uncertainties, both instrumental and astrophysical in nature. We use this likelihood to derive our best...
The modified signed likelihood statistic and saddlepoint approximations
DEFF Research Database (Denmark)
Jensen, Jens Ledet
1992-01-01
SUMMARY: For a number of tests in exponential families we show that the use of a normal approximation to the modified signed likelihood ratio statistic r * is equivalent to the use of a saddlepoint approximation. This is also true in a large deviation region where the signed likelihood ratio...... statistic r is of order √ n. © 1992 Biometrika Trust....
Staley, Dennis M.; Negri, Jacquelyn A.; Kean, Jason W.; Laber, Jayme L.; Tillery, Anne C.; Youberg, Ann M.
2016-06-30
Wildfire can significantly alter the hydrologic response of a watershed to the extent that even modest rainstorms can generate dangerous flash floods and debris flows. To reduce public exposure to hazard, the U.S. Geological Survey produces post-fire debris-flow hazard assessments for select fires in the western United States. We use publicly available geospatial data describing basin morphology, burn severity, soil properties, and rainfall characteristics to estimate the statistical likelihood that debris flows will occur in response to a storm of a given rainfall intensity. Using an empirical database and refined geospatial analysis methods, we defined new equations for the prediction of debris-flow likelihood using logistic regression methods. We showed that the new logistic regression model outperformed previous models used to predict debris-flow likelihood.
Likelihood analysis of parity violation in the compound nucleus
International Nuclear Information System (INIS)
Bowman, D.; Sharapov, E.
1993-01-01
We discuss the determination of the root mean-squared matrix element of the parity-violating interaction between compound-nuclear states using likelihood analysis. We briefly review the relevant features of the statistical model of the compound nucleus and the formalism of likelihood analysis. We then discuss the application of likelihood analysis to data on panty-violating longitudinal asymmetries. The reliability of the extracted value of the matrix element and errors assigned to the matrix element is stressed. We treat the situations where the spins of the p-wave resonances are not known and known using experimental data and Monte Carlo techniques. We conclude that likelihood analysis provides a reliable way to determine M and its confidence interval. We briefly discuss some problems associated with the normalization of the likelihood function
Robust Demographic Inference from Genomic and SNP Data
Excoffier, Laurent; Dupanloup, Isabelle; Huerta-Sánchez, Emilia; Sousa, Vitor C.; Foll, Matthieu
2013-01-01
We introduce a flexible and robust simulation-based framework to infer demographic parameters from the site frequency spectrum (SFS) computed on large genomic datasets. We show that our composite-likelihood approach allows one to study evolutionary models of arbitrary complexity, which cannot be tackled by other current likelihood-based methods. For simple scenarios, our approach compares favorably in terms of accuracy and speed with , the current reference in the field, while showing better convergence properties for complex models. We first apply our methodology to non-coding genomic SNP data from four human populations. To infer their demographic history, we compare neutral evolutionary models of increasing complexity, including unsampled populations. We further show the versatility of our framework by extending it to the inference of demographic parameters from SNP chips with known ascertainment, such as that recently released by Affymetrix to study human origins. Whereas previous ways of handling ascertained SNPs were either restricted to a single population or only allowed the inference of divergence time between a pair of populations, our framework can correctly infer parameters of more complex models including the divergence of several populations, bottlenecks and migration. We apply this approach to the reconstruction of African demography using two distinct ascertained human SNP panels studied under two evolutionary models. The two SNP panels lead to globally very similar estimates and confidence intervals, and suggest an ancient divergence (>110 Ky) between Yoruba and San populations. Our methodology appears well suited to the study of complex scenarios from large genomic data sets. PMID:24204310
Leaché, Adam D; Banbury, Barbara L; Felsenstein, Joseph; de Oca, Adrián Nieto-Montes; Stamatakis, Alexandros
2015-11-01
Single nucleotide polymorphisms (SNPs) are useful markers for phylogenetic studies owing in part to their ubiquity throughout the genome and ease of collection. Restriction site associated DNA sequencing (RADseq) methods are becoming increasingly popular for SNP data collection, but an assessment of the best practises for using these data in phylogenetics is lacking. We use computer simulations, and new double digest RADseq (ddRADseq) data for the lizard family Phrynosomatidae, to investigate the accuracy of RAD loci for phylogenetic inference. We compare the two primary ways RAD loci are used during phylogenetic analysis, including the analysis of full sequences (i.e., SNPs together with invariant sites), or the analysis of SNPs on their own after excluding invariant sites. We find that using full sequences rather than just SNPs is preferable from the perspectives of branch length and topological accuracy, but not of computational time. We introduce two new acquisition bias corrections for dealing with alignments composed exclusively of SNPs, a conditional likelihood method and a reconstituted DNA approach. The conditional likelihood method conditions on the presence of variable characters only (the number of invariant sites that are unsampled but known to exist is not considered), while the reconstituted DNA approach requires the user to specify the exact number of unsampled invariant sites prior to the analysis. Under simulation, branch length biases increase with the amount of missing data for both acquisition bias correction methods, but branch length accuracy is much improved in the reconstituted DNA approach compared to the conditional likelihood approach. Phylogenetic analyses of the empirical data using concatenation or a coalescent-based species tree approach provide strong support for many of the accepted relationships among phrynosomatid lizards, suggesting that RAD loci contain useful phylogenetic signal across a range of divergence times despite the
Inference in Complex Systems Using Multi-Phase MCMC Sampling With Gradient Matching Burn-in
Lazarus, Alan; Husmeier, Dirk; Papamarkou, Theodore
2017-01-01
We propose a novel method for parameter inference that builds on the current research in gradient matching surrogate likelihood spaces. Adopting a three phase technique, we demonstrate that it is possible to obtain parameter estimates of limited bias whilst still adopting the paradigm of the computationally cheap surrogate approximation.
DEFF Research Database (Denmark)
Bonnevie, Rasmus; Schmidt, Mikkel Nørgaard; Mørup, Morten
2017-01-01
Variational methods for approximate inference in Bayesian models optimise a lower bound on the marginal likelihood, but the optimization problem often suffers from being nonconvex and high-dimensional. This can be alleviated by working in a collapsed domain where a part of the parameter space...
Statistical inference for discrete-time samples from affine stochastic delay differential equations
DEFF Research Database (Denmark)
Küchler, Uwe; Sørensen, Michael
2013-01-01
Statistical inference for discrete time observations of an affine stochastic delay differential equation is considered. The main focus is on maximum pseudo-likelihood estimators, which are easy to calculate in practice. A more general class of prediction-based estimating functions is investigated...
DEFF Research Database (Denmark)
Jacobsen, Christian Robert Dahl; Møller, Jesper
2017-01-01
We introduce new estimation methods for a subclass of the Gaussian scale mixture models for wavelet trees by Wainwright, Simoncelli and Willsky that rely on modern results for composite likelihoods and approximate Bayesian inference. Our methodology is illustrated for denoising and edge detection...
Directory of Open Access Journals (Sweden)
Ross S Williamson
2015-04-01
Full Text Available Stimulus dimensionality-reduction methods in neuroscience seek to identify a low-dimensional space of stimulus features that affect a neuron's probability of spiking. One popular method, known as maximally informative dimensions (MID, uses an information-theoretic quantity known as "single-spike information" to identify this space. Here we examine MID from a model-based perspective. We show that MID is a maximum-likelihood estimator for the parameters of a linear-nonlinear-Poisson (LNP model, and that the empirical single-spike information corresponds to the normalized log-likelihood under a Poisson model. This equivalence implies that MID does not necessarily find maximally informative stimulus dimensions when spiking is not well described as Poisson. We provide several examples to illustrate this shortcoming, and derive a lower bound on the information lost when spiking is Bernoulli in discrete time bins. To overcome this limitation, we introduce model-based dimensionality reduction methods for neurons with non-Poisson firing statistics, and show that they can be framed equivalently in likelihood-based or information-theoretic terms. Finally, we show how to overcome practical limitations on the number of stimulus dimensions that MID can estimate by constraining the form of the non-parametric nonlinearity in an LNP model. We illustrate these methods with simulations and data from primate visual cortex.
Directory of Open Access Journals (Sweden)
Zhang Zhang
2009-06-01
Full Text Available A major analytical challenge in computational biology is the detection and description of clusters of specified site types, such as polymorphic or substituted sites within DNA or protein sequences. Progress has been stymied by a lack of suitable methods to detect clusters and to estimate the extent of clustering in discrete linear sequences, particularly when there is no a priori specification of cluster size or cluster count. Here we derive and demonstrate a maximum likelihood method of hierarchical clustering. Our method incorporates a tripartite divide-and-conquer strategy that models sequence heterogeneity, delineates clusters, and yields a profile of the level of clustering associated with each site. The clustering model may be evaluated via model selection using the Akaike Information Criterion, the corrected Akaike Information Criterion, and the Bayesian Information Criterion. Furthermore, model averaging using weighted model likelihoods may be applied to incorporate model uncertainty into the profile of heterogeneity across sites. We evaluated our method by examining its performance on a number of simulated datasets as well as on empirical polymorphism data from diverse natural alleles of the Drosophila alcohol dehydrogenase gene. Our method yielded greater power for the detection of clustered sites across a breadth of parameter ranges, and achieved better accuracy and precision of estimation of clusters, than did the existing empirical cumulative distribution function statistics.
Empirical Test Case Specification
DEFF Research Database (Denmark)
Kalyanova, Olena; Heiselberg, Per
This document includes the empirical specification on the IEA task of evaluation building energy simulation computer programs for the Double Skin Facades (DSF) constructions. There are two approaches involved into this procedure, one is the comparative approach and another is the empirical one. I....... In the comparative approach the outcomes of different software tools are compared, while in the empirical approach the modelling results are compared with the results of experimental test cases....
Harbert, Robert S; Nixon, Kevin C
2015-08-01
• Plant distributions have long been understood to be correlated with the environmental conditions to which species are adapted. Climate is one of the major components driving species distributions. Therefore, it is expected that the plants coexisting in a community are reflective of the local environment, particularly climate.• Presented here is a method for the estimation of climate from local plant species coexistence data. The method, Climate Reconstruction Analysis using Coexistence Likelihood Estimation (CRACLE), is a likelihood-based method that employs specimen collection data at a global scale for the inference of species climate tolerance. CRACLE calculates the maximum joint likelihood of coexistence given individual species climate tolerance characterization to estimate the expected climate.• Plant distribution data for more than 4000 species were used to show that this method accurately infers expected climate profiles for 165 sites with diverse climatic conditions. Estimates differ from the WorldClim global climate model by less than 1.5°C on average for mean annual temperature and less than ∼250 mm for mean annual precipitation. This is a significant improvement upon other plant-based climate-proxy methods.• CRACLE validates long hypothesized interactions between climate and local associations of plant species. Furthermore, CRACLE successfully estimates climate that is consistent with the widely used WorldClim model and therefore may be applied to the quantitative estimation of paleoclimate in future studies. © 2015 Botanical Society of America, Inc.
Integrated empirical ethics: loss of normativity?
van der Scheer, Lieke; Widdershoven, Guy
2004-01-01
An important discussion in contemporary ethics concerns the relevance of empirical research for ethics. Specifically, two crucial questions pertain, respectively, to the possibility of inferring normative statements from descriptive statements, and to the danger of a loss of normativity if normative statements should be based on empirical research. Here we take part in the debate and defend integrated empirical ethical research: research in which normative guidelines are established on the basis of empirical research and in which the guidelines are empirically evaluated by focusing on observable consequences. We argue that in our concrete example normative statements are not derived from descriptive statements, but are developed within a process of reflection and dialogue that goes on within a specific praxis. Moreover, we show that the distinction in experience between the desirable and the undesirable precludes relativism. The normative guidelines so developed are both critical and normative: they help in choosing the right action and in evaluating that action. Finally, following Aristotle, we plead for a return to the view that morality and ethics are inherently related to one another, and for an acknowledgment of the fact that moral judgments have their origin in experience which is always related to historical and cultural circumstances.
Feature Inference Learning and Eyetracking
Rehder, Bob; Colner, Robert M.; Hoffman, Aaron B.
2009-01-01
Besides traditional supervised classification learning, people can learn categories by inferring the missing features of category members. It has been proposed that feature inference learning promotes learning a category's internal structure (e.g., its typical features and interfeature correlations) whereas classification promotes the learning of…
An Inference Language for Imaging
DEFF Research Database (Denmark)
Pedemonte, Stefano; Catana, Ciprian; Van Leemput, Koen
2014-01-01
We introduce iLang, a language and software framework for probabilistic inference. The iLang framework enables the definition of directed and undirected probabilistic graphical models and the automated synthesis of high performance inference algorithms for imaging applications. The iLang framewor...
Energy Technology Data Exchange (ETDEWEB)
Chertkov, Michael [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Ahn, Sungsoo [Korea Advanced Inst. Science and Technology (KAIST), Daejeon (Korea, Republic of); Shin, Jinwoo [Korea Advanced Inst. Science and Technology (KAIST), Daejeon (Korea, Republic of)
2017-05-25
Computing partition function is the most important statistical inference task arising in applications of Graphical Models (GM). Since it is computationally intractable, approximate methods have been used to resolve the issue in practice, where meanfield (MF) and belief propagation (BP) are arguably the most popular and successful approaches of a variational type. In this paper, we propose two new variational schemes, coined Gauged-MF (G-MF) and Gauged-BP (G-BP), improving MF and BP, respectively. Both provide lower bounds for the partition function by utilizing the so-called gauge transformation which modifies factors of GM while keeping the partition function invariant. Moreover, we prove that both G-MF and G-BP are exact for GMs with a single loop of a special structure, even though the bare MF and BP perform badly in this case. Our extensive experiments, on complete GMs of relatively small size and on large GM (up-to 300 variables) confirm that the newly proposed algorithms outperform and generalize MF and BP.
Social Inference Through Technology
Oulasvirta, Antti
Awareness cues are computer-mediated, real-time indicators of people’s undertakings, whereabouts, and intentions. Already in the mid-1970 s, UNIX users could use commands such as “finger” and “talk” to find out who was online and to chat. The small icons in instant messaging (IM) applications that indicate coconversants’ presence in the discussion space are the successors of “finger” output. Similar indicators can be found in online communities, media-sharing services, Internet relay chat (IRC), and location-based messaging applications. But presence and availability indicators are only the tip of the iceberg. Technological progress has enabled richer, more accurate, and more intimate indicators. For example, there are mobile services that allow friends to query and follow each other’s locations. Remote monitoring systems developed for health care allow relatives and doctors to assess the wellbeing of homebound patients (see, e.g., Tang and Venables 2000). But users also utilize cues that have not been deliberately designed for this purpose. For example, online gamers pay attention to other characters’ behavior to infer what the other players are like “in real life.” There is a common denominator underlying these examples: shared activities rely on the technology’s representation of the remote person. The other human being is not physically present but present only through a narrow technological channel.
Statistical inference for noisy nonlinear ecological dynamic systems.
Wood, Simon N
2010-08-26
Chaotic ecological dynamic systems defy conventional statistical analysis. Systems with near-chaotic dynamics are little better. Such systems are almost invariably driven by endogenous dynamic processes plus demographic and environmental process noise, and are only observable with error. Their sensitivity to history means that minute changes in the driving noise realization, or the system parameters, will cause drastic changes in the system trajectory. This sensitivity is inherited and amplified by the joint probability density of the observable data and the process noise, rendering it useless as the basis for obtaining measures of statistical fit. Because the joint density is the basis for the fit measures used by all conventional statistical methods, this is a major theoretical shortcoming. The inability to make well-founded statistical inferences about biological dynamic models in the chaotic and near-chaotic regimes, other than on an ad hoc basis, leaves dynamic theory without the methods of quantitative validation that are essential tools in the rest of biological science. Here I show that this impasse can be resolved in a simple and general manner, using a method that requires only the ability to simulate the observed data on a system from the dynamic model about which inferences are required. The raw data series are reduced to phase-insensitive summary statistics, quantifying local dynamic structure and the distribution of observations. Simulation is used to obtain the mean and the covariance matrix of the statistics, given model parameters, allowing the construction of a 'synthetic likelihood' that assesses model fit. This likelihood can be explored using a straightforward Markov chain Monte Carlo sampler, but one further post-processing step returns pure likelihood-based inference. I apply the method to establish the dynamic nature of the fluctuations in Nicholson's classic blowfly experiments.
Posterior distributions for likelihood ratios in forensic science.
van den Hout, Ardo; Alberink, Ivo
2016-09-01
Evaluation of evidence in forensic science is discussed using posterior distributions for likelihood ratios. Instead of eliminating the uncertainty by integrating (Bayes factor) or by conditioning on parameter values, uncertainty in the likelihood ratio is retained by parameter uncertainty derived from posterior distributions. A posterior distribution for a likelihood ratio can be summarised by the median and credible intervals. Using the posterior mean of the distribution is not recommended. An analysis of forensic data for body height estimation is undertaken. The posterior likelihood approach has been criticised both theoretically and with respect to applicability. This paper addresses the latter and illustrates an interesting application area. Copyright © 2016 The Chartered Society of Forensic Sciences. Published by Elsevier Ireland Ltd. All rights reserved.
Practical likelihood analysis for spatial generalized linear mixed models
DEFF Research Database (Denmark)
Bonat, W. H.; Ribeiro, Paulo Justiniano
2016-01-01
We investigate an algorithm for maximum likelihood estimation of spatial generalized linear mixed models based on the Laplace approximation. We compare our algorithm with a set of alternative approaches for two datasets from the literature. The Rhizoctonia root rot and the Rongelap are......, respectively, examples of binomial and count datasets modeled by spatial generalized linear mixed models. Our results show that the Laplace approximation provides similar estimates to Markov Chain Monte Carlo likelihood, Monte Carlo expectation maximization, and modified Laplace approximation. Some advantages...... of Laplace approximation include the computation of the maximized log-likelihood value, which can be used for model selection and tests, and the possibility to obtain realistic confidence intervals for model parameters based on profile likelihoods. The Laplace approximation also avoids the tuning...
Algorithms of maximum likelihood data clustering with applications
Giada, Lorenzo; Marsili, Matteo
2002-12-01
We address the problem of data clustering by introducing an unsupervised, parameter-free approach based on maximum likelihood principle. Starting from the observation that data sets belonging to the same cluster share a common information, we construct an expression for the likelihood of any possible cluster structure. The likelihood in turn depends only on the Pearson's coefficient of the data. We discuss clustering algorithms that provide a fast and reliable approximation to maximum likelihood configurations. Compared to standard clustering methods, our approach has the advantages that (i) it is parameter free, (ii) the number of clusters need not be fixed in advance and (iii) the interpretation of the results is transparent. In order to test our approach and compare it with standard clustering algorithms, we analyze two very different data sets: time series of financial market returns and gene expression data. We find that different maximization algorithms produce similar cluster structures whereas the outcome of standard algorithms has a much wider variability.
Maximum likelihood estimation of finite mixture model for economic data
Phoong, Seuk-Yen; Ismail, Mohd Tahir
2014-06-01
Finite mixture model is a mixture model with finite-dimension. This models are provides a natural representation of heterogeneity in a finite number of latent classes. In addition, finite mixture models also known as latent class models or unsupervised learning models. Recently, maximum likelihood estimation fitted finite mixture models has greatly drawn statistician's attention. The main reason is because maximum likelihood estimation is a powerful statistical method which provides consistent findings as the sample sizes increases to infinity. Thus, the application of maximum likelihood estimation is used to fit finite mixture model in the present paper in order to explore the relationship between nonlinear economic data. In this paper, a two-component normal mixture model is fitted by maximum likelihood estimation in order to investigate the relationship among stock market price and rubber price for sampled countries. Results described that there is a negative effect among rubber price and stock market price for Malaysia, Thailand, Philippines and Indonesia.
Attitude towards, and likelihood of, complaining in the banking ...
African Journals Online (AJOL)
aims to determine customers' attitudes towards complaining as well as their likelihood of voicing a .... is particularly powerful and impacts greatly on customer satisfaction and retention. ...... 'Cross-national analysis of hotel customers' attitudes ...
Narrow band interference cancelation in OFDM: Astructured maximum likelihood approach
Sohail, Muhammad Sadiq; Al-Naffouri, Tareq Y.; Al-Ghadhban, Samir N.
2012-01-01
This paper presents a maximum likelihood (ML) approach to mitigate the effect of narrow band interference (NBI) in a zero padded orthogonal frequency division multiplexing (ZP-OFDM) system. The NBI is assumed to be time variant and asynchronous
DEFF Research Database (Denmark)
Philipsen, Kirsten Riber; Christiansen, Lasse Engbo; Mandsberg, Lotte Frigaard
2008-01-01
with an exponentially decaying function of the time between observations is suggested. A model with a full covariance structure containing OD-dependent variance and an autocorrelation structure is compared to a model with variance only and with no variance or correlation implemented. It is shown that the model...... are used for parameter estimation. The data is log-transformed such that a linear model can be applied. The transformation changes the variance structure, and hence an OD-dependent variance is implemented in the model. The autocorrelation in the data is demonstrated, and a correlation model...... that best describes data is a model taking into account the full covariance structure. An inference study is made in order to determine whether the growth rate of the five bacteria strains is the same. After applying a likelihood-ratio test to models with a full covariance structure, it is concluded...
A model independent safeguard against background mismodeling for statistical inference
Energy Technology Data Exchange (ETDEWEB)
Priel, Nadav; Landsman, Hagar; Manfredini, Alessandro; Budnik, Ranny [Department of Particle Physics and Astrophysics, Weizmann Institute of Science, Herzl St. 234, Rehovot (Israel); Rauch, Ludwig, E-mail: nadav.priel@weizmann.ac.il, E-mail: rauch@mpi-hd.mpg.de, E-mail: hagar.landsman@weizmann.ac.il, E-mail: alessandro.manfredini@weizmann.ac.il, E-mail: ran.budnik@weizmann.ac.il [Teilchen- und Astroteilchenphysik, Max-Planck-Institut für Kernphysik, Saupfercheckweg 1, 69117 Heidelberg (Germany)
2017-05-01
We propose a safeguard procedure for statistical inference that provides universal protection against mismodeling of the background. The method quantifies and incorporates the signal-like residuals of the background model into the likelihood function, using information available in a calibration dataset. This prevents possible false discovery claims that may arise through unknown mismodeling, and corrects the bias in limit setting created by overestimated or underestimated background. We demonstrate how the method removes the bias created by an incomplete background model using three realistic case studies.
INFERENCE AND SENSITIVITY IN STOCHASTIC WIND POWER FORECAST MODELS.
Elkantassi, Soumaya
2017-10-03
Reliable forecasting of wind power generation is crucial to optimal control of costs in generation of electricity with respect to the electricity demand. Here, we propose and analyze stochastic wind power forecast models described by parametrized stochastic differential equations, which introduce appropriate fluctuations in numerical forecast outputs. We use an approximate maximum likelihood method to infer the model parameters taking into account the time correlated sets of data. Furthermore, we study the validity and sensitivity of the parameters for each model. We applied our models to Uruguayan wind power production as determined by historical data and corresponding numerical forecasts for the period of March 1 to May 31, 2016.
INFERENCE AND SENSITIVITY IN STOCHASTIC WIND POWER FORECAST MODELS.
Elkantassi, Soumaya; Kalligiannaki, Evangelia; Tempone, Raul
2017-01-01
Reliable forecasting of wind power generation is crucial to optimal control of costs in generation of electricity with respect to the electricity demand. Here, we propose and analyze stochastic wind power forecast models described by parametrized stochastic differential equations, which introduce appropriate fluctuations in numerical forecast outputs. We use an approximate maximum likelihood method to infer the model parameters taking into account the time correlated sets of data. Furthermore, we study the validity and sensitivity of the parameters for each model. We applied our models to Uruguayan wind power production as determined by historical data and corresponding numerical forecasts for the period of March 1 to May 31, 2016.
On the likelihood function of Gaussian max-stable processes
Genton, M. G.; Ma, Y.; Sang, H.
2011-01-01
We derive a closed form expression for the likelihood function of a Gaussian max-stable process indexed by ℝd at p≤d+1 sites, d≥1. We demonstrate the gain in efficiency in the maximum composite likelihood estimators of the covariance matrix from p=2 to p=3 sites in ℝ2 by means of a Monte Carlo simulation study. © 2011 Biometrika Trust.
Incorporating Nuisance Parameters in Likelihoods for Multisource Spectra
Conway, J.S.
2011-01-01
We describe here the general mathematical approach to constructing likelihoods for fitting observed spectra in one or more dimensions with multiple sources, including the effects of systematic uncertainties represented as nuisance parameters, when the likelihood is to be maximized with respect to these parameters. We consider three types of nuisance parameters: simple multiplicative factors, source spectra "morphing" parameters, and parameters representing statistical uncertainties in the predicted source spectra.
On the likelihood function of Gaussian max-stable processes
Genton, M. G.
2011-05-24
We derive a closed form expression for the likelihood function of a Gaussian max-stable process indexed by ℝd at p≤d+1 sites, d≥1. We demonstrate the gain in efficiency in the maximum composite likelihood estimators of the covariance matrix from p=2 to p=3 sites in ℝ2 by means of a Monte Carlo simulation study. © 2011 Biometrika Trust.
Models for inference in dynamic metacommunity systems
Dorazio, Robert M.; Kery, Marc; Royle, J. Andrew; Plattner, Matthias
2010-01-01
A variety of processes are thought to be involved in the formation and dynamics of species assemblages. For example, various metacommunity theories are based on differences in the relative contributions of dispersal of species among local communities and interactions of species within local communities. Interestingly, metacommunity theories continue to be advanced without much empirical validation. Part of the problem is that statistical models used to analyze typical survey data either fail to specify ecological processes with sufficient complexity or they fail to account for errors in detection of species during sampling. In this paper, we describe a statistical modeling framework for the analysis of metacommunity dynamics that is based on the idea of adopting a unified approach, multispecies occupancy modeling, for computing inferences about individual species, local communities of species, or the entire metacommunity of species. This approach accounts for errors in detection of species during sampling and also allows different metacommunity paradigms to be specified in terms of species- and location-specific probabilities of occurrence, extinction, and colonization: all of which are estimable. In addition, this approach can be used to address inference problems that arise in conservation ecology, such as predicting temporal and spatial changes in biodiversity for use in making conservation decisions. To illustrate, we estimate changes in species composition associated with the species-specific phenologies of flight patterns of butterflies in Switzerland for the purpose of estimating regional differences in biodiversity.
Nonparametric inference of network structure and dynamics
Peixoto, Tiago P.
The network structure of complex systems determine their function and serve as evidence for the evolutionary mechanisms that lie behind them. Despite considerable effort in recent years, it remains an open challenge to formulate general descriptions of the large-scale structure of network systems, and how to reliably extract such information from data. Although many approaches have been proposed, few methods attempt to gauge the statistical significance of the uncovered structures, and hence the majority cannot reliably separate actual structure from stochastic fluctuations. Due to the sheer size and high-dimensionality of many networks, this represents a major limitation that prevents meaningful interpretations of the results obtained with such nonstatistical methods. In this talk, I will show how these issues can be tackled in a principled and efficient fashion by formulating appropriate generative models of network structure that can have their parameters inferred from data. By employing a Bayesian description of such models, the inference can be performed in a nonparametric fashion, that does not require any a priori knowledge or ad hoc assumptions about the data. I will show how this approach can be used to perform model comparison, and how hierarchical models yield the most appropriate trade-off between model complexity and quality of fit based on the statistical evidence present in the data. I will also show how this general approach can be elegantly extended to networks with edge attributes, that are embedded in latent spaces, and that change in time. The latter is obtained via a fully dynamic generative network model, based on arbitrary-order Markov chains, that can also be inferred in a nonparametric fashion. Throughout the talk I will illustrate the application of the methods with many empirical networks such as the internet at the autonomous systems level, the global airport network, the network of actors and films, social networks, citations among
Empirical Philosophy of Science
DEFF Research Database (Denmark)
Mansnerus, Erika; Wagenknecht, Susann
2015-01-01
knowledge takes place through the integration of the empirical or historical research into the philosophical studies, as Chang, Nersessian, Thagard and Schickore argue in their work. Building upon their contributions we will develop a blueprint for an Empirical Philosophy of Science that draws upon...... qualitative methods from the social sciences in order to advance our philosophical understanding of science in practice. We will regard the relationship between philosophical conceptualization and empirical data as an iterative dialogue between theory and data, which is guided by a particular ‘feeling with......Empirical insights are proven fruitful for the advancement of Philosophy of Science, but the integration of philosophical concepts and empirical data poses considerable methodological challenges. Debates in Integrated History and Philosophy of Science suggest that the advancement of philosophical...
Dark Energy Survey Year 1 Results: Multi-Probe Methodology and Simulated Likelihood Analyses
Energy Technology Data Exchange (ETDEWEB)
Krause, E.; et al.
2017-06-28
We present the methodology for and detail the implementation of the Dark Energy Survey (DES) 3x2pt DES Year 1 (Y1) analysis, which combines configuration-space two-point statistics from three different cosmological probes: cosmic shear, galaxy-galaxy lensing, and galaxy clustering, using data from the first year of DES observations. We have developed two independent modeling pipelines and describe the code validation process. We derive expressions for analytical real-space multi-probe covariances, and describe their validation with numerical simulations. We stress-test the inference pipelines in simulated likelihood analyses that vary 6-7 cosmology parameters plus 20 nuisance parameters and precisely resemble the analysis to be presented in the DES 3x2pt analysis paper, using a variety of simulated input data vectors with varying assumptions. We find that any disagreement between pipelines leads to changes in assigned likelihood $\\Delta \\chi^2 \\le 0.045$ with respect to the statistical error of the DES Y1 data vector. We also find that angular binning and survey mask do not impact our analytic covariance at a significant level. We determine lower bounds on scales used for analysis of galaxy clustering (8 Mpc$~h^{-1}$) and galaxy-galaxy lensing (12 Mpc$~h^{-1}$) such that the impact of modeling uncertainties in the non-linear regime is well below statistical errors, and show that our analysis choices are robust against a variety of systematics. These tests demonstrate that we have a robust analysis pipeline that yields unbiased cosmological parameter inferences for the flagship 3x2pt DES Y1 analysis. We emphasize that the level of independent code development and subsequent code comparison as demonstrated in this paper is necessary to produce credible constraints from increasingly complex multi-probe analyses of current data.
A guideline for the validation of likelihood ratio methods used for forensic evidence evaluation.
Meuwly, Didier; Ramos, Daniel; Haraksim, Rudolf
2017-07-01
This Guideline proposes a protocol for the validation of forensic evaluation methods at the source level, using the Likelihood Ratio framework as defined within the Bayes' inference model. In the context of the inference of identity of source, the Likelihood Ratio is used to evaluate the strength of the evidence for a trace specimen, e.g. a fingermark, and a reference specimen, e.g. a fingerprint, to originate from common or different sources. Some theoretical aspects of probabilities necessary for this Guideline were discussed prior to its elaboration, which started after a workshop of forensic researchers and practitioners involved in this topic. In the workshop, the following questions were addressed: "which aspects of a forensic evaluation scenario need to be validated?", "what is the role of the LR as part of a decision process?" and "how to deal with uncertainty in the LR calculation?". The questions: "what to validate?" focuses on the validation methods and criteria and "how to validate?" deals with the implementation of the validation protocol. Answers to these questions were deemed necessary with several objectives. First, concepts typical for validation standards [1], such as performance characteristics, performance metrics and validation criteria, will be adapted or applied by analogy to the LR framework. Second, a validation strategy will be defined. Third, validation methods will be described. Finally, a validation protocol and an example of validation report will be proposed, which can be applied to the forensic fields developing and validating LR methods for the evaluation of the strength of evidence at source level under the following propositions. Copyright © 2016. Published by Elsevier B.V.
Pradhan, Vivek; Saha, Krishna K; Banerjee, Tathagata; Evans, John C
2014-07-30
Inference on the difference between two binomial proportions in the paired binomial setting is often an important problem in many biomedical investigations. Tang et al. (2010, Statistics in Medicine) discussed six methods to construct confidence intervals (henceforth, we abbreviate it as CI) for the difference between two proportions in paired binomial setting using method of variance estimates recovery. In this article, we propose weighted profile likelihood-based CIs for the difference between proportions of a paired binomial distribution. However, instead of the usual likelihood, we use weighted likelihood that is essentially making adjustments to the cell frequencies of a 2 × 2 table in the spirit of Agresti and Min (2005, Statistics in Medicine). We then conduct numerical studies to compare the performances of the proposed CIs with that of Tang et al. and Agresti and Min in terms of coverage probabilities and expected lengths. Our numerical study clearly indicates that the weighted profile likelihood-based intervals and Jeffreys interval (cf. Tang et al.) are superior in terms of achieving the nominal level, and in terms of expected lengths, they are competitive. Finally, we illustrate the use of the proposed CIs with real-life examples. Copyright © 2014 John Wiley & Sons, Ltd.
Inference of R 0 and Transmission Heterogeneity from the Size Distribution of Stuttering Chains
Blumberg, Seth; Lloyd-Smith, James O.
2013-01-01
For many infectious disease processes such as emerging zoonoses and vaccine-preventable diseases, and infections occur as self-limited stuttering transmission chains. A mechanistic understanding of transmission is essential for characterizing the risk of emerging diseases and monitoring spatio-temporal dynamics. Thus methods for inferring and the degree of heterogeneity in transmission from stuttering chain data have important applications in disease surveillance and management. Previous researchers have used chain size distributions to infer , but estimation of the degree of individual-level variation in infectiousness (as quantified by the dispersion parameter, ) has typically required contact tracing data. Utilizing branching process theory along with a negative binomial offspring distribution, we demonstrate how maximum likelihood estimation can be applied to chain size data to infer both and the dispersion parameter that characterizes heterogeneity. While the maximum likelihood value for is a simple function of the average chain size, the associated confidence intervals are dependent on the inferred degree of transmission heterogeneity. As demonstrated for monkeypox data from the Democratic Republic of Congo, this impacts when a statistically significant change in is detectable. In addition, by allowing for superspreading events, inference of shifts the threshold above which a transmission chain should be considered anomalously large for a given value of (thus reducing the probability of false alarms about pathogen adaptation). Our analysis of monkeypox also clarifies the various ways that imperfect observation can impact inference of transmission parameters, and highlights the need to quantitatively evaluate whether observation is likely to significantly bias results. PMID:23658504
An Empirical Mass Function Distribution
Murray, S. G.; Robotham, A. S. G.; Power, C.
2018-03-01
The halo mass function, encoding the comoving number density of dark matter halos of a given mass, plays a key role in understanding the formation and evolution of galaxies. As such, it is a key goal of current and future deep optical surveys to constrain the mass function down to mass scales that typically host {L}\\star galaxies. Motivated by the proven accuracy of Press–Schechter-type mass functions, we introduce a related but purely empirical form consistent with standard formulae to better than 4% in the medium-mass regime, {10}10{--}{10}13 {h}-1 {M}ȯ . In particular, our form consists of four parameters, each of which has a simple interpretation, and can be directly related to parameters of the galaxy distribution, such as {L}\\star . Using this form within a hierarchical Bayesian likelihood model, we show how individual mass-measurement errors can be successfully included in a typical analysis, while accounting for Eddington bias. We apply our form to a question of survey design in the context of a semi-realistic data model, illustrating how it can be used to obtain optimal balance between survey depth and angular coverage for constraints on mass function parameters. Open-source Python and R codes to apply our new form are provided at http://mrpy.readthedocs.org and https://cran.r-project.org/web/packages/tggd/index.html respectively.
Inferring Demographic History Using Two-Locus Statistics.
Ragsdale, Aaron P; Gutenkunst, Ryan N
2017-06-01
Population demographic history may be learned from contemporary genetic variation data. Methods based on aggregating the statistics of many single loci into an allele frequency spectrum (AFS) have proven powerful, but such methods ignore potentially informative patterns of linkage disequilibrium (LD) between neighboring loci. To leverage such patterns, we developed a composite-likelihood framework for inferring demographic history from aggregated statistics of pairs of loci. Using this framework, we show that two-locus statistics are more sensitive to demographic history than single-locus statistics such as the AFS. In particular, two-locus statistics escape the notorious confounding of depth and duration of a bottleneck, and they provide a means to estimate effective population size based on the recombination rather than mutation rate. We applied our approach to a Zambian population of Drosophila melanogaster Notably, using both single- and two-locus statistics, we inferred a substantially lower ancestral effective population size than previous works and did not infer a bottleneck history. Together, our results demonstrate the broad potential for two-locus statistics to enable powerful population genetic inference. Copyright © 2017 by the Genetics Society of America.
Optimization methods for logical inference
Chandru, Vijay
2011-01-01
Merging logic and mathematics in deductive inference-an innovative, cutting-edge approach. Optimization methods for logical inference? Absolutely, say Vijay Chandru and John Hooker, two major contributors to this rapidly expanding field. And even though ""solving logical inference problems with optimization methods may seem a bit like eating sauerkraut with chopsticks. . . it is the mathematical structure of a problem that determines whether an optimization model can help solve it, not the context in which the problem occurs."" Presenting powerful, proven optimization techniques for logic in
Neandertal admixture in Eurasia confirmed by maximum-likelihood analysis of three genomes.
Lohse, Konrad; Frantz, Laurent A F
2014-04-01
Although there has been much interest in estimating histories of divergence and admixture from genomic data, it has proved difficult to distinguish recent admixture from long-term structure in the ancestral population. Thus, recent genome-wide analyses based on summary statistics have sparked controversy about the possibility of interbreeding between Neandertals and modern humans in Eurasia. Here we derive the probability of full mutational configurations in nonrecombining sequence blocks under both admixture and ancestral structure scenarios. Dividing the genome into short blocks gives an efficient way to compute maximum-likelihood estimates of parameters. We apply this likelihood scheme to triplets of human and Neandertal genomes and compare the relative support for a model of admixture from Neandertals into Eurasian populations after their expansion out of Africa against a history of persistent structure in their common ancestral population in Africa. Our analysis allows us to conclusively reject a model of ancestral structure in Africa and instead reveals strong support for Neandertal admixture in Eurasia at a higher rate (3.4-7.3%) than suggested previously. Using analysis and simulations we show that our inference is more powerful than previous summary statistics and robust to realistic levels of recombination.
Efficient Maximum Likelihood Estimation for Pedigree Data with the Sum-Product Algorithm.
Engelhardt, Alexander; Rieger, Anna; Tresch, Achim; Mansmann, Ulrich
2016-01-01
We analyze data sets consisting of pedigrees with age at onset of colorectal cancer (CRC) as phenotype. The occurrence of familial clusters of CRC suggests the existence of a latent, inheritable risk factor. We aimed to compute the probability of a family possessing this risk factor as well as the hazard rate increase for these risk factor carriers. Due to the inheritability of this risk factor, the estimation necessitates a costly marginalization of the likelihood. We propose an improved EM algorithm by applying factor graphs and the sum-product algorithm in the E-step. This reduces the computational complexity from exponential to linear in the number of family members. Our algorithm is as precise as a direct likelihood maximization in a simulation study and a real family study on CRC risk. For 250 simulated families of size 19 and 21, the runtime of our algorithm is faster by a factor of 4 and 29, respectively. On the largest family (23 members) in the real data, our algorithm is 6 times faster. We introduce a flexible and runtime-efficient tool for statistical inference in biomedical event data with latent variables that opens the door for advanced analyses of pedigree data. © 2017 S. Karger AG, Basel.
Sellentin, Elena; Heavens, Alan F.
2018-01-01
We investigate whether a Gaussian likelihood, as routinely assumed in the analysis of cosmological data, is supported by simulated survey data. We define test statistics, based on a novel method that first destroys Gaussian correlations in a data set, and then measures the non-Gaussian correlations that remain. This procedure flags pairs of data points that depend on each other in a non-Gaussian fashion, and thereby identifies where the assumption of a Gaussian likelihood breaks down. Using this diagnosis, we find that non-Gaussian correlations in the CFHTLenS cosmic shear correlation functions are significant. With a simple exclusion of the most contaminated data points, the posterior for s8 is shifted without broadening, but we find no significant reduction in the tension with s8 derived from Planck cosmic microwave background data. However, we also show that the one-point distributions of the correlation statistics are noticeably skewed, such that sound weak-lensing data sets are intrinsically likely to lead to a systematically low lensing amplitude being inferred. The detected non-Gaussianities get larger with increasing angular scale such that for future wide-angle surveys such as Euclid or LSST, with their very small statistical errors, the large-scale modes are expected to be increasingly affected. The shifts in posteriors may then not be negligible and we recommend that these diagnostic tests be run as part of future analyses.
Models and Inference for Multivariate Spatial Extremes
Vettori, Sabrina
2017-12-07
The development of flexible and interpretable statistical methods is necessary in order to provide appropriate risk assessment measures for extreme events and natural disasters. In this thesis, we address this challenge by contributing to the developing research field of Extreme-Value Theory. We initially study the performance of existing parametric and non-parametric estimators of extremal dependence for multivariate maxima. As the dimensionality increases, non-parametric estimators are more flexible than parametric methods but present some loss in efficiency that we quantify under various scenarios. We introduce a statistical tool which imposes the required shape constraints on non-parametric estimators in high dimensions, significantly improving their performance. Furthermore, by embedding the tree-based max-stable nested logistic distribution in the Bayesian framework, we develop a statistical algorithm that identifies the most likely tree structures representing the data\\'s extremal dependence using the reversible jump Monte Carlo Markov Chain method. A mixture of these trees is then used for uncertainty assessment in prediction through Bayesian model averaging. The computational complexity of full likelihood inference is significantly decreased by deriving a recursive formula for the nested logistic model likelihood. The algorithm performance is verified through simulation experiments which also compare different likelihood procedures. Finally, we extend the nested logistic representation to the spatial framework in order to jointly model multivariate variables collected across a spatial region. This situation emerges often in environmental applications but is not often considered in the current literature. Simulation experiments show that the new class of multivariate max-stable processes is able to detect both the cross and inner spatial dependence of a number of extreme variables at a relatively low computational cost, thanks to its Bayesian hierarchical
DarkBit. A GAMBIT module for computing dark matter observables and likelihoods
Energy Technology Data Exchange (ETDEWEB)
Bringmann, Torsten; Dal, Lars A. [University of Oslo, Department of Physics, Oslo (Norway); Conrad, Jan; Edsjoe, Joakim; Farmer, Ben [AlbaNova University Centre, Oskar Klein Centre for Cosmoparticle Physics, Stockholm (Sweden); Stockholm University, Department of Physics, Stockholm (Sweden); Cornell, Jonathan M. [McGill University, Department of Physics, Montreal, QC (Canada); Kahlhoefer, Felix; Wild, Sebastian [DESY, Hamburg (Germany); Kvellestad, Anders; Savage, Christopher [NORDITA, Stockholm (Sweden); Putze, Antje [LAPTh, Universite de Savoie, CNRS, Annecy-le-Vieux (France); Scott, Pat [Blackett Laboratory, Imperial College London, Department of Physics, London (United Kingdom); Weniger, Christoph [University of Amsterdam, GRAPPA, Institute of Physics, Amsterdam (Netherlands); White, Martin [University of Adelaide, Department of Physics, Adelaide, SA (Australia); Australian Research Council Centre of Excellence for Particle Physics at the Tera-scale, Parkville (Australia); Collaboration: The GAMBIT Dark Matter Workgroup
2017-12-15
We introduce DarkBit, an advanced software code for computing dark matter constraints on various extensions to the Standard Model of particle physics, comprising both new native code and interfaces to external packages. This release includes a dedicated signal yield calculator for gamma-ray observations, which significantly extends current tools by implementing a cascade-decay Monte Carlo, as well as a dedicated likelihood calculator for current and future experiments (gamLike). This provides a general solution for studying complex particle physics models that predict dark matter annihilation to a multitude of final states. We also supply a direct detection package that models a large range of direct detection experiments (DDCalc), and that provides the corresponding likelihoods for arbitrary combinations of spin-independent and spin-dependent scattering processes. Finally, we provide custom relic density routines along with interfaces to DarkSUSY, micrOMEGAs, and the neutrino telescope likelihood package nulike. DarkBit is written in the framework of the Global And Modular Beyond the Standard Model Inference Tool (GAMBIT), providing seamless integration into a comprehensive statistical fitting framework that allows users to explore new models with both particle and astrophysics constraints, and a consistent treatment of systematic uncertainties. In this paper we describe its main functionality, provide a guide to getting started quickly, and show illustrative examples for results obtained with DarkBit (both as a stand-alone tool and as a GAMBIT module). This includes a quantitative comparison between two of the main dark matter codes (DarkSUSY and micrOMEGAs), and application of DarkBit's advanced direct and indirect detection routines to a simple effective dark matter model. (orig.)
Directory of Open Access Journals (Sweden)
Wang Huai-Chun
2009-09-01
Full Text Available Abstract Background The covarion hypothesis of molecular evolution holds that selective pressures on a given amino acid or nucleotide site are dependent on the identity of other sites in the molecule that change throughout time, resulting in changes of evolutionary rates of sites along the branches of a phylogenetic tree. At the sequence level, covarion-like evolution at a site manifests as conservation of nucleotide or amino acid states among some homologs where the states are not conserved in other homologs (or groups of homologs. Covarion-like evolution has been shown to relate to changes in functions at sites in different clades, and, if ignored, can adversely affect the accuracy of phylogenetic inference. Results PROCOV (protein covarion analysis is a software tool that implements a number of previously proposed covarion models of protein evolution for phylogenetic inference in a maximum likelihood framework. Several algorithmic and implementation improvements in this tool over previous versions make computationally expensive tree searches with covarion models more efficient and analyses of large phylogenomic data sets tractable. PROCOV can be used to identify covarion sites by comparing the site likelihoods under the covarion process to the corresponding site likelihoods under a rates-across-sites (RAS process. Those sites with the greatest log-likelihood difference between a 'covarion' and an RAS process were found to be of functional or structural significance in a dataset of bacterial and eukaryotic elongation factors. Conclusion Covarion models implemented in PROCOV may be especially useful for phylogenetic estimation when ancient divergences between sequences have occurred and rates of evolution at sites are likely to have changed over the tree. It can also be used to study lineage-specific functional shifts in protein families that result in changes in the patterns of site variability among subtrees.
DEFF Research Database (Denmark)
A watershed moment of the twentieth century, the end of empire saw upheavals to global power structures and national identities. However, decolonisation profoundly affected individual subjectivities too. Life Writing After Empire examines how people around the globe have made sense of the post...... in order to understand how individual life writing reflects broader societal changes. From far-flung corners of the former British Empire, people have turned to life writing to manage painful or nostalgic memories, as well as to think about the past and future of the nation anew through the personal...
Theological reflections on empire
Directory of Open Access Journals (Sweden)
Allan A. Boesak
2009-11-01
Full Text Available Since the meeting of the World Alliance of Reformed Churches in Accra, Ghana (2004, and the adoption of the Accra Declaration, a debate has been raging in the churches about globalisation, socio-economic justice, ecological responsibility, political and cultural domination and globalised war. Central to this debate is the concept of empire and the way the United States is increasingly becoming its embodiment. Is the United States a global empire? This article argues that the United States has indeed become the expression of a modern empire and that this reality has considerable consequences, not just for global economics and politics but for theological refl ection as well.
Pyron, R Alexander
2017-01-01
Here, I combine previously underutilized models and priors to perform more biologically realistic phylogenetic inference from morphological data, with an example from squamate reptiles. When coding morphological characters, it is often possible to denote ordered states with explicit reference to observed or hypothetical ancestral conditions. Using this logic, we can integrate across character-state labels and estimate meaningful rates of forward and backward transitions from plesiomorphy to apomorphy. I refer to this approach as MkA, for “asymmetric.” The MkA model incorporates the biological reality of limited reversal for many phylogenetically informative characters, and significantly increases likelihoods in the empirical data sets. Despite this, the phylogeny of Squamata remains contentious. Total-evidence analyses using combined morphological and molecular data and the MkA approach tend toward recent consensus estimates supporting a nested Iguania. However, support for this topology is not unambiguous across data sets or analyses, and no mechanism has been proposed to explain the widespread incongruence between partitions, or the hidden support for various topologies in those partitions. Furthermore, different morphological data sets produced by different authors contain both different characters and different states for the same or similar characters, resulting in drastically different placements for many important fossil lineages. Effort is needed to standardize ontology for morphology, resolve incongruence, and estimate a robust phylogeny. The MkA approach provides a preliminary avenue for investigating morphological evolution while accounting for temporal evidence and asymmetry in character-state changes.
Rodriguez, Jesse M.; Batzoglou, Serafim; Bercovici, Sivan
2013-01-01
, accurate and efficient detection of hidden relatedness becomes a challenge. To enable disease-mapping studies of increasingly large cohorts, a fast and accurate method to detect IBD segments is required. We present PARENTE, a novel method for detecting
Schoups, G.; Vrugt, J.A.
2010-01-01
Estimation of parameter and predictive uncertainty of hydrologic models has traditionally relied on several simplifying assumptions. Residual errors are often assumed to be independent and to be adequately described by a Gaussian probability distribution with a mean of zero and a constant variance.
On principles of inductive inference
Kostecki, Ryszard Paweł
2011-01-01
We propose an intersubjective epistemic approach to foundations of probability theory and statistical inference, based on relative entropy and category theory, and aimed to bypass the mathematical and conceptual problems of existing foundational approaches.
Statistical inference via fiducial methods
Salomé, Diemer
1998-01-01
In this thesis the attention is restricted to inductive reasoning using a mathematical probability model. A statistical procedure prescribes, for every theoretically possible set of data, the inference about the unknown of interest. ... Zie: Summary
Statistical inference for stochastic processes
National Research Council Canada - National Science Library
Basawa, Ishwar V; Prakasa Rao, B. L. S
1980-01-01
The aim of this monograph is to attempt to reduce the gap between theory and applications in the area of stochastic modelling, by directing the interest of future researchers to the inference aspects...
Camilo, Daniela Castro
2017-10-02
In order to model the complex non-stationary dependence structure of precipitation extremes over the entire contiguous U.S., we propose a flexible local approach based on factor copula models. Our sub-asymptotic spatial modeling framework yields non-trivial tail dependence structures, with a weakening dependence strength as events become more extreme, a feature commonly observed with precipitation data but not accounted for in classical asymptotic extreme-value models. To estimate the local extremal behavior, we fit the proposed model in small regional neighborhoods to high threshold exceedances, under the assumption of local stationarity. This allows us to gain in flexibility, while making inference for such a large and complex dataset feasible. Adopting a local censored likelihood approach, inference is made on a fine spatial grid, and local estimation is performed taking advantage of distributed computing resources and of the embarrassingly parallel nature of this estimation procedure. The local model is efficiently fitted at all grid points, and uncertainty is measured using a block bootstrap procedure. An extensive simulation study shows that our approach is able to adequately capture complex, non-stationary dependencies, while our study of U.S. winter precipitation data reveals interesting differences in local tail structures over space, which has important implications on regional risk assessment of extreme precipitation events. A comparison between past and current data suggests that extremes in certain areas might be slightly wider in extent nowadays than during the first half of the twentieth century.
Camilo, Daniela Castro; Huser, Raphaë l
2017-01-01
In order to model the complex non-stationary dependence structure of precipitation extremes over the entire contiguous U.S., we propose a flexible local approach based on factor copula models. Our sub-asymptotic spatial modeling framework yields non-trivial tail dependence structures, with a weakening dependence strength as events become more extreme, a feature commonly observed with precipitation data but not accounted for in classical asymptotic extreme-value models. To estimate the local extremal behavior, we fit the proposed model in small regional neighborhoods to high threshold exceedances, under the assumption of local stationarity. This allows us to gain in flexibility, while making inference for such a large and complex dataset feasible. Adopting a local censored likelihood approach, inference is made on a fine spatial grid, and local estimation is performed taking advantage of distributed computing resources and of the embarrassingly parallel nature of this estimation procedure. The local model is efficiently fitted at all grid points, and uncertainty is measured using a block bootstrap procedure. An extensive simulation study shows that our approach is able to adequately capture complex, non-stationary dependencies, while our study of U.S. winter precipitation data reveals interesting differences in local tail structures over space, which has important implications on regional risk assessment of extreme precipitation events. A comparison between past and current data suggests that extremes in certain areas might be slightly wider in extent nowadays than during the first half of the twentieth century.
Zeilinger, Adam R; Olson, Dawn M; Andow, David A
2014-08-01
Consumer feeding preference among resource choices has critical implications for basic ecological and evolutionary processes, and can be highly relevant to applied problems such as ecological risk assessment and invasion biology. Within consumer choice experiments, also known as feeding preference or cafeteria experiments, measures of relative consumption and measures of consumer movement can provide distinct and complementary insights into the strength, causes, and consequences of preference. Despite the distinct value of inferring preference from measures of consumer movement, rigorous and biologically relevant analytical methods are lacking. We describe a simple, likelihood-based, biostatistical model for analyzing the transient dynamics of consumer movement in a paired-choice experiment. With experimental data consisting of repeated discrete measures of consumer location, the model can be used to estimate constant consumer attraction and leaving rates for two food choices, and differences in choice-specific attraction and leaving rates can be tested using model selection. The model enables calculation of transient and equilibrial probabilities of consumer-resource association, which could be incorporated into larger scale movement models. We explore the effect of experimental design on parameter estimation through stochastic simulation and describe methods to check that data meet model assumptions. Using a dataset of modest sample size, we illustrate the use of the model to draw inferences on consumer preference as well as underlying behavioral mechanisms. Finally, we include a user's guide and computer code scripts in R to facilitate use of the model by other researchers.
A Network Inference Workflow Applied to Virulence-Related Processes in Salmonella typhimurium
Energy Technology Data Exchange (ETDEWEB)
Taylor, Ronald C.; Singhal, Mudita; Weller, Jennifer B.; Khoshnevis, Saeed; Shi, Liang; McDermott, Jason E.
2009-04-20
Inference of the structure of mRNA transcriptional regulatory networks, protein regulatory or interaction networks, and protein activation/inactivation-based signal transduction networks are critical tasks in systems biology. In this article we discuss a workflow for the reconstruction of parts of the transcriptional regulatory network of the pathogenic bacterium Salmonella typhimurium based on the information contained in sets of microarray gene expression data now available for that organism, and describe our results obtained by following this workflow. The primary tool is one of the network inference algorithms deployed in the Software Environment for BIological Network Inference (SEBINI). Specifically, we selected the algorithm called Context Likelihood of Relatedness (CLR), which uses the mutual information contained in the gene expression data to infer regulatory connections. The associated analysis pipeline automatically stores the inferred edges from the CLR runs within SEBINI and, upon request, transfers the inferred edges into either Cytoscape or the plug-in Collective Analysis of Biological of Biological Interaction Networks (CABIN) tool for further post-analysis of the inferred regulatory edges. The following article presents the outcome of this workflow, as well as the protocols followed for microarray data collection, data cleansing, and network inference. Our analysis revealed several interesting interactions, functional groups, metabolic pathways, and regulons in S. typhimurium.
Active inference, communication and hermeneutics.
Friston, Karl J; Frith, Christopher D
2015-07-01
Hermeneutics refers to interpretation and translation of text (typically ancient scriptures) but also applies to verbal and non-verbal communication. In a psychological setting it nicely frames the problem of inferring the intended content of a communication. In this paper, we offer a solution to the problem of neural hermeneutics based upon active inference. In active inference, action fulfils predictions about how we will behave (e.g., predicting we will speak). Crucially, these predictions can be used to predict both self and others--during speaking and listening respectively. Active inference mandates the suppression of prediction errors by updating an internal model that generates predictions--both at fast timescales (through perceptual inference) and slower timescales (through perceptual learning). If two agents adopt the same model, then--in principle--they can predict each other and minimise their mutual prediction errors. Heuristically, this ensures they are singing from the same hymn sheet. This paper builds upon recent work on active inference and communication to illustrate perceptual learning using simulated birdsongs. Our focus here is the neural hermeneutics implicit in learning, where communication facilitates long-term changes in generative models that are trying to predict each other. In other words, communication induces perceptual learning and enables others to (literally) change our minds and vice versa. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.
Constraint likelihood analysis for a network of gravitational wave detectors
International Nuclear Information System (INIS)
Klimenko, S.; Rakhmanov, M.; Mitselmakher, G.; Mohanty, S.
2005-01-01
We propose a coherent method for detection and reconstruction of gravitational wave signals with a network of interferometric detectors. The method is derived by using the likelihood ratio functional for unknown signal waveforms. In the likelihood analysis, the global maximum of the likelihood ratio over the space of waveforms is used as the detection statistic. We identify a problem with this approach. In the case of an aligned pair of detectors, the detection statistic depends on the cross correlation between the detectors as expected, but this dependence disappears even for infinitesimally small misalignments. We solve the problem by applying constraints on the likelihood functional and obtain a new class of statistics. The resulting method can be applied to data from a network consisting of any number of detectors with arbitrary detector orientations. The method allows us reconstruction of the source coordinates and the waveforms of two polarization components of a gravitational wave. We study the performance of the method with numerical simulations and find the reconstruction of the source coordinates to be more accurate than in the standard likelihood method
African Journals Online (AJOL)
FIRST LADY
2011-01-18
Jan 18, 2011 ... Empirical results reveal that consumption of sugar in. Kenya varies ... experiences in trade in different regions of the world. Some studies ... To assess the relationship between domestic sugar retail prices and sugar sales in ...
Scalable inference for stochastic block models
Peng, Chengbin
2017-12-08
Community detection in graphs is widely used in social and biological networks, and the stochastic block model is a powerful probabilistic tool for describing graphs with community structures. However, in the era of "big data," traditional inference algorithms for such a model are increasingly limited due to their high time complexity and poor scalability. In this paper, we propose a multi-stage maximum likelihood approach to recover the latent parameters of the stochastic block model, in time linear with respect to the number of edges. We also propose a parallel algorithm based on message passing. Our algorithm can overlap communication and computation, providing speedup without compromising accuracy as the number of processors grows. For example, to process a real-world graph with about 1.3 million nodes and 10 million edges, our algorithm requires about 6 seconds on 64 cores of a contemporary commodity Linux cluster. Experiments demonstrate that the algorithm can produce high quality results on both benchmark and real-world graphs. An example of finding more meaningful communities is illustrated consequently in comparison with a popular modularity maximization algorithm.
Probabilistic learning and inference in schizophrenia
Averbeck, Bruno B.; Evans, Simon; Chouhan, Viraj; Bristow, Eleanor; Shergill, Sukhwinder S.
2010-01-01
Patients with schizophrenia make decisions on the basis of less evidence when required to collect information to make an inference, a behavior often called jumping to conclusions. The underlying basis for this behaviour remains controversial. We examined the cognitive processes underpinning this finding by testing subjects on the beads task, which has been used previously to elicit jumping to conclusions behaviour, and a stochastic sequence learning task, with a similar decision theoretic structure. During the sequence learning task, subjects had to learn a sequence of button presses, while receiving noisy feedback on their choices. We fit a Bayesian decision making model to the sequence task and compared model parameters to the choice behavior in the beads task in both patients and healthy subjects. We found that patients did show a jumping to conclusions style; and those who picked early in the beads task tended to learn less from positive feedback in the sequence task. This favours the likelihood of patients selecting early because they have a low threshold for making decisions, and that they make choices on the basis of relatively little evidence. PMID:20810252
Probabilistic learning and inference in schizophrenia.
Averbeck, Bruno B; Evans, Simon; Chouhan, Viraj; Bristow, Eleanor; Shergill, Sukhwinder S
2011-04-01
Patients with schizophrenia make decisions on the basis of less evidence when required to collect information to make an inference, a behavior often called jumping to conclusions. The underlying basis for this behavior remains controversial. We examined the cognitive processes underpinning this finding by testing subjects on the beads task, which has been used previously to elicit jumping to conclusions behavior, and a stochastic sequence learning task, with a similar decision theoretic structure. During the sequence learning task, subjects had to learn a sequence of button presses, while receiving a noisy feedback on their choices. We fit a Bayesian decision making model to the sequence task and compared model parameters to the choice behavior in the beads task in both patients and healthy subjects. We found that patients did show a jumping to conclusions style; and those who picked early in the beads task tended to learn less from positive feedback in the sequence task. This favours the likelihood of patients selecting early because they have a low threshold for making decisions, and that they make choices on the basis of relatively little evidence. Published by Elsevier B.V.
Inferring gene networks from discrete expression data
Zhang, L.
2013-07-18
The modeling of gene networks from transcriptional expression data is an important tool in biomedical research to reveal signaling pathways and to identify treatment targets. Current gene network modeling is primarily based on the use of Gaussian graphical models applied to continuous data, which give a closedformmarginal likelihood. In this paper,we extend network modeling to discrete data, specifically data from serial analysis of gene expression, and RNA-sequencing experiments, both of which generate counts of mRNAtranscripts in cell samples.We propose a generalized linear model to fit the discrete gene expression data and assume that the log ratios of the mean expression levels follow a Gaussian distribution.We restrict the gene network structures to decomposable graphs and derive the graphs by selecting the covariance matrix of the Gaussian distribution with the hyper-inverse Wishart priors. Furthermore, we incorporate prior network models based on gene ontology information, which avails existing biological information on the genes of interest. We conduct simulation studies to examine the performance of our discrete graphical model and apply the method to two real datasets for gene network inference. © The Author 2013. Published by Oxford University Press. All rights reserved.
Multiple sequence alignment accuracy and phylogenetic inference.
Ogden, T Heath; Rosenberg, Michael S
2006-04-01
Phylogenies are often thought to be more dependent upon the specifics of the sequence alignment rather than on the method of reconstruction. Simulation of sequences containing insertion and deletion events was performed in order to determine the role that alignment accuracy plays during phylogenetic inference. Data sets were simulated for pectinate, balanced, and random tree shapes under different conditions (ultrametric equal branch length, ultrametric random branch length, nonultrametric random branch length). Comparisons between hypothesized alignments and true alignments enabled determination of two measures of alignment accuracy, that of the total data set and that of individual branches. In general, our results indicate that as alignment error increases, topological accuracy decreases. This trend was much more pronounced for data sets derived from more pectinate topologies. In contrast, for balanced, ultrametric, equal branch length tree shapes, alignment inaccuracy had little average effect on tree reconstruction. These conclusions are based on average trends of many analyses under different conditions, and any one specific analysis, independent of the alignment accuracy, may recover very accurate or inaccurate topologies. Maximum likelihood and Bayesian, in general, outperformed neighbor joining and maximum parsimony in terms of tree reconstruction accuracy. Results also indicated that as the length of the branch and of the neighboring branches increase, alignment accuracy decreases, and the length of the neighboring branches is the major factor in topological accuracy. Thus, multiple-sequence alignment can be an important factor in downstream effects on topological reconstruction.
Profile-likelihood Confidence Intervals in Item Response Theory Models.
Chalmers, R Philip; Pek, Jolynn; Liu, Yang
2017-01-01
Confidence intervals (CIs) are fundamental inferential devices which quantify the sampling variability of parameter estimates. In item response theory, CIs have been primarily obtained from large-sample Wald-type approaches based on standard error estimates, derived from the observed or expected information matrix, after parameters have been estimated via maximum likelihood. An alternative approach to constructing CIs is to quantify sampling variability directly from the likelihood function with a technique known as profile-likelihood confidence intervals (PL CIs). In this article, we introduce PL CIs for item response theory models, compare PL CIs to classical large-sample Wald-type CIs, and demonstrate important distinctions among these CIs. CIs are then constructed for parameters directly estimated in the specified model and for transformed parameters which are often obtained post-estimation. Monte Carlo simulation results suggest that PL CIs perform consistently better than Wald-type CIs for both non-transformed and transformed parameters.
Statistical inference for imperfect maintenance models with missing data
International Nuclear Information System (INIS)
Dijoux, Yann; Fouladirad, Mitra; Nguyen, Dinh Tuan
2016-01-01
The paper considers complex industrial systems with incomplete maintenance history. A corrective maintenance is performed after the occurrence of a failure and its efficiency is assumed to be imperfect. In maintenance analysis, the databases are not necessarily complete. Specifically, the observations are assumed to be window-censored. This situation arises relatively frequently after the purchase of a second-hand unit or in the absence of maintenance record during the burn-in phase. The joint assessment of the wear-out of the system and the maintenance efficiency is investigated under missing data. A review along with extensions of statistical inference procedures from an observation window are proposed in the case of perfect and minimal repair using the renewal and Poisson theories, respectively. Virtual age models are employed to model imperfect repair. In this framework, new estimation procedures are developed. In particular, maximum likelihood estimation methods are derived for the most classical virtual age models. The benefits of the new estimation procedures are highlighted by numerical simulations and an application to a real data set. - Highlights: • New estimation procedures for window-censored observations and imperfect repair. • Extensions of inference methods for perfect and minimal repair with missing data. • Overview of maximum likelihood method with complete and incomplete observations. • Benefits of the new procedures highlighted by simulation studies and real application.
Likelihood ratio sequential sampling models of recognition memory.
Osth, Adam F; Dennis, Simon; Heathcote, Andrew
2017-02-01
The mirror effect - a phenomenon whereby a manipulation produces opposite effects on hit and false alarm rates - is benchmark regularity of recognition memory. A likelihood ratio decision process, basing recognition on the relative likelihood that a stimulus is a target or a lure, naturally predicts the mirror effect, and so has been widely adopted in quantitative models of recognition memory. Glanzer, Hilford, and Maloney (2009) demonstrated that likelihood ratio models, assuming Gaussian memory strength, are also capable of explaining regularities observed in receiver-operating characteristics (ROCs), such as greater target than lure variance. Despite its central place in theorising about recognition memory, however, this class of models has not been tested using response time (RT) distributions. In this article, we develop a linear approximation to the likelihood ratio transformation, which we show predicts the same regularities as the exact transformation. This development enabled us to develop a tractable model of recognition-memory RT based on the diffusion decision model (DDM), with inputs (drift rates) provided by an approximate likelihood ratio transformation. We compared this "LR-DDM" to a standard DDM where all targets and lures receive their own drift rate parameters. Both were implemented as hierarchical Bayesian models and applied to four datasets. Model selection taking into account parsimony favored the LR-DDM, which requires fewer parameters than the standard DDM but still fits the data well. These results support log-likelihood based models as providing an elegant explanation of the regularities of recognition memory, not only in terms of choices made but also in terms of the times it takes to make them. Copyright © 2016 Elsevier Inc. All rights reserved.
Unbinned likelihood maximisation framework for neutrino clustering in Python
Energy Technology Data Exchange (ETDEWEB)
Coenders, Stefan [Technische Universitaet Muenchen, Boltzmannstr. 2, 85748 Garching (Germany)
2016-07-01
Albeit having detected an astrophysical neutrino flux with IceCube, sources of astrophysical neutrinos remain hidden up to now. A detection of a neutrino point source is a smoking gun for hadronic processes and acceleration of cosmic rays. The search for neutrino sources has many degrees of freedom, for example steady versus transient, point-like versus extended sources, et cetera. Here, we introduce a Python framework designed for unbinned likelihood maximisations as used in searches for neutrino point sources by IceCube. Implementing source scenarios in a modular way, likelihood searches on various kinds can be implemented in a user-friendly way, without sacrificing speed and memory management.
Nearly Efficient Likelihood Ratio Tests of the Unit Root Hypothesis
DEFF Research Database (Denmark)
Jansson, Michael; Nielsen, Morten Ørregaard
Seemingly absent from the arsenal of currently available "nearly efficient" testing procedures for the unit root hypothesis, i.e. tests whose local asymptotic power functions are indistinguishable from the Gaussian power envelope, is a test admitting a (quasi-)likelihood ratio interpretation. We...... show that the likelihood ratio unit root test derived in a Gaussian AR(1) model with standard normal innovations is nearly efficient in that model. Moreover, these desirable properties carry over to more complicated models allowing for serially correlated and/or non-Gaussian innovations....
A note on estimating errors from the likelihood function
International Nuclear Information System (INIS)
Barlow, Roger
2005-01-01
The points at which the log likelihood falls by 12 from its maximum value are often used to give the 'errors' on a result, i.e. the 68% central confidence interval. The validity of this is examined for two simple cases: a lifetime measurement and a Poisson measurement. Results are compared with the exact Neyman construction and with the simple Bartlett approximation. It is shown that the accuracy of the log likelihood method is poor, and the Bartlett construction explains why it is flawed
Nearly Efficient Likelihood Ratio Tests for Seasonal Unit Roots
DEFF Research Database (Denmark)
Jansson, Michael; Nielsen, Morten Ørregaard
In an important generalization of zero frequency autore- gressive unit root tests, Hylleberg, Engle, Granger, and Yoo (1990) developed regression-based tests for unit roots at the seasonal frequencies in quarterly time series. We develop likelihood ratio tests for seasonal unit roots and show...... that these tests are "nearly efficient" in the sense of Elliott, Rothenberg, and Stock (1996), i.e. that their local asymptotic power functions are indistinguishable from the Gaussian power envelope. Currently available nearly efficient testing procedures for seasonal unit roots are regression-based and require...... the choice of a GLS detrending parameter, which our likelihood ratio tests do not....
LDR: A Package for Likelihood-Based Sufficient Dimension Reduction
Directory of Open Access Journals (Sweden)
R. Dennis Cook
2011-03-01
Full Text Available We introduce a new mlab software package that implements several recently proposed likelihood-based methods for sufficient dimension reduction. Current capabilities include estimation of reduced subspaces with a fixed dimension d, as well as estimation of d by use of likelihood-ratio testing, permutation testing and information criteria. The methods are suitable for preprocessing data for both regression and classification. Implementations of related estimators are also available. Although the software is more oriented to command-line operation, a graphical user interface is also provided for prototype computations.
Likelihood ratio decisions in memory: three implied regularities.
Glanzer, Murray; Hilford, Andrew; Maloney, Laurence T
2009-06-01
We analyze four general signal detection models for recognition memory that differ in their distributional assumptions. Our analyses show that a basic assumption of signal detection theory, the likelihood ratio decision axis, implies three regularities in recognition memory: (1) the mirror effect, (2) the variance effect, and (3) the z-ROC length effect. For each model, we present the equations that produce the three regularities and show, in computed examples, how they do so. We then show that the regularities appear in data from a range of recognition studies. The analyses and data in our study support the following generalization: Individuals make efficient recognition decisions on the basis of likelihood ratios.
Peters, B. C., Jr.; Walker, H. F.
1976-01-01
The problem of obtaining numerically maximum likelihood estimates of the parameters for a mixture of normal distributions is addressed. In recent literature, a certain successive approximations procedure, based on the likelihood equations, is shown empirically to be effective in numerically approximating such maximum-likelihood estimates; however, the reliability of this procedure was not established theoretically. Here, a general iterative procedure is introduced, of the generalized steepest-ascent (deflected-gradient) type, which is just the procedure known in the literature when the step-size is taken to be 1. With probability 1 as the sample size grows large, it is shown that this procedure converges locally to the strongly consistent maximum-likelihood estimate whenever the step-size lies between 0 and 2. The step-size which yields optimal local convergence rates for large samples is determined in a sense by the separation of the component normal densities and is bounded below by a number between 1 and 2.
Peters, B. C., Jr.; Walker, H. F.
1978-01-01
This paper addresses the problem of obtaining numerically maximum-likelihood estimates of the parameters for a mixture of normal distributions. In recent literature, a certain successive-approximations procedure, based on the likelihood equations, was shown empirically to be effective in numerically approximating such maximum-likelihood estimates; however, the reliability of this procedure was not established theoretically. Here, we introduce a general iterative procedure, of the generalized steepest-ascent (deflected-gradient) type, which is just the procedure known in the literature when the step-size is taken to be 1. We show that, with probability 1 as the sample size grows large, this procedure converges locally to the strongly consistent maximum-likelihood estimate whenever the step-size lies between 0 and 2. We also show that the step-size which yields optimal local convergence rates for large samples is determined in a sense by the 'separation' of the component normal densities and is bounded below by a number between 1 and 2.
Understanding the properties of diagnostic tests - Part 2: Likelihood ratios.
Ranganathan, Priya; Aggarwal, Rakesh
2018-01-01
Diagnostic tests are used to identify subjects with and without disease. In a previous article in this series, we examined some attributes of diagnostic tests - sensitivity, specificity, and predictive values. In this second article, we look at likelihood ratios, which are useful for the interpretation of diagnostic test results in everyday clinical practice.
Comparison of likelihood testing procedures for parallel systems with covariances
International Nuclear Information System (INIS)
Ayman Baklizi; Isa Daud; Noor Akma Ibrahim
1998-01-01
In this paper we considered investigating and comparing the behavior of the likelihood ratio, the Rao's and the Wald's statistics for testing hypotheses on the parameters of the simple linear regression model based on parallel systems with covariances. These statistics are asymptotically equivalent (Barndorff-Nielsen and Cox, 1994). However, their relative performances in finite samples are generally known. A Monte Carlo experiment is conducted to stimulate the sizes and the powers of these statistics for complete samples and in the presence of time censoring. Comparisons of the statistics are made according to the attainment of assumed size of the test and their powers at various points in the parameter space. The results show that the likelihood ratio statistics appears to have the best performance in terms of the attainment of the assumed size of the test. Power comparisons show that the Rao statistic has some advantage over the Wald statistic in almost all of the space of alternatives while likelihood ratio statistic occupies either the first or the last position in term of power. Overall, the likelihood ratio statistic appears to be more appropriate to the model under study, especially for small sample sizes
Maximum likelihood estimation of the attenuated ultrasound pulse
DEFF Research Database (Denmark)
Rasmussen, Klaus Bolding
1994-01-01
The attenuated ultrasound pulse is divided into two parts: a stationary basic pulse and a nonstationary attenuation pulse. A standard ARMA model is used for the basic pulse, and a nonstandard ARMA model is derived for the attenuation pulse. The maximum likelihood estimator of the attenuated...
Planck 2013 results. XV. CMB power spectra and likelihood
Ade, P.A.R.; Armitage-Caplan, C.; Arnaud, M.; Ashdown, M.; Atrio-Barandela, F.; Aumont, J.; Baccigalupi, C.; Banday, A.J.; Barreiro, R.B.; Bartlett, J.G.; Battaner, E.; Benabed, K.; Benoit, A.; Benoit-Levy, A.; Bernard, J.P.; Bersanelli, M.; Bielewicz, P.; Bobin, J.; Bock, J.J.; Bonaldi, A.; Bonavera, L.; Bond, J.R.; Borrill, J.; Bouchet, F.R.; Boulanger, F.; Bridges, M.; Bucher, M.; Burigana, C.; Butler, R.C.; Calabrese, E.; Cardoso, J.F.; Catalano, A.; Challinor, A.; Chamballu, A.; Chiang, L.Y.; Chiang, H.C.; Christensen, P.R.; Church, S.; Clements, D.L.; Colombi, S.; Colombo, L.P.L.; Combet, C.; Couchot, F.; Coulais, A.; Crill, B.P.; Curto, A.; Cuttaia, F.; Danese, L.; Davies, R.D.; Davis, R.J.; de Bernardis, P.; de Rosa, A.; de Zotti, G.; Delabrouille, J.; Delouis, J.M.; Desert, F.X.; Dickinson, C.; Diego, J.M.; Dole, H.; Donzelli, S.; Dore, O.; Douspis, M.; Dunkley, J.; Dupac, X.; Efstathiou, G.; Elsner, F.; Ensslin, T.A.; Eriksen, H.K.; Finelli, F.; Forni, O.; Frailis, M.; Fraisse, A.A.; Franceschi, E.; Gaier, T.C.; Galeotta, S.; Galli, S.; Ganga, K.; Giard, M.; Giardino, G.; Giraud-Heraud, Y.; Gjerlow, E.; Gonzalez-Nuevo, J.; Gorski, K.M.; Gratton, S.; Gregorio, A.; Gruppuso, A.; Gudmundsson, J.E.; Hansen, F.K.; Hanson, D.; Harrison, D.; Helou, G.; Henrot-Versille, S.; Hernandez-Monteagudo, C.; Herranz, D.; Hildebrandt, S.R.; Hivon, E.; Hobson, M.; Holmes, W.A.; Hornstrup, A.; Hovest, W.; Huffenberger, K.M.; Hurier, G.; Jaffe, T.R.; Jaffe, A.H.; Jewell, J.; Jones, W.C.; Juvela, M.; Keihanen, E.; Keskitalo, R.; Kiiveri, K.; Kisner, T.S.; Kneissl, R.; Knoche, J.; Knox, L.; Kunz, M.; Kurki-Suonio, H.; Lagache, G.; Lahteenmaki, A.; Lamarre, J.M.; Lasenby, A.; Lattanzi, M.; Laureijs, R.J.; Lawrence, C.R.; Le Jeune, M.; Leach, S.; Leahy, J.P.; Leonardi, R.; Leon-Tavares, J.; Lesgourgues, J.; Liguori, M.; Lilje, P.B.; Lindholm, V.; Linden-Vornle, M.; Lopez-Caniego, M.; Lubin, P.M.; Macias-Perez, J.F.; Maffei, B.; Maino, D.; Mandolesi, N.; Marinucci, D.; Maris, M.; Marshall, D.J.; Martin, P.G.; Martinez-Gonzalez, E.; Masi, S.; Matarrese, S.; Matthai, F.; Mazzotta, P.; Meinhold, P.R.; Melchiorri, A.; Mendes, L.; Menegoni, E.; Mennella, A.; Migliaccio, M.; Millea, M.; Mitra, S.; Miville-Deschenes, M.A.; Molinari, D.; Moneti, A.; Montier, L.; Morgante, G.; Mortlock, D.; Moss, A.; Munshi, D.; Naselsky, P.; Nati, F.; Natoli, P.; Netterfield, C.B.; Norgaard-Nielsen, H.U.; Noviello, F.; Novikov, D.; Novikov, I.; O'Dwyer, I.J.; Orieux, F.; Osborne, S.; Oxborrow, C.A.; Paci, F.; Pagano, L.; Pajot, F.; Paladini, R.; Paoletti, D.; Partridge, B.; Pasian, F.; Patanchon, G.; Paykari, P.; Perdereau, O.; Perotto, L.; Perrotta, F.; Piacentini, F.; Piat, M.; Pierpaoli, E.; Pietrobon, D.; Plaszczynski, S.; Pointecouteau, E.; Polenta, G.; Ponthieu, N.; Popa, L.; Poutanen, T.; Pratt, G.W.; Prezeau, G.; Prunet, S.; Puget, J.L.; Rachen, J.P.; Rahlin, A.; Rebolo, R.; Reinecke, M.; Remazeilles, M.; Renault, C.; Ricciardi, S.; Riller, T.; Ringeval, C.; Ristorcelli, I.; Rocha, G.; Rosset, C.; Roudier, G.; Rowan-Robinson, M.; Rubino-Martin, J.A.; Rusholme, B.; Sandri, M.; Sanselme, L.; Santos, D.; Savini, G.; Scott, D.; Seiffert, M.D.; Shellard, E.P.S.; Spencer, L.D.; Starck, J.L.; Stolyarov, V.; Stompor, R.; Sudiwala, R.; Sureau, F.; Sutton, D.; Suur-Uski, A.S.; Sygnet, J.F.; Tauber, J.A.; Tavagnacco, D.; Terenzi, L.; Toffolatti, L.; Tomasi, M.; Tristram, M.; Tucci, M.; Tuovinen, J.; Turler, M.; Valenziano, L.; Valiviita, J.; Van Tent, B.; Varis, J.; Vielva, P.; Villa, F.; Vittorio, N.; Wade, L.A.; Wandelt, B.D.; Wehus, I.K.; White, M.; White, S.D.M.; Yvon, D.; Zacchei, A.; Zonca, A.
2014-01-01
We present the Planck likelihood, a complete statistical description of the two-point correlation function of the CMB temperature fluctuations. We use this likelihood to derive the Planck CMB power spectrum over three decades in l, covering 2 = 50, we employ a correlated Gaussian likelihood approximation based on angular cross-spectra derived from the 100, 143 and 217 GHz channels. We validate our likelihood through an extensive suite of consistency tests, and assess the impact of residual foreground and instrumental uncertainties on cosmological parameters. We find good internal agreement among the high-l cross-spectra with residuals of a few uK^2 at l <= 1000. We compare our results with foreground-cleaned CMB maps, and with cross-spectra derived from the 70 GHz Planck map, and find broad agreement in terms of spectrum residuals and cosmological parameters. The best-fit LCDM cosmology is in excellent agreement with preliminary Planck polarisation spectra. The standard LCDM cosmology is well constrained b...
MAXIMUM-LIKELIHOOD-ESTIMATION OF THE ENTROPY OF AN ATTRACTOR
SCHOUTEN, JC; TAKENS, F; VANDENBLEEK, CM
In this paper, a maximum-likelihood estimate of the (Kolmogorov) entropy of an attractor is proposed that can be obtained directly from a time series. Also, the relative standard deviation of the entropy estimate is derived; it is dependent on the entropy and on the number of samples used in the
A simplification of the likelihood ratio test statistic for testing ...
African Journals Online (AJOL)
The traditional likelihood ratio test statistic for testing hypothesis about goodness of fit of multinomial probabilities in one, two and multi – dimensional contingency table was simplified. Advantageously, using the simplified version of the statistic to test the null hypothesis is easier and faster because calculating the expected ...
Adaptive Unscented Kalman Filter using Maximum Likelihood Estimation
DEFF Research Database (Denmark)
Mahmoudi, Zeinab; Poulsen, Niels Kjølstad; Madsen, Henrik
2017-01-01
The purpose of this study is to develop an adaptive unscented Kalman filter (UKF) by tuning the measurement noise covariance. We use the maximum likelihood estimation (MLE) and the covariance matching (CM) method to estimate the noise covariance. The multi-step prediction errors generated...
Likelihood-based Dynamic Factor Analysis for Measurement and Forecasting
Jungbacker, B.M.J.P.; Koopman, S.J.
2015-01-01
We present new results for the likelihood-based analysis of the dynamic factor model. The latent factors are modelled by linear dynamic stochastic processes. The idiosyncratic disturbance series are specified as autoregressive processes with mutually correlated innovations. The new results lead to
Composite likelihood and two-stage estimation in family studies
DEFF Research Database (Denmark)
Andersen, Elisabeth Anne Wreford
2004-01-01
In this paper register based family studies provide the motivation for linking a two-stage estimation procedure in copula models for multivariate failure time data with a composite likelihood approach. The asymptotic properties of the estimators in both parametric and semi-parametric models are d...
Reconceptualizing Social Influence in Counseling: The Elaboration Likelihood Model.
McNeill, Brian W.; Stoltenberg, Cal D.
1989-01-01
Presents Elaboration Likelihood Model (ELM) of persuasion (a reconceptualization of the social influence process) as alternative model of attitude change. Contends ELM unifies conflicting social psychology results and can potentially account for inconsistent research findings in counseling psychology. Provides guidelines on integrating…
Counseling Pretreatment and the Elaboration Likelihood Model of Attitude Change.
Heesacker, Martin
1986-01-01
Results of the application of the Elaboration Likelihood Model (ELM) to a counseling context revealed that more favorable attitudes toward counseling occurred as subjects' ego involvement increased and as intervention quality improved. Counselor credibility affected the degree to which subjects' attitudes reflected argument quality differences.…
Cases in which ancestral maximum likelihood will be confusingly misleading.
Handelman, Tomer; Chor, Benny
2017-05-07
Ancestral maximum likelihood (AML) is a phylogenetic tree reconstruction criteria that "lies between" maximum parsimony (MP) and maximum likelihood (ML). ML has long been known to be statistically consistent. On the other hand, Felsenstein (1978) showed that MP is statistically inconsistent, and even positively misleading: There are cases where the parsimony criteria, applied to data generated according to one tree topology, will be optimized on a different tree topology. The question of weather AML is statistically consistent or not has been open for a long time. Mossel et al. (2009) have shown that AML can "shrink" short tree edges, resulting in a star tree with no internal resolution, which yields a better AML score than the original (resolved) model. This result implies that AML is statistically inconsistent, but not that it is positively misleading, because the star tree is compatible with any other topology. We show that AML is confusingly misleading: For some simple, four taxa (resolved) tree, the ancestral likelihood optimization criteria is maximized on an incorrect (resolved) tree topology, as well as on a star tree (both with specific edge lengths), while the tree with the original, correct topology, has strictly lower ancestral likelihood. Interestingly, the two short edges in the incorrect, resolved tree topology are of length zero, and are not adjacent, so this resolved tree is in fact a simple path. While for MP, the underlying phenomenon can be described as long edge attraction, it turns out that here we have long edge repulsion. Copyright © 2017. Published by Elsevier Ltd.
Multilevel maximum likelihood estimation with application to covariance matrices
Czech Academy of Sciences Publication Activity Database
Turčičová, Marie; Mandel, J.; Eben, Kryštof
Published online: 23 January ( 2018 ) ISSN 0361-0926 R&D Projects: GA ČR GA13-34856S Institutional support: RVO:67985807 Keywords : Fisher information * High dimension * Hierarchical maximum likelihood * Nested parameter spaces * Spectral diagonal covariance model * Sparse inverse covariance model Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.311, year: 2016
Pendeteksian Outlier pada Regresi Nonlinier dengan Metode statistik Likelihood Displacement
Directory of Open Access Journals (Sweden)
Siti Tabi'atul Hasanah
2012-11-01
Full Text Available Outlier is an observation that much different (extreme from the other observational data, or data can be interpreted that do not follow the general pattern of the model. Sometimes outliers provide information that can not be provided by other data. That's why outliers should not just be eliminated. Outliers can also be an influential observation. There are many methods that can be used to detect of outliers. In previous studies done on outlier detection of linear regression. Next will be developed detection of outliers in nonlinear regression. Nonlinear regression here is devoted to multiplicative nonlinear regression. To detect is use of statistical method likelihood displacement. Statistical methods abbreviated likelihood displacement (LD is a method to detect outliers by removing the suspected outlier data. To estimate the parameters are used to the maximum likelihood method, so we get the estimate of the maximum. By using LD method is obtained i.e likelihood displacement is thought to contain outliers. Further accuracy of LD method in detecting the outliers are shown by comparing the MSE of LD with the MSE from the regression in general. Statistic test used is Λ. Initial hypothesis was rejected when proved so is an outlier.
Interactive Instruction in Bayesian Inference
DEFF Research Database (Denmark)
Khan, Azam; Breslav, Simon; Hornbæk, Kasper
2018-01-01
An instructional approach is presented to improve human performance in solving Bayesian inference problems. Starting from the original text of the classic Mammography Problem, the textual expression is modified and visualizations are added according to Mayer’s principles of instruction. These pri......An instructional approach is presented to improve human performance in solving Bayesian inference problems. Starting from the original text of the classic Mammography Problem, the textual expression is modified and visualizations are added according to Mayer’s principles of instruction....... These principles concern coherence, personalization, signaling, segmenting, multimedia, spatial contiguity, and pretraining. Principles of self-explanation and interactivity are also applied. Four experiments on the Mammography Problem showed that these principles help participants answer the questions...... that an instructional approach to improving human performance in Bayesian inference is a promising direction....
On Maximum Entropy and Inference
Directory of Open Access Journals (Sweden)
Luigi Gresele
2017-11-01
Full Text Available Maximum entropy is a powerful concept that entails a sharp separation between relevant and irrelevant variables. It is typically invoked in inference, once an assumption is made on what the relevant variables are, in order to estimate a model from data, that affords predictions on all other (dependent variables. Conversely, maximum entropy can be invoked to retrieve the relevant variables (sufficient statistics directly from the data, once a model is identified by Bayesian model selection. We explore this approach in the case of spin models with interactions of arbitrary order, and we discuss how relevant interactions can be inferred. In this perspective, the dimensionality of the inference problem is not set by the number of parameters in the model, but by the frequency distribution of the data. We illustrate the method showing its ability to recover the correct model in a few prototype cases and discuss its application on a real dataset.
Empirical philosophy of science
DEFF Research Database (Denmark)
Wagenknecht, Susann; Nersessian, Nancy J.; Andersen, Hanne
2015-01-01
A growing number of philosophers of science make use of qualitative empirical data, a development that may reconfigure the relations between philosophy and sociology of science and that is reminiscent of efforts to integrate history and philosophy of science. Therefore, the first part...... of this introduction to the volume Empirical Philosophy of Science outlines the history of relations between philosophy and sociology of science on the one hand, and philosophy and history of science on the other. The second part of this introduction offers an overview of the papers in the volume, each of which...... is giving its own answer to questions such as: Why does the use of qualitative empirical methods benefit philosophical accounts of science? And how should these methods be used by the philosopher?...
Learning and inference in a nonequilibrium Ising model with hidden nodes.
Dunn, Benjamin; Roudi, Yasser
2013-02-01
We study inference and reconstruction of couplings in a partially observed kinetic Ising model. With hidden spins, calculating the likelihood of a sequence of observed spin configurations requires performing a trace over the configurations of the hidden ones. This, as we show, can be represented as a path integral. Using this representation, we demonstrate that systematic approximate inference and learning rules can be derived using dynamical mean-field theory. Although naive mean-field theory leads to an unstable learning rule, taking into account Gaussian corrections allows learning the couplings involving hidden nodes. It also improves learning of the couplings between the observed nodes compared to when hidden nodes are ignored.
Exclusion probabilities and likelihood ratios with applications to mixtures.
Slooten, Klaas-Jan; Egeland, Thore
2016-01-01
The statistical evidence obtained from mixed DNA profiles can be summarised in several ways in forensic casework including the likelihood ratio (LR) and the Random Man Not Excluded (RMNE) probability. The literature has seen a discussion of the advantages and disadvantages of likelihood ratios and exclusion probabilities, and part of our aim is to bring some clarification to this debate. In a previous paper, we proved that there is a general mathematical relationship between these statistics: RMNE can be expressed as a certain average of the LR, implying that the expected value of the LR, when applied to an actual contributor to the mixture, is at least equal to the inverse of the RMNE. While the mentioned paper presented applications for kinship problems, the current paper demonstrates the relevance for mixture cases, and for this purpose, we prove some new general properties. We also demonstrate how to use the distribution of the likelihood ratio for donors of a mixture, to obtain estimates for exceedance probabilities of the LR for non-donors, of which the RMNE is a special case corresponding to L R>0. In order to derive these results, we need to view the likelihood ratio as a random variable. In this paper, we describe how such a randomization can be achieved. The RMNE is usually invoked only for mixtures without dropout. In mixtures, artefacts like dropout and drop-in are commonly encountered and we address this situation too, illustrating our results with a basic but widely implemented model, a so-called binary model. The precise definitions, modelling and interpretation of the required concepts of dropout and drop-in are not entirely obvious, and we attempt to clarify them here in a general likelihood framework for a binary model.
DEFF Research Database (Denmark)
Gravier, Magali
2011-01-01
The article discusses the concepts of federation and empire in the context of the European Union (EU). Even if these two concepts are not usually contrasted to one another, the article shows that they refer to related type of polities. Furthermore, they can be used at a time because they shed light...... on different and complementary aspects of the European integration process. The article concludes that the EU is at the crossroads between federation and empire and may remain an ‘imperial federation’ for several decades. This could mean that the EU is on the verge of transforming itself to another type...
Empirical comparison of theories
International Nuclear Information System (INIS)
Opp, K.D.; Wippler, R.
1990-01-01
The book represents the first, comprehensive attempt to take an empirical approach for comparative assessment of theories in sociology. The aims, problems, and advantages of the empirical approach are discussed in detail, and the three theories selected for the purpose of this work are explained. Their comparative assessment is performed within the framework of several research projects, which among other subjects also investigate the social aspects of the protest against nuclear power plants. The theories analysed in this context are the theory of mental incongruities and that of the benefit, and their efficiency in explaining protest behaviour is compared. (orig./HSCH) [de
DEFF Research Database (Denmark)
Grund, Cynthia M.
The toolbox for empirically exploring the ways that artistic endeavors convey and activate meaning on the part of performers and audiences continues to expand. Current work employing methods at the intersection of performance studies, philosophy, motion capture and neuroscience to better understand...... musical performance and reception is inspired by traditional approaches within aesthetics, but it also challenges some of the presuppositions inherent in them. As an example of such work I present a research project in empirical music aesthetics begun last year and of which I am a team member....
Eight challenges in phylodynamic inference
Directory of Open Access Journals (Sweden)
Simon D.W. Frost
2015-03-01
Full Text Available The field of phylodynamics, which attempts to enhance our understanding of infectious disease dynamics using pathogen phylogenies, has made great strides in the past decade. Basic epidemiological and evolutionary models are now well characterized with inferential frameworks in place. However, significant challenges remain in extending phylodynamic inference to more complex systems. These challenges include accounting for evolutionary complexities such as changing mutation rates, selection, reassortment, and recombination, as well as epidemiological complexities such as stochastic population dynamics, host population structure, and different patterns at the within-host and between-host scales. An additional challenge exists in making efficient inferences from an ever increasing corpus of sequence data.
Problem solving and inference mechanisms
Energy Technology Data Exchange (ETDEWEB)
Furukawa, K; Nakajima, R; Yonezawa, A; Goto, S; Aoyama, A
1982-01-01
The heart of the fifth generation computer will be powerful mechanisms for problem solving and inference. A deduction-oriented language is to be designed, which will form the core of the whole computing system. The language is based on predicate logic with the extended features of structuring facilities, meta structures and relational data base interfaces. Parallel computation mechanisms and specialized hardware architectures are being investigated to make possible efficient realization of the language features. The project includes research into an intelligent programming system, a knowledge representation language and system, and a meta inference system to be built on the core. 30 references.
Computation of the Likelihood of Joint Site Frequency Spectra Using Orthogonal Polynomials
Directory of Open Access Journals (Sweden)
Claus Vogl
2016-02-01
Full Text Available In population genetics, information about evolutionary forces, e.g., mutation, selection and genetic drift, is often inferred from DNA sequence information. Generally, DNA consists of two long strands of nucleotides or sites that pair via the complementary bases cytosine and guanine (C and G, on the one hand, and adenine and thymine (A and T, on the other. With whole genome sequencing, most genomic information stored in the DNA has become available for multiple individuals of one or more populations, at least in humans and model species, such as fruit flies of the genus Drosophila. In a genome-wide sample of L sites for M (haploid individuals, the state of each site may be made binary, by binning the complementary bases, e.g., C with G to C/G, and contrasting C/G to A/T, to obtain a “site frequency spectrum” (SFS. Two such samples of either a single population from different time-points or two related populations from a single time-point are called joint site frequency spectra (joint SFS. While mathematical models describing the interplay of mutation, drift and selection have been available for more than 80 years, calculation of exact likelihoods from joint SFS is difficult. Sufficient statistics for inference of, e.g., mutation or selection parameters that would make use of all the information in the genomic data are rarely available. Hence, often suites of crude summary statistics are combined in simulation-based computational approaches. In this article, we use a bi-allelic boundary-mutation and drift population genetic model to compute the transition probabilities of joint SFS using orthogonal polynomials. This allows inference of population genetic parameters, such as the mutation rate (scaled by the population size and the time separating the two samples. We apply this inference method to a population dataset of neutrally-evolving short intronic sites from six DNA sequences of the fruit fly Drosophila melanogaster and the reference
Inference of trustworthiness from intuitive moral judgments.
Everett, Jim A C; Pizarro, David A; Crockett, M J
2016-06-01
Moral judgments play a critical role in motivating and enforcing human cooperation, and research on the proximate mechanisms of moral judgments highlights the importance of intuitive, automatic processes in forming such judgments. Intuitive moral judgments often share characteristics with deontological theories in normative ethics, which argue that certain acts (such as killing) are absolutely wrong, regardless of their consequences. Why do moral intuitions typically follow deontological prescriptions, as opposed to those of other ethical theories? Here, we test a functional explanation for this phenomenon by investigating whether agents who express deontological moral judgments are more valued as social partners. Across 5 studies, we show that people who make characteristically deontological judgments are preferred as social partners, perceived as more moral and trustworthy, and are trusted more in economic games. These findings provide empirical support for a partner choice account of moral intuitions whereby typically deontological judgments confer an adaptive function by increasing a person's likelihood of being chosen as a cooperation partner. Therefore, deontological moral intuitions may represent an evolutionarily prescribed prior that was selected for through partner choice mechanisms. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Psychotic Experiences and Overhasty Inferences Are Related to Maladaptive Learning.
Directory of Open Access Journals (Sweden)
Heiner Stuke
2017-01-01
Full Text Available Theoretical accounts suggest that an alteration in the brain's learning mechanisms might lead to overhasty inferences, resulting in psychotic symptoms. Here, we sought to elucidate the suggested link between maladaptive learning and psychosis. Ninety-eight healthy individuals with varying degrees of delusional ideation and hallucinatory experiences performed a probabilistic reasoning task that allowed us to quantify overhasty inferences. Replicating previous results, we found a relationship between psychotic experiences and overhasty inferences during probabilistic reasoning. Computational modelling revealed that the behavioral data was best explained by a novel computational learning model that formalizes the adaptiveness of learning by a non-linear distortion of prediction error processing, where an increased non-linearity implies a growing resilience against learning from surprising and thus unreliable information (large prediction errors. Most importantly, a decreased adaptiveness of learning predicted delusional ideation and hallucinatory experiences. Our current findings provide a formal description of the computational mechanisms underlying overhasty inferences, thereby empirically substantiating theories that link psychosis to maladaptive learning.
Empirical research through design
Keyson, D.V.; Bruns, M.
2009-01-01
This paper describes the empirical research through design method (ERDM), which differs from current approaches to research through design by enforcing the need for the designer, after a series of pilot prototype based studies, to a-priori develop a number of testable interaction design hypothesis
Essays in empirical microeconomics
Péter, A.N.
2016-01-01
The empirical studies in this thesis investigate various factors that could affect individuals' labor market, family formation and educational outcomes. Chapter 2 focuses on scheduling as a potential determinant of individuals' productivity. Chapter 3 looks at the role of a family factor on
Worship, Reflection, Empirical Research
Ding Dong,
2012-01-01
In my youth, I was a worshipper of Mao Zedong. From the latter stage of the Mao Era to the early years of Reform and Opening, I began to reflect on Mao and the Communist Revolution he launched. In recent years I’ve devoted myself to empirical historical research on Mao, seeking the truth about Mao and China’s modern history.
DEFF Research Database (Denmark)
Bang, Peter Fibiger
2007-01-01
This articles seeks to establish a new set of organizing concepts for the analysis of the Roman imperial economy from Republic to late antiquity: tributary empire, port-folio capitalism and protection costs. Together these concepts explain better economic developments in the Roman world than the...
Empirically sampling Universal Dependencies
DEFF Research Database (Denmark)
Schluter, Natalie; Agic, Zeljko
2017-01-01
Universal Dependencies incur a high cost in computation for unbiased system development. We propose a 100% empirically chosen small subset of UD languages for efficient parsing system development. The technique used is based on measurements of model capacity globally. We show that the diversity o...
Indirect inference with time series observed with error
DEFF Research Database (Denmark)
Rossi, Eduardo; Santucci de Magistris, Paolo
estimation. We propose to solve this inconsistency by jointly estimating the nuisance and the structural parameters. Under standard assumptions, this estimator is consistent and asymptotically normal. A condition for the identification of ARMA plus noise is obtained. The proposed methodology is used......We analyze the properties of the indirect inference estimator when the observed series are contaminated by measurement error. We show that the indirect inference estimates are asymptotically biased when the nuisance parameters of the measurement error distribution are neglected in the indirect...... to estimate the parameters of continuous-time stochastic volatility models with auxiliary specifications based on realized volatility measures. Monte Carlo simulations shows the bias reduction of the indirect estimates obtained when the microstructure noise is explicitly modeled. Finally, an empirical...
Object-Oriented Type Inference
DEFF Research Database (Denmark)
Schwartzbach, Michael Ignatieff; Palsberg, Jens
1991-01-01
We present a new approach to inferring types in untyped object-oriented programs with inheritance, assignments, and late binding. It guarantees that all messages are understood, annotates the program with type information, allows polymorphic methods, and can be used as the basis of an op...
Inference in hybrid Bayesian networks
DEFF Research Database (Denmark)
Lanseth, Helge; Nielsen, Thomas Dyhre; Rumí, Rafael
2009-01-01
Since the 1980s, Bayesian Networks (BNs) have become increasingly popular for building statistical models of complex systems. This is particularly true for boolean systems, where BNs often prove to be a more efficient modelling framework than traditional reliability-techniques (like fault trees...... decade's research on inference in hybrid Bayesian networks. The discussions are linked to an example model for estimating human reliability....
Statistical inference and Aristotle's Rhetoric.
Macdonald, Ranald R
2004-11-01
Formal logic operates in a closed system where all the information relevant to any conclusion is present, whereas this is not the case when one reasons about events and states of the world. Pollard and Richardson drew attention to the fact that the reasoning behind statistical tests does not lead to logically justifiable conclusions. In this paper statistical inferences are defended not by logic but by the standards of everyday reasoning. Aristotle invented formal logic, but argued that people mostly get at the truth with the aid of enthymemes--incomplete syllogisms which include arguing from examples, analogies and signs. It is proposed that statistical tests work in the same way--in that they are based on examples, invoke the analogy of a model and use the size of the effect under test as a sign that the chance hypothesis is unlikely. Of existing theories of statistical inference only a weak version of Fisher's takes this into account. Aristotle anticipated Fisher by producing an argument of the form that there were too many cases in which an outcome went in a particular direction for that direction to be plausibly attributed to chance. We can therefore conclude that Aristotle would have approved of statistical inference and there is a good reason for calling this form of statistical inference classical.
A composite likelihood approach for spatially correlated survival data
Paik, Jane; Ying, Zhiliang
2013-01-01
The aim of this paper is to provide a composite likelihood approach to handle spatially correlated survival data using pairwise joint distributions. With e-commerce data, a recent question of interest in marketing research has been to describe spatially clustered purchasing behavior and to assess whether geographic distance is the appropriate metric to describe purchasing dependence. We present a model for the dependence structure of time-to-event data subject to spatial dependence to characterize purchasing behavior from the motivating example from e-commerce data. We assume the Farlie-Gumbel-Morgenstern (FGM) distribution and then model the dependence parameter as a function of geographic and demographic pairwise distances. For estimation of the dependence parameters, we present pairwise composite likelihood equations. We prove that the resulting estimators exhibit key properties of consistency and asymptotic normality under certain regularity conditions in the increasing-domain framework of spatial asymptotic theory. PMID:24223450
A composite likelihood approach for spatially correlated survival data.
Paik, Jane; Ying, Zhiliang
2013-01-01
The aim of this paper is to provide a composite likelihood approach to handle spatially correlated survival data using pairwise joint distributions. With e-commerce data, a recent question of interest in marketing research has been to describe spatially clustered purchasing behavior and to assess whether geographic distance is the appropriate metric to describe purchasing dependence. We present a model for the dependence structure of time-to-event data subject to spatial dependence to characterize purchasing behavior from the motivating example from e-commerce data. We assume the Farlie-Gumbel-Morgenstern (FGM) distribution and then model the dependence parameter as a function of geographic and demographic pairwise distances. For estimation of the dependence parameters, we present pairwise composite likelihood equations. We prove that the resulting estimators exhibit key properties of consistency and asymptotic normality under certain regularity conditions in the increasing-domain framework of spatial asymptotic theory.
Secondary Analysis under Cohort Sampling Designs Using Conditional Likelihood
Directory of Open Access Journals (Sweden)
Olli Saarela
2012-01-01
Full Text Available Under cohort sampling designs, additional covariate data are collected on cases of a specific type and a randomly selected subset of noncases, primarily for the purpose of studying associations with a time-to-event response of interest. With such data available, an interest may arise to reuse them for studying associations between the additional covariate data and a secondary non-time-to-event response variable, usually collected for the whole study cohort at the outset of the study. Following earlier literature, we refer to such a situation as secondary analysis. We outline a general conditional likelihood approach for secondary analysis under cohort sampling designs and discuss the specific situations of case-cohort and nested case-control designs. We also review alternative methods based on full likelihood and inverse probability weighting. We compare the alternative methods for secondary analysis in two simulated settings and apply them in a real-data example.
GENERALIZATION OF RAYLEIGH MAXIMUM LIKELIHOOD DESPECKLING FILTER USING QUADRILATERAL KERNELS
Directory of Open Access Journals (Sweden)
S. Sridevi
2013-02-01
Full Text Available Speckle noise is the most prevalent noise in clinical ultrasound images. It visibly looks like light and dark spots and deduce the pixel intensity as murkiest. Gazing at fetal ultrasound images, the impact of edge and local fine details are more palpable for obstetricians and gynecologists to carry out prenatal diagnosis of congenital heart disease. A robust despeckling filter has to be contrived to proficiently suppress speckle noise and simultaneously preserve the features. The proposed filter is the generalization of Rayleigh maximum likelihood filter by the exploitation of statistical tools as tuning parameters and use different shapes of quadrilateral kernels to estimate the noise free pixel from neighborhood. The performance of various filters namely Median, Kuwahura, Frost, Homogenous mask filter and Rayleigh maximum likelihood filter are compared with the proposed filter in terms PSNR and image profile. Comparatively the proposed filters surpass the conventional filters.
Maximum Likelihood Compton Polarimetry with the Compton Spectrometer and Imager
Energy Technology Data Exchange (ETDEWEB)
Lowell, A. W.; Boggs, S. E; Chiu, C. L.; Kierans, C. A.; Sleator, C.; Tomsick, J. A.; Zoglauer, A. C. [Space Sciences Laboratory, University of California, Berkeley (United States); Chang, H.-K.; Tseng, C.-H.; Yang, C.-Y. [Institute of Astronomy, National Tsing Hua University, Taiwan (China); Jean, P.; Ballmoos, P. von [IRAP Toulouse (France); Lin, C.-H. [Institute of Physics, Academia Sinica, Taiwan (China); Amman, M. [Lawrence Berkeley National Laboratory (United States)
2017-10-20
Astrophysical polarization measurements in the soft gamma-ray band are becoming more feasible as detectors with high position and energy resolution are deployed. Previous work has shown that the minimum detectable polarization (MDP) of an ideal Compton polarimeter can be improved by ∼21% when an unbinned, maximum likelihood method (MLM) is used instead of the standard approach of fitting a sinusoid to a histogram of azimuthal scattering angles. Here we outline a procedure for implementing this maximum likelihood approach for real, nonideal polarimeters. As an example, we use the recent observation of GRB 160530A with the Compton Spectrometer and Imager. We find that the MDP for this observation is reduced by 20% when the MLM is used instead of the standard method.
Physical constraints on the likelihood of life on exoplanets
Lingam, Manasvi; Loeb, Abraham
2018-04-01
One of the most fundamental questions in exoplanetology is to determine whether a given planet is habitable. We estimate the relative likelihood of a planet's propensity towards habitability by considering key physical characteristics such as the role of temperature on ecological and evolutionary processes, and atmospheric losses via hydrodynamic escape and stellar wind erosion. From our analysis, we demonstrate that Earth-sized exoplanets in the habitable zone around M-dwarfs seemingly display much lower prospects of being habitable relative to Earth, owing to the higher incident ultraviolet fluxes and closer distances to the host star. We illustrate our results by specifically computing the likelihood (of supporting life) for the recently discovered exoplanets, Proxima b and TRAPPIST-1e, which we find to be several orders of magnitude smaller than that of Earth.
THESEUS: maximum likelihood superpositioning and analysis of macromolecular structures.
Theobald, Douglas L; Wuttke, Deborah S
2006-09-01
THESEUS is a command line program for performing maximum likelihood (ML) superpositions and analysis of macromolecular structures. While conventional superpositioning methods use ordinary least-squares (LS) as the optimization criterion, ML superpositions provide substantially improved accuracy by down-weighting variable structural regions and by correcting for correlations among atoms. ML superpositioning is robust and insensitive to the specific atoms included in the analysis, and thus it does not require subjective pruning of selected variable atomic coordinates. Output includes both likelihood-based and frequentist statistics for accurate evaluation of the adequacy of a superposition and for reliable analysis of structural similarities and differences. THESEUS performs principal components analysis for analyzing the complex correlations found among atoms within a structural ensemble. ANSI C source code and selected binaries for various computing platforms are available under the GNU open source license from http://monkshood.colorado.edu/theseus/ or http://www.theseus3d.org.
A Cautionary Analysis of STAPLE Using Direct Inference of Segmentation Truth
DEFF Research Database (Denmark)
Van Leemput, Koen; Sabuncu, Mert R.
2014-01-01
In this paper we analyze the properties of the well-known segmentation fusion algorithm STAPLE, using a novel inference technique that analytically marginalizes out all model parameters. We demonstrate both theoretically and empirically that when the number of raters is large, or when consensus r...
Inconsistency of Bayesian Inference for Misspecified Linear Models, and a Proposal for Repairing It
Grünwald, P.; van Ommen, T.
2017-01-01
We empirically show that Bayesian inference can be inconsistent under misspecification in simple linear regression problems, both in a model averaging/selection and in a Bayesian ridge regression setting. We use the standard linear model, which assumes homoskedasticity, whereas the data are
Fusion And Inference From Multiple And Massive Disparate Distributed Dynamic Data Sets
2017-07-01
computational execution together form a comprehensive, widely- applicable paradigm for statistical graph inference. Approved for Public Release; Distribution...always involve challenging empirical modeling and implementation issues. Our project has propelled the mathematical development, statistical design...D. J., and Sussman, D. L., “A limit theorem for scaled eigenvectors of random dot product graphs,” Sankhya A. Mathemat - ical Statistics and
Inconsistency of Bayesian inference for misspecified linear models, and a proposal for repairing it
P.D. Grünwald (Peter); T. van Ommen (Thijs)
2017-01-01
textabstractWe empirically show that Bayesian inference can be inconsistent under misspecification in simple linear regression problems, both in a model averaging/selection and in a Bayesian ridge regression setting. We use the standard linear model, which assumes homoskedasticity, whereas the data
Filipiak, Katarzyna; Klein, Daniel; Roy, Anuradha
2017-01-01
The problem of testing the separability of a covariance matrix against an unstructured variance-covariance matrix is studied in the context of multivariate repeated measures data using Rao's score test (RST). The RST statistic is developed with the first component of the separable structure as a first-order autoregressive (AR(1)) correlation matrix or an unstructured (UN) covariance matrix under the assumption of multivariate normality. It is shown that the distribution of the RST statistic under the null hypothesis of any separability does not depend on the true values of the mean or the unstructured components of the separable structure. A significant advantage of the RST is that it can be performed for small samples, even smaller than the dimension of the data, where the likelihood ratio test (LRT) cannot be used, and it outperforms the standard LRT in a number of contexts. Monte Carlo simulations are then used to study the comparative behavior of the null distribution of the RST statistic, as well as that of the LRT statistic, in terms of sample size considerations, and for the estimation of the empirical percentiles. Our findings are compared with existing results where the first component of the separable structure is a compound symmetry (CS) correlation matrix. It is also shown by simulations that the empirical null distribution of the RST statistic converges faster than the empirical null distribution of the LRT statistic to the limiting χ 2 distribution. The tests are implemented on a real dataset from medical studies. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Menyoal Elaboration Likelihood Model (ELM) dan Teori Retorika
Yudi Perbawaningsih
2012-01-01
Abstract: Persuasion is a communication process to establish or change attitudes, which can be understood through theory of Rhetoric and theory of Elaboration Likelihood Model (ELM). This study elaborates these theories in a Public Lecture series which to persuade the students in choosing their concentration of study. The result shows that in term of persuasion effectiveness it is not quite relevant to separate the message and its source. The quality of source is determined by the quality of ...
Maximum Likelihood, Consistency and Data Envelopment Analysis: A Statistical Foundation
Rajiv D. Banker
1993-01-01
This paper provides a formal statistical basis for the efficiency evaluation techniques of data envelopment analysis (DEA). DEA estimators of the best practice monotone increasing and concave production function are shown to be also maximum likelihood estimators if the deviation of actual output from the efficient output is regarded as a stochastic variable with a monotone decreasing probability density function. While the best practice frontier estimator is biased below the theoretical front...
Maximum likelihood convolutional decoding (MCD) performance due to system losses
Webster, L.
1976-01-01
A model for predicting the computational performance of a maximum likelihood convolutional decoder (MCD) operating in a noisy carrier reference environment is described. This model is used to develop a subroutine that will be utilized by the Telemetry Analysis Program to compute the MCD bit error rate. When this computational model is averaged over noisy reference phase errors using a high-rate interpolation scheme, the results are found to agree quite favorably with experimental measurements.
Menyoal Elaboration Likelihood Model (ELM) Dan Teori Retorika
Perbawaningsih, Yudi
2012-01-01
: Persuasion is a communication process to establish or change attitudes, which can be understood through theory of Rhetoric and theory of Elaboration Likelihood Model (ELM). This study elaborates these theories in a Public Lecture series which to persuade the students in choosing their concentration of study. The result shows that in term of persuasion effectiveness it is not quite relevant to separate the message and its source. The quality of source is determined by the quality of the mess...
Penggunaan Elaboration Likelihood Model dalam Menganalisis Penerimaan Teknologi Informasi
vitrian, vitrian2
2010-01-01
This article discusses some technology acceptance models in an organization. Thorough analysis of how technology is acceptable help managers make any planning to implement new teachnology and make sure that new technology could enhance organization's performance. Elaboration Likelihood Model (ELM) is the one which sheds light on some behavioral factors in acceptance of information technology. The basic tenet of ELM states that human behavior in principle can be influenced through central r...
Statistical Bias in Maximum Likelihood Estimators of Item Parameters.
1982-04-01
34 a> E r’r~e r ,C Ie I# ne,..,.rVi rnd Id.,flfv b1 - bindk numb.r) I; ,t-i i-cd I ’ tiie bias in the maximum likelihood ,st i- i;, ’ t iIeiIrs in...NTC, IL 60088 Psychometric Laboratory University of North Carolina I ERIC Facility-Acquisitions Davie Hall 013A 4833 Rugby Avenue Chapel Hill, NC
Democracy, Autocracy and the Likelihood of International Conflict
Tangerås, Thomas
2008-01-01
This is a game-theoretic analysis of the link between regime type and international conflict. The democratic electorate can credibly punish the leader for bad conflict outcomes, whereas the autocratic selectorate cannot. For the fear of being thrown out of office, democratic leaders are (i) more selective about the wars they initiate and (ii) on average win more of the wars they start. Foreign policy behaviour is found to display strategic complementarities. The likelihood of interstate war, ...
Caching and interpolated likelihoods: accelerating cosmological Monte Carlo Markov chains
Energy Technology Data Exchange (ETDEWEB)
Bouland, Adam; Easther, Richard; Rosenfeld, Katherine, E-mail: adam.bouland@aya.yale.edu, E-mail: richard.easther@yale.edu, E-mail: krosenfeld@cfa.harvard.edu [Department of Physics, Yale University, New Haven CT 06520 (United States)
2011-05-01
We describe a novel approach to accelerating Monte Carlo Markov Chains. Our focus is cosmological parameter estimation, but the algorithm is applicable to any problem for which the likelihood surface is a smooth function of the free parameters and computationally expensive to evaluate. We generate a high-order interpolating polynomial for the log-likelihood using the first points gathered by the Markov chains as a training set. This polynomial then accurately computes the majority of the likelihoods needed in the latter parts of the chains. We implement a simple version of this algorithm as a patch (InterpMC) to CosmoMC and show that it accelerates parameter estimatation by a factor of between two and four for well-converged chains. The current code is primarily intended as a ''proof of concept'', and we argue that there is considerable room for further performance gains. Unlike other approaches to accelerating parameter fits, we make no use of precomputed training sets or special choices of variables, and InterpMC is almost entirely transparent to the user.
Caching and interpolated likelihoods: accelerating cosmological Monte Carlo Markov chains
International Nuclear Information System (INIS)
Bouland, Adam; Easther, Richard; Rosenfeld, Katherine
2011-01-01
We describe a novel approach to accelerating Monte Carlo Markov Chains. Our focus is cosmological parameter estimation, but the algorithm is applicable to any problem for which the likelihood surface is a smooth function of the free parameters and computationally expensive to evaluate. We generate a high-order interpolating polynomial for the log-likelihood using the first points gathered by the Markov chains as a training set. This polynomial then accurately computes the majority of the likelihoods needed in the latter parts of the chains. We implement a simple version of this algorithm as a patch (InterpMC) to CosmoMC and show that it accelerates parameter estimatation by a factor of between two and four for well-converged chains. The current code is primarily intended as a ''proof of concept'', and we argue that there is considerable room for further performance gains. Unlike other approaches to accelerating parameter fits, we make no use of precomputed training sets or special choices of variables, and InterpMC is almost entirely transparent to the user
Maximum likelihood as a common computational framework in tomotherapy
International Nuclear Information System (INIS)
Olivera, G.H.; Shepard, D.M.; Reckwerdt, P.J.; Ruchala, K.; Zachman, J.; Fitchard, E.E.; Mackie, T.R.
1998-01-01
Tomotherapy is a dose delivery technique using helical or axial intensity modulated beams. One of the strengths of the tomotherapy concept is that it can incorporate a number of processes into a single piece of equipment. These processes include treatment optimization planning, dose reconstruction and kilovoltage/megavoltage image reconstruction. A common computational technique that could be used for all of these processes would be very appealing. The maximum likelihood estimator, originally developed for emission tomography, can serve as a useful tool in imaging and radiotherapy. We believe that this approach can play an important role in the processes of optimization planning, dose reconstruction and kilovoltage and/or megavoltage image reconstruction. These processes involve computations that require comparable physical methods. They are also based on equivalent assumptions, and they have similar mathematical solutions. As a result, the maximum likelihood approach is able to provide a common framework for all three of these computational problems. We will demonstrate how maximum likelihood methods can be applied to optimization planning, dose reconstruction and megavoltage image reconstruction in tomotherapy. Results for planning optimization, dose reconstruction and megavoltage image reconstruction will be presented. Strengths and weaknesses of the methodology are analysed. Future directions for this work are also suggested. (author)
Active Inference and Learning in the Cerebellum.
Friston, Karl; Herreros, Ivan
2016-09-01
This letter offers a computational account of Pavlovian conditioning in the cerebellum based on active inference and predictive coding. Using eyeblink conditioning as a canonical paradigm, we formulate a minimal generative model that can account for spontaneous blinking, startle responses, and (delay or trace) conditioning. We then establish the face validity of the model using simulated responses to unconditioned and conditioned stimuli to reproduce the sorts of behavior that are observed empirically. The scheme's anatomical validity is then addressed by associating variables in the predictive coding scheme with nuclei and neuronal populations to match the (extrinsic and intrinsic) connectivity of the cerebellar (eyeblink conditioning) system. Finally, we try to establish predictive validity by reproducing selective failures of delay conditioning, trace conditioning, and extinction using (simulated and reversible) focal lesions. Although rather metaphorical, the ensuing scheme can account for a remarkable range of anatomical and neurophysiological aspects of cerebellar circuitry-and the specificity of lesion-deficit mappings that have been established experimentally. From a computational perspective, this work shows how conditioning or learning can be formulated in terms of minimizing variational free energy (or maximizing Bayesian model evidence) using exactly the same principles that underlie predictive coding in perception.
DEFF Research Database (Denmark)
Rasch, Astrid
of the collective, but insufficient attention has been paid to how individuals respond to such narrative changes. This dissertation examines the relationship between individual and collective memory at the end of empire through analysis of 13 end of empire autobiographies by public intellectuals from Australia......Decolonisation was a major event of the twentieth century, redrawing maps and impacting on identity narratives around the globe. As new nations defined their place in the world, the national and imperial past was retold in new cultural memories. These developments have been studied at the level......, the Anglophone Caribbean and Zimbabwe. I conceive of memory as reconstructive and social, with individual memory striving to make sense of the past in the present in dialogue with surrounding narratives. By examining recurring tropes in the autobiographies, like colonial education, journeys to the imperial...
International Nuclear Information System (INIS)
Guillemoles, A.; Lazareva, A.
2008-01-01
Gazprom is conquering the world. The Russian industrial giant owns the hugest gas reserves and enjoys the privilege of a considerable power. Gazprom edits journals, owns hospitals, airplanes and has even built cities where most of the habitants work for him. With 400000 workers, Gazprom represents 8% of Russia's GDP. This inquiry describes the history and operation of this empire and show how its has become a masterpiece of the government's strategy of russian influence reconquest at the world scale. Is it going to be a winning game? Are the corruption affairs and the expected depletion of resources going to weaken the empire? The authors shade light on the political and diplomatic strategies that are played around the crucial dossier of the energy supply. (J.S.)
Communicating likelihoods and probabilities in forecasts of volcanic eruptions
Doyle, Emma E. H.; McClure, John; Johnston, David M.; Paton, Douglas
2014-02-01
The issuing of forecasts and warnings of natural hazard events, such as volcanic eruptions, earthquake aftershock sequences and extreme weather often involves the use of probabilistic terms, particularly when communicated by scientific advisory groups to key decision-makers, who can differ greatly in relative expertise and function in the decision making process. Recipients may also differ in their perception of relative importance of political and economic influences on interpretation. Consequently, the interpretation of these probabilistic terms can vary greatly due to the framing of the statements, and whether verbal or numerical terms are used. We present a review from the psychology literature on how the framing of information influences communication of these probability terms. It is also unclear as to how people rate their perception of an event's likelihood throughout a time frame when a forecast time window is stated. Previous research has identified that, when presented with a 10-year time window forecast, participants viewed the likelihood of an event occurring ‘today’ as being of less than that in year 10. Here we show that this skew in perception also occurs for short-term time windows (under one week) that are of most relevance for emergency warnings. In addition, unlike the long-time window statements, the use of the phrasing “within the next…” instead of “in the next…” does not mitigate this skew, nor do we observe significant differences between the perceived likelihoods of scientists and non-scientists. This finding suggests that effects occurring due to the shorter time window may be ‘masking’ any differences in perception due to wording or career background observed for long-time window forecasts. These results have implications for scientific advice, warning forecasts, emergency management decision-making, and public information as any skew in perceived event likelihood towards the end of a forecast time window may result in
Christiansen, Bo
2015-04-01
Linear regression methods are without doubt the most used approaches to describe and predict data in the physical sciences. They are often good first order approximations and they are in general easier to apply and interpret than more advanced methods. However, even the properties of univariate regression can lead to debate over the appropriateness of various models as witnessed by the recent discussion about climate reconstruction methods. Before linear regression is applied important choices have to be made regarding the origins of the noise terms and regarding which of the two variables under consideration that should be treated as the independent variable. These decisions are often not easy to make but they may have a considerable impact on the results. We seek to give a unified probabilistic - Bayesian with flat priors - treatment of univariate linear regression and prediction by taking, as starting point, the general errors-in-variables model (Christiansen, J. Clim., 27, 2014-2031, 2014). Other versions of linear regression can be obtained as limits of this model. We derive the likelihood of the model parameters and predictands of the general errors-in-variables model by marginalizing over the nuisance parameters. The resulting likelihood is relatively simple and easy to analyze and calculate. The well known unidentifiability of the errors-in-variables model is manifested as the absence of a well-defined maximum in the likelihood. However, this does not mean that probabilistic inference can not be made; the marginal likelihoods of model parameters and the predictands have, in general, well-defined maxima. We also include a probabilistic version of classical calibration and show how it is related to the errors-in-variables model. The results are illustrated by an example from the coupling between the lower stratosphere and the troposphere in the Northern Hemisphere winter.
Directory of Open Access Journals (Sweden)
Kodner Robin B
2010-10-01
Full Text Available Abstract Background Likelihood-based phylogenetic inference is generally considered to be the most reliable classification method for unknown sequences. However, traditional likelihood-based phylogenetic methods cannot be applied to large volumes of short reads from next-generation sequencing due to computational complexity issues and lack of phylogenetic signal. "Phylogenetic placement," where a reference tree is fixed and the unknown query sequences are placed onto the tree via a reference alignment, is a way to bring the inferential power offered by likelihood-based approaches to large data sets. Results This paper introduces pplacer, a software package for phylogenetic placement and subsequent visualization. The algorithm can place twenty thousand short reads on a reference tree of one thousand taxa per hour per processor, has essentially linear time and memory complexity in the number of reference taxa, and is easy to run in parallel. Pplacer features calculation of the posterior probability of a placement on an edge, which is a statistically rigorous way of quantifying uncertainty on an edge-by-edge basis. It also can inform the user of the positional uncertainty for query sequences by calculating expected distance between placement locations, which is crucial in the estimation of uncertainty with a well-sampled reference tree. The software provides visualizations using branch thickness and color to represent number of placements and their uncertainty. A simulation study using reads generated from 631 COG alignments shows a high level of accuracy for phylogenetic placement over a wide range of alignment diversity, and the power of edge uncertainty estimates to measure placement confidence. Conclusions Pplacer enables efficient phylogenetic placement and subsequent visualization, making likelihood-based phylogenetics methodology practical for large collections of reads; it is freely available as source code, binaries, and a web service.
Statistical learning and selective inference.
Taylor, Jonathan; Tibshirani, Robert J
2015-06-23
We describe the problem of "selective inference." This addresses the following challenge: Having mined a set of data to find potential associations, how do we properly assess the strength of these associations? The fact that we have "cherry-picked"--searched for the strongest associations--means that we must set a higher bar for declaring significant the associations that we see. This challenge becomes more important in the era of big data and complex statistical modeling. The cherry tree (dataset) can be very large and the tools for cherry picking (statistical learning methods) are now very sophisticated. We describe some recent new developments in selective inference and illustrate their use in forward stepwise regression, the lasso, and principal components analysis.
Bayesian inference with ecological applications
Link, William A
2009-01-01
This text is written to provide a mathematically sound but accessible and engaging introduction to Bayesian inference specifically for environmental scientists, ecologists and wildlife biologists. It emphasizes the power and usefulness of Bayesian methods in an ecological context. The advent of fast personal computers and easily available software has simplified the use of Bayesian and hierarchical models . One obstacle remains for ecologists and wildlife biologists, namely the near absence of Bayesian texts written specifically for them. The book includes many relevant examples, is supported by software and examples on a companion website and will become an essential grounding in this approach for students and research ecologists. Engagingly written text specifically designed to demystify a complex subject Examples drawn from ecology and wildlife research An essential grounding for graduate and research ecologists in the increasingly prevalent Bayesian approach to inference Companion website with analyt...
Bayesian inference on proportional elections.
Directory of Open Access Journals (Sweden)
Gabriel Hideki Vatanabe Brunello
Full Text Available Polls for majoritarian voting systems usually show estimates of the percentage of votes for each candidate. However, proportional vote systems do not necessarily guarantee the candidate with the most percentage of votes will be elected. Thus, traditional methods used in majoritarian elections cannot be applied on proportional elections. In this context, the purpose of this paper was to perform a Bayesian inference on proportional elections considering the Brazilian system of seats distribution. More specifically, a methodology to answer the probability that a given party will have representation on the chamber of deputies was developed. Inferences were made on a Bayesian scenario using the Monte Carlo simulation technique, and the developed methodology was applied on data from the Brazilian elections for Members of the Legislative Assembly and Federal Chamber of Deputies in 2010. A performance rate was also presented to evaluate the efficiency of the methodology. Calculations and simulations were carried out using the free R statistical software.
Causal inference based on counterfactuals
Directory of Open Access Journals (Sweden)
Höfler M
2005-09-01
Full Text Available Abstract Background The counterfactual or potential outcome model has become increasingly standard for causal inference in epidemiological and medical studies. Discussion This paper provides an overview on the counterfactual and related approaches. A variety of conceptual as well as practical issues when estimating causal effects are reviewed. These include causal interactions, imperfect experiments, adjustment for confounding, time-varying exposures, competing risks and the probability of causation. It is argued that the counterfactual model of causal effects captures the main aspects of causality in health sciences and relates to many statistical procedures. Summary Counterfactuals are the basis of causal inference in medicine and epidemiology. Nevertheless, the estimation of counterfactual differences pose several difficulties, primarily in observational studies. These problems, however, reflect fundamental barriers only when learning from observations, and this does not invalidate the counterfactual concept.
System Support for Forensic Inference
Gehani, Ashish; Kirchner, Florent; Shankar, Natarajan
Digital evidence is playing an increasingly important role in prosecuting crimes. The reasons are manifold: financially lucrative targets are now connected online, systems are so complex that vulnerabilities abound and strong digital identities are being adopted, making audit trails more useful. If the discoveries of forensic analysts are to hold up to scrutiny in court, they must meet the standard for scientific evidence. Software systems are currently developed without consideration of this fact. This paper argues for the development of a formal framework for constructing “digital artifacts” that can serve as proxies for physical evidence; a system so imbued would facilitate sound digital forensic inference. A case study involving a filesystem augmentation that provides transparent support for forensic inference is described.
Probability biases as Bayesian inference
Directory of Open Access Journals (Sweden)
Andre; C. R. Martins
2006-11-01
Full Text Available In this article, I will show how several observed biases in human probabilistic reasoning can be partially explained as good heuristics for making inferences in an environment where probabilities have uncertainties associated to them. Previous results show that the weight functions and the observed violations of coalescing and stochastic dominance can be understood from a Bayesian point of view. We will review those results and see that Bayesian methods should also be used as part of the explanation behind other known biases. That means that, although the observed errors are still errors under the be understood as adaptations to the solution of real life problems. Heuristics that allow fast evaluations and mimic a Bayesian inference would be an evolutionary advantage, since they would give us an efficient way of making decisions. %XX In that sense, it should be no surprise that humans reason with % probability as it has been observed.
Statistical inference on residual life
Jeong, Jong-Hyeon
2014-01-01
This is a monograph on the concept of residual life, which is an alternative summary measure of time-to-event data, or survival data. The mean residual life has been used for many years under the name of life expectancy, so it is a natural concept for summarizing survival or reliability data. It is also more interpretable than the popular hazard function, especially for communications between patients and physicians regarding the efficacy of a new drug in the medical field. This book reviews existing statistical methods to infer the residual life distribution. The review and comparison includes existing inference methods for mean and median, or quantile, residual life analysis through medical data examples. The concept of the residual life is also extended to competing risks analysis. The targeted audience includes biostatisticians, graduate students, and PhD (bio)statisticians. Knowledge in survival analysis at an introductory graduate level is advisable prior to reading this book.
Nonparametric Bayesian inference in biostatistics
Müller, Peter
2015-01-01
As chapters in this book demonstrate, BNP has important uses in clinical sciences and inference for issues like unknown partitions in genomics. Nonparametric Bayesian approaches (BNP) play an ever expanding role in biostatistical inference from use in proteomics to clinical trials. Many research problems involve an abundance of data and require flexible and complex probability models beyond the traditional parametric approaches. As this book's expert contributors show, BNP approaches can be the answer. Survival Analysis, in particular survival regression, has traditionally used BNP, but BNP's potential is now very broad. This applies to important tasks like arrangement of patients into clinically meaningful subpopulations and segmenting the genome into functionally distinct regions. This book is designed to both review and introduce application areas for BNP. While existing books provide theoretical foundations, this book connects theory to practice through engaging examples and research questions. Chapters c...
Statistical inference a short course
Panik, Michael J
2012-01-01
A concise, easily accessible introduction to descriptive and inferential techniques Statistical Inference: A Short Course offers a concise presentation of the essentials of basic statistics for readers seeking to acquire a working knowledge of statistical concepts, measures, and procedures. The author conducts tests on the assumption of randomness and normality, provides nonparametric methods when parametric approaches might not work. The book also explores how to determine a confidence interval for a population median while also providing coverage of ratio estimation, randomness, and causal
On Quantum Statistical Inference, II
Barndorff-Nielsen, O. E.; Gill, R. D.; Jupp, P. E.
2003-01-01
Interest in problems of statistical inference connected to measurements of quantum systems has recently increased substantially, in step with dramatic new developments in experimental techniques for studying small quantum systems. Furthermore, theoretical developments in the theory of quantum measurements have brought the basic mathematical framework for the probability calculations much closer to that of classical probability theory. The present paper reviews this field and proposes and inte...
Nonparametric predictive inference in reliability
International Nuclear Information System (INIS)
Coolen, F.P.A.; Coolen-Schrijner, P.; Yan, K.J.
2002-01-01
We introduce a recently developed statistical approach, called nonparametric predictive inference (NPI), to reliability. Bounds for the survival function for a future observation are presented. We illustrate how NPI can deal with right-censored data, and discuss aspects of competing risks. We present possible applications of NPI for Bernoulli data, and we briefly outline applications of NPI for replacement decisions. The emphasis is on introduction and illustration of NPI in reliability contexts, detailed mathematical justifications are presented elsewhere
Variational inference & deep learning : A new synthesis
Kingma, D.P.
2017-01-01
In this thesis, Variational Inference and Deep Learning: A New Synthesis, we propose novel solutions to the problems of variational (Bayesian) inference, generative modeling, representation learning, semi-supervised learning, and stochastic optimization.
Variational inference & deep learning: A new synthesis
Kingma, D.P.
2017-01-01
In this thesis, Variational Inference and Deep Learning: A New Synthesis, we propose novel solutions to the problems of variational (Bayesian) inference, generative modeling, representation learning, semi-supervised learning, and stochastic optimization.
Continuous Integrated Invariant Inference, Phase I
National Aeronautics and Space Administration — The proposed project will develop a new technique for invariant inference and embed this and other current invariant inference and checking techniques in an...
Anderson, Eric C; Ng, Thomas C
2016-02-01
We develop a computational framework for addressing pedigree inference problems using small numbers (80-400) of single nucleotide polymorphisms (SNPs). Our approach relaxes the assumptions, which are commonly made, that sampling is complete with respect to the pedigree and that there is no genotyping error. It relies on representing the inferred pedigree as a factor graph and invoking the Sum-Product algorithm to compute and store quantities that allow the joint probability of the data to be rapidly computed under a large class of rearrangements of the pedigree structure. This allows efficient MCMC sampling over the space of pedigrees, and, hence, Bayesian inference of pedigree structure. In this paper we restrict ourselves to inference of pedigrees without loops using SNPs assumed to be unlinked. We present the methodology in general for multigenerational inference, and we illustrate the method by applying it to the inference of full sibling groups in a large sample (n=1157) of Chinook salmon typed at 95 SNPs. The results show that our method provides a better point estimate and estimate of uncertainty than the currently best-available maximum-likelihood sibling reconstruction method. Extensions of this work to more complex scenarios are briefly discussed. Published by Elsevier Inc.
Adaptive Inference on General Graphical Models
Acar, Umut A.; Ihler, Alexander T.; Mettu, Ramgopal; Sumer, Ozgur
2012-01-01
Many algorithms and applications involve repeatedly solving variations of the same inference problem; for example we may want to introduce new evidence to the model or perform updates to conditional dependencies. The goal of adaptive inference is to take advantage of what is preserved in the model and perform inference more rapidly than from scratch. In this paper, we describe techniques for adaptive inference on general graphs that support marginal computation and updates to the conditional ...
Network Model-Assisted Inference from Respondent-Driven Sampling Data.
Gile, Krista J; Handcock, Mark S
2015-06-01
Respondent-Driven Sampling is a widely-used method for sampling hard-to-reach human populations by link-tracing over their social networks. Inference from such data requires specialized techniques because the sampling process is both partially beyond the control of the researcher, and partially implicitly defined. Therefore, it is not generally possible to directly compute the sampling weights for traditional design-based inference, and likelihood inference requires modeling the complex sampling process. As an alternative, we introduce a model-assisted approach, resulting in a design-based estimator leveraging a working network model. We derive a new class of estimators for population means and a corresponding bootstrap standard error estimator. We demonstrate improved performance compared to existing estimators, including adjustment for an initial convenience sample. We also apply the method and an extension to the estimation of HIV prevalence in a high-risk population.
Network Model-Assisted Inference from Respondent-Driven Sampling Data
Gile, Krista J.; Handcock, Mark S.
2015-01-01
Summary Respondent-Driven Sampling is a widely-used method for sampling hard-to-reach human populations by link-tracing over their social networks. Inference from such data requires specialized techniques because the sampling process is both partially beyond the control of the researcher, and partially implicitly defined. Therefore, it is not generally possible to directly compute the sampling weights for traditional design-based inference, and likelihood inference requires modeling the complex sampling process. As an alternative, we introduce a model-assisted approach, resulting in a design-based estimator leveraging a working network model. We derive a new class of estimators for population means and a corresponding bootstrap standard error estimator. We demonstrate improved performance compared to existing estimators, including adjustment for an initial convenience sample. We also apply the method and an extension to the estimation of HIV prevalence in a high-risk population. PMID:26640328
Maximum likelihood estimation of semiparametric mixture component models for competing risks data.
Choi, Sangbum; Huang, Xuelin
2014-09-01
In the analysis of competing risks data, the cumulative incidence function is a useful quantity to characterize the crude risk of failure from a specific event type. In this article, we consider an efficient semiparametric analysis of mixture component models on cumulative incidence functions. Under the proposed mixture model, latency survival regressions given the event type are performed through a class of semiparametric models that encompasses the proportional hazards model and the proportional odds model, allowing for time-dependent covariates. The marginal proportions of the occurrences of cause-specific events are assessed by a multinomial logistic model. Our mixture modeling approach is advantageous in that it makes a joint estimation of model parameters associated with all competing risks under consideration, satisfying the constraint that the cumulative probability of failing from any cause adds up to one given any covariates. We develop a novel maximum likelihood scheme based on semiparametric regression analysis that facilitates efficient and reliable estimation. Statistical inferences can be conveniently made from the inverse of the observed information matrix. We establish the consistency and asymptotic normality of the proposed estimators. We validate small sample properties with simulations and demonstrate the methodology with a data set from a study of follicular lymphoma. © 2014, The International Biometric Society.
Bayesian inference in processing experimental data: principles and basic applications
International Nuclear Information System (INIS)
D'Agostini, G
2003-01-01
This paper introduces general ideas and some basic methods of the Bayesian probability theory applied to physics measurements. Our aim is to make the reader familiar, through examples rather than rigorous formalism, with concepts such as the following: model comparison (including the automatic Ockham's Razor filter provided by the Bayesian approach); parametric inference; quantification of the uncertainty about the value of physical quantities, also taking into account systematic effects; role of marginalization; posterior characterization; predictive distributions; hierarchical modelling and hyperparameters; Gaussian approximation of the posterior and recovery of conventional methods, especially maximum likelihood and chi-square fits under well-defined conditions; conjugate priors, transformation invariance and maximum entropy motivated priors; and Monte Carlo (MC) estimates of expectation, including a short introduction to Markov Chain MC methods
The Multivariate Generalised von Mises Distribution: Inference and Applications
DEFF Research Database (Denmark)
Navarro, Alexandre Khae Wu; Frellsen, Jes; Turner, Richard
2017-01-01
Circular variables arise in a multitude of data-modelling contexts ranging from robotics to the social sciences, but they have been largely overlooked by the machine learning community. This paper partially redresses this imbalance by extending some standard probabilistic modelling tools to the c......Circular variables arise in a multitude of data-modelling contexts ranging from robotics to the social sciences, but they have been largely overlooked by the machine learning community. This paper partially redresses this imbalance by extending some standard probabilistic modelling tools....... These models can leverage standard modelling tools (e.g. kernel functions and automatic relevance determination). Third, we show that the posterior distribution in these models is a mGvM distribution which enables development of an efficient variational free-energy scheme for performing approximate inference...... and approximate maximum-likelihood learning....
Applying exclusion likelihoods from LHC searches to extended Higgs sectors
International Nuclear Information System (INIS)
Bechtle, Philip; Heinemeyer, Sven; Staal, Oscar; Stefaniak, Tim; Weiglein, Georg
2015-01-01
LHC searches for non-standard Higgs bosons decaying into tau lepton pairs constitute a sensitive experimental probe for physics beyond the Standard Model (BSM), such as supersymmetry (SUSY). Recently, the limits obtained from these searches have been presented by the CMS collaboration in a nearly model-independent fashion - as a narrow resonance model - based on the full 8 TeV dataset. In addition to publishing a 95 % C.L. exclusion limit, the full likelihood information for the narrowresonance model has been released. This provides valuable information that can be incorporated into global BSM fits. We present a simple algorithm that maps an arbitrary model with multiple neutral Higgs bosons onto the narrow resonance model and derives the corresponding value for the exclusion likelihood from the CMS search. This procedure has been implemented into the public computer code HiggsBounds (version 4.2.0 and higher). We validate our implementation by cross-checking against the official CMS exclusion contours in three Higgs benchmark scenarios in the Minimal Supersymmetric Standard Model (MSSM), and find very good agreement. Going beyond validation, we discuss the combined constraints of the ττ search and the rate measurements of the SM-like Higgs at 125 GeV in a recently proposed MSSM benchmark scenario, where the lightest Higgs boson obtains SM-like couplings independently of the decoupling of the heavier Higgs states. Technical details for how to access the likelihood information within HiggsBounds are given in the appendix. The program is available at http:// higgsbounds.hepforge.org. (orig.)
Elaboration likelihood and the perceived value of labels
DEFF Research Database (Denmark)
Poulsen, Carsten Stig; Juhl, Hans Jørn
2001-01-01
In this paper the increasingly popular method of choice based on conjoint analysis is used and data are collected by pairwise comparisons. A latent class model is formulated allowing that the resulting data can be analyzed with segmentation in mind. The empirical study is on food labeling...
Australian food life style segments and elaboration likelihood differences
DEFF Research Database (Denmark)
Brunsø, Karen; Reid, Mike
As the global food marketing environment becomes more competitive, the international and comparative perspective of consumers' attitudes and behaviours becomes more important for both practitioners and academics. This research employs the Food-Related Life Style (FRL) instrument in Australia...... in order to 1) determine Australian Life Style Segments and compare these with their European counterparts, and to 2) explore differences in elaboration likelihood among the Australian segments, e.g. consumers' interest and motivation to perceive product related communication. The results provide new...
Maximum-likelihood method for numerical inversion of Mellin transform
International Nuclear Information System (INIS)
Iqbal, M.
1997-01-01
A method is described for inverting the Mellin transform which uses an expansion in Laguerre polynomials and converts the Mellin transform to Laplace transform, then the maximum-likelihood regularization method is used to recover the original function of the Mellin transform. The performance of the method is illustrated by the inversion of the test functions available in the literature (J. Inst. Math. Appl., 20 (1977) 73; Math. Comput., 53 (1989) 589). Effectiveness of the method is shown by results obtained through demonstration by means of tables and diagrams
How to Improve the Likelihood of CDM Approval?
DEFF Research Database (Denmark)
Brandt, Urs Steiner; Svendsen, Gert Tinggaard
2014-01-01
How can the likelihood of Clean Development Mechanism (CDM) approval be improved in the face of institutional shortcomings? To answer this question, we focus on the three institutional shortcomings of income sharing, risk sharing and corruption prevention concerning afforestation/reforestation (A....../R). Furthermore, three main stakeholders are identified, namely investors, governments and agents in a principal-agent model regarding monitoring and enforcement capacity. Developing countries such as West Africa have, despite huge potentials, not been integrated in A/R CDM projects yet. Remote sensing, however...
Maximum Likelihood and Bayes Estimation in Randomly Censored Geometric Distribution
Directory of Open Access Journals (Sweden)
Hare Krishna
2017-01-01
Full Text Available In this article, we study the geometric distribution under randomly censored data. Maximum likelihood estimators and confidence intervals based on Fisher information matrix are derived for the unknown parameters with randomly censored data. Bayes estimators are also developed using beta priors under generalized entropy and LINEX loss functions. Also, Bayesian credible and highest posterior density (HPD credible intervals are obtained for the parameters. Expected time on test and reliability characteristics are also analyzed in this article. To compare various estimates developed in the article, a Monte Carlo simulation study is carried out. Finally, for illustration purpose, a randomly censored real data set is discussed.
Elemental composition of cosmic rays using a maximum likelihood method
International Nuclear Information System (INIS)
Ruddick, K.
1996-01-01
We present a progress report on our attempts to determine the composition of cosmic rays in the knee region of the energy spectrum. We have used three different devices to measure properties of the extensive air showers produced by primary cosmic rays: the Soudan 2 underground detector measures the muon flux deep underground, a proportional tube array samples shower density at the surface of the earth, and a Cherenkov array observes light produced high in the atmosphere. We have begun maximum likelihood fits to these measurements with the hope of determining the nuclear mass number A on an event by event basis. (orig.)
Process criticality accident likelihoods, consequences and emergency planning
International Nuclear Information System (INIS)
McLaughlin, T.P.
1992-01-01
Evaluation of criticality accident risks in the processing of significant quantities of fissile materials is both complex and subjective, largely due to the lack of accident statistics. Thus, complying with national and international standards and regulations which require an evaluation of the net benefit of a criticality accident alarm system, is also subjective. A review of guidance found in the literature on potential accident magnitudes is presented for different material forms and arrangements. Reasoned arguments are also presented concerning accident prevention and accident likelihoods for these material forms and arrangements. (Author)
Likelihood Estimation of Gamma Ray Bursts Duration Distribution
Horvath, Istvan
2005-01-01
Two classes of Gamma Ray Bursts have been identified so far, characterized by T90 durations shorter and longer than approximately 2 seconds. It was shown that the BATSE 3B data allow a good fit with three Gaussian distributions in log T90. In the same Volume in ApJ. another paper suggested that the third class of GRBs is may exist. Using the full BATSE catalog here we present the maximum likelihood estimation, which gives us 0.5% probability to having only two subclasses. The MC simulation co...
Process criticality accident likelihoods, consequences, and emergency planning
Energy Technology Data Exchange (ETDEWEB)
McLaughlin, T.P.
1991-01-01
Evaluation of criticality accident risks in the processing of significant quantities of fissile materials is both complex and subjective, largely due to the lack of accident statistics. Thus, complying with standards such as ISO 7753 which mandates that the need for an alarm system be evaluated, is also subjective. A review of guidance found in the literature on potential accident magnitudes is presented for different material forms and arrangements. Reasoned arguments are also presented concerning accident prevention and accident likelihoods for these material forms and arrangements. 13 refs., 1 fig., 1 tab.
Improved Likelihood Function in Particle-based IR Eye Tracking
DEFF Research Database (Denmark)
Satria, R.; Sorensen, J.; Hammoud, R.
2005-01-01
In this paper we propose a log likelihood-ratio function of foreground and background models used in a particle filter to track the eye region in dark-bright pupil image sequences. This model fuses information from both dark and bright pupil images and their difference image into one model. Our...... enhanced tracker overcomes the issues of prior selection of static thresholds during the detection of feature observations in the bright-dark difference images. The auto-initialization process is performed using cascaded classifier trained using adaboost and adapted to IR eye images. Experiments show good...
Estimating likelihood of future crashes for crash-prone drivers
Subasish Das; Xiaoduan Sun; Fan Wang; Charles Leboeuf
2015-01-01
At-fault crash-prone drivers are usually considered as the high risk group for possible future incidents or crashes. In Louisiana, 34% of crashes are repeatedly committed by the at-fault crash-prone drivers who represent only 5% of the total licensed drivers in the state. This research has conducted an exploratory data analysis based on the driver faultiness and proneness. The objective of this study is to develop a crash prediction model to estimate the likelihood of future crashes for the a...
Similar tests and the standardized log likelihood ratio statistic
DEFF Research Database (Denmark)
Jensen, Jens Ledet
1986-01-01
When testing an affine hypothesis in an exponential family the 'ideal' procedure is to calculate the exact similar test, or an approximation to this, based on the conditional distribution given the minimal sufficient statistic under the null hypothesis. By contrast to this there is a 'primitive......' approach in which the marginal distribution of a test statistic considered and any nuisance parameter appearing in the test statistic is replaced by an estimate. We show here that when using standardized likelihood ratio statistics the 'primitive' procedure is in fact an 'ideal' procedure to order O(n -3...
Maximum Likelihood Joint Tracking and Association in Strong Clutter
Directory of Open Access Journals (Sweden)
Leonid I. Perlovsky
2013-01-01
Full Text Available We have developed a maximum likelihood formulation for a joint detection, tracking and association problem. An efficient non-combinatorial algorithm for this problem is developed in case of strong clutter for radar data. By using an iterative procedure of the dynamic logic process “from vague-to-crisp” explained in the paper, the new tracker overcomes the combinatorial complexity of tracking in highly-cluttered scenarios and results in an orders-of-magnitude improvement in signal-to-clutter ratio.
Inference of R(0 and transmission heterogeneity from the size distribution of stuttering chains.
Directory of Open Access Journals (Sweden)
Seth Blumberg
Full Text Available For many infectious disease processes such as emerging zoonoses and vaccine-preventable diseases, [Formula: see text] and infections occur as self-limited stuttering transmission chains. A mechanistic understanding of transmission is essential for characterizing the risk of emerging diseases and monitoring spatio-temporal dynamics. Thus methods for inferring [Formula: see text] and the degree of heterogeneity in transmission from stuttering chain data have important applications in disease surveillance and management. Previous researchers have used chain size distributions to infer [Formula: see text], but estimation of the degree of individual-level variation in infectiousness (as quantified by the dispersion parameter, [Formula: see text] has typically required contact tracing data. Utilizing branching process theory along with a negative binomial offspring distribution, we demonstrate how maximum likelihood estimation can be applied to chain size data to infer both [Formula: see text] and the dispersion parameter that characterizes heterogeneity. While the maximum likelihood value for [Formula: see text] is a simple function of the average chain size, the associated confidence intervals are dependent on the inferred degree of transmission heterogeneity. As demonstrated for monkeypox data from the Democratic Republic of Congo, this impacts when a statistically significant change in [Formula: see text] is detectable. In addition, by allowing for superspreading events, inference of [Formula: see text] shifts the threshold above which a transmission chain should be considered anomalously large for a given value of [Formula: see text] (thus reducing the probability of false alarms about pathogen adaptation. Our analysis of monkeypox also clarifies the various ways that imperfect observation can impact inference of transmission parameters, and highlights the need to quantitatively evaluate whether observation is likely to significantly bias results.
Epistemology and Empirical Investigation
DEFF Research Database (Denmark)
Ahlström, Kristoffer
2008-01-01
Recently, Hilary Kornblith has argued that epistemological investigation is substantially empirical. In the present paper, I will ¿rst show that his claim is not contingent upon the further and, admittedly, controversial assumption that all objects of epistemological investigation are natural kinds....... Then, I will argue that, contrary to what Kornblith seems to assume, this methodological contention does not imply that there is no need for attending to our epistemic concepts in epistemology. Understanding the make-up of our concepts and, in particular, the purposes they ¿ll, is necessary...
Likelihood Approximation With Parallel Hierarchical Matrices For Large Spatial Datasets
Litvinenko, Alexander
2017-11-01
The main goal of this article is to introduce the parallel hierarchical matrix library HLIBpro to the statistical community. We describe the HLIBCov package, which is an extension of the HLIBpro library for approximating large covariance matrices and maximizing likelihood functions. We show that an approximate Cholesky factorization of a dense matrix of size $2M\\\\times 2M$ can be computed on a modern multi-core desktop in few minutes. Further, HLIBCov is used for estimating the unknown parameters such as the covariance length, variance and smoothness parameter of a Matérn covariance function by maximizing the joint Gaussian log-likelihood function. The computational bottleneck here is expensive linear algebra arithmetics due to large and dense covariance matrices. Therefore covariance matrices are approximated in the hierarchical ($\\\\H$-) matrix format with computational cost $\\\\mathcal{O}(k^2n \\\\log^2 n/p)$ and storage $\\\\mathcal{O}(kn \\\\log n)$, where the rank $k$ is a small integer (typically $k<25$), $p$ the number of cores and $n$ the number of locations on a fairly general mesh. We demonstrate a synthetic example, where the true values of known parameters are known. For reproducibility we provide the C++ code, the documentation, and the synthetic data.
Likelihood Approximation With Parallel Hierarchical Matrices For Large Spatial Datasets
Litvinenko, Alexander; Sun, Ying; Genton, Marc G.; Keyes, David E.
2017-01-01
The main goal of this article is to introduce the parallel hierarchical matrix library HLIBpro to the statistical community. We describe the HLIBCov package, which is an extension of the HLIBpro library for approximating large covariance matrices and maximizing likelihood functions. We show that an approximate Cholesky factorization of a dense matrix of size $2M\\times 2M$ can be computed on a modern multi-core desktop in few minutes. Further, HLIBCov is used for estimating the unknown parameters such as the covariance length, variance and smoothness parameter of a Matérn covariance function by maximizing the joint Gaussian log-likelihood function. The computational bottleneck here is expensive linear algebra arithmetics due to large and dense covariance matrices. Therefore covariance matrices are approximated in the hierarchical ($\\H$-) matrix format with computational cost $\\mathcal{O}(k^2n \\log^2 n/p)$ and storage $\\mathcal{O}(kn \\log n)$, where the rank $k$ is a small integer (typically $k<25$), $p$ the number of cores and $n$ the number of locations on a fairly general mesh. We demonstrate a synthetic example, where the true values of known parameters are known. For reproducibility we provide the C++ code, the documentation, and the synthetic data.
Superfast maximum-likelihood reconstruction for quantum tomography
Shang, Jiangwei; Zhang, Zhengyun; Ng, Hui Khoon
2017-06-01
Conventional methods for computing maximum-likelihood estimators (MLE) often converge slowly in practical situations, leading to a search for simplifying methods that rely on additional assumptions for their validity. In this work, we provide a fast and reliable algorithm for maximum-likelihood reconstruction that avoids this slow convergence. Our method utilizes the state-of-the-art convex optimization scheme, an accelerated projected-gradient method, that allows one to accommodate the quantum nature of the problem in a different way than in the standard methods. We demonstrate the power of our approach by comparing its performance with other algorithms for n -qubit state tomography. In particular, an eight-qubit situation that purportedly took weeks of computation time in 2005 can now be completed in under a minute for a single set of data, with far higher accuracy than previously possible. This refutes the common claim that MLE reconstruction is slow and reduces the need for alternative methods that often come with difficult-to-verify assumptions. In fact, recent methods assuming Gaussian statistics or relying on compressed sensing ideas are demonstrably inapplicable for the situation under consideration here. Our algorithm can be applied to general optimization problems over the quantum state space; the philosophy of projected gradients can further be utilized for optimization contexts with general constraints.
Simulation-based marginal likelihood for cluster strong lensing cosmology
Killedar, M.; Borgani, S.; Fabjan, D.; Dolag, K.; Granato, G.; Meneghetti, M.; Planelles, S.; Ragone-Figueroa, C.
2018-01-01
Comparisons between observed and predicted strong lensing properties of galaxy clusters have been routinely used to claim either tension or consistency with Λ cold dark matter cosmology. However, standard approaches to such cosmological tests are unable to quantify the preference for one cosmology over another. We advocate approximating the relevant Bayes factor using a marginal likelihood that is based on the following summary statistic: the posterior probability distribution function for the parameters of the scaling relation between Einstein radii and cluster mass, α and β. We demonstrate, for the first time, a method of estimating the marginal likelihood using the X-ray selected z > 0.5 Massive Cluster Survey clusters as a case in point and employing both N-body and hydrodynamic simulations of clusters. We investigate the uncertainty in this estimate and consequential ability to compare competing cosmologies, which arises from incomplete descriptions of baryonic processes, discrepancies in cluster selection criteria, redshift distribution and dynamical state. The relation between triaxial cluster masses at various overdensities provides a promising alternative to the strong lensing test.
Risk factors and likelihood of Campylobacter colonization in broiler flocks
Directory of Open Access Journals (Sweden)
SL Kuana
2007-09-01
Full Text Available Campylobacter was investigated in cecal droppings, feces, and cloacal swabs of 22 flocks of 3 to 5 week-old broilers. Risk factors and the likelihood of the presence of this agent in these flocks were determined. Management practices, such as cleaning and disinfection, feeding, drinkers, and litter treatments, were assessed. Results were evaluated using Odds Ratio (OR test, and their significance was tested by Fisher's test (p<0.05. A Campylobacter prevalence of 81.8% was found in the broiler flocks (18/22, and within positive flocks, it varied between 85 and 100%. Campylobacter incidence among sample types was homogenous, being 81.8% in cecal droppings, 80.9% in feces, and 80.4% in cloacal swabs (230. Flocks fed by automatic feeding systems presented higher incidence of Campylobacter as compared to those fed by tube feeders. Litter was reused in 63.6% of the farm, and, despite the lack of statistical significance, there was higher likelihood of Campylobacter incidence when litter was reused. Foot bath was not used in 45.5% of the flocks, whereas the use of foot bath associated to deficient lime management increased the number of positive flocks, although with no statiscal significance. The evaluated parameters were not significantly associated with Campylobacter colonization in the assessed broiler flocks.
Menyoal Elaboration Likelihood Model (ELM dan Teori Retorika
Directory of Open Access Journals (Sweden)
Yudi Perbawaningsih
2012-06-01
Full Text Available Abstract: Persuasion is a communication process to establish or change attitudes, which can be understood through theory of Rhetoric and theory of Elaboration Likelihood Model (ELM. This study elaborates these theories in a Public Lecture series which to persuade the students in choosing their concentration of study. The result shows that in term of persuasion effectiveness it is not quite relevant to separate the message and its source. The quality of source is determined by the quality of the message, and vice versa. Separating the two routes of the persuasion process as described in the ELM theory would not be relevant. Abstrak: Persuasi adalah proses komunikasi untuk membentuk atau mengubah sikap, yang dapat dipahami dengan teori Retorika dan teori Elaboration Likelihood Model (ELM. Penelitian ini mengelaborasi teori tersebut dalam Kuliah Umum sebagai sarana mempersuasi mahasiswa untuk memilih konsentrasi studi studi yang didasarkan pada proses pengolahan informasi. Menggunakan metode survey, didapatkan hasil yaitu tidaklah cukup relevan memisahkan pesan dan narasumber dalam melihat efektivitas persuasi. Keduanya menyatu yang berarti bahwa kualitas narasumber ditentukan oleh kualitas pesan yang disampaikannya, dan sebaliknya. Memisahkan proses persuasi dalam dua lajur seperti yang dijelaskan dalam ELM teori menjadi tidak relevan.
Corporate brand extensions based on the purchase likelihood: governance implications
Directory of Open Access Journals (Sweden)
Spyridon Goumas
2018-03-01
Full Text Available This paper is examining the purchase likelihood of hypothetical service brand extensions from product companies focusing on consumer electronics based on sector categorization and perceptions of fit between the existing product category and image of the company. Prior research has recognized that levels of brand knowledge eases the transference of associations and affect to the new products. Similarity to the existing products of the parent company and perceived image also influence the success of brand extensions. However, sector categorization may interfere with this relationship. The purpose of this study is to examine Greek consumers’ attitudes towards hypothetical brand extensions, and how these are affected by consumers’ existing knowledge about the brand, sector categorization and perceptions of image and category fit of cross-sector extensions. This aim is examined in the context of technological categories, where less-known companies exhibited significance in purchase likelihood, and contradictory with the existing literature, service companies did not perform as positively as expected. Additional insights to the existing literature about sector categorization are provided. The effect of both image and category fit is also examined and predictions regarding the effect of each are made.
Gauging the likelihood of stable cavitation from ultrasound contrast agents.
Bader, Kenneth B; Holland, Christy K
2013-01-07
The mechanical index (MI) was formulated to gauge the likelihood of adverse bioeffects from inertial cavitation. However, the MI formulation did not consider bubble activity from stable cavitation. This type of bubble activity can be readily nucleated from ultrasound contrast agents (UCAs) and has the potential to promote beneficial bioeffects. Here, the presence of stable cavitation is determined numerically by tracking the onset of subharmonic oscillations within a population of bubbles for frequencies up to 7 MHz and peak rarefactional pressures up to 3 MPa. In addition, the acoustic pressure rupture threshold of an UCA population was determined using the Marmottant model. The threshold for subharmonic emissions of optimally sized bubbles was found to be lower than the inertial cavitation threshold for all frequencies studied. The rupture thresholds of optimally sized UCAs were found to be lower than the threshold for subharmonic emissions for either single cycle or steady state acoustic excitations. Because the thresholds of both subharmonic emissions and UCA rupture are linearly dependent on frequency, an index of the form I(CAV) = P(r)/f (where P(r) is the peak rarefactional pressure in MPa and f is the frequency in MHz) was derived to gauge the likelihood of subharmonic emissions due to stable cavitation activity nucleated from UCAs.
Safe semi-supervised learning based on weighted likelihood.
Kawakita, Masanori; Takeuchi, Jun'ichi
2014-05-01
We are interested in developing a safe semi-supervised learning that works in any situation. Semi-supervised learning postulates that n(') unlabeled data are available in addition to n labeled data. However, almost all of the previous semi-supervised methods require additional assumptions (not only unlabeled data) to make improvements on supervised learning. If such assumptions are not met, then the methods possibly perform worse than supervised learning. Sokolovska, Cappé, and Yvon (2008) proposed a semi-supervised method based on a weighted likelihood approach. They proved that this method asymptotically never performs worse than supervised learning (i.e., it is safe) without any assumption. Their method is attractive because it is easy to implement and is potentially general. Moreover, it is deeply related to a certain statistical paradox. However, the method of Sokolovska et al. (2008) assumes a very limited situation, i.e., classification, discrete covariates, n(')→∞ and a maximum likelihood estimator. In this paper, we extend their method by modifying the weight. We prove that our proposal is safe in a significantly wide range of situations as long as n≤n('). Further, we give a geometrical interpretation of the proof of safety through the relationship with the above-mentioned statistical paradox. Finally, we show that the above proposal is asymptotically safe even when n(')
Empirical microeconomics action functionals
Baaquie, Belal E.; Du, Xin; Tanputraman, Winson
2015-06-01
A statistical generalization of microeconomics has been made in Baaquie (2013), where the market price of every traded commodity, at each instant of time, is considered to be an independent random variable. The dynamics of commodity market prices is modeled by an action functional-and the focus of this paper is to empirically determine the action functionals for different commodities. The correlation functions of the model are defined using a Feynman path integral. The model is calibrated using the unequal time correlation of the market commodity prices as well as their cubic and quartic moments using a perturbation expansion. The consistency of the perturbation expansion is verified by a numerical evaluation of the path integral. Nine commodities drawn from the energy, metal and grain sectors are studied and their market behavior is described by the model to an accuracy of over 90% using only six parameters. The paper empirically establishes the existence of the action functional for commodity prices that was postulated to exist in Baaquie (2013).
Maximum-Entropy Inference with a Programmable Annealer
Chancellor, Nicholas; Szoke, Szilard; Vinci, Walter; Aeppli, Gabriel; Warburton, Paul A.
2016-03-01
Optimisation problems typically involve finding the ground state (i.e. the minimum energy configuration) of a cost function with respect to many variables. If the variables are corrupted by noise then this maximises the likelihood that the solution is correct. The maximum entropy solution on the other hand takes the form of a Boltzmann distribution over the ground and excited states of the cost function to correct for noise. Here we use a programmable annealer for the information decoding problem which we simulate as a random Ising model in a field. We show experimentally that finite temperature maximum entropy decoding can give slightly better bit-error-rates than the maximum likelihood approach, confirming that useful information can be extracted from the excited states of the annealer. Furthermore we introduce a bit-by-bit analytical method which is agnostic to the specific application and use it to show that the annealer samples from a highly Boltzmann-like distribution. Machines of this kind are therefore candidates for use in a variety of machine learning applications which exploit maximum entropy inference, including language processing and image recognition.
Sweller, Naomi; Hayes, Brett K
2010-08-01
Three studies examined how task demands that impact on attention to typical or atypical category features shape the category representations formed through classification learning and inference learning. During training categories were learned via exemplar classification or by inferring missing exemplar features. In the latter condition inferences were made about missing typical features alone (typical feature inference) or about both missing typical and atypical features (mixed feature inference). Classification and mixed feature inference led to the incorporation of typical and atypical features into category representations, with both kinds of features influencing inferences about familiar (Experiments 1 and 2) and novel (Experiment 3) test items. Those in the typical inference condition focused primarily on typical features. Together with formal modelling, these results challenge previous accounts that have characterized inference learning as producing a focus on typical category features. The results show that two different kinds of inference learning are possible and that these are subserved by different kinds of category representations.
A combinatorial perspective of the protein inference problem.
Yang, Chao; He, Zengyou; Yu, Weichuan
2013-01-01
In a shotgun proteomics experiment, proteins are the most biologically meaningful output. The success of proteomics studies depends on the ability to accurately and efficiently identify proteins. Many methods have been proposed to facilitate the identification of proteins from peptide identification results. However, the relationship between protein identification and peptide identification has not been thoroughly explained before. In this paper, we devote ourselves to a combinatorial perspective of the protein inference problem. We employ combinatorial mathematics to calculate the conditional protein probabilities (protein probability means the probability that a protein is correctly identified) under three assumptions, which lead to a lower bound, an upper bound, and an empirical estimation of protein probabilities, respectively. The combinatorial perspective enables us to obtain an analytical expression for protein inference. Our method achieves comparable results with ProteinProphet in a more efficient manner in experiments on two data sets of standard protein mixtures and two data sets of real samples. Based on our model, we study the impact of unique peptides and degenerate peptides (degenerate peptides are peptides shared by at least two proteins) on protein probabilities. Meanwhile, we also study the relationship between our model and ProteinProphet. We name our program ProteinInfer. Its Java source code, our supplementary document and experimental results are available at: >http://bioinformatics.ust.hk/proteininfer.
Generative inference for cultural evolution.
Kandler, Anne; Powell, Adam
2018-04-05
One of the major challenges in cultural evolution is to understand why and how various forms of social learning are used in human populations, both now and in the past. To date, much of the theoretical work on social learning has been done in isolation of data, and consequently many insights focus on revealing the learning processes or the distributions of cultural variants that are expected to have evolved in human populations. In population genetics, recent methodological advances have allowed a greater understanding of the explicit demographic and/or selection mechanisms that underlie observed allele frequency distributions across the globe, and their change through time. In particular, generative frameworks-often using coalescent-based simulation coupled with approximate Bayesian computation (ABC)-have provided robust inferences on the human past, with no reliance on a priori assumptions of equilibrium. Here, we demonstrate the applicability and utility of generative inference approaches to the field of cultural evolution. The framework advocated here uses observed population-level frequency data directly to establish the likely presence or absence of particular hypothesized learning strategies. In this context, we discuss the problem of equifinality and argue that, in the light of sparse cultural data and the multiplicity of possible social learning processes, the exclusion of those processes inconsistent with the observed data might be the most instructive outcome. Finally, we summarize the findings of generative inference approaches applied to a number of case studies.This article is part of the theme issue 'Bridging cultural gaps: interdisciplinary studies in human cultural evolution'. © 2018 The Author(s).
Using DNA fingerprints to infer familial relationships within NHANES III households.
Katki, Hormuzd A; Sanders, Christopher L; Graubard, Barry I; Bergen, Andrew W
2010-06-01
Developing, targeting, and evaluating genomic strategies for population-based disease prevention require population-based data. In response to this urgent need, genotyping has been conducted within the Third National Health and Nutrition Examination (NHANES III), the nationally-representative household-interview health survey in the U.S. However, before these genetic analyses can occur, family relationships within households must be accurately ascertained. Unfortunately, reported family relationships within NHANES III households based on questionnaire data are incomplete and inconclusive with regards to actual biological relatedness of family members. We inferred family relationships within households using DNA fingerprints (Identifiler(R)) that contain the DNA loci used by law enforcement agencies for forensic identification of individuals. However, performance of these loci for relationship inference is not well understood. We evaluated two competing statistical methods for relationship inference on pairs of household members: an exact likelihood ratio relying on allele frequencies to an Identical By State (IBS) likelihood ratio that only requires matching alleles. We modified these methods to account for genotyping errors and population substructure. The two methods usually agree on the rankings of the most likely relationships. However, the IBS method underestimates the likelihood ratio by not accounting for the informativeness of matching rare alleles. The likelihood ratio is sensitive to estimates of population substructure, and parent-child relationships are sensitive to the specified genotyping error rate. These loci were unable to distinguish second-degree relationships and cousins from being unrelated. The genetic data is also useful for verifying reported relationships and identifying data quality issues. An important by-product is the first explicitly nationally-representative estimates of allele frequencies at these ubiquitous forensic loci.
What 'empirical turn in bioethics'?
Hurst, Samia
2010-10-01
Uncertainty as to how we should articulate empirical data and normative reasoning seems to underlie most difficulties regarding the 'empirical turn' in bioethics. This article examines three different ways in which we could understand 'empirical turn'. Using real facts in normative reasoning is trivial and would not represent a 'turn'. Becoming an empirical discipline through a shift to the social and neurosciences would be a turn away from normative thinking, which we should not take. Conducting empirical research to inform normative reasoning is the usual meaning given to the term 'empirical turn'. In this sense, however, the turn is incomplete. Bioethics has imported methodological tools from empirical disciplines, but too often it has not imported the standards to which researchers in these disciplines are held. Integrating empirical and normative approaches also represents true added difficulties. Addressing these issues from the standpoint of debates on the fact-value distinction can cloud very real methodological concerns by displacing the debate to a level of abstraction where they need not be apparent. Ideally, empirical research in bioethics should meet standards for empirical and normative validity similar to those used in the source disciplines for these methods, and articulate these aspects clearly and appropriately. More modestly, criteria to ensure that none of these standards are completely left aside would improve the quality of empirical bioethics research and partly clear the air of critiques addressing its theoretical justification, when its rigour in the particularly difficult context of interdisciplinarity is what should be at stake.
sick: The Spectroscopic Inference Crank
Casey, Andrew R.
2016-03-01
There exists an inordinate amount of spectral data in both public and private astronomical archives that remain severely under-utilized. The lack of reliable open-source tools for analyzing large volumes of spectra contributes to this situation, which is poised to worsen as large surveys successively release orders of magnitude more spectra. In this article I introduce sick, the spectroscopic inference crank, a flexible and fast Bayesian tool for inferring astrophysical parameters from spectra. sick is agnostic to the wavelength coverage, resolving power, or general data format, allowing any user to easily construct a generative model for their data, regardless of its source. sick can be used to provide a nearest-neighbor estimate of model parameters, a numerically optimized point estimate, or full Markov Chain Monte Carlo sampling of the posterior probability distributions. This generality empowers any astronomer to capitalize on the plethora of published synthetic and observed spectra, and make precise inferences for a host of astrophysical (and nuisance) quantities. Model intensities can be reliably approximated from existing grids of synthetic or observed spectra using linear multi-dimensional interpolation, or a Cannon-based model. Additional phenomena that transform the data (e.g., redshift, rotational broadening, continuum, spectral resolution) are incorporated as free parameters and can be marginalized away. Outlier pixels (e.g., cosmic rays or poorly modeled regimes) can be treated with a Gaussian mixture model, and a noise model is included to account for systematically underestimated variance. Combining these phenomena into a scalar-justified, quantitative model permits precise inferences with credible uncertainties on noisy data. I describe the common model features, the implementation details, and the default behavior, which is balanced to be suitable for most astronomical applications. Using a forward model on low-resolution, high signal
SICK: THE SPECTROSCOPIC INFERENCE CRANK
Energy Technology Data Exchange (ETDEWEB)
Casey, Andrew R., E-mail: arc@ast.cam.ac.uk [Institute of Astronomy, University of Cambridge, Madingley Road, Cambdridge, CB3 0HA (United Kingdom)
2016-03-15
There exists an inordinate amount of spectral data in both public and private astronomical archives that remain severely under-utilized. The lack of reliable open-source tools for analyzing large volumes of spectra contributes to this situation, which is poised to worsen as large surveys successively release orders of magnitude more spectra. In this article I introduce sick, the spectroscopic inference crank, a flexible and fast Bayesian tool for inferring astrophysical parameters from spectra. sick is agnostic to the wavelength coverage, resolving power, or general data format, allowing any user to easily construct a generative model for their data, regardless of its source. sick can be used to provide a nearest-neighbor estimate of model parameters, a numerically optimized point estimate, or full Markov Chain Monte Carlo sampling of the posterior probability distributions. This generality empowers any astronomer to capitalize on the plethora of published synthetic and observed spectra, and make precise inferences for a host of astrophysical (and nuisance) quantities. Model intensities can be reliably approximated from existing grids of synthetic or observed spectra using linear multi-dimensional interpolation, or a Cannon-based model. Additional phenomena that transform the data (e.g., redshift, rotational broadening, continuum, spectral resolution) are incorporated as free parameters and can be marginalized away. Outlier pixels (e.g., cosmic rays or poorly modeled regimes) can be treated with a Gaussian mixture model, and a noise model is included to account for systematically underestimated variance. Combining these phenomena into a scalar-justified, quantitative model permits precise inferences with credible uncertainties on noisy data. I describe the common model features, the implementation details, and the default behavior, which is balanced to be suitable for most astronomical applications. Using a forward model on low-resolution, high signal
Bayesian inference for Hawkes processes
DEFF Research Database (Denmark)
Rasmussen, Jakob Gulddahl
The Hawkes process is a practically and theoretically important class of point processes, but parameter-estimation for such a process can pose various problems. In this paper we explore and compare two approaches to Bayesian inference. The first approach is based on the so-called conditional...... intensity function, while the second approach is based on an underlying clustering and branching structure in the Hawkes process. For practical use, MCMC (Markov chain Monte Carlo) methods are employed. The two approaches are compared numerically using three examples of the Hawkes process....
Bayesian inference for Hawkes processes
DEFF Research Database (Denmark)
Rasmussen, Jakob Gulddahl
2013-01-01
The Hawkes process is a practically and theoretically important class of point processes, but parameter-estimation for such a process can pose various problems. In this paper we explore and compare two approaches to Bayesian inference. The first approach is based on the so-called conditional...... intensity function, while the second approach is based on an underlying clustering and branching structure in the Hawkes process. For practical use, MCMC (Markov chain Monte Carlo) methods are employed. The two approaches are compared numerically using three examples of the Hawkes process....
Inference in hybrid Bayesian networks
International Nuclear Information System (INIS)
Langseth, Helge; Nielsen, Thomas D.; Rumi, Rafael; Salmeron, Antonio
2009-01-01
Since the 1980s, Bayesian networks (BNs) have become increasingly popular for building statistical models of complex systems. This is particularly true for boolean systems, where BNs often prove to be a more efficient modelling framework than traditional reliability techniques (like fault trees and reliability block diagrams). However, limitations in the BNs' calculation engine have prevented BNs from becoming equally popular for domains containing mixtures of both discrete and continuous variables (the so-called hybrid domains). In this paper we focus on these difficulties, and summarize some of the last decade's research on inference in hybrid Bayesian networks. The discussions are linked to an example model for estimating human reliability.
SICK: THE SPECTROSCOPIC INFERENCE CRANK
International Nuclear Information System (INIS)
Casey, Andrew R.
2016-01-01
There exists an inordinate amount of spectral data in both public and private astronomical archives that remain severely under-utilized. The lack of reliable open-source tools for analyzing large volumes of spectra contributes to this situation, which is poised to worsen as large surveys successively release orders of magnitude more spectra. In this article I introduce sick, the spectroscopic inference crank, a flexible and fast Bayesian tool for inferring astrophysical parameters from spectra. sick is agnostic to the wavelength coverage, resolving power, or general data format, allowing any user to easily construct a generative model for their data, regardless of its source. sick can be used to provide a nearest-neighbor estimate of model parameters, a numerically optimized point estimate, or full Markov Chain Monte Carlo sampling of the posterior probability distributions. This generality empowers any astronomer to capitalize on the plethora of published synthetic and observed spectra, and make precise inferences for a host of astrophysical (and nuisance) quantities. Model intensities can be reliably approximated from existing grids of synthetic or observed spectra using linear multi-dimensional interpolation, or a Cannon-based model. Additional phenomena that transform the data (e.g., redshift, rotational broadening, continuum, spectral resolution) are incorporated as free parameters and can be marginalized away. Outlier pixels (e.g., cosmic rays or poorly modeled regimes) can be treated with a Gaussian mixture model, and a noise model is included to account for systematically underestimated variance. Combining these phenomena into a scalar-justified, quantitative model permits precise inferences with credible uncertainties on noisy data. I describe the common model features, the implementation details, and the default behavior, which is balanced to be suitable for most astronomical applications. Using a forward model on low-resolution, high signal
ColliderBit. A GAMBIT module for the calculation of high-energy collider observables and likelihoods
Energy Technology Data Exchange (ETDEWEB)
Balazs, Csaba [Monash University, School of Physics and Astronomy, Melbourne, VIC (Australia); Australian Research Council Centre of Excellence for Particle Physics at the Tera-scale (Australia); Buckley, Andy [University of Glasgow, SUPA, School of Physics and Astronomy, Glasgow (United Kingdom); Dal, Lars A.; Krislock, Abram; Raklev, Are [University of Oslo, Department of Physics, Oslo (Norway); Farmer, Ben [AlbaNova University Centre, Oskar Klein Centre for Cosmoparticle Physics, Stockholm (Sweden); Jackson, Paul; Murnane, Daniel; White, Martin [Australian Research Council Centre of Excellence for Particle Physics at the Tera-scale (Australia); University of Adelaide, Department of Physics, Adelaide, SA (Australia); Kvellestad, Anders [NORDITA, Stockholm (Sweden); Putze, Antje [Universite de Savoie, LAPTh, Annecy-le-Vieux (France); Rogan, Christopher [Harvard University, Department of Physics, Cambridge, MA (United States); Saavedra, Aldo [Australian Research Council Centre of Excellence for Particle Physics at the Tera-scale (Australia); The University of Sydney, Faculty of Engineering and Information Technologies, Centre for Translational Data Science, School of Physics, Sydney, NSW (Australia); Scott, Pat [Imperial College London, Blackett Laboratory, Department of Physics, London (United Kingdom); Weniger, Christoph [University of Amsterdam, GRAPPA, Institute of Physics, Amsterdam (Netherlands); Collaboration: The GAMBIT Scanner Workgroup
2017-11-15
We describe ColliderBit, a new code for the calculation of high energy collider observables in theories of physics beyond the Standard Model (BSM). ColliderBit features a generic interface to BSM models, a unique parallelised Monte Carlo event generation scheme suitable for large-scale supercomputer applications, and a number of LHC analyses, covering a reasonable range of the BSM signatures currently sought by ATLAS and CMS. ColliderBit also calculates likelihoods for Higgs sector observables, and LEP searches for BSM particles. These features are provided by a combination of new code unique toColliderBit, and interfaces to existing state-of-the-art public codes. ColliderBit is both an important part of the GAMBIT framework for BSM inference, and a standalone tool for efficiently applying collider constraints to theories of new physics. (orig.)
The emotion of compassion and the likelihood of its expression in nursing practice.
Newham, Roger Alan
2017-07-01
Philosophical and empirical work on the nature of the emotions is extensive, and there are many theories of emotions. However, all agree that emotions are not knee jerk reactions to stimuli and are open to rational assessment or warrant. This paper's focus is on the condition or conditions for compassion as an emotion and the likelihood that it or they can be met in nursing practice. Thus, it is attempting to keep, as far as possible, compassion as an emotion separate from both moral norms and professional norms. This is because empirical or causal conditions that can make experiencing and acting out of compassion difficult seem especially relevant in nursing practice. I consider how theories of emotion in general and of compassion in particular are somewhat contested, but all recent accounts agree that emotions are not totally immune to reason. Then, using accounts of constitutive conditions of the emotion of compassion, I will show how they are often likely to be quite fragile or unstable in practice and particularly so within much nursing practice. In addition, some of the conditions for compassion will be shown to be problematic for nursing practice. It is difficult to keep ideas of compassion separate from morality, and this connection is noticeable in the claims made of compassion for nursing and so I will briefly highlight one such connection that of the need for normative theory to give an account of the value that emotions such as compassion presume and that compassionate motivation is separate from moral motivation and may conflict with it. The fragility or instability of the emotion of compassion in practice has implications for both what can be expected and what should be expected of compassion; at least if what is wanted is a realist rather than idealist account of "should." © 2016 John Wiley & Sons Ltd.
Nonparametric predictive inference for combining diagnostic tests with parametric copula
Muhammad, Noryanti; Coolen, F. P. A.; Coolen-Maturi, T.
2017-09-01
Measuring the accuracy of diagnostic tests is crucial in many application areas including medicine and health care. The Receiver Operating Characteristic (ROC) curve is a popular statistical tool for describing the performance of diagnostic tests. The area under the ROC curve (AUC) is often used as a measure of the overall performance of the diagnostic test. In this paper, we interest in developing strategies for combining test results in order to increase the diagnostic accuracy. We introduce nonparametric predictive inference (NPI) for combining two diagnostic test results with considering dependence structure using parametric copula. NPI is a frequentist statistical framework for inference on a future observation based on past data observations. NPI uses lower and upper probabilities to quantify uncertainty and is based on only a few modelling assumptions. While copula is a well-known statistical concept for modelling dependence of random variables. A copula is a joint distribution function whose marginals are all uniformly distributed and it can be used to model the dependence separately from the marginal distributions. In this research, we estimate the copula density using a parametric method which is maximum likelihood estimator (MLE). We investigate the performance of this proposed method via data sets from the literature and discuss results to show how our method performs for different family of copulas. Finally, we briefly outline related challenges and opportunities for future research.
Bayesian Inference for Linear Parabolic PDEs with Noisy Boundary Conditions
Ruggeri, Fabrizio; Sawlan, Zaid A; Scavino, Marco; Tempone, Raul
2016-01-01
In this work we develop a hierarchical Bayesian setting to infer unknown parameters in initial-boundary value problems (IBVPs) for one-dimensional linear parabolic partial differential equations. Noisy boundary data and known initial condition are assumed. We derive the likelihood function associated with the forward problem, given some measurements of the solution field subject to Gaussian noise. Such function is then analytically marginalized using the linearity of the equation. Gaussian priors have been assumed for the time-dependent Dirichlet boundary values. Our approach is applied to synthetic data for the one-dimensional heat equation model, where the thermal diffusivity is the unknown parameter. We show how to infer the thermal diffusivity parameter when its prior distribution is lognormal or modeled by means of a space-dependent stationary lognormal random field. We use the Laplace method to provide approximated Gaussian posterior distributions for the thermal diffusivity. Expected information gains and predictive posterior densities for observable quantities are numerically estimated for different experimental setups.
Using MOEA with Redistribution and Consensus Branches to Infer Phylogenies.
Min, Xiaoping; Zhang, Mouzhao; Yuan, Sisi; Ge, Shengxiang; Liu, Xiangrong; Zeng, Xiangxiang; Xia, Ningshao
2017-12-26
In recent years, to infer phylogenies, which are NP-hard problems, more and more research has focused on using metaheuristics. Maximum Parsimony and Maximum Likelihood are two effective ways to conduct inference. Based on these methods, which can also be considered as the optimal criteria for phylogenies, various kinds of multi-objective metaheuristics have been used to reconstruct phylogenies. However, combining these two time-consuming methods results in those multi-objective metaheuristics being slower than a single objective. Therefore, we propose a novel, multi-objective optimization algorithm, MOEA-RC, to accelerate the processes of rebuilding phylogenies using structural information of elites in current populations. We compare MOEA-RC with two representative multi-objective algorithms, MOEA/D and NAGA-II, and a non-consensus version of MOEA-RC on three real-world datasets. The result is, within a given number of iterations, MOEA-RC achieves better solutions than the other algorithms.
Bayesian inference from count data using discrete uniform priors.
Directory of Open Access Journals (Sweden)
Federico Comoglio
Full Text Available We consider a set of sample counts obtained by sampling arbitrary fractions of a finite volume containing an homogeneously dispersed population of identical objects. We report a Bayesian derivation of the posterior probability distribution of the population size using a binomial likelihood and non-conjugate, discrete uniform priors under sampling with or without replacement. Our derivation yields a computationally feasible formula that can prove useful in a variety of statistical problems involving absolute quantification under uncertainty. We implemented our algorithm in the R package dupiR and compared it with a previously proposed Bayesian method based on a Gamma prior. As a showcase, we demonstrate that our inference framework can be used to estimate bacterial survival curves from measurements characterized by extremely low or zero counts and rather high sampling fractions. All in all, we provide a versatile, general purpose algorithm to infer population sizes from count data, which can find application in a broad spectrum of biological and physical problems.
Bayesian Inference for Linear Parabolic PDEs with Noisy Boundary Conditions
Ruggeri, Fabrizio
2015-01-07
In this work we develop a hierarchical Bayesian setting to infer unknown parameters in initial-boundary value problems (IBVPs) for one-dimensional linear parabolic partial differential equations. Noisy boundary data and known initial condition are assumed. We derive the likelihood function associated with the forward problem, given some measurements of the solution field subject to Gaussian noise. Such function is then analytically marginalized using the linearity of the equation. Gaussian priors have been assumed for the time-dependent Dirichlet boundary values. Our approach is applied to synthetic data for the one-dimensional heat equation model, where the thermal diffusivity is the unknown parameter. We show how to infer the thermal diffusivity parameter when its prior distribution is lognormal or modeled by means of a space-dependent stationary lognormal random field. We use the Laplace method to provide approximated Gaussian posterior distributions for the thermal diffusivity. Expected information gains and predictive posterior densities for observable quantities are numerically estimated for different experimental setups.
Bayesian Inference for Linear Parabolic PDEs with Noisy Boundary Conditions
Ruggeri, Fabrizio
2016-01-06
In this work we develop a hierarchical Bayesian setting to infer unknown parameters in initial-boundary value problems (IBVPs) for one-dimensional linear parabolic partial differential equations. Noisy boundary data and known initial condition are assumed. We derive the likelihood function associated with the forward problem, given some measurements of the solution field subject to Gaussian noise. Such function is then analytically marginalized using the linearity of the equation. Gaussian priors have been assumed for the time-dependent Dirichlet boundary values. Our approach is applied to synthetic data for the one-dimensional heat equation model, where the thermal diffusivity is the unknown parameter. We show how to infer the thermal diffusivity parameter when its prior distribution is lognormal or modeled by means of a space-dependent stationary lognormal random field. We use the Laplace method to provide approximated Gaussian posterior distributions for the thermal diffusivity. Expected information gains and predictive posterior densities for observable quantities are numerically estimated for different experimental setups.
On parametrised cold dense matter equation of state inference
Riley, Thomas E.; Raaijmakers, Geert; Watts, Anna L.
2018-04-01
Constraining the equation of state of cold dense matter in compact stars is a major science goal for observing programmes being conducted using X-ray, radio, and gravitational wave telescopes. We discuss Bayesian hierarchical inference of parametrised dense matter equations of state. In particular we generalise and examine two inference paradigms from the literature: (i) direct posterior equation of state parameter estimation, conditioned on observations of a set of rotating compact stars; and (ii) indirect parameter estimation, via transformation of an intermediary joint posterior distribution of exterior spacetime parameters (such as gravitational masses and coordinate equatorial radii). We conclude that the former paradigm is not only tractable for large-scale analyses, but is principled and flexible from a Bayesian perspective whilst the latter paradigm is not. The thematic problem of Bayesian prior definition emerges as the crux of the difference between these paradigms. The second paradigm should in general only be considered as an ill-defined approach to the problem of utilising archival posterior constraints on exterior spacetime parameters; we advocate for an alternative approach whereby such information is repurposed as an approximative likelihood function. We also discuss why conditioning on a piecewise-polytropic equation of state model - currently standard in the field of dense matter study - can easily violate conditions required for transformation of a probability density distribution between spaces of exterior (spacetime) and interior (source matter) parameters.
Post-model selection inference and model averaging
Directory of Open Access Journals (Sweden)
Georges Nguefack-Tsague
2011-07-01
Full Text Available Although model selection is routinely used in practice nowadays, little is known about its precise effects on any subsequent inference that is carried out. The same goes for the effects induced by the closely related technique of model averaging. This paper is concerned with the use of the same data first to select a model and then to carry out inference, in particular point estimation and point prediction. The properties of the resulting estimator, called a post-model-selection estimator (PMSE, are hard to derive. Using selection criteria such as hypothesis testing, AIC, BIC, HQ and Cp, we illustrate that, in terms of risk function, no single PMSE dominates the others. The same conclusion holds more generally for any penalised likelihood information criterion. We also compare various model averaging schemes and show that no single one dominates the others in terms of risk function. Since PMSEs can be regarded as a special case of model averaging, with 0-1 random-weights, we propose a connection between the two theories, in the frequentist approach, by taking account of the selection procedure when performing model averaging. We illustrate the point by simulating a simple linear regression model.
Krishnan, Neeraja M; Seligmann, Hervé; Stewart, Caro-Beth; De Koning, A P Jason; Pollock, David D
2004-10-01
Reconstruction of ancestral DNA and amino acid sequences is an important means of inferring information about past evolutionary events. Such reconstructions suggest changes in molecular function and evolutionary processes over the course of evolution and are used to infer adaptation and convergence. Maximum likelihood (ML) is generally thought to provide relatively accurate reconstructed sequences compared to parsimony, but both methods lead to the inference of multiple directional changes in nucleotide frequencies in primate mitochondrial DNA (mtDNA). To better understand this surprising result, as well as to better understand how parsimony and ML differ, we constructed a series of computationally simple "conditional pathway" methods that differed in the number of substitutions allowed per site along each branch, and we also evaluated the entire Bayesian posterior frequency distribution of reconstructed ancestral states. We analyzed primate mitochondrial cytochrome b (Cyt-b) and cytochrome oxidase subunit I (COI) genes and found that ML reconstructs ancestral frequencies that are often more different from tip sequences than are parsimony reconstructions. In contrast, frequency reconstructions based on the posterior ensemble more closely resemble extant nucleotide frequencies. Simulations indicate that these differences in ancestral sequence inference are probably due to deterministic bias caused by high uncertainty in the optimization-based ancestral reconstruction methods (parsimony, ML, Bayesian maximum a posteriori). In contrast, ancestral nucleotide frequencies based on an average of the Bayesian set of credible ancestral sequences are much less biased. The methods involving simpler conditional pathway calculations have slightly reduced likelihood values compared to full likelihood calculations, but they can provide fairly unbiased nucleotide reconstructions and may be useful in more complex phylogenetic analyses than considered here due to their speed and
Subjective randomness as statistical inference.
Griffiths, Thomas L; Daniels, Dylan; Austerweil, Joseph L; Tenenbaum, Joshua B
2018-06-01
Some events seem more random than others. For example, when tossing a coin, a sequence of eight heads in a row does not seem very random. Where do these intuitions about randomness come from? We argue that subjective randomness can be understood as the result of a statistical inference assessing the evidence that an event provides for having been produced by a random generating process. We show how this account provides a link to previous work relating randomness to algorithmic complexity, in which random events are those that cannot be described by short computer programs. Algorithmic complexity is both incomputable and too general to capture the regularities that people can recognize, but viewing randomness as statistical inference provides two paths to addressing these problems: considering regularities generated by simpler computing machines, and restricting the set of probability distributions that characterize regularity. Building on previous work exploring these different routes to a more restricted notion of randomness, we define strong quantitative models of human randomness judgments that apply not just to binary sequences - which have been the focus of much of the previous work on subjective randomness - but also to binary matrices and spatial clustering. Copyright © 2018 Elsevier Inc. All rights reserved.
Maximum likelihood positioning algorithm for high-resolution PET scanners
International Nuclear Information System (INIS)
Gross-Weege, Nicolas; Schug, David; Hallen, Patrick; Schulz, Volkmar
2016-01-01
Purpose: In high-resolution positron emission tomography (PET), lightsharing elements are incorporated into typical detector stacks to read out scintillator arrays in which one scintillator element (crystal) is smaller than the size of the readout channel. In order to identify the hit crystal by means of the measured light distribution, a positioning algorithm is required. One commonly applied positioning algorithm uses the center of gravity (COG) of the measured light distribution. The COG algorithm is limited in spatial resolution by noise and intercrystal Compton scatter. The purpose of this work is to develop a positioning algorithm which overcomes this limitation. Methods: The authors present a maximum likelihood (ML) algorithm which compares a set of expected light distributions given by probability density functions (PDFs) with the measured light distribution. Instead of modeling the PDFs by using an analytical model, the PDFs of the proposed ML algorithm are generated assuming a single-gamma-interaction model from measured data. The algorithm was evaluated with a hot-rod phantom measurement acquired with the preclinical HYPERION II D PET scanner. In order to assess the performance with respect to sensitivity, energy resolution, and image quality, the ML algorithm was compared to a COG algorithm which calculates the COG from a restricted set of channels. The authors studied the energy resolution of the ML and the COG algorithm regarding incomplete light distributions (missing channel information caused by detector dead time). Furthermore, the authors investigated the effects of using a filter based on the likelihood values on sensitivity, energy resolution, and image quality. Results: A sensitivity gain of up to 19% was demonstrated in comparison to the COG algorithm for the selected operation parameters. Energy resolution and image quality were on a similar level for both algorithms. Additionally, the authors demonstrated that the performance of the ML
EGG: Empirical Galaxy Generator
Schreiber, C.; Elbaz, D.; Pannella, M.; Merlin, E.; Castellano, M.; Fontana, A.; Bourne, N.; Boutsia, K.; Cullen, F.; Dunlop, J.; Ferguson, H. C.; MichaÅowski, M. J.; Okumura, K.; Santini, P.; Shu, X. W.; Wang, T.; White, C.
2018-04-01
The Empirical Galaxy Generator (EGG) generates fake galaxy catalogs and images with realistic positions, morphologies and fluxes from the far-ultraviolet to the far-infrared. The catalogs are generated by egg-gencat and stored in binary FITS tables (column oriented). Another program, egg-2skymaker, is used to convert the generated catalog into ASCII tables suitable for ingestion by SkyMaker (ascl:1010.066) to produce realistic high resolution images (e.g., Hubble-like), while egg-gennoise and egg-genmap can be used to generate the low resolution images (e.g., Herschel-like). These tools can be used to test source extraction codes, or to evaluate the reliability of any map-based science (stacking, dropout identification, etc.).
Inference for exponentiated general class of distributions based on record values
Directory of Open Access Journals (Sweden)
Samah N. Sindi
2017-09-01
Full Text Available The main objective of this paper is to suggest and study a new exponentiated general class (EGC of distributions. Maximum likelihood, Bayesian and empirical Bayesian estimators of the parameter of the EGC of distributions based on lower record values are obtained. Furthermore, Bayesian prediction of future records is considered. Based on lower record values, the exponentiated Weibull distribution, its special cases of distributions and exponentiated Gompertz distribution are applied to the EGC of distributions.
DEFF Research Database (Denmark)
Hooghoudt, Jan Otto; Barroso, Margarida; Waagepetersen, Rasmus Plenge
2017-01-01
Főrster resonance energy transfer (FRET) is a quantum-physical phenomenon where energy may be transferred from one molecule to a neighbour molecule if the molecules are close enough. Using fluorophore molecule marking of proteins in a cell it is possible to measure in microscopic images to what....... In this paper we propose a new likelihood-based approach to statistical inference for FRET microscopic data. The likelihood function is obtained from a detailed modeling of the FRET data generating mechanism conditional on a protein configuration. We next follow a Bayesian approach and introduce a spatial point...
Statistical inference for template aging
Schuckers, Michael E.
2006-04-01
A change in classification error rates for a biometric device is often referred to as template aging. Here we offer two methods for determining whether the effect of time is statistically significant. The first of these is the use of a generalized linear model to determine if these error rates change linearly over time. This approach generalizes previous work assessing the impact of covariates using generalized linear models. The second approach uses of likelihood ratio tests methodology. The focus here is on statistical methods for estimation not the underlying cause of the change in error rates over time. These methodologies are applied to data from the National Institutes of Standards and Technology Biometric Score Set Release 1. The results of these applications are discussed.
Scientific inference learning from data
Vaughan, Simon
2013-01-01
Providing the knowledge and practical experience to begin analysing scientific data, this book is ideal for physical sciences students wishing to improve their data handling skills. The book focuses on explaining and developing the practice and understanding of basic statistical analysis, concentrating on a few core ideas, such as the visual display of information, modelling using the likelihood function, and simulating random data. Key concepts are developed through a combination of graphical explanations, worked examples, example computer code and case studies using real data. Students will develop an understanding of the ideas behind statistical methods and gain experience in applying them in practice. Further resources are available at www.cambridge.org/9781107607590, including data files for the case studies so students can practise analysing data, and exercises to test students' understanding.
CSIR Research Space (South Africa)
Kok, S
2012-07-01
Full Text Available continuously as the correlation function hyper-parameters approach zero. Since the global minimizer of the maximum likelihood function is an asymptote in this case, it is unclear if maximum likelihood estimation (MLE) remains valid. Numerical ill...