Hu, Zixi; Yao, Zhewei; Li, Jinglai
2015-01-01
Many scientific and engineering problems require to perform Bayesian inferences for unknowns of infinite dimension. In such problems, many standard Markov Chain Monte Carlo (MCMC) algorithms become arbitrary slow under the mesh refinement, which is referred to as being dimension dependent. To this end, a family of dimensional independent MCMC algorithms, known as the preconditioned Crank-Nicolson (pCN) methods, were proposed to sample the infinite dimensional parameters. In this work we devel...
Directory of Open Access Journals (Sweden)
Allen Rodrigo
2006-01-01
Full Text Available Using the structured serial coalescent with Bayesian MCMC and serial samples, we estimate population size when some demes are not sampled or are hidden, ie ghost demes. It is found that even with the presence of a ghost deme, accurate inference was possible if the parameters are estimated with the true model. However with an incorrect model, estimates were biased and can be positively misleading. We extend these results to the case where there are sequences from the ghost at the last time sample. This case can arise in HIV patients, when some tissue samples and viral sequences only become available after death. When some sequences from the ghost deme are available at the last sampling time, estimation bias is reduced and accurate estimation of parameters associated with the ghost deme is possible despite sampling bias. Migration rates for this case are also shown to be good estimates when migration values are low.
Bayesian Optimization for Adaptive MCMC
Mahendran, Nimalan; Wang, Ziyu; Hamze, Firas; De Freitas, Nando
2011-01-01
This paper proposes a new randomized strategy for adaptive MCMC using Bayesian optimization. This approach applies to non-differentiable objective functions and trades off exploration and exploitation to reduce the number of potentially costly objective function evaluations. We demonstrate the strategy in the complex setting of sampling from constrained, discrete and densely connected probabilistic graphical models where, for each variation of the problem, one needs to adjust the parameters o...
Measuring the reliability of MCMC inference with bidirectional Monte Carlo
Grosse, Roger B.; Ancha, Siddharth; Roy, Daniel M.
2016-01-01
Markov chain Monte Carlo (MCMC) is one of the main workhorses of probabilistic inference, but it is notoriously hard to measure the quality of approximate posterior samples. This challenge is particularly salient in black box inference methods, which can hide details and obscure inference failures. In this work, we extend the recently introduced bidirectional Monte Carlo technique to evaluate MCMC-based posterior inference algorithms. By running annealed importance sampling (AIS) chains both ...
Complexity analysis of accelerated MCMC methods for Bayesian inversion
Hoang, Viet Ha; Schwab, Christoph; Stuart, Andrew M.
2013-08-01
The Bayesian approach to inverse problems, in which the posterior probability distribution on an unknown field is sampled for the purposes of computing posterior expectations of quantities of interest, is starting to become computationally feasible for partial differential equation (PDE) inverse problems. Balancing the sources of error arising from finite-dimensional approximation of the unknown field, the PDE forward solution map and the sampling of the probability space under the posterior distribution are essential for the design of efficient computational Bayesian methods for PDE inverse problems. We study Bayesian inversion for a model elliptic PDE with an unknown diffusion coefficient. We provide complexity analyses of several Markov chain Monte Carlo (MCMC) methods for the efficient numerical evaluation of expectations under the Bayesian posterior distribution, given data δ. Particular attention is given to bounds on the overall work required to achieve a prescribed error level ε. Specifically, we first bound the computational complexity of ‘plain’ MCMC, based on combining MCMC sampling with linear complexity multi-level solvers for elliptic PDE. Our (new) work versus accuracy bounds show that the complexity of this approach can be quite prohibitive. Two strategies for reducing the computational complexity are then proposed and analyzed: first, a sparse, parametric and deterministic generalized polynomial chaos (gpc) ‘surrogate’ representation of the forward response map of the PDE over the entire parameter space, and, second, a novel multi-level Markov chain Monte Carlo strategy which utilizes sampling from a multi-level discretization of the posterior and the forward PDE. For both of these strategies, we derive asymptotic bounds on work versus accuracy, and hence asymptotic bounds on the computational complexity of the algorithms. In particular, we provide sufficient conditions on the regularity of the unknown coefficients of the PDE and on the
boa: An R Package for MCMC Output Convergence Assessment and Posterior Inference
Directory of Open Access Journals (Sweden)
Brian J. Smith
2007-10-01
Full Text Available Markov chain Monte Carlo (MCMC is the most widely used method of estimating joint posterior distributions in Bayesian analysis. The idea of MCMC is to iteratively produce parameter values that are representative samples from the joint posterior. Unlike frequentist analysis where iterative model fitting routines are monitored for convergence to a single point, MCMC output is monitored for convergence to a distribution. Thus, specialized diagnostic tools are needed in the Bayesian setting. To this end, the R package boa was created. This manuscript presents the user's manual for boa, which outlines the use of and methodology upon which the software is based. Included is a description of the menu system, data management capabilities, and statistical/graphical methods for convergence assessment and posterior inference. Throughout the manual, a linear regression example is used to illustrate the software.
Bayesian inference for Hawkes processes
DEFF Research Database (Denmark)
Rasmussen, Jakob Gulddahl
The Hawkes process is a practically and theoretically important class of point processes, but parameter-estimation for such a process can pose various problems. In this paper we explore and compare two approaches to Bayesian inference. The first approach is based on the so-called conditional...... intensity function, while the second approach is based on an underlying clustering and branching structure in the Hawkes process. For practical use, MCMC (Markov chain Monte Carlo) methods are employed. The two approaches are compared numerically using three examples of the Hawkes process....
Bayesian inference for Hawkes processes
DEFF Research Database (Denmark)
Rasmussen, Jakob Gulddahl
2013-01-01
The Hawkes process is a practically and theoretically important class of point processes, but parameter-estimation for such a process can pose various problems. In this paper we explore and compare two approaches to Bayesian inference. The first approach is based on the so-called conditional...... intensity function, while the second approach is based on an underlying clustering and branching structure in the Hawkes process. For practical use, MCMC (Markov chain Monte Carlo) methods are employed. The two approaches are compared numerically using three examples of the Hawkes process....
Frühwirth-Schnatter, Sylvia
1990-01-01
In the paper at hand we apply it to Bayesian statistics to obtain "Fuzzy Bayesian Inference". In the subsequent sections we will discuss a fuzzy valued likelihood function, Bayes' theorem for both fuzzy data and fuzzy priors, a fuzzy Bayes' estimator, fuzzy predictive densities and distributions, and fuzzy H.P.D .-Regions. (author's abstract)
Computationally efficient Bayesian inference for inverse problems.
Energy Technology Data Exchange (ETDEWEB)
Marzouk, Youssef M.; Najm, Habib N.; Rahn, Larry A.
2007-10-01
Bayesian statistics provides a foundation for inference from noisy and incomplete data, a natural mechanism for regularization in the form of prior information, and a quantitative assessment of uncertainty in the inferred results. Inverse problems - representing indirect estimation of model parameters, inputs, or structural components - can be fruitfully cast in this framework. Complex and computationally intensive forward models arising in physical applications, however, can render a Bayesian approach prohibitive. This difficulty is compounded by high-dimensional model spaces, as when the unknown is a spatiotemporal field. We present new algorithmic developments for Bayesian inference in this context, showing strong connections with the forward propagation of uncertainty. In particular, we introduce a stochastic spectral formulation that dramatically accelerates the Bayesian solution of inverse problems via rapid evaluation of a surrogate posterior. We also explore dimensionality reduction for the inference of spatiotemporal fields, using truncated spectral representations of Gaussian process priors. These new approaches are demonstrated on scalar transport problems arising in contaminant source inversion and in the inference of inhomogeneous material or transport properties. We also present a Bayesian framework for parameter estimation in stochastic models, where intrinsic stochasticity may be intermingled with observational noise. Evaluation of a likelihood function may not be analytically tractable in these cases, and thus several alternative Markov chain Monte Carlo (MCMC) schemes, operating on the product space of the observations and the parameters, are introduced.
Improving the structure MCMC sampler for Bayesian networks by introducing a new edge reversal move
Grzegorczyk, Marco; Husmeier, Dirk
2008-01-01
Applications of Bayesian networks in systems biology are computationally demanding due to the large number of model parameters. Conventional MCMC schemes based on proposal moves in structure space tend to be too slow in mixing and convergence, and have recently been superseded by proposal moves in t
Inference algorithms and learning theory for Bayesian sparse factor analysis
International Nuclear Information System (INIS)
Bayesian sparse factor analysis has many applications; for example, it has been applied to the problem of inferring a sparse regulatory network from gene expression data. We describe a number of inference algorithms for Bayesian sparse factor analysis using a slab and spike mixture prior. These include well-established Markov chain Monte Carlo (MCMC) and variational Bayes (VB) algorithms as well as a novel hybrid of VB and Expectation Propagation (EP). For the case of a single latent factor we derive a theory for learning performance using the replica method. We compare the MCMC and VB/EP algorithm results with simulated data to the theoretical prediction. The results for MCMC agree closely with the theory as expected. Results for VB/EP are slightly sub-optimal but show that the new algorithm is effective for sparse inference. In large-scale problems MCMC is infeasible due to computational limitations and the VB/EP algorithm then provides a very useful computationally efficient alternative.
BigFoot: Bayesian alignment and phylogenetic footprinting with MCMC
Directory of Open Access Journals (Sweden)
Miklós István
2009-08-01
Full Text Available Abstract Background We have previously combined statistical alignment and phylogenetic footprinting to detect conserved functional elements without assuming a fixed alignment. Considering a probability-weighted distribution of alignments removes sensitivity to alignment errors, properly accommodates regions of alignment uncertainty, and increases the accuracy of functional element prediction. Our method utilized standard dynamic programming hidden markov model algorithms to analyze up to four sequences. Results We present a novel approach, implemented in the software package BigFoot, for performing phylogenetic footprinting on greater numbers of sequences. We have developed a Markov chain Monte Carlo (MCMC approach which samples both sequence alignments and locations of slowly evolving regions. We implement our method as an extension of the existing StatAlign software package and test it on well-annotated regions controlling the expression of the even-skipped gene in Drosophila and the α-globin gene in vertebrates. The results exhibit how adding additional sequences to the analysis has the potential to improve the accuracy of functional predictions, and demonstrate how BigFoot outperforms existing alignment-based phylogenetic footprinting techniques. Conclusion BigFoot extends a combined alignment and phylogenetic footprinting approach to analyze larger amounts of sequence data using MCMC. Our approach is robust to alignment error and uncertainty and can be applied to a variety of biological datasets. The source code and documentation are publicly available for download from http://www.stats.ox.ac.uk/~satija/BigFoot/
Analysis of KATRIN data using Bayesian inference
Riis, Anna Sejersen; Weinheimer, Christian
2011-01-01
The KATRIN (KArlsruhe TRItium Neutrino) experiment will be analyzing the tritium beta-spectrum to determine the mass of the neutrino with a sensitivity of 0.2 eV (90% C.L.). This approach to a measurement of the absolute value of the neutrino mass relies only on the principle of energy conservation and can in some sense be called model-independent as compared to cosmology and neutrino-less double beta decay. However by model independent we only mean in case of the minimal extension of the standard model. One should therefore also analyse the data for non-standard couplings to e.g. righthanded or sterile neutrinos. As an alternative to the frequentist minimization methods used in the analysis of the earlier experiments in Mainz and Troitsk we have been investigating Markov Chain Monte Carlo (MCMC) methods which are very well suited for probing multi-parameter spaces. We found that implementing the KATRIN chi squared function in the COSMOMC package - an MCMC code using Bayesian parameter inference - solved the ...
Bayesian-MCMC-based parameter estimation of stealth aircraft RCS models
Xia, Wei; Dai, Xiao-Xia; Feng, Yuan
2015-12-01
When modeling a stealth aircraft with low RCS (Radar Cross Section), conventional parameter estimation methods may cause a deviation from the actual distribution, owing to the fact that the characteristic parameters are estimated via directly calculating the statistics of RCS. The Bayesian-Markov Chain Monte Carlo (Bayesian-MCMC) method is introduced herein to estimate the parameters so as to improve the fitting accuracies of fluctuation models. The parameter estimations of the lognormal and the Legendre polynomial models are reformulated in the Bayesian framework. The MCMC algorithm is then adopted to calculate the parameter estimates. Numerical results show that the distribution curves obtained by the proposed method exhibit improved consistence with the actual ones, compared with those fitted by the conventional method. The fitting accuracy could be improved by no less than 25% for both fluctuation models, which implies that the Bayesian-MCMC method might be a good candidate among the optimal parameter estimation methods for stealth aircraft RCS models. Project supported by the National Natural Science Foundation of China (Grant No. 61101173), the National Basic Research Program of China (Grant No. 613206), the National High Technology Research and Development Program of China (Grant No. 2012AA01A308), the State Scholarship Fund by the China Scholarship Council (CSC), and the Oversea Academic Training Funds, and University of Electronic Science and Technology of China (UESTC).
Roche, Alexis
2015-01-01
This paper revisits the concept of composite likelihood from the perspective of probabilistic inference, and proposes a generalization called super composite likelihood for sharper inference in multiclass problems. It is argued that, beside providing a new interpretation and a general justification of na\\"ive Bayes procedures, super composite likelihood yields a much wider class of discriminative models suitable for unsupervised and weakly supervised learning.
Bayesian inference tools for inverse problems
Mohammad-Djafari, Ali
2013-08-01
In this paper, first the basics of Bayesian inference with a parametric model of the data is presented. Then, the needed extensions are given when dealing with inverse problems and in particular the linear models such as Deconvolution or image reconstruction in Computed Tomography (CT). The main point to discuss then is the prior modeling of signals and images. A classification of these priors is presented, first in separable and Markovien models and then in simple or hierarchical with hidden variables. For practical applications, we need also to consider the estimation of the hyper parameters. Finally, we see that we have to infer simultaneously on the unknowns, the hidden variables and the hyper parameters. Very often, the expression of this joint posterior law is too complex to be handled directly. Indeed, rarely we can obtain analytical solutions to any point estimators such the Maximum A posteriori (MAP) or Posterior Mean (PM). Three main tools are then can be used: Laplace approximation (LAP), Markov Chain Monte Carlo (MCMC) and Bayesian Variational Approximations (BVA). To illustrate all these aspects, we will consider a deconvolution problem where we know that the input signal is sparse and propose to use a Student-t prior for that. Then, to handle the Bayesian computations with this model, we use the property of Student-t which is modelling it via an infinite mixture of Gaussians, introducing thus hidden variables which are the variances. Then, the expression of the joint posterior of the input signal samples, the hidden variables (which are here the inverse variances of those samples) and the hyper-parameters of the problem (for example the variance of the noise) is given. From this point, we will present the joint maximization by alternate optimization and the three possible approximation methods. Finally, the proposed methodology is applied in different applications such as mass spectrometry, spectrum estimation of quasi periodic biological signals and
Probability biases as Bayesian inference
Directory of Open Access Journals (Sweden)
Andre; C. R. Martins
2006-11-01
Full Text Available In this article, I will show how several observed biases in human probabilistic reasoning can be partially explained as good heuristics for making inferences in an environment where probabilities have uncertainties associated to them. Previous results show that the weight functions and the observed violations of coalescing and stochastic dominance can be understood from a Bayesian point of view. We will review those results and see that Bayesian methods should also be used as part of the explanation behind other known biases. That means that, although the observed errors are still errors under the be understood as adaptations to the solution of real life problems. Heuristics that allow fast evaluations and mimic a Bayesian inference would be an evolutionary advantage, since they would give us an efficient way of making decisions. %XX In that sense, it should be no surprise that humans reason with % probability as it has been observed.
Numerical approximations for speeding up mcmc inference in the infinite relational model
DEFF Research Database (Denmark)
Schmidt, Mikkel Nørgaard; Albers, Kristoffer Jon
The infinite relational model (IRM) is a powerful model for discovering clusters in complex networks; however, the computational speed of Markov chain Monte Carlo inference in the model can be a limiting factor when analyzing large networks. We investigate how using numerical approximations of th...... on test data. We demonstrate that the computational time for MCMC inference can be reduced by a factor of two without affecting the performance, making it worthwhile in practical situations when on a computational budget....
Bayesian Inference for Radio Observations
Lochner, Michelle; Zwart, Jonathan T L; Smirnov, Oleg; Bassett, Bruce A; Oozeer, Nadeem; Kunz, Martin
2015-01-01
(Abridged) New telescopes like the Square Kilometre Array (SKA) will push into a new sensitivity regime and expose systematics, such as direction-dependent effects, that could previously be ignored. Current methods for handling such systematics rely on alternating best estimates of instrumental calibration and models of the underlying sky, which can lead to inaccurate uncertainty estimates and biased results because such methods ignore any correlations between parameters. These deconvolution algorithms produce a single image that is assumed to be a true representation of the sky, when in fact it is just one realisation of an infinite ensemble of images compatible with the noise in the data. In contrast, here we report a Bayesian formalism that simultaneously infers both systematics and science. Our technique, Bayesian Inference for Radio Observations (BIRO), determines all parameters directly from the raw data, bypassing image-making entirely, by sampling from the joint posterior probability distribution. Thi...
Bayesian inference on proportional elections.
Brunello, Gabriel Hideki Vatanabe; Nakano, Eduardo Yoshio
2015-01-01
Polls for majoritarian voting systems usually show estimates of the percentage of votes for each candidate. However, proportional vote systems do not necessarily guarantee the candidate with the most percentage of votes will be elected. Thus, traditional methods used in majoritarian elections cannot be applied on proportional elections. In this context, the purpose of this paper was to perform a Bayesian inference on proportional elections considering the Brazilian system of seats distribution. More specifically, a methodology to answer the probability that a given party will have representation on the chamber of deputies was developed. Inferences were made on a Bayesian scenario using the Monte Carlo simulation technique, and the developed methodology was applied on data from the Brazilian elections for Members of the Legislative Assembly and Federal Chamber of Deputies in 2010. A performance rate was also presented to evaluate the efficiency of the methodology. Calculations and simulations were carried out using the free R statistical software. PMID:25786259
Skill Rating by Bayesian Inference
Di Fatta, Giuseppe; Haworth, Guy McCrossan; Regan, Kenneth W.
2009-01-01
Systems Engineering often involves computer modelling the behaviour of proposed systems and their components. Where a component is human, fallibility must be modelled by a stochastic agent. The identification of a model of decision-making over quantifiable options is investigated using the game-domain of Chess. Bayesian methods are used to infer the distribution of players’ skill levels from the moves they play rather than from their competitive results. The approach is used on large sets of ...
Inference in hybrid Bayesian networks
DEFF Research Database (Denmark)
Lanseth, Helge; Nielsen, Thomas Dyhre; Rumí, Rafael;
2009-01-01
and reliability block diagrams). However, limitations in the BNs' calculation engine have prevented BNs from becoming equally popular for domains containing mixtures of both discrete and continuous variables (so-called hybrid domains). In this paper we focus on these difficulties, and summarize some of the last...... decade's research on inference in hybrid Bayesian networks. The discussions are linked to an example model for estimating human reliability....
DEFF Research Database (Denmark)
Picchini, Umberto; Forman, Julie Lyng
2016-01-01
applications. A simulation study is conducted to compare our strategy with exact Bayesian inference, the latter resulting two orders of magnitude slower than ABC-MCMC for the considered set-up. Finally, the ABC algorithm is applied to a large size protein data. The suggested methodology is fairly general and...
New inference strategies for solving Markov Decision Processes using reversible jump MCMC
Hoffman, Matthias; Kueck, Hendrik; De Freitas, Nando; Doucet, Arnaud
2012-01-01
In this paper we build on previous work which uses inferences techniques, in particular Markov Chain Monte Carlo (MCMC) methods, to solve parameterized control problems. We propose a number of modifications in order to make this approach more practical in general, higher-dimensional spaces. We first introduce a new target distribution which is able to incorporate more reward information from sampled trajectories. We also show how to break strong correlations between the policy parameters and ...
Quantum Inference on Bayesian Networks
Yoder, Theodore; Low, Guang Hao; Chuang, Isaac
2014-03-01
Because quantum physics is naturally probabilistic, it seems reasonable to expect physical systems to describe probabilities and their evolution in a natural fashion. Here, we use quantum computation to speedup sampling from a graphical probability model, the Bayesian network. A specialization of this sampling problem is approximate Bayesian inference, where the distribution on query variables is sampled given the values e of evidence variables. Inference is a key part of modern machine learning and artificial intelligence tasks, but is known to be NP-hard. Classically, a single unbiased sample is obtained from a Bayesian network on n variables with at most m parents per node in time (nmP(e) - 1 / 2) , depending critically on P(e) , the probability the evidence might occur in the first place. However, by implementing a quantum version of rejection sampling, we obtain a square-root speedup, taking (n2m P(e) -1/2) time per sample. The speedup is the result of amplitude amplification, which is proving to be broadly applicable in sampling and machine learning tasks. In particular, we provide an explicit and efficient circuit construction that implements the algorithm without the need for oracle access.
Nonparametric Bayesian inference in biostatistics
Müller, Peter
2015-01-01
As chapters in this book demonstrate, BNP has important uses in clinical sciences and inference for issues like unknown partitions in genomics. Nonparametric Bayesian approaches (BNP) play an ever expanding role in biostatistical inference from use in proteomics to clinical trials. Many research problems involve an abundance of data and require flexible and complex probability models beyond the traditional parametric approaches. As this book's expert contributors show, BNP approaches can be the answer. Survival Analysis, in particular survival regression, has traditionally used BNP, but BNP's potential is now very broad. This applies to important tasks like arrangement of patients into clinically meaningful subpopulations and segmenting the genome into functionally distinct regions. This book is designed to both review and introduce application areas for BNP. While existing books provide theoretical foundations, this book connects theory to practice through engaging examples and research questions. Chapters c...
New inference strategies for solving Markov Decision Processes using reversible jump MCMC
Hoffman, Matthias; de Freitas, Nando; Doucet, Arnaud
2012-01-01
In this paper we build on previous work which uses inferences techniques, in particular Markov Chain Monte Carlo (MCMC) methods, to solve parameterized control problems. We propose a number of modifications in order to make this approach more practical in general, higher-dimensional spaces. We first introduce a new target distribution which is able to incorporate more reward information from sampled trajectories. We also show how to break strong correlations between the policy parameters and sampled trajectories in order to sample more freely. Finally, we show how to incorporate these techniques in a principled manner to obtain estimates of the optimal policy.
Compiling Relational Bayesian Networks for Exact Inference
DEFF Research Database (Denmark)
Jaeger, Manfred; Chavira, Mark; Darwiche, Adnan
2004-01-01
We describe a system for exact inference with relational Bayesian networks as defined in the publicly available \\primula\\ tool. The system is based on compiling propositional instances of relational Bayesian networks into arithmetic circuits and then performing online inference by evaluating and ...
Compiling Relational Bayesian Networks for Exact Inference
DEFF Research Database (Denmark)
Jaeger, Manfred; Darwiche, Adnan; Chavira, Mark
We describe in this paper a system for exact inference with relational Bayesian networks as defined in the publicly available PRIMULA tool. The system is based on compiling propositional instances of relational Bayesian networks into arithmetic circuits and then performing online inference by eva...
Parallel local approximation MCMC for expensive models
Conrad, Patrick; Davis, Andrew; Marzouk, Youssef; Pillai, Natesh; Smith, Aaron
2016-01-01
Performing Bayesian inference via Markov chain Monte Carlo (MCMC) can be exceedingly expensive when posterior evaluations invoke the evaluation of a computationally expensive model, such as a system of partial differential equations. In recent work [Conrad et al. JASA 2015, arXiv:1402.1694] we described a framework for constructing and refining local approximations of such models during an MCMC simulation. These posterior--adapted approximations harness regularity of the model to reduce the c...
Fast MCMC sampling for Markov jump processes and continuous time Bayesian networks
Rao, Vinayak
2012-01-01
Markov jump processes and continuous time Bayesian networks are important classes of continuous time dynamical systems. In this paper, we tackle the problem of inferring unobserved paths in these models by introducing a fast auxiliary variable Gibbs sampler. Our approach is based on the idea of uniformization, and sets up a Markov chain over paths by sampling a finite set of virtual jump times and then running a standard hidden Markov model forward filtering-backward sampling algorithm over states at the set of extant and virtual jump times. We demonstrate significant computational benefits over a state-of-the-art Gibbs sampler on a number of continuous time Bayesian networks.
Bayesian Inference Methods for Sparse Channel Estimation
DEFF Research Database (Denmark)
Pedersen, Niels Lovmand
2013-01-01
inference algorithms based on the proposed prior representation for sparse channel estimation in orthogonal frequency-division multiplexing receivers. The inference algorithms, which are mainly obtained from variational Bayesian methods, exploit the underlying sparse structure of wireless channel responses......This thesis deals with sparse Bayesian learning (SBL) with application to radio channel estimation. As opposed to the classical approach for sparse signal representation, we focus on the problem of inferring complex signals. Our investigations within SBL constitute the basis for the development of...... Bayesian inference algorithms for sparse channel estimation. Sparse inference methods aim at finding the sparse representation of a signal given in some overcomplete dictionary of basis vectors. Within this context, one of our main contributions to the field of SBL is a hierarchical representation of...
Bayesian Inference on Gravitational Waves
Directory of Open Access Journals (Sweden)
Asad Ali
2015-12-01
Full Text Available The Bayesian approach is increasingly becoming popular among the astrophysics data analysis communities. However, the Pakistan statistics communities are unaware of this fertile interaction between the two disciplines. Bayesian methods have been in use to address astronomical problems since the very birth of the Bayes probability in eighteenth century. Today the Bayesian methods for the detection and parameter estimation of gravitational waves have solid theoretical grounds with a strong promise for the realistic applications. This article aims to introduce the Pakistan statistics communities to the applications of Bayesian Monte Carlo methods in the analysis of gravitational wave data with an overview of the Bayesian signal detection and estimation methods and demonstration by a couple of simplified examples.
AGNfitter: A Bayesian MCMC approach to fitting spectral energy distributions of AGN
Rivera, Gabriela Calistro; Hennawi, Joseph F; Hogg, David W
2016-01-01
We present AGNfitter, a publicly available open-source algorithm implementing a fully Bayesian Markov Chain Monte Carlo method to fit the spectral energy distributions (SEDs) of active galactic nuclei (AGN) from the sub-mm to the UV, allowing one to robustly disentangle the physical processes responsible for their emission. AGNfitter makes use of a large library of theoretical, empirical, and semi-empirical models to characterize both the nuclear and host galaxy emission simultaneously. The model consists of four physical emission components: an accretion disk, a torus of AGN heated dust, stellar populations, and cold dust in star forming regions. AGNfitter determines the posterior distributions of numerous parameters that govern the physics of AGN with a fully Bayesian treatment of errors and parameter degeneracies, allowing one to infer integrated luminosities, dust attenuation parameters, stellar masses, and star formation rates. We tested AGNfitter's performace on real data by fitting the SEDs of a sample...
Bayesian inference for Markov jump processes with informative observations.
Golightly, Andrew; Wilkinson, Darren J
2015-04-01
In this paper we consider the problem of parameter inference for Markov jump process (MJP) representations of stochastic kinetic models. Since transition probabilities are intractable for most processes of interest yet forward simulation is straightforward, Bayesian inference typically proceeds through computationally intensive methods such as (particle) MCMC. Such methods ostensibly require the ability to simulate trajectories from the conditioned jump process. When observations are highly informative, use of the forward simulator is likely to be inefficient and may even preclude an exact (simulation based) analysis. We therefore propose three methods for improving the efficiency of simulating conditioned jump processes. A conditioned hazard is derived based on an approximation to the jump process, and used to generate end-point conditioned trajectories for use inside an importance sampling algorithm. We also adapt a recently proposed sequential Monte Carlo scheme to our problem. Essentially, trajectories are reweighted at a set of intermediate time points, with more weight assigned to trajectories that are consistent with the next observation. We consider two implementations of this approach, based on two continuous approximations of the MJP. We compare these constructs for a simple tractable jump process before using them to perform inference for a Lotka-Volterra system. The best performing construct is used to infer the parameters governing a simple model of motility regulation in Bacillus subtilis. PMID:25720091
An Intuitive Dashboard for Bayesian Network Inference
Reddy, Vikas; Charisse Farr, Anna; Wu, Paul; Mengersen, Kerrie; Yarlagadda, Prasad K. D. V.
2014-03-01
Current Bayesian network software packages provide good graphical interface for users who design and develop Bayesian networks for various applications. However, the intended end-users of these networks may not necessarily find such an interface appealing and at times it could be overwhelming, particularly when the number of nodes in the network is large. To circumvent this problem, this paper presents an intuitive dashboard, which provides an additional layer of abstraction, enabling the end-users to easily perform inferences over the Bayesian networks. Unlike most software packages, which display the nodes and arcs of the network, the developed tool organises the nodes based on the cause-and-effect relationship, making the user-interaction more intuitive and friendly. In addition to performing various types of inferences, the users can conveniently use the tool to verify the behaviour of the developed Bayesian network. The tool has been developed using QT and SMILE libraries in C++.
An Intuitive Dashboard for Bayesian Network Inference
International Nuclear Information System (INIS)
Current Bayesian network software packages provide good graphical interface for users who design and develop Bayesian networks for various applications. However, the intended end-users of these networks may not necessarily find such an interface appealing and at times it could be overwhelming, particularly when the number of nodes in the network is large. To circumvent this problem, this paper presents an intuitive dashboard, which provides an additional layer of abstraction, enabling the end-users to easily perform inferences over the Bayesian networks. Unlike most software packages, which display the nodes and arcs of the network, the developed tool organises the nodes based on the cause-and-effect relationship, making the user-interaction more intuitive and friendly. In addition to performing various types of inferences, the users can conveniently use the tool to verify the behaviour of the developed Bayesian network. The tool has been developed using QT and SMILE libraries in C++
Kernel Bayesian Inference with Posterior Regularization
Song, Yang; Jun ZHU; Ren, Yong
2016-01-01
We propose a vector-valued regression problem whose solution is equivalent to the reproducing kernel Hilbert space (RKHS) embedding of the Bayesian posterior distribution. This equivalence provides a new understanding of kernel Bayesian inference. Moreover, the optimization problem induces a new regularization for the posterior embedding estimator, which is faster and has comparable performance to the squared regularization in kernel Bayes' rule. This regularization coincides with a former th...
Tactile length contraction as Bayesian inference.
Tong, Jonathan; Ngo, Vy; Goldreich, Daniel
2016-08-01
To perceive, the brain must interpret stimulus-evoked neural activity. This is challenging: The stochastic nature of the neural response renders its interpretation inherently uncertain. Perception would be optimized if the brain used Bayesian inference to interpret inputs in light of expectations derived from experience. Bayesian inference would improve perception on average but cause illusions when stimuli violate expectation. Intriguingly, tactile, auditory, and visual perception are all prone to length contraction illusions, characterized by the dramatic underestimation of the distance between punctate stimuli delivered in rapid succession; the origin of these illusions has been mysterious. We previously proposed that length contraction illusions occur because the brain interprets punctate stimulus sequences using Bayesian inference with a low-velocity expectation. A novel prediction of our Bayesian observer model is that length contraction should intensify if stimuli are made more difficult to localize. Here we report a tactile psychophysical study that tested this prediction. Twenty humans compared two distances on the forearm: a fixed reference distance defined by two taps with 1-s temporal separation and an adjustable comparison distance defined by two taps with temporal separation t ≤ 1 s. We observed significant length contraction: As t was decreased, participants perceived the two distances as equal only when the comparison distance was made progressively greater than the reference distance. Furthermore, the use of weaker taps significantly enhanced participants' length contraction. These findings confirm the model's predictions, supporting the view that the spatiotemporal percept is a best estimate resulting from a Bayesian inference process. PMID:27121574
Bayesian inference on genetic merit under uncertain paternity
Directory of Open Access Journals (Sweden)
Tempelman Robert J
2003-09-01
Full Text Available Abstract A hierarchical animal model was developed for inference on genetic merit of livestock with uncertain paternity. Fully conditional posterior distributions for fixed and genetic effects, variance components, sire assignments and their probabilities are derived to facilitate a Bayesian inference strategy using MCMC methods. We compared this model to a model based on the Henderson average numerator relationship (ANRM in a simulation study with 10 replicated datasets generated for each of two traits. Trait 1 had a medium heritability (h2 for each of direct and maternal genetic effects whereas Trait 2 had a high h2 attributable only to direct effects. The average posterior probabilities inferred on the true sire were between 1 and 10% larger than the corresponding priors (the inverse of the number of candidate sires in a mating pasture for Trait 1 and between 4 and 13% larger than the corresponding priors for Trait 2. The predicted additive and maternal genetic effects were very similar using both models; however, model choice criteria (Pseudo Bayes Factor and Deviance Information Criterion decisively favored the proposed hierarchical model over the ANRM model.
Type Ia Supernova Light Curve Inference: Hierarchical Bayesian Analysis in the Near Infrared
Mandel, Kaisey S; Friedman, Andrew S; Kirshner, Robert P
2009-01-01
We present a comprehensive statistical analysis of the properties of Type Ia SN light curves in the near infrared using recent data from PAIRITEL and the literature. We construct a hierarchical Bayesian framework, incorporating several uncertainties including photometric error, peculiar velocities, dust extinction and intrinsic variations, for coherent statistical inference. SN Ia light curve inferences are drawn from the global posterior probability of parameters describing both individual supernovae and the population conditioned on the entire SN Ia NIR dataset. The logical structure of the hierarchical Bayesian model is represented by a directed acyclic graph. Fully Bayesian analysis of the model and data is enabled by an efficient MCMC algorithm exploiting the conditional structure using Gibbs sampling. We apply this framework to the JHK_s SN Ia light curve data. A new light curve model captures the observed J-band light curve shape variations. The intrinsic variances in peak absolute magnitudes are: sigm...
BAMBI: blind accelerated multimodal Bayesian inference
Graff, Philip; Hobson, Michael P; Lasenby, Anthony
2011-01-01
In this paper we present an algorithm for rapid Bayesian analysis that combines the benefits of nested sampling and artificial neural networks. The blind accelerated multimodal Bayesian inference (BAMBI) algorithm implements the MultiNest package for nested sampling as well as the training of an artificial neural network (NN) to learn the likelihood function. In the case of computationally expensive likelihoods, this allows the substitution of a much more rapid approximation in order to increase significantly the speed of the analysis. We begin by demonstrating, with a few toy examples, the ability of a NN to learn complicated likelihood surfaces. BAMBI's ability to decrease running time for Bayesian inference is then demonstrated in the context of estimating cosmological parameters from WMAP and other observations. We show that valuable speed increases are achieved in addition to obtaining NNs trained on the likelihood functions for the different model and data combinations. These NNs can then be used for an...
Bayesianism and inference to the best explanation
Directory of Open Access Journals (Sweden)
Valeriano IRANZO
2008-01-01
Full Text Available Bayesianism and Inference to the best explanation (IBE are two different models of inference. Recently there has been some debate about the possibility of “bayesianizing” IBE. Firstly I explore several alternatives to include explanatory considerations in Bayes’s Theorem. Then I distinguish two different interpretations of prior probabilities: “IBE-Bayesianism” (IBE-Bay and “frequentist-Bayesianism” (Freq-Bay. After detailing the content of the latter, I propose a rule for assessing the priors. I also argue that Freq-Bay: (i endorses a role for explanatory value in the assessment of scientific hypotheses; (ii avoids a purely subjectivist reading of prior probabilities; and (iii fits better than IBE-Bayesianism with two basic facts about science, i.e., the prominent role played by empirical testing and the existence of many scientific theories in the past that failed to fulfil their promises and were subsequently abandoned.
The NIFTY way of Bayesian signal inference
International Nuclear Information System (INIS)
We introduce NIFTY, 'Numerical Information Field Theory', a software package for the development of Bayesian signal inference algorithms that operate independently from any underlying spatial grid and its resolution. A large number of Bayesian and Maximum Entropy methods for 1D signal reconstruction, 2D imaging, as well as 3D tomography, appear formally similar, but one often finds individualized implementations that are neither flexible nor easily transferable. Signal inference in the framework of NIFTY can be done in an abstract way, such that algorithms, prototyped in 1D, can be applied to real world problems in higher-dimensional settings. NIFTY as a versatile library is applicable and already has been applied in 1D, 2D, 3D and spherical settings. A recent application is the D3PO algorithm targeting the non-trivial task of denoising, deconvolving, and decomposing photon observations in high energy astronomy
Using Alien Coins to Test Whether Simple Inference Is Bayesian
Cassey, Peter; Hawkins, Guy E.; Donkin, Chris; Brown, Scott D.
2016-01-01
Reasoning and inference are well-studied aspects of basic cognition that have been explained as statistically optimal Bayesian inference. Using a simplified experimental design, we conducted quantitative comparisons between Bayesian inference and human inference at the level of individuals. In 3 experiments, with more than 13,000 participants, we…
Constrained bayesian inference of project performance models
Sunmola, Funlade
2013-01-01
Project performance models play an important role in the management of project success. When used for monitoring projects, they can offer predictive ability such as indications of possible delivery problems. Approaches for monitoring project performance relies on available project information including restrictions imposed on the project, particularly the constraints of cost, quality, scope and time. We study in this paper a Bayesian inference methodology for project performance modelling in ...
Bayesian approach to change point detection of unemployment rate via MCMC methods
Czech Academy of Sciences Publication Activity Database
Reisnerová, Soňa
Plzeň : University of West Bohemia in Pilsen, 2006 - (Lukáš, L.), s. 447-452 ISBN 978-80-7043-480-2. [Mathematical Methods in Economics 2006. Plzeň (CZ), 13.09.2006-15.09.2006] R&D Projects: GA AV ČR IAA101120604 Institutional research plan: CEZ:AV0Z10750506 Keywords : Change point * unemployment rate * MCMC * poisson model Subject RIV: BB - Applied Statistics, Operational Research
Koepke, C.; Irving, J.
2015-12-01
Bayesian solutions to inverse problems in near-surface geophysics and hydrology have gained increasing popularity as a means of estimating not only subsurface model parameters, but also their corresponding uncertainties that can be used in probabilistic forecasting and risk analysis. In particular, Markov-chain-Monte-Carlo (MCMC) methods have attracted much recent attention as a means of statistically sampling from the Bayesian posterior distribution. In this regard, two approaches are commonly used to improve the computational tractability of the Bayesian-MCMC approach: (i) Forward models involving a simplification of the underlying physics are employed, which offer a significant reduction in the time required to calculate data, but generally at the expense of model accuracy, and (ii) the model parameter space is represented using a limited set of spatially correlated basis functions as opposed to a more intuitive high-dimensional pixel-based parameterization. It has become well understood that model inaccuracies resulting from (i) can lead to posterior parameter distributions that are highly biased and overly confident. Further, when performing model reduction as described in (ii), it is not clear how the prior distribution for the basis weights should be defined because simple (e.g., Gaussian or uniform) priors that may be suitable for a pixel-based parameterization may result in a strong prior bias when used for the weights. To address the issue of model error resulting from known forward model approximations, we generate a set of error training realizations and analyze them with principal component analysis (PCA) in order to generate a sparse basis. The latter is used in the MCMC inversion to remove the main model-error component from the residuals. To improve issues related to prior bias when performing model reduction, we also use a training realization approach, but this time models are simulated from the prior distribution and analyzed using independent
Murakami, Yohei; Takada, Shoji
2013-01-01
When exact values of model parameters in systems biology are not available from experiments, they need to be inferred so that the resulting simulation reproduces the experimentally known phenomena. For the purpose, Bayesian statistics with Markov chain Monte Carlo (MCMC) is a useful method. Biological experiments are often performed with cell population, and the results are represented by histograms. On another front, experiments sometimes indicate the existence of a specific bifurcation patt...
Bayesian Estimation and Inference Using Stochastic Electronics.
Thakur, Chetan Singh; Afshar, Saeed; Wang, Runchun M; Hamilton, Tara J; Tapson, Jonathan; van Schaik, André
2016-01-01
In this paper, we present the implementation of two types of Bayesian inference problems to demonstrate the potential of building probabilistic algorithms in hardware using single set of building blocks with the ability to perform these computations in real time. The first implementation, referred to as the BEAST (Bayesian Estimation and Stochastic Tracker), demonstrates a simple problem where an observer uses an underlying Hidden Markov Model (HMM) to track a target in one dimension. In this implementation, sensors make noisy observations of the target position at discrete time steps. The tracker learns the transition model for target movement, and the observation model for the noisy sensors, and uses these to estimate the target position by solving the Bayesian recursive equation online. We show the tracking performance of the system and demonstrate how it can learn the observation model, the transition model, and the external distractor (noise) probability interfering with the observations. In the second implementation, referred to as the Bayesian INference in DAG (BIND), we show how inference can be performed in a Directed Acyclic Graph (DAG) using stochastic circuits. We show how these building blocks can be easily implemented using simple digital logic gates. An advantage of the stochastic electronic implementation is that it is robust to certain types of noise, which may become an issue in integrated circuit (IC) technology with feature sizes in the order of tens of nanometers due to their low noise margin, the effect of high-energy cosmic rays and the low supply voltage. In our framework, the flipping of random individual bits would not affect the system performance because information is encoded in a bit stream. PMID:27047326
Trans-Dimensional Bayesian Inference for Gravitational Lens Substructures
Brewer, Brendon J; Lewis, Geraint F
2015-01-01
We introduce a Bayesian solution to the problem of inferring the density profile of strong gravitational lenses when the lens galaxy may contain multiple dark or faint substructures. The source and lens models are based on a superposition of an unknown number of non-negative basis functions (or "blobs") whose form was chosen with speed as a primary criterion. The prior distribution for the blobs' properties is specified hierarchically, so the mass function of substructures is a natural output of the method. We use reversible jump Markov Chain Monte Carlo (MCMC) within Diffusive Nested Sampling (DNS) to sample the posterior distribution and evaluate the marginal likelihood of the model, including the summation over the unknown number of blobs in the source and the lens. We demonstrate the method on a simulated data set with a single substructure, which is recovered well with moderate uncertainties. We also apply the method to the g-band image of the "Cosmic Horseshoe" system, and find some hints of potential s...
Bayesian Posterior Distributions Without Markov Chains
Cole, Stephen R.; Chu, Haitao; Greenland, Sander; Hamra, Ghassan; Richardson, David B.
2012-01-01
Bayesian posterior parameter distributions are often simulated using Markov chain Monte Carlo (MCMC) methods. However, MCMC methods are not always necessary and do not help the uninitiated understand Bayesian inference. As a bridge to understanding Bayesian inference, the authors illustrate a transparent rejection sampling method. In example 1, they illustrate rejection sampling using 36 cases and 198 controls from a case-control study (1976–1983) assessing the relation between residential ex...
Human collective intelligence as distributed Bayesian inference
Krafft, Peter M; Pan, Wei; Della Penna, Nicolás; Altshuler, Yaniv; Shmueli, Erez; Tenenbaum, Joshua B; Pentland, Alex
2016-01-01
Collective intelligence is believed to underly the remarkable success of human society. The formation of accurate shared beliefs is one of the key components of human collective intelligence. How are accurate shared beliefs formed in groups of fallible individuals? Answering this question requires a multiscale analysis. We must understand both the individual decision mechanisms people use, and the properties and dynamics of those mechanisms in the aggregate. As of yet, mathematical tools for such an approach have been lacking. To address this gap, we introduce a new analytical framework: We propose that groups arrive at accurate shared beliefs via distributed Bayesian inference. Distributed inference occurs through information processing at the individual level, and yields rational belief formation at the group level. We instantiate this framework in a new model of human social decision-making, which we validate using a dataset we collected of over 50,000 users of an online social trading platform where inves...
Bayesian Inference of a Multivariate Regression Model
Directory of Open Access Journals (Sweden)
Marick S. Sinay
2014-01-01
Full Text Available We explore Bayesian inference of a multivariate linear regression model with use of a flexible prior for the covariance structure. The commonly adopted Bayesian setup involves the conjugate prior, multivariate normal distribution for the regression coefficients and inverse Wishart specification for the covariance matrix. Here we depart from this approach and propose a novel Bayesian estimator for the covariance. A multivariate normal prior for the unique elements of the matrix logarithm of the covariance matrix is considered. Such structure allows for a richer class of prior distributions for the covariance, with respect to strength of beliefs in prior location hyperparameters, as well as the added ability, to model potential correlation amongst the covariance structure. The posterior moments of all relevant parameters of interest are calculated based upon numerical results via a Markov chain Monte Carlo procedure. The Metropolis-Hastings-within-Gibbs algorithm is invoked to account for the construction of a proposal density that closely matches the shape of the target posterior distribution. As an application of the proposed technique, we investigate a multiple regression based upon the 1980 High School and Beyond Survey.
Universal Darwinism as a process of Bayesian inference
Campbell, John O
2016-01-01
Many of the mathematical frameworks describing natural selection are equivalent to Bayes Theorem, also known as Bayesian updating. By definition, a process of Bayesian Inference is one which involves a Bayesian update, so we may conclude that these frameworks describe natural selection as a process of Bayesian inference. Thus natural selection serves as a counter example to a widely-held interpretation that restricts Bayesian Inference to human mental processes (including the endeavors of statisticians). As Bayesian inference can always be cast in terms of (variational) free energy minimization, natural selection can be viewed as comprising two components: a generative model of an "experiment" in the external world environment, and the results of that "experiment" or the "surprise" entailed by predicted and actual outcomes of the "experiment". Minimization of free energy implies that the implicit measure of "surprise" experienced serves to update the generative model in a Bayesian manner. This description clo...
Efficient Bayesian inference for ARFIMA processes
Graves, T.; Gramacy, R. B.; Franzke, C. L. E.; Watkins, N. W.
2015-03-01
Many geophysical quantities, like atmospheric temperature, water levels in rivers, and wind speeds, have shown evidence of long-range dependence (LRD). LRD means that these quantities experience non-trivial temporal memory, which potentially enhances their predictability, but also hampers the detection of externally forced trends. Thus, it is important to reliably identify whether or not a system exhibits LRD. In this paper we present a modern and systematic approach to the inference of LRD. Rather than Mandelbrot's fractional Gaussian noise, we use the more flexible Autoregressive Fractional Integrated Moving Average (ARFIMA) model which is widely used in time series analysis, and of increasing interest in climate science. Unlike most previous work on the inference of LRD, which is frequentist in nature, we provide a systematic treatment of Bayesian inference. In particular, we provide a new approximate likelihood for efficient parameter inference, and show how nuisance parameters (e.g. short memory effects) can be integrated over in order to focus on long memory parameters, and hypothesis testing more directly. We illustrate our new methodology on the Nile water level data, with favorable comparison to the standard estimators.
A Fast Iterative Bayesian Inference Algorithm for Sparse Channel Estimation
DEFF Research Database (Denmark)
Pedersen, Niels Lovmand; Manchón, Carles Navarro; Fleury, Bernard Henri
2013-01-01
representation of the Bessel K probability density function; a highly efficient, fast iterative Bayesian inference method is then applied to the proposed model. The resulting estimator outperforms other state-of-the-art Bayesian and non-Bayesian estimators, either by yielding lower mean squared estimation error...
Optical tweezers calibration with Bayesian inference
Türkcan, Silvan; Richly, Maximilian U.; Le Gall, Antoine; Fiszman, Nicolas; Masson, Jean-Baptiste; Westbrook, Nathalie; Perronet, Karen; Alexandrou, Antigoni
2014-09-01
We present a new method for calibrating an optical-tweezer setup that is based on Bayesian inference1. This method employs an algorithm previously used to analyze the confined trajectories of receptors within lipid rafts2,3. The main advantages of this method are that it does not require input parameters and is insensitive to systematic errors like the drift of the setup. Additionally, it exploits a much larger amount of the information stored in the recorded bead trajectory than standard calibration approaches. The additional information can be used to detect deviations from the perfect harmonic potential or detect environmental influences on the bead. The algorithm infers the diffusion coefficient and the potential felt by a trapped bead, and only requires the bead trajectory as input. We demonstrate that this method outperforms the equipartition method and the power-spectrum method in input information required (bead radius and trajectory length) and in output accuracy. Furthermore, by inferring a higher order potential our method can reveal deviations from the assumed second-order potential. More generally, this method can also be used for magnetic-tweezer calibration.
Strategies for MCMC computation inquantitative genetics
DEFF Research Database (Denmark)
Waagepetersen, Rasmus; Ibánēz-Escriche, Noelia; Sorensen, Daniel
likelihood inference is complicated since it is not possible to evaluate explicitly the likelihood function and conventional Gibbs sampling is difficult since the full conditional distributions are not anymore of standard forms. The aim of this paper is to discuss strategies to obtain efficient Markov chain...... sparse inverse. Maximum likelihood inference and Bayesian inference for the linear mixed model are well-studied topics (Sorensen and Gianola, 2002). Regarding Bayesian inference, with appropriate choice of priors, the full conditional distributions are standard distributions and Gibbs sampling can be...... both in size and regarding the inferences concerning the genetic covariance parameters. Section 2 discusses general strategies for obtaining efficient MCMC algorithms while Section 3 considers these strategies in the specific context of the San Cristobal-Gaudy et al. (1998) model. Section 4 presents...
Bayesian electron density inference from JET lithium beam emission spectra using Gaussian processes
Kwak, Sehyun; Brix, M; Ghim, Y -c
2016-01-01
A Bayesian model to infer edge electron density profiles is developed for the JET lithium beam emission spectroscopy system, measuring Li I line radiation using 26 channels with ~1 cm spatial resolution and 10~20 ms temporal resolution. The density profile is modelled using a Gaussian process prior, and the uncertainty of the density profile is calculated by a Markov Chain Monte Carlo (MCMC) scheme. From the spectra measured by the transmission grating spectrometer, the Li line intensities are extracted, and modelled as a function of the plasma density by a multi-state model which describes the relevant processes between neutral lithium beam atoms and plasma particles. The spectral model fully takes into account interference filter and instrument effects, that are separately estimated, again using Gaussian processes. The line intensities are inferred based on a spectral model consistent with the measured spectra within their uncertainties, which includes photon statistics and electronic noise. Our newly devel...
Universal Darwinism As a Process of Bayesian Inference.
Campbell, John O
2016-01-01
Many of the mathematical frameworks describing natural selection are equivalent to Bayes' Theorem, also known as Bayesian updating. By definition, a process of Bayesian Inference is one which involves a Bayesian update, so we may conclude that these frameworks describe natural selection as a process of Bayesian inference. Thus, natural selection serves as a counter example to a widely-held interpretation that restricts Bayesian Inference to human mental processes (including the endeavors of statisticians). As Bayesian inference can always be cast in terms of (variational) free energy minimization, natural selection can be viewed as comprising two components: a generative model of an "experiment" in the external world environment, and the results of that "experiment" or the "surprise" entailed by predicted and actual outcomes of the "experiment." Minimization of free energy implies that the implicit measure of "surprise" experienced serves to update the generative model in a Bayesian manner. This description closely accords with the mechanisms of generalized Darwinian process proposed both by Dawkins, in terms of replicators and vehicles, and Campbell, in terms of inferential systems. Bayesian inference is an algorithm for the accumulation of evidence-based knowledge. This algorithm is now seen to operate over a wide range of evolutionary processes, including natural selection, the evolution of mental models and cultural evolutionary processes, notably including science itself. The variational principle of free energy minimization may thus serve as a unifying mathematical framework for universal Darwinism, the study of evolutionary processes operating throughout nature. PMID:27375438
Mira Antonietta; Tenconi Paolo; Bressanini Dario
2003-01-01
We propose a general purpose variance reduction technique for MCMC estimators. The idea is obtained by combining standard variance reduction principles known for regular Monte Carlo simulations (Ripley, 1987) and the Zero-Variance principle introduced in the physics literature (Assaraf and Caffarel, 1999). The potential of the new idea is illustrated with some toy examples and an application to Bayesian estimation
De Freitas, Nando; Hojen-Sorensen, Pedro; Jordan, Michael I.; Russell, Stuart
2013-01-01
We propose a new class of learning algorithms that combines variational approximation and Markov chain Monte Carlo (MCMC) simulation. Naive algorithms that use the variational approximation as proposal distribution can perform poorly because this approximation tends to underestimate the true variance and other features of the data. We solve this problem by introducing more sophisticated MCMC algorithms. One of these algorithms is a mixture of two MCMC kernels: a random walk Metropolis kernel ...
International Nuclear Information System (INIS)
A new method for the experimental determination of stopping powers based on Bayesian Inference with the Markov chain Monte Carlo (MCMC) algorithm has been devised. This method avoids the difficulties related to thin target preparation. By measuring the RBS spectra for a known material, and using the known underlying physics, the stopping powers are determined by best matching the simulated spectra with the experimental spectra. Using silicon, SiO2 and Al2O3 as test cases, good agreement is obtained between calculated and experimental data. (author)
International Nuclear Information System (INIS)
Sparsity has become a key concept for solving of high-dimensional inverse problems using variational regularization techniques. Recently, using similar sparsity-constraints in the Bayesian framework for inverse problems by encoding them in the prior distribution has attracted attention. Important questions about the relation between regularization theory and Bayesian inference still need to be addressed when using sparsity promoting inversion. A practical obstacle for these examinations is the lack of fast posterior sampling algorithms for sparse, high-dimensional Bayesian inversion. Accessing the full range of Bayesian inference methods requires being able to draw samples from the posterior probability distribution in a fast and efficient way. This is usually done using Markov chain Monte Carlo (MCMC) sampling algorithms. In this paper, we develop and examine a new implementation of a single component Gibbs MCMC sampler for sparse priors relying on L1-norms. We demonstrate that the efficiency of our Gibbs sampler increases when the level of sparsity or the dimension of the unknowns is increased. This property is contrary to the properties of the most commonly applied Metropolis–Hastings (MH) sampling schemes. We demonstrate that the efficiency of MH schemes for L1-type priors dramatically decreases when the level of sparsity or the dimension of the unknowns is increased. Practically, Bayesian inversion for L1-type priors using MH samplers is not feasible at all. As this is commonly believed to be an intrinsic feature of MCMC sampling, the performance of our Gibbs sampler also challenges common beliefs about the applicability of sample based Bayesian inference. (paper)
Bayesian Inference and Optimal Design in the Sparse Linear Model
Seeger, Matthias; Steinke, Florian; Tsuda, Koji
2007-01-01
The sparse linear model has seen many successful applications in Statistics, Machine Learning, and Computational Biology, such as identification of gene regulatory networks from micro-array expression data. Prior work has either approximated Bayesian inference by expensive Markov chain Monte Carlo, or replaced it by point estimation. We show how to obtain a good approximation to Bayesian analysis efficiently, using the Expectation Propagation method. We also address the problems of optimal de...
Kernel Approximate Bayesian Computation for Population Genetic Inferences
Nakagome, Shigeki; Fukumizu, Kenji; Mano, Shuhei
2012-01-01
Approximate Bayesian computation (ABC) is a likelihood-free approach for Bayesian inferences based on a rejection algorithm method that applies a tolerance of dissimilarity between summary statistics from observed and simulated data. Although several improvements to the algorithm have been proposed, none of these improvements avoid the following two sources of approximation: 1) lack of sufficient statistics: sampling is not from the true posterior density given data but from an approximate po...
Methods for Bayesian power spectrum inference with galaxy surveys
Jasche, Jens
2013-01-01
We derive and implement a full Bayesian large scale structure inference method aiming at precision recovery of the cosmological power spectrum from galaxy redshift surveys. Our approach improves over previous Bayesian methods by performing a joint inference of the three dimensional density field, the cosmological power spectrum, luminosity dependent galaxy biases and corresponding normalizations. We account for all joint and correlated uncertainties between all inferred quantities. Classes of galaxies with different biases are treated as separate sub samples. The method therefore also allows the combined analysis of more than one galaxy survey. In particular, it solves the problem of inferring the power spectrum from galaxy surveys with non-trivial survey geometries by exploring the joint posterior distribution with efficient implementations of multiple block Markov chain and Hybrid Monte Carlo methods. Our Markov sampler achieves high statistical efficiency in low signal to noise regimes by using a determini...
Approximate Bayesian inference for complex ecosystems
Michael P H Stumpf
2014-01-01
Mathematical models have been central to ecology for nearly a century. Simple models of population dynamics have allowed us to understand fundamental aspects underlying the dynamics and stability of ecological systems. What has remained a challenge, however, is to meaningfully interpret experimental or observational data in light of mathematical models. Here, we review recent developments, notably in the growing field of approximate Bayesian computation (ABC), that allow us to calibrate mathe...
A Bayesian Approach to Protein Inference Problem in Shotgun Proteomics
Li, Yong Fuga; Arnold, Randy J.; Li, Yixue; Radivojac, Predrag; Sheng, Quanhu; Tang, Haixu
2009-01-01
The protein inference problem represents a major challenge in shotgun proteomics. In this article, we describe a novel Bayesian approach to address this challenge by incorporating the predicted peptide detectabilities as the prior probabilities of peptide identification. We propose a rigorious probabilistic model for protein inference and provide practical algoritmic solutions to this problem. We used a complex synthetic protein mixture to test our method and obtained promising results.
Bayesian Inference of Kinematics and Memberships of Open Cluster
Shao, Z. Y.; Chen, L.; Zhong, J.; Hou, J. L.
2014-07-01
Based on the Bayesian Inference (BI) method, the Multiple-modelling approach is improved to combine coordinative positions, proper motions (PM) and radial velocities (RV), to separate the motion of the open cluster from field stars, as well as to describe the intrinsic kinematic status of the cluster.
Pig Data and Bayesian Inference on Multinomial Probabilities
Kern, John C.
2006-01-01
Bayesian inference on multinomial probabilities is conducted based on data collected from the game Pass the Pigs[R]. Prior information on these probabilities is readily available from the instruction manual, and is easily incorporated in a Dirichlet prior. Posterior analysis of the scoring probabilities quantifies the discrepancy between empirical…
Stan: A Probabilistic Programming Language for Bayesian Inference and Optimization
Gelman, Andrew; Lee, Daniel; Guo, Jiqiang
2015-01-01
Stan is a free and open-source C++ program that performs Bayesian inference or optimization for arbitrary user-specified models and can be called from the command line, R, Python, Matlab, or Julia and has great promise for fitting large and complex statistical models in many areas of application. We discuss Stan from users' and developers'…
Bayesian inference model for fatigue life of laminated composites
DEFF Research Database (Denmark)
Dimitrov, Nikolay Krasimirov; Kiureghian, Armen Der; Berggreen, Christian
2016-01-01
A probabilistic model for estimating the fatigue life of laminated composite plates is developed. The model is based on lamina-level input data, making it possible to predict fatigue properties for a wide range of laminate configurations. Model parameters are estimated by Bayesian inference. The...
Bayesian inference of chemical kinetic models from proposed reactions
Galagali, Nikhil
2015-02-01
© 2014 Elsevier Ltd. Bayesian inference provides a natural framework for combining experimental data with prior knowledge to develop chemical kinetic models and quantify the associated uncertainties, not only in parameter values but also in model structure. Most existing applications of Bayesian model selection methods to chemical kinetics have been limited to comparisons among a small set of models, however. The significant computational cost of evaluating posterior model probabilities renders traditional Bayesian methods infeasible when the model space becomes large. We present a new framework for tractable Bayesian model inference and uncertainty quantification using a large number of systematically generated model hypotheses. The approach involves imposing point-mass mixture priors over rate constants and exploring the resulting posterior distribution using an adaptive Markov chain Monte Carlo method. The posterior samples are used to identify plausible models, to quantify rate constant uncertainties, and to extract key diagnostic information about model structure-such as the reactions and operating pathways most strongly supported by the data. We provide numerical demonstrations of the proposed framework by inferring kinetic models for catalytic steam and dry reforming of methane using available experimental data.
Bayesian Inference in the Modern Design of Experiments
DeLoach, Richard
2008-01-01
This paper provides an elementary tutorial overview of Bayesian inference and its potential for application in aerospace experimentation in general and wind tunnel testing in particular. Bayes Theorem is reviewed and examples are provided to illustrate how it can be applied to objectively revise prior knowledge by incorporating insights subsequently obtained from additional observations, resulting in new (posterior) knowledge that combines information from both sources. A logical merger of Bayesian methods and certain aspects of Response Surface Modeling is explored. Specific applications to wind tunnel testing, computational code validation, and instrumentation calibration are discussed.
Energy Technology Data Exchange (ETDEWEB)
Zhang, Guannan [ORNL; Webster, Clayton G [ORNL; Gunzburger, Max D [ORNL
2012-09-01
Although Bayesian analysis has become vital to the quantification of prediction uncertainty in groundwater modeling, its application has been hindered due to the computational cost associated with numerous model executions needed for exploring the posterior probability density function (PPDF) of model parameters. This is particularly the case when the PPDF is estimated using Markov Chain Monte Carlo (MCMC) sampling. In this study, we develop a new approach that improves computational efficiency of Bayesian inference by constructing a surrogate system based on an adaptive sparse-grid high-order stochastic collocation (aSG-hSC) method. Unlike previous works using first-order hierarchical basis, we utilize a compactly supported higher-order hierar- chical basis to construct the surrogate system, resulting in a significant reduction in the number of computational simulations required. In addition, we use hierarchical surplus as an error indi- cator to determine adaptive sparse grids. This allows local refinement in the uncertain domain and/or anisotropic detection with respect to the random model parameters, which further improves computational efficiency. Finally, we incorporate a global optimization technique and propose an iterative algorithm for building the surrogate system for the PPDF with multiple significant modes. Once the surrogate system is determined, the PPDF can be evaluated by sampling the surrogate system directly with very little computational cost. The developed method is evaluated first using a simple analytical density function with multiple modes and then using two synthetic groundwater reactive transport models. The groundwater models represent different levels of complexity; the first example involves coupled linear reactions and the second example simulates nonlinear ura- nium surface complexation. The results show that the aSG-hSC is an effective and efficient tool for Bayesian inference in groundwater modeling in comparison with conventional
An equilibrium validation technique based on Bayesian inference
International Nuclear Information System (INIS)
In recent years, Bayesian probability theory has been used in a number of experiments to fold uncertainties and interdependences in the diagnostic data and forward models, and together with prior knowledge of the state of the plasma, thus increase accuracy of inferred physics variables. Key developments include the application to current and flux surface tomography, effective charge, the electron energy distribution function, neutron spectrometry and density. Virtual observations have also been introduced to better constrain inferred quantities in current tomography. In this work we present Bayesian inference results of toroidal and poloidal current and flux surface tomography. Whilst the uncertainty in these profiles, as well as the uncertainty in inferred parameters such as the safety factor profile is small (<5%), the inference can change substantially depending on the physics model used. We also present Bayesian inference results for Thomson scattering and charge-exchange recombination spectroscopy. In separate work we have computed radial force balance components on the midplane in the Mega Ampere Spherical tokamak. Our aim is to establish a validation framework for different equilibrium physics models. We find that in the overlapping region of the core (normalized poloidal flux less than 0.4) and motional Stark effect (MSE) chords, the plasma is consistent with static Grad-Shafranov force balance to within two standard deviations. In the outboard edge region, where MSE data are also available, the pressure gradient exceeds the Lorentz force. Most likely, this is because the poloidal current is not constrained to zero at the plasma edge. To lowest order, the results suggest computing components of force balance are useful to assess data-consistency, independent of any equilibrium solution. To first order, we have integrated the residue to force balance to infer an energetic particle pressure.
Dimension-independent likelihood-informed MCMC
Cui, Tiangang
2015-10-08
Many Bayesian inference problems require exploring the posterior distribution of high-dimensional parameters that represent the discretization of an underlying function. This work introduces a family of Markov chain Monte Carlo (MCMC) samplers that can adapt to the particular structure of a posterior distribution over functions. Two distinct lines of research intersect in the methods developed here. First, we introduce a general class of operator-weighted proposal distributions that are well defined on function space, such that the performance of the resulting MCMC samplers is independent of the discretization of the function. Second, by exploiting local Hessian information and any associated low-dimensional structure in the change from prior to posterior distributions, we develop an inhomogeneous discretization scheme for the Langevin stochastic differential equation that yields operator-weighted proposals adapted to the non-Gaussian structure of the posterior. The resulting dimension-independent and likelihood-informed (DILI) MCMC samplers may be useful for a large class of high-dimensional problems where the target probability measure has a density with respect to a Gaussian reference measure. Two nonlinear inverse problems are used to demonstrate the efficiency of these DILI samplers: an elliptic PDE coefficient inverse problem and path reconstruction in a conditioned diffusion.
Höhna, Sebastian; Landis, Michael J; Heath, Tracy A; Boussau, Bastien; Lartillot, Nicolas; Moore, Brian R; Huelsenbeck, John P; Ronquist, Fredrik
2016-07-01
Programs for Bayesian inference of phylogeny currently implement a unique and ﬁxed suite of models. Consequently, users of these software packages are simultaneously forced to use a number of programs for a given study, while also lacking the freedom to explore models that have not been implemented by the developers of those programs. We developed a new open-source software package, RevBayes, to address these problems. RevBayes is entirely based on probabilistic graphical models, a powerful generic framework for specifying and analyzing statistical models. Phylogenetic-graphical models can be speciﬁed interactively in RevBayes, piece by piece, using a new succinct and intuitive language called Rev. Rev is similar to the R language and the BUGS model-speciﬁcation language, and should be easy to learn for most users. The strength of RevBayes is the simplicity with which one can design, specify, and implement new and complex models. Fortunately, this tremendous ﬂexibility does not come at the cost of slower computation; as we demonstrate, RevBayes outperforms competing software for several standard analyses. Compared with other programs, RevBayes has fewer black-box elements. Users need to explicitly specify each part of the model and analysis. Although this explicitness may initially be unfamiliar, we are convinced that this transparency will improve understanding of phylogenetic models in our ﬁeld. Moreover, it will motivate the search for improvements to existing methods by brazenly exposing the model choices that we make to critical scrutiny. RevBayes is freely available at http://www.RevBayes.com [Bayesian inference; Graphical models; MCMC; statistical phylogenetics.]. PMID:27235697
Revealing ecological networks using Bayesian network inference algorithms.
Milns, Isobel; Beale, Colin M; Smith, V Anne
2010-07-01
Understanding functional relationships within ecological networks can help reveal keys to ecosystem stability or fragility. Revealing these relationships is complicated by the difficulties of isolating variables or performing experimental manipulations within a natural ecosystem, and thus inferences are often made by matching models to observational data. Such models, however, require assumptions-or detailed measurements-of parameters such as birth and death rate, encounter frequency, territorial exclusion, and predation success. Here, we evaluate the use of a Bayesian network inference algorithm, which can reveal ecological networks based upon species and habitat abundance alone. We test the algorithm's performance and applicability on observational data of avian communities and habitat in the Peak District National Park, United Kingdom. The resulting networks correctly reveal known relationships among habitat types and known interspecific relationships. In addition, the networks produced novel insights into ecosystem structure and identified key species with high connectivity. Thus, Bayesian networks show potential for becoming a valuable tool in ecosystem analysis. PMID:20715607
Parsing optical scanned 3D data by Bayesian inference
Xiong, Hanwei; Xu, Jun; Xu, Chenxi; Pan, Ming
2015-10-01
Optical devices are always used to digitize complex objects to get their shapes in form of point clouds. The results have no semantic meaning about the objects, and tedious process is indispensable to segment the scanned data to get meanings. The reason for a person to perceive an object correctly is the usage of knowledge, so Bayesian inference is used to the goal. A probabilistic And-Or-Graph is used as a unified framework of representation, learning, and recognition for a large number of object categories, and a probabilistic model defined on this And-Or-Graph is learned from a relatively small training set per category. Given a set of 3D scanned data, the Bayesian inference constructs a most probable interpretation of the object, and a semantic segment is obtained from the part decomposition. Some examples are given to explain the method.
Fast Bayesian inference for slow-roll inflation
Ringeval, Christophe
2013-01-01
We present and discuss a new approach increasing by orders of magnitude the speed of performing Bayesian inference and parameter estimation within the framework of slow-roll inflation. The method relies on the determination of an effective likelihood for inflation which is a function of the primordial amplitude of the scalar perturbations complemented with the necessary number of the so-called Hubble flow functions to reach the desired accuracy. Starting from any cosmological data set, the effective likelihood is obtained by marginalisation over the standard cosmological parameters, here viewed as "nuisance" from the early Universe point of view. As being low-dimensional, basic machine-learning algorithms can be trained to accurately reproduce its multidimensional shape and then be used as a proxy to perform fast Bayesian inference on the inflationary models. The robustness and accuracy of the method are illustrated using the Planck Cosmic Microwave Background (CMB) data to perform primordial parameter estima...
Bayesian Inference for Smoking Cessation with a Latent Cure State
Luo, Sheng; Crainiceanu, Ciprian M.; Thomas A Louis; Chatterjee, Nilanjan
2009-01-01
We present a Bayesian approach to modeling dynamic smoking addiction behavior processes when cure is not directly observed due to censoring. Subject-specific probabilities model the stochastic transitions among three behavioral states: smoking, transient quitting, and permanent quitting (absorbent state). A multivariate normal distribution for random effects is used to account for the potential correlation among the subject-specific transition probabilities. Inference is conducted using a Bay...
Practical Statistics for LHC Physicists: Bayesian Inference (3/3)
CERN. Geneva
2015-01-01
These lectures cover those principles and practices of statistics that are most relevant for work at the LHC. The first lecture discusses the basic ideas of descriptive statistics, probability and likelihood. The second lecture covers the key ideas in the frequentist approach, including confidence limits, profile likelihoods, p-values, and hypothesis testing. The third lecture covers inference in the Bayesian approach. Throughout, real-world examples will be used to illustrate the practical application of the ideas. No previous knowledge is assumed.
Bayesian Fusion Algorithm for Inferring Trust in Wireless Sensor Networks
Mohammad Momani; Subhash Challa; Rami Alhmouz
2010-01-01
This paper introduces a new Bayesian fusion algorithm to combine more than one trust component (data trust and communication trust) to infer the overall trust between nodes. This research work proposes that one trust component is not enough when deciding on whether or not to trust a specific node in a wireless sensor network. This paper discusses and analyses the results from the communication trust component (binary) and the data trust component (continuous) and proves that either component ...
Halo detection via large-scale Bayesian inference
Merson, Alexander I.; Jasche, Jens; Abdalla, Filipe B.; Lahav, Ofer; Wandelt, Benjamin; Jones, D. Heath; Colless, Matthew
2016-08-01
We present a proof-of-concept of a novel and fully Bayesian methodology designed to detect haloes of different masses in cosmological observations subject to noise and systematic uncertainties. Our methodology combines the previously published Bayesian large-scale structure inference algorithm, HAmiltonian Density Estimation and Sampling algorithm (HADES), and a Bayesian chain rule (the Blackwell-Rao estimator), which we use to connect the inferred density field to the properties of dark matter haloes. To demonstrate the capability of our approach, we construct a realistic galaxy mock catalogue emulating the wide-area 6-degree Field Galaxy Survey, which has a median redshift of approximately 0.05. Application of HADES to the catalogue provides us with accurately inferred three-dimensional density fields and corresponding quantification of uncertainties inherent to any cosmological observation. We then use a cosmological simulation to relate the amplitude of the density field to the probability of detecting a halo with mass above a specified threshold. With this information, we can sum over the HADES density field realisations to construct maps of detection probabilities and demonstrate the validity of this approach within our mock scenario. We find that the probability of successful detection of haloes in the mock catalogue increases as a function of the signal to noise of the local galaxy observations. Our proposed methodology can easily be extended to account for more complex scientific questions and is a promising novel tool to analyse the cosmic large-scale structure in observations.
Methods for Bayesian Power Spectrum Inference with Galaxy Surveys
Jasche, Jens; Wandelt, Benjamin D.
2013-12-01
We derive and implement a full Bayesian large scale structure inference method aiming at precision recovery of the cosmological power spectrum from galaxy redshift surveys. Our approach improves upon previous Bayesian methods by performing a joint inference of the three-dimensional density field, the cosmological power spectrum, luminosity dependent galaxy biases, and corresponding normalizations. We account for all joint and correlated uncertainties between all inferred quantities. Classes of galaxies with different biases are treated as separate subsamples. This method therefore also allows the combined analysis of more than one galaxy survey. In particular, it solves the problem of inferring the power spectrum from galaxy surveys with non-trivial survey geometries by exploring the joint posterior distribution with efficient implementations of multiple block Markov chain and Hybrid Monte Carlo methods. Our Markov sampler achieves high statistical efficiency in low signal-to-noise regimes by using a deterministic reversible jump algorithm. This approach reduces the correlation length of the sampler by several orders of magnitude, turning the otherwise numerically unfeasible problem of joint parameter exploration into a numerically manageable task. We test our method on an artificial mock galaxy survey, emulating characteristic features of the Sloan Digital Sky Survey data release 7, such as its survey geometry and luminosity-dependent biases. These tests demonstrate the numerical feasibility of our large scale Bayesian inference frame work when the parameter space has millions of dimensions. This method reveals and correctly treats the anti-correlation between bias amplitudes and power spectrum, which are not taken into account in current approaches to power spectrum estimation, a 20% effect across large ranges in k space. In addition, this method results in constrained realizations of density fields obtained without assuming the power spectrum or bias parameters
MCMC for non-linear state space models using ensembles of latent sequences
Shestopaloff, Alexander Y.; Neal, Radford M.
2013-01-01
Non-linear state space models are a widely-used class of models for biological, economic, and physical processes. Fitting these models to observed data is a difficult inference problem that has no straightforward solution. We take a Bayesian approach to the inference of unknown parameters of a non-linear state model; this, in turn, requires the availability of efficient Markov Chain Monte Carlo (MCMC) sampling methods for the latent (hidden) variables and model parameters. Using the ensemble ...
Furtado-Junior, I; Abrunhosa, F A; Holanda, F C A F; Tavares, M C S
2016-06-01
Fishing selectivity of the mangrove crab Ucides cordatus in the north coast of Brazil can be defined as the fisherman's ability to capture and select individuals from a certain size or sex (or a combination of these factors) which suggests an empirical selectivity. Considering this hypothesis, we calculated the selectivity curves for males and females crabs using the logit function of the logistic model in the formulation. The Bayesian inference consisted of obtaining the posterior distribution by applying the Markov chain Monte Carlo (MCMC) method to software R using the OpenBUGS, BRugs, and R2WinBUGS libraries. The estimated results of width average carapace selection for males and females compared with previous studies reporting the average width of the carapace of sexual maturity allow us to confirm the hypothesis that most mature individuals do not suffer from fishing pressure; thus, ensuring their sustainability. PMID:26934154
Approximate bayesian parameter inference for dynamical systems in systems biology
International Nuclear Information System (INIS)
This paper proposes to use approximate instead of exact stochastic simulation algorithms for approximate Bayesian parameter inference of dynamical systems in systems biology. It first presents the mathematical framework for the description of systems biology models, especially from the aspect of a stochastic formulation as opposed to deterministic model formulations based on the law of mass action. In contrast to maximum likelihood methods for parameter inference, approximate inference method- share presented which are based on sampling parameters from a known prior probability distribution, which gradually evolves toward a posterior distribution, through the comparison of simulated data from the model to a given data set of measurements. The paper then discusses the simulation process, where an over- view is given of the different exact and approximate methods for stochastic simulation and their improvements that we propose. The exact and approximate simulators are implemented and used within approximate Bayesian parameter inference methods. Our evaluation of these methods on two tasks of parameter estimation in two different models shows that equally good results are obtained much faster when using approximate simulation as compared to using exact simulation. (Author)
A Bayesian Approach to Inferring Rates of Selfing and Locus-Specific Mutation.
Redelings, Benjamin D; Kumagai, Seiji; Tatarenkov, Andrey; Wang, Liuyang; Sakai, Ann K; Weller, Stephen G; Culley, Theresa M; Avise, John C; Uyenoyama, Marcy K
2015-11-01
We present a Bayesian method for characterizing the mating system of populations reproducing through a mixture of self-fertilization and random outcrossing. Our method uses patterns of genetic variation across the genome as a basis for inference about reproduction under pure hermaphroditism, gynodioecy, and a model developed to describe the self-fertilizing killifish Kryptolebias marmoratus. We extend the standard coalescence model to accommodate these mating systems, accounting explicitly for multilocus identity disequilibrium, inbreeding depression, and variation in fertility among mating types. We incorporate the Ewens sampling formula (ESF) under the infinite-alleles model of mutation to obtain a novel expression for the likelihood of mating system parameters. Our Markov chain Monte Carlo (MCMC) algorithm assigns locus-specific mutation rates, drawn from a common mutation rate distribution that is itself estimated from the data using a Dirichlet process prior model. Our sampler is designed to accommodate additional information, including observations pertaining to the sex ratio, the intensity of inbreeding depression, and other aspects of reproduction. It can provide joint posterior distributions for the population-wide proportion of uniparental individuals, locus-specific mutation rates, and the number of generations since the most recent outcrossing event for each sampled individual. Further, estimation of all basic parameters of a given model permits estimation of functions of those parameters, including the proportion of the gene pool contributed by each sex and relative effective numbers. PMID:26374460
Bayesian Top-Down Protein Sequence Alignment with Inferred Position-Specific Gap Penalties.
Neuwald, Andrew F; Altschul, Stephen F
2016-05-01
We describe a Bayesian Markov chain Monte Carlo (MCMC) sampler for protein multiple sequence alignment (MSA) that, as implemented in the program GISMO and applied to large numbers of diverse sequences, is more accurate than the popular MSA programs MUSCLE, MAFFT, Clustal-Ω and Kalign. Features of GISMO central to its performance are: (i) It employs a "top-down" strategy with a favorable asymptotic time complexity that first identifies regions generally shared by all the input sequences, and then realigns closely related subgroups in tandem. (ii) It infers position-specific gap penalties that favor insertions or deletions (indels) within each sequence at alignment positions in which indels are invoked in other sequences. This favors the placement of insertions between conserved blocks, which can be understood as making up the proteins' structural core. (iii) It uses a Bayesian statistical measure of alignment quality based on the minimum description length principle and on Dirichlet mixture priors. Consequently, GISMO aligns sequence regions only when statistically justified. This is unlike methods based on the ad hoc, but widely used, sum-of-the-pairs scoring system, which will align random sequences. (iv) It defines a system for exploring alignment space that provides natural avenues for further experimentation through the development of new sampling strategies for more efficiently escaping from suboptimal traps. GISMO's superior performance is illustrated using 408 protein sets containing, on average, 235 sequences. These sets correspond to NCBI Conserved Domain Database alignments, which have been manually curated in the light of available crystal structures, and thus provide a means to assess alignment accuracy. GISMO fills a different niche than other MSA programs, namely identifying and aligning a conserved domain present within a large, diverse set of full length sequences. The GISMO program is available at http://gismo.igs.umaryland.edu/. PMID:27192614
Bayesian Inference of Reticulate Phylogenies under the Multispecies Network Coalescent.
Wen, Dingqiao; Yu, Yun; Nakhleh, Luay
2016-05-01
The multispecies coalescent (MSC) is a statistical framework that models how gene genealogies grow within the branches of a species tree. The field of computational phylogenetics has witnessed an explosion in the development of methods for species tree inference under MSC, owing mainly to the accumulating evidence of incomplete lineage sorting in phylogenomic analyses. However, the evolutionary history of a set of genomes, or species, could be reticulate due to the occurrence of evolutionary processes such as hybridization or horizontal gene transfer. We report on a novel method for Bayesian inference of genome and species phylogenies under the multispecies network coalescent (MSNC). This framework models gene evolution within the branches of a phylogenetic network, thus incorporating reticulate evolutionary processes, such as hybridization, in addition to incomplete lineage sorting. As phylogenetic networks with different numbers of reticulation events correspond to points of different dimensions in the space of models, we devise a reversible-jump Markov chain Monte Carlo (RJMCMC) technique for sampling the posterior distribution of phylogenetic networks under MSNC. We implemented the methods in the publicly available, open-source software package PhyloNet and studied their performance on simulated and biological data. The work extends the reach of Bayesian inference to phylogenetic networks and enables new evolutionary analyses that account for reticulation. PMID:27144273
Bayesian Inference and Forecasting in the Stationary Bilinear Model
Roberto Leon-Gonzalez; Fuyu Yang
2014-01-01
A stationary bilinear (SB) model can be used to describe processes with a time-varying degree of persistence that depends on past shocks. An example of such a process is inflation. This study develops methods for Bayesian inference, model comparison, and forecasting in the SB model. Using monthly U.K. inflation data, we find that the SB model outperforms the random walk and first order autoregressive AR(1) models in terms of root mean squared forecast errors for both the one-step-ahead and th...
Progress on Bayesian Inference of the Fast Ion Distribution Function
DEFF Research Database (Denmark)
Stagner, L.; Heidbrink, W.W,; Chen, X.;
2013-01-01
The fast-ion distribution function (DF) has a complicated dependence on several phase-space variables. The standard analysis procedure in energetic particle research is to compute the DF theoretically, use that DF in forward modeling to predict diagnostic signals, then compare with measured data....... However, when theory and experiment disagree (for one or more diagnostics), it is unclear how to proceed. Bayesian statistics provides a framework to infer the DF, quantify errors, and reconcile discrepant diagnostic measurements. Diagnostic errors and weight functions that describe the phase space...
Towards Bayesian Inference of the Fast-Ion Distribution Function
DEFF Research Database (Denmark)
Stagner, L.; Heidbrink, W.W.; Salewski, Mirko
2012-01-01
The fast-ion distribution function (DF) has a complicated dependence on several phase-space variables. The standard analysis procedure in energetic particle research is to compute the DF theoretically, use that DF in forward modeling to predict diagnostic signals, then compare with measured data....... However, when theory and experiment disagree (for one or more diagnostics), it is unclear how to proceed. Bayesian statistics provides a framework to infer the DF, quantify errors, and reconcile discrepant diagnostic measurements. Diagnostic errors and ``weight functions" that describe the phase space...
Structural damage identification using piezoelectric impedance and Bayesian inference
Shuai, Q.; Zhou, K.; Tang, J.
2015-04-01
Structural damage identification is a challenging subject in the structural health monitoring research. The piezoelectric impedance-based damage identification, which usually utilizes the matrix inverse-based optimization, may in theory identify the damage location and damage severity. However, the sensitivity matrix is oftentimes ill-conditioned in practice, since the number of unknowns may far exceed the useful measurements/inputs. In this research, a new method based on intelligent inference framework for damage identification is presented. Bayesian inference is used to directly predict damage location and severity using impedance measurement through forward prediction and comparison. Gaussian process is employed to enrich the forward analysis result, thereby reducing computational cost. Case study is carried out to illustrate the identification performance.
GPU Computing in Bayesian Inference of Realized Stochastic Volatility Model
International Nuclear Information System (INIS)
The realized stochastic volatility (RSV) model that utilizes the realized volatility as additional information has been proposed to infer volatility of financial time series. We consider the Bayesian inference of the RSV model by the Hybrid Monte Carlo (HMC) algorithm. The HMC algorithm can be parallelized and thus performed on the GPU for speedup. The GPU code is developed with CUDA Fortran. We compare the computational time in performing the HMC algorithm on GPU (GTX 760) and CPU (Intel i7-4770 3.4GHz) and find that the GPU can be up to 17 times faster than the CPU. We also code the program with OpenACC and find that appropriate coding can achieve the similar speedup with CUDA Fortran
Bayesian inference of population size history from multiple loci
Directory of Open Access Journals (Sweden)
Drummond Alexei J
2008-10-01
Full Text Available Abstract Background Effective population size (Ne is related to genetic variability and is a basic parameter in many models of population genetics. A number of methods for inferring current and past population sizes from genetic data have been developed since JFC Kingman introduced the n-coalescent in 1982. Here we present the Extended Bayesian Skyline Plot, a non-parametric Bayesian Markov chain Monte Carlo algorithm that extends a previous coalescent-based method in several ways, including the ability to analyze multiple loci. Results Through extensive simulations we show the accuracy and limitations of inferring population size as a function of the amount of data, including recovering information about evolutionary bottlenecks. We also analyzed two real data sets to demonstrate the behavior of the new method; a single gene Hepatitis C virus data set sampled from Egypt and a 10 locus Drosophila ananassae data set representing 16 different populations. Conclusion The results demonstrate the essential role of multiple loci in recovering population size dynamics. Multi-locus data from a small number of individuals can precisely recover past bottlenecks in population size which can not be characterized by analysis of a single locus. We also demonstrate that sequence data quality is important because even moderate levels of sequencing errors result in a considerable decrease in estimation accuracy for realistic levels of population genetic variability.
Bayesian Inference for Signal-Based Seismic Monitoring
Moore, D.
2015-12-01
Traditional seismic monitoring systems rely on discrete detections produced by station processing software, discarding significant information present in the original recorded signal. SIG-VISA (Signal-based Vertically Integrated Seismic Analysis) is a system for global seismic monitoring through Bayesian inference on seismic signals. By modeling signals directly, our forward model is able to incorporate a rich representation of the physics underlying the signal generation process, including source mechanisms, wave propagation, and station response. This allows inference in the model to recover the qualitative behavior of recent geophysical methods including waveform matching and double-differencing, all as part of a unified Bayesian monitoring system that simultaneously detects and locates events from a global network of stations. We demonstrate recent progress in scaling up SIG-VISA to efficiently process the data stream of global signals recorded by the International Monitoring System (IMS), including comparisons against existing processing methods that show increased sensitivity from our signal-based model and in particular the ability to locate events (including aftershock sequences that can tax analyst processing) precisely from waveform correlation effects. We also provide a Bayesian analysis of an alleged low-magnitude event near the DPRK test site in May 2010 [1] [2], investigating whether such an event could plausibly be detected through automated processing in a signal-based monitoring system. [1] Zhang, Miao and Wen, Lianxing. "Seismological Evidence for a Low-Yield Nuclear Test on 12 May 2010 in North Korea". Seismological Research Letters, January/February 2015. [2] Richards, Paul. "A Seismic Event in North Korea on 12 May 2010". CTBTO SnT 2015 oral presentation, video at https://video-archive.ctbto.org/index.php/kmc/preview/partner_id/103/uiconf_id/4421629/entry_id/0_ymmtpps0/delivery/http
Bayesian inference for identifying interaction rules in moving animal groups.
Mann, Richard P
2011-01-01
The emergence of similar collective patterns from different self-propelled particle models of animal groups points to a restricted set of "universal" classes for these patterns. While universality is interesting, it is often the fine details of animal interactions that are of biological importance. Universality thus presents a challenge to inferring such interactions from macroscopic group dynamics since these can be consistent with many underlying interaction models. We present a Bayesian framework for learning animal interaction rules from fine scale recordings of animal movements in swarms. We apply these techniques to the inverse problem of inferring interaction rules from simulation models, showing that parameters can often be inferred from a small number of observations. Our methodology allows us to quantify our confidence in parameter fitting. For example, we show that attraction and alignment terms can be reliably estimated when animals are milling in a torus shape, while interaction radius cannot be reliably measured in such a situation. We assess the importance of rate of data collection and show how to test different models, such as topological and metric neighbourhood models. Taken together our results both inform the design of experiments on animal interactions and suggest how these data should be best analysed. PMID:21829657
Bayesian inference for identifying interaction rules in moving animal groups.
Directory of Open Access Journals (Sweden)
Richard P Mann
Full Text Available The emergence of similar collective patterns from different self-propelled particle models of animal groups points to a restricted set of "universal" classes for these patterns. While universality is interesting, it is often the fine details of animal interactions that are of biological importance. Universality thus presents a challenge to inferring such interactions from macroscopic group dynamics since these can be consistent with many underlying interaction models. We present a Bayesian framework for learning animal interaction rules from fine scale recordings of animal movements in swarms. We apply these techniques to the inverse problem of inferring interaction rules from simulation models, showing that parameters can often be inferred from a small number of observations. Our methodology allows us to quantify our confidence in parameter fitting. For example, we show that attraction and alignment terms can be reliably estimated when animals are milling in a torus shape, while interaction radius cannot be reliably measured in such a situation. We assess the importance of rate of data collection and show how to test different models, such as topological and metric neighbourhood models. Taken together our results both inform the design of experiments on animal interactions and suggest how these data should be best analysed.
Inference of Gene Regulatory Network Based on Local Bayesian Networks
Liu, Fei; Zhang, Shao-Wu; Guo, Wei-Feng; Chen, Luonan
2016-01-01
The inference of gene regulatory networks (GRNs) from expression data can mine the direct regulations among genes and gain deep insights into biological processes at a network level. During past decades, numerous computational approaches have been introduced for inferring the GRNs. However, many of them still suffer from various problems, e.g., Bayesian network (BN) methods cannot handle large-scale networks due to their high computational complexity, while information theory-based methods cannot identify the directions of regulatory interactions and also suffer from false positive/negative problems. To overcome the limitations, in this work we present a novel algorithm, namely local Bayesian network (LBN), to infer GRNs from gene expression data by using the network decomposition strategy and false-positive edge elimination scheme. Specifically, LBN algorithm first uses conditional mutual information (CMI) to construct an initial network or GRN, which is decomposed into a number of local networks or GRNs. Then, BN method is employed to generate a series of local BNs by selecting the k-nearest neighbors of each gene as its candidate regulatory genes, which significantly reduces the exponential search space from all possible GRN structures. Integrating these local BNs forms a tentative network or GRN by performing CMI, which reduces redundant regulations in the GRN and thus alleviates the false positive problem. The final network or GRN can be obtained by iteratively performing CMI and local BN on the tentative network. In the iterative process, the false or redundant regulations are gradually removed. When tested on the benchmark GRN datasets from DREAM challenge as well as the SOS DNA repair network in E.coli, our results suggest that LBN outperforms other state-of-the-art methods (ARACNE, GENIE3 and NARROMI) significantly, with more accurate and robust performance. In particular, the decomposition strategy with local Bayesian networks not only effectively reduce
Inference of Gene Regulatory Network Based on Local Bayesian Networks.
Liu, Fei; Zhang, Shao-Wu; Guo, Wei-Feng; Wei, Ze-Gang; Chen, Luonan
2016-08-01
The inference of gene regulatory networks (GRNs) from expression data can mine the direct regulations among genes and gain deep insights into biological processes at a network level. During past decades, numerous computational approaches have been introduced for inferring the GRNs. However, many of them still suffer from various problems, e.g., Bayesian network (BN) methods cannot handle large-scale networks due to their high computational complexity, while information theory-based methods cannot identify the directions of regulatory interactions and also suffer from false positive/negative problems. To overcome the limitations, in this work we present a novel algorithm, namely local Bayesian network (LBN), to infer GRNs from gene expression data by using the network decomposition strategy and false-positive edge elimination scheme. Specifically, LBN algorithm first uses conditional mutual information (CMI) to construct an initial network or GRN, which is decomposed into a number of local networks or GRNs. Then, BN method is employed to generate a series of local BNs by selecting the k-nearest neighbors of each gene as its candidate regulatory genes, which significantly reduces the exponential search space from all possible GRN structures. Integrating these local BNs forms a tentative network or GRN by performing CMI, which reduces redundant regulations in the GRN and thus alleviates the false positive problem. The final network or GRN can be obtained by iteratively performing CMI and local BN on the tentative network. In the iterative process, the false or redundant regulations are gradually removed. When tested on the benchmark GRN datasets from DREAM challenge as well as the SOS DNA repair network in E.coli, our results suggest that LBN outperforms other state-of-the-art methods (ARACNE, GENIE3 and NARROMI) significantly, with more accurate and robust performance. In particular, the decomposition strategy with local Bayesian networks not only effectively reduce
Bayesian inference on the sphere beyond statistical isotropy
Das, Santanu; Souradeep, Tarun
2015-01-01
We present a general method for Bayesian inference of the underlying covariance structure of random fields on a sphere. We employ the Bipolar Spherical Harmonic (BipoSH) representation of general covariance structure on the sphere. We illustrate the efficacy of the method as a principled approach to assess violation of statistical isotropy (SI) in the sky maps of Cosmic Microwave Background (CMB) fluctuations. SI violation in observed CMB maps arise due to known physical effects such as Doppler boost and weak lensing; yet unknown theoretical possibilities like cosmic topology and subtle violations of the cosmological principle, as well as, expected observational artefacts of scanning the sky with a non-circular beam, masking, foreground residuals, anisotropic noise, etc. We explicitly demonstrate the recovery of the input SI violation signals with their full statistics in simulated CMB maps. Our formalism easily adapts to exploring parametric physical models with non-SI covariance, as we illustrate for the in...
Bayesian Inference of Natural Rankings in Incomplete Competition Networks
Park, Juyong
2013-01-01
Competition between a complex system's constituents and a corresponding reward mechanism based on it have profound influence on the functioning, stability, and evolution of the system. But determining the dominance hierarchy or ranking among the constituent parts from the strongest to the weakest -- essential in determining reward or penalty -- is almost always an ambiguous task due to the incomplete nature of competition networks. Here we introduce ``Natural Ranking," a desirably unambiguous ranking method applicable to a complete (full) competition network, and formulate an analytical model based on the Bayesian formula inferring the expected mean and error of the natural ranking of nodes from an incomplete network. We investigate its potential and uses in solving issues in ranking by applying to a real-world competition network of economic and social importance.
Bayesian large-scale structure inference and cosmic web analysis
Leclercq, Florent
2015-01-01
Surveys of the cosmic large-scale structure carry opportunities for building and testing cosmological theories about the origin and evolution of the Universe. This endeavor requires appropriate data assimilation tools, for establishing the contact between survey catalogs and models of structure formation. In this thesis, we present an innovative statistical approach for the ab initio simultaneous analysis of the formation history and morphology of the cosmic web: the BORG algorithm infers the primordial density fluctuations and produces physical reconstructions of the dark matter distribution that underlies observed galaxies, by assimilating the survey data into a cosmological structure formation model. The method, based on Bayesian probability theory, provides accurate means of uncertainty quantification. We demonstrate the application of BORG to the Sloan Digital Sky Survey data and describe the primordial and late-time large-scale structure in the observed volume. We show how the approach has led to the fi...
Unsupervised Transient Light Curve Analysis Via Hierarchical Bayesian Inference
Sanders, Nathan; Soderberg, Alicia
2014-01-01
Historically, light curve studies of supernovae (SNe) and other transient classes have focused on individual objects with copious and high signal-to-noise observations. In the nascent era of wide field transient searches, objects with detailed observations are decreasing as a fraction of the overall known SN population, and this strategy sacrifices the majority of the information contained in the data about the underlying population of transients. A population level modeling approach, simultaneously fitting all available observations of objects in a transient sub-class of interest, fully mines the data to infer the properties of the population and avoids certain systematic biases. We present a novel hierarchical Bayesian statistical model for population level modeling of transient light curves, and discuss its implementation using an efficient Hamiltonian Monte Carlo technique. As a test case, we apply this model to the Type IIP SN sample from the Pan-STARRS1 Medium Deep Survey, consisting of 18,837 photometr...
Metainference: A Bayesian Inference Method for Heterogeneous Systems
Bonomi, Massimiliano; Cavalli, Andrea; Vendruscolo, Michele
2015-01-01
Modelling a complex system is almost invariably a challenging task. The incorporation of experimental observations can be used to improve the quality of a model, and thus to obtain better predictions about the behavior of the corresponding system. This approach, however, is affected by a variety of different errors, especially when a system populates simultaneously an ensemble of different states and experimental data are measured as averages over such states. To address this problem we present a method, called metainference, that combines Bayesian inference, which is a powerful strategy to deal with errors in experimental measurements, with the maximum entropy principle, which represents a rigorous approach to deal with experimental measurements averaged over multiple states. To illustrate the method we present its application to the determination of an ensemble of structures corresponding to the thermal fluctuations of a protein molecule. Metainference thus provides an approach to model complex systems with...
Bayesian inference of mass segregation of open clusters
Shao, Zhengyi; Chen, Li; Lin, Chien-Cheng; Zhong, Jing; Hou, Jinliang
2015-08-01
Based on the Bayesian inference (BI) method, the mixture-modeling approach is improved to combine all kinematic data, including the coordinative position, proper motion (PM) and radial velocity (RV), to separate the motion of the cluster from field stars in its area, as well as to describe the intrinsic kinematic status. Meanwhile, the membership probabilities of individual stars are determined as by product results. This method has been testified by simulation of toy models and it was found that the joint usage of multiple kinematic data can significantly reduce the missing rate of membership determination, say from ~15% for single data type to 1% for using all position, proper motion and radial velocity data.By combining kinematic data from multiple sources of photometric and redshift surveys, such as WIYN and APOGEE, M67 and NGC188 are revisited. Mass segregation is identified clearly for both of these two old open clusters, either in position or in PM spaces, since the Bayesian evidence (BE) of the model, which includes the segregation parameters, is much larger than that without it. The ongoing work is applying this method to the LAMOST released data which contains a large amount of RVs cover ~200 nearby open clusters. If the coming GAIA data can be used, the accuracy of tangential velocity will be largely improved and the intrinsic kinematics of open clusters can be well investigated, though they are usually less than 1 km/s.
Bayesian Fusion Algorithm for Inferring Trust in Wireless Sensor Networks
Directory of Open Access Journals (Sweden)
Mohammad Momani
2010-07-01
Full Text Available This paper introduces a new Bayesian fusion algorithm to combine more than one trust component (data trust and communication trust to infer the overall trust between nodes. This research work proposes that one trust component is not enough when deciding on whether or not to trust a specific node in a wireless sensor network. This paper discusses and analyses the results from the communication trust component (binary and the data trust component (continuous and proves that either component by itself, can mislead the network and eventually cause a total breakdown of the network. As a result of this, new algorithms are needed to combine more than one trust component to infer the overall trust. The proposed algorithm is simple and generic as it allows trust components to be added and deleted easily. Simulation results demonstrate that a node is highly trustworthy provided that both trust components simultaneously confirm its trustworthiness and conversely, a node is highly untrustworthy if its untrustworthiness is asserted by both components.
vs. a polynomial chaos-based MCMC
Siripatana, Adil
2014-08-01
Bayesian Inference of Manning\\'s n coefficient in a Storm Surge Model Framework: comparison between Kalman lter and polynomial based method Adil Siripatana Conventional coastal ocean models solve the shallow water equations, which describe the conservation of mass and momentum when the horizontal length scale is much greater than the vertical length scale. In this case vertical pressure gradients in the momentum equations are nearly hydrostatic. The outputs of coastal ocean models are thus sensitive to the bottom stress terms de ned through the formulation of Manning\\'s n coefficients. This thesis considers the Bayesian inference problem of the Manning\\'s n coefficient in the context of storm surge based on the coastal ocean ADCIRC model. In the first part of the thesis, we apply an ensemble-based Kalman filter, the singular evolutive interpolated Kalman (SEIK) filter to estimate both a constant Manning\\'s n coefficient and a 2-D parameterized Manning\\'s coefficient on one ideal and one of more realistic domain using observation system simulation experiments (OSSEs). We study the sensitivity of the system to the ensemble size. we also access the benefits from using an in ation factor on the filter performance. To study the limitation of the Guassian restricted assumption on the SEIK lter, 5 we also implemented in the second part of this thesis a Markov Chain Monte Carlo (MCMC) method based on a Generalized Polynomial chaos (gPc) approach for the estimation of the 1-D and 2-D Mannning\\'s n coe cient. The gPc is used to build a surrogate model that imitate the ADCIRC model in order to make the computational cost of implementing the MCMC with the ADCIRC model reasonable. We evaluate the performance of the MCMC-gPc approach and study its robustness to di erent OSSEs scenario. we also compare its estimates with those resulting from SEIK in term of parameter estimates and full distributions. we present a full analysis of the solution of these two methods, of the
Rajabi, Mohammad Mahdi; Ataie-Ashtiani, Behzad
2016-05-01
Bayesian inference has traditionally been conceived as the proper framework for the formal incorporation of expert knowledge in parameter estimation of groundwater models. However, conventional Bayesian inference is incapable of taking into account the imprecision essentially embedded in expert provided information. In order to solve this problem, a number of extensions to conventional Bayesian inference have been introduced in recent years. One of these extensions is 'fuzzy Bayesian inference' which is the result of integrating fuzzy techniques into Bayesian statistics. Fuzzy Bayesian inference has a number of desirable features which makes it an attractive approach for incorporating expert knowledge in the parameter estimation process of groundwater models: (1) it is well adapted to the nature of expert provided information, (2) it allows to distinguishably model both uncertainty and imprecision, and (3) it presents a framework for fusing expert provided information regarding the various inputs of the Bayesian inference algorithm. However an important obstacle in employing fuzzy Bayesian inference in groundwater numerical modeling applications is the computational burden, as the required number of numerical model simulations often becomes extremely exhaustive and often computationally infeasible. In this paper, a novel approach of accelerating the fuzzy Bayesian inference algorithm is proposed which is based on using approximate posterior distributions derived from surrogate modeling, as a screening tool in the computations. The proposed approach is first applied to a synthetic test case of seawater intrusion (SWI) in a coastal aquifer. It is shown that for this synthetic test case, the proposed approach decreases the number of required numerical simulations by an order of magnitude. Then the proposed approach is applied to a real-world test case involving three-dimensional numerical modeling of SWI in Kish Island, located in the Persian Gulf. An expert
Bayesian inference for partially identified models exploring the limits of limited data
Gustafson, Paul
2015-01-01
Introduction Identification What Is against Us? What Is for Us? Some Simple Examples of Partially Identified ModelsThe Road Ahead The Structure of Inference in Partially Identified Models Bayesian Inference The Structure of Posterior Distributions in PIMs Computational Strategies Strength of Bayesian Updating, Revisited Posterior MomentsCredible Intervals Evaluating the Worth of Inference Partial Identification versus Model Misspecification The Siren Call of Identification Comp
On Russian Roulette Estimates for Bayesian Inference with Doubly-Intractable Likelihoods
Lyne, Anne-Marie; Girolami, Mark; Atchadé, Yves; Strathmann, Heiko; Simpson, Daniel
2013-01-01
A large number of statistical models are “doubly-intractable”: the likelihood normalising term, which is a function of the model parameters, is intractable, as well as the marginal likelihood (model evidence). This means that standard inference techniques to sample from the posterior, such as Markov chain Monte Carlo (MCMC), cannot be used. Examples include, but are not confined to, massive Gaussian Markov random fields, autologistic models and Exponential random graph models. A number of app...
Bayesian Inference for Functional Dynamics Exploring in fMRI Data
Directory of Open Access Journals (Sweden)
Xuan Guo
2016-01-01
Full Text Available This paper aims to review state-of-the-art Bayesian-inference-based methods applied to functional magnetic resonance imaging (fMRI data. Particularly, we focus on one specific long-standing challenge in the computational modeling of fMRI datasets: how to effectively explore typical functional interactions from fMRI time series and the corresponding boundaries of temporal segments. Bayesian inference is a method of statistical inference which has been shown to be a powerful tool to encode dependence relationships among the variables with uncertainty. Here we provide an introduction to a group of Bayesian-inference-based methods for fMRI data analysis, which were designed to detect magnitude or functional connectivity change points and to infer their functional interaction patterns based on corresponding temporal boundaries. We also provide a comparison of three popular Bayesian models, that is, Bayesian Magnitude Change Point Model (BMCPM, Bayesian Connectivity Change Point Model (BCCPM, and Dynamic Bayesian Variable Partition Model (DBVPM, and give a summary of their applications. We envision that more delicate Bayesian inference models will be emerging and play increasingly important roles in modeling brain functions in the years to come.
Hernández, Mario R.; Francés, Félix
2015-04-01
One phase of the hydrological models implementation process, significantly contributing to the hydrological predictions uncertainty, is the calibration phase in which values of the unknown model parameters are tuned by optimizing an objective function. An unsuitable error model (e.g. Standard Least Squares or SLS) introduces noise into the estimation of the parameters. The main sources of this noise are the input errors and the hydrological model structural deficiencies. Thus, the biased calibrated parameters cause the divergence model phenomenon, where the errors variance of the (spatially and temporally) forecasted flows far exceeds the errors variance in the fitting period, and provoke the loss of part or all of the physical meaning of the modeled processes. In other words, yielding a calibrated hydrological model which works well, but not for the right reasons. Besides, an unsuitable error model yields a non-reliable predictive uncertainty assessment. Hence, with the aim of prevent all these undesirable effects, this research focuses on the Bayesian joint inference (BJI) of both the hydrological and error model parameters, considering a general additive (GA) error model that allows for correlation, non-stationarity (in variance and bias) and non-normality of model residuals. As hydrological model, it has been used a conceptual distributed model called TETIS, with a particular split structure of the effective model parameters. Bayesian inference has been performed with the aid of a Markov Chain Monte Carlo (MCMC) algorithm called Dream-ZS. MCMC algorithm quantifies the uncertainty of the hydrological and error model parameters by getting the joint posterior probability distribution, conditioned on the observed flows. The BJI methodology is a very powerful and reliable tool, but it must be used correctly this is, if non-stationarity in errors variance and bias is modeled, the Total Laws must be taken into account. The results of this research show that the
Bayesian analysis of Markov point processes
DEFF Research Database (Denmark)
Berthelsen, Kasper Klitgaard; Møller, Jesper
2006-01-01
Recently Møller, Pettitt, Berthelsen and Reeves introduced a new MCMC methodology for drawing samples from a posterior distribution when the likelihood function is only specified up to a normalising constant. We illustrate the method in the setting of Bayesian inference for Markov point processes...
Bayesian Inference for Neighborhood Filters With Application in Denoising.
Huang, Chao-Tsung
2015-11-01
Range-weighted neighborhood filters are useful and popular for their edge-preserving property and simplicity, but they are originally proposed as intuitive tools. Previous works needed to connect them to other tools or models for indirect property reasoning or parameter estimation. In this paper, we introduce a unified empirical Bayesian framework to do both directly. A neighborhood noise model is proposed to reason and infer the Yaroslavsky, bilateral, and modified non-local means filters by joint maximum a posteriori and maximum likelihood estimation. Then, the essential parameter, range variance, can be estimated via model fitting to the empirical distribution of an observable chi scale mixture variable. An algorithm based on expectation-maximization and quasi-Newton optimization is devised to perform the model fitting efficiently. Finally, we apply this framework to the problem of color-image denoising. A recursive fitting and filtering scheme is proposed to improve the image quality. Extensive experiments are performed for a variety of configurations, including different kernel functions, filter types and support sizes, color channel numbers, and noise types. The results show that the proposed framework can fit noisy images well and the range variance can be estimated successfully and efficiently. PMID:26259244
Bayesian inference and life testing plans for generalized exponential distribution
Institute of Scientific and Technical Information of China (English)
KUNDU; Debasis; PRADHAN; Biswabrata
2009-01-01
Recently generalized exponential distribution has received considerable attentions.In this paper,we deal with the Bayesian inference of the unknown parameters of the progressively censored generalized exponential distribution.It is assumed that the scale and the shape parameters have independent gamma priors.The Bayes estimates of the unknown parameters cannot be obtained in the closed form.Lindley’s approximation and importance sampling technique have been suggested to compute the approximate Bayes estimates.Markov Chain Monte Carlo method has been used to compute the approximate Bayes estimates and also to construct the highest posterior density credible intervals.We also provide different criteria to compare two different sampling schemes and hence to ?nd the optimal sampling schemes.It is observed that ?nding the optimum censoring procedure is a computationally expensive process.And we have recommended to use the sub-optimal censoring procedure,which can be obtained very easily.Monte Carlo simulations are performed to compare the performances of the different methods and one data analysis has been performed for illustrative purposes.
Metainference: A Bayesian inference method for heterogeneous systems.
Bonomi, Massimiliano; Camilloni, Carlo; Cavalli, Andrea; Vendruscolo, Michele
2016-01-01
Modeling a complex system is almost invariably a challenging task. The incorporation of experimental observations can be used to improve the quality of a model and thus to obtain better predictions about the behavior of the corresponding system. This approach, however, is affected by a variety of different errors, especially when a system simultaneously populates an ensemble of different states and experimental data are measured as averages over such states. To address this problem, we present a Bayesian inference method, called "metainference," that is able to deal with errors in experimental measurements and with experimental measurements averaged over multiple states. To achieve this goal, metainference models a finite sample of the distribution of models using a replica approach, in the spirit of the replica-averaging modeling based on the maximum entropy principle. To illustrate the method, we present its application to a heterogeneous model system and to the determination of an ensemble of structures corresponding to the thermal fluctuations of a protein molecule. Metainference thus provides an approach to modeling complex systems with heterogeneous components and interconverting between different states by taking into account all possible sources of errors. PMID:26844300
Bayesian Inference for NASA Probabilistic Risk and Reliability Analysis
Dezfuli, Homayoon; Kelly, Dana; Smith, Curtis; Vedros, Kurt; Galyean, William
2009-01-01
This document, Bayesian Inference for NASA Probabilistic Risk and Reliability Analysis, is intended to provide guidelines for the collection and evaluation of risk and reliability-related data. It is aimed at scientists and engineers familiar with risk and reliability methods and provides a hands-on approach to the investigation and application of a variety of risk and reliability data assessment methods, tools, and techniques. This document provides both: A broad perspective on data analysis collection and evaluation issues. A narrow focus on the methods to implement a comprehensive information repository. The topics addressed herein cover the fundamentals of how data and information are to be used in risk and reliability analysis models and their potential role in decision making. Understanding these topics is essential to attaining a risk informed decision making environment that is being sought by NASA requirements and procedures such as 8000.4 (Agency Risk Management Procedural Requirements), NPR 8705.05 (Probabilistic Risk Assessment Procedures for NASA Programs and Projects), and the System Safety requirements of NPR 8715.3 (NASA General Safety Program Requirements).
Fast, fully Bayesian spatiotemporal inference for fMRI data.
Musgrove, Donald R; Hughes, John; Eberly, Lynn E
2016-04-01
We propose a spatial Bayesian variable selection method for detecting blood oxygenation level dependent activation in functional magnetic resonance imaging (fMRI) data. Typical fMRI experiments generate large datasets that exhibit complex spatial and temporal dependence. Fitting a full statistical model to such data can be so computationally burdensome that many practitioners resort to fitting oversimplified models, which can lead to lower quality inference. We develop a full statistical model that permits efficient computation. Our approach eases the computational burden in two ways. We partition the brain into 3D parcels, and fit our model to the parcels in parallel. Voxel-level activation within each parcel is modeled as regressions located on a lattice. Regressors represent the magnitude of change in blood oxygenation in response to a stimulus, while a latent indicator for each regressor represents whether the change is zero or non-zero. A sparse spatial generalized linear mixed model captures the spatial dependence among indicator variables within a parcel and for a given stimulus. The sparse SGLMM permits considerably more efficient computation than does the spatial model typically employed in fMRI. Through simulation we show that our parcellation scheme performs well in various realistic scenarios. Importantly, indicator variables on the boundary between parcels do not exhibit edge effects. We conclude by applying our methodology to data from a task-based fMRI experiment. PMID:26553916
Markov Model of Wind Power Time Series UsingBayesian Inference of Transition Matrix
Chen, Peiyuan; Berthelsen, Kasper Klitgaard; Bak-Jensen, Birgitte; Chen, Zhe
2009-01-01
This paper proposes to use Bayesian inference of transition matrix when developing a discrete Markov model of a wind speed/power time series and 95% credible interval for the model verification. The Dirichlet distribution is used as a conjugate prior for the transition matrix. Three discrete Markov models are compared, i.e. the basic Markov model, the Bayesian Markov model and the birth-and-death Markov model. The proposed Bayesian Markov model shows the best accuracy in modeling the autocorr...
Mocapy++ - A toolkit for inference and learning in dynamic Bayesian networks
DEFF Research Database (Denmark)
Paluszewski, Martin; Hamelryck, Thomas Wim
2010-01-01
Background Mocapy++ is a toolkit for parameter learning and inference in dynamic Bayesian networks (DBNs). It supports a wide range of DBN architectures and probability distributions, including distributions from directional statistics (the statistics of angles, directions and orientations...
VIGoR: Variational Bayesian Inference for Genome-Wide Regression
Onogi, Akio; Iwata, Hiroyoshi
2016-01-01
Genome-wide regression using a number of genome-wide markers as predictors is now widely used for genome-wide association mapping and genomic prediction. We developed novel software for genome-wide regression which we named VIGoR (variational Bayesian inference for genome-wide regression). Variational Bayesian inference is computationally much faster than widely used Markov chain Monte Carlo algorithms. VIGoR implements seven regression methods, and is provided as a command line program packa...
UNSUPERVISED TRANSIENT LIGHT CURVE ANALYSIS VIA HIERARCHICAL BAYESIAN INFERENCE
Energy Technology Data Exchange (ETDEWEB)
Sanders, N. E.; Soderberg, A. M. [Harvard-Smithsonian Center for Astrophysics, 60 Garden Street, Cambridge, MA 02138 (United States); Betancourt, M., E-mail: nsanders@cfa.harvard.edu [Department of Statistics, University of Warwick, Coventry CV4 7AL (United Kingdom)
2015-02-10
Historically, light curve studies of supernovae (SNe) and other transient classes have focused on individual objects with copious and high signal-to-noise observations. In the nascent era of wide field transient searches, objects with detailed observations are decreasing as a fraction of the overall known SN population, and this strategy sacrifices the majority of the information contained in the data about the underlying population of transients. A population level modeling approach, simultaneously fitting all available observations of objects in a transient sub-class of interest, fully mines the data to infer the properties of the population and avoids certain systematic biases. We present a novel hierarchical Bayesian statistical model for population level modeling of transient light curves, and discuss its implementation using an efficient Hamiltonian Monte Carlo technique. As a test case, we apply this model to the Type IIP SN sample from the Pan-STARRS1 Medium Deep Survey, consisting of 18,837 photometric observations of 76 SNe, corresponding to a joint posterior distribution with 9176 parameters under our model. Our hierarchical model fits provide improved constraints on light curve parameters relevant to the physical properties of their progenitor stars relative to modeling individual light curves alone. Moreover, we directly evaluate the probability for occurrence rates of unseen light curve characteristics from the model hyperparameters, addressing observational biases in survey methodology. We view this modeling framework as an unsupervised machine learning technique with the ability to maximize scientific returns from data to be collected by future wide field transient searches like LSST.
Bayesian inference of the demographic history of chimpanzees.
Wegmann, Daniel; Excoffier, Laurent
2010-06-01
Due to an almost complete absence of fossil record, the evolutionary history of chimpanzees has only been studied recently on the basis of genetic data. Although the general topology of the chimpanzee phylogeny is well established, uncertainties remain concerning the size of current and past populations, the occurrence of bottlenecks or population expansions, or about divergence times and migrations rates between subspecies. Here, we present a novel attempt at globally inferring the detailed evolution of the Pan genus based on approximate Bayesian computation, an approach preferentially applied to complex models where the likelihood cannot be computed analytically. Based on two microsatellite and DNA sequence data sets and adjusting simulated data for local levels of inbreeding and patterns of missing data, we find support for several new features of chimpanzee evolution as compared with previous studies based on smaller data sets and simpler evolutionary models. We find that the central chimpanzees are certainly the oldest population of all P. troglodytes subspecies and that the other two P. t. subspecies diverged from the central chimpanzees by founder events. We also find an older divergence time (1.6 million years [My]) between common chimpanzee and Bonobos than previous studies (0.9-1.3 My), but this divergence appears to have been very progressive with the maintenance of relatively high levels of gene flow between the ancestral chimpanzee population and the Bonobos. Finally, we could also confirm the existence of strong unidirectional gene flow from the western into the central chimpanzee. These results show that interesting and innovative features of chimpanzee history emerge when considering their whole evolutionary history in a single analysis, rather than relying on simpler models involving several comparisons of pairs of populations. PMID:20118191
Bayesian inference of nonlinear unsteady aerodynamics from aeroelastic limit cycle oscillations
Sandhu, Rimple; Poirel, Dominique; Pettit, Chris; Khalil, Mohammad; Sarkar, Abhijit
2016-07-01
A Bayesian model selection and parameter estimation algorithm is applied to investigate the influence of nonlinear and unsteady aerodynamic loads on the limit cycle oscillation (LCO) of a pitching airfoil in the transitional Reynolds number regime. At small angles of attack, laminar boundary layer trailing edge separation causes negative aerodynamic damping leading to the LCO. The fluid-structure interaction of the rigid, but elastically mounted, airfoil and nonlinear unsteady aerodynamics is represented by two coupled nonlinear stochastic ordinary differential equations containing uncertain parameters and model approximation errors. Several plausible aerodynamic models with increasing complexity are proposed to describe the aeroelastic system leading to LCO. The likelihood in the posterior parameter probability density function (pdf) is available semi-analytically using the extended Kalman filter for the state estimation of the coupled nonlinear structural and unsteady aerodynamic model. The posterior parameter pdf is sampled using a parallel and adaptive Markov Chain Monte Carlo (MCMC) algorithm. The posterior probability of each model is estimated using the Chib-Jeliazkov method that directly uses the posterior MCMC samples for evidence (marginal likelihood) computation. The Bayesian algorithm is validated through a numerical study and then applied to model the nonlinear unsteady aerodynamic loads using wind-tunnel test data at various Reynolds numbers.
Elsheikh, Ahmed H.
2014-02-01
An efficient Bayesian calibration method based on the nested sampling (NS) algorithm and non-intrusive polynomial chaos method is presented. Nested sampling is a Bayesian sampling algorithm that builds a discrete representation of the posterior distributions by iteratively re-focusing a set of samples to high likelihood regions. NS allows representing the posterior probability density function (PDF) with a smaller number of samples and reduces the curse of dimensionality effects. The main difficulty of the NS algorithm is in the constrained sampling step which is commonly performed using a random walk Markov Chain Monte-Carlo (MCMC) algorithm. In this work, we perform a two-stage sampling using a polynomial chaos response surface to filter out rejected samples in the Markov Chain Monte-Carlo method. The combined use of nested sampling and the two-stage MCMC based on approximate response surfaces provides significant computational gains in terms of the number of simulation runs. The proposed algorithm is applied for calibration and model selection of subsurface flow models. © 2013.
Bayesian networks inference algorithm to implement Dempster Shafer theory in reliability analysis
International Nuclear Information System (INIS)
This paper deals with the use of Bayesian networks to compute system reliability. The reliability analysis problem is described and the usual methods for quantitative reliability analysis are presented within a case study. Some drawbacks that justify the use of Bayesian networks are identified. The basic concepts of the Bayesian networks application to reliability analysis are introduced and a model to compute the reliability for the case study is presented. Dempster Shafer theory to treat epistemic uncertainty in reliability analysis is then discussed and its basic concepts that can be applied thanks to the Bayesian network inference algorithm are introduced. Finally, it is shown, with a numerical example, how Bayesian networks' inference algorithms compute complex system reliability and what the Dempster Shafer theory can provide to reliability analysis
Online query answering with differential privacy: a utility-driven approach using Bayesian inference
Xiao, Yonghui
2012-01-01
Data privacy issues frequently and increasingly arise for data sharing and data analysis tasks. In this paper, we study the problem of online query answering under the rigorous differential privacy model. The existing interactive mechanisms for differential privacy can only support a limited number of queries before the accumulated cost of privacy reaches a certain bound. This limitation has greatly hindered their applicability, especially in the scenario where multiple users legitimately need to pose a large number of queries. To minimize the privacy cost and extend the life span of a system, we propose a utility-driven mechanism for online query answering using Bayesian statistical inference. The key idea is to keep track of the query history and use Bayesian inference to answer a new query using previous query answers. The Bayesian inference algorithm provides both optimal point estimation and optimal interval estimation. We formally quantify the error of the inference result to determine if it satisfies t...
A Non-Parametric Bayesian Method for Inferring Hidden Causes
Wood, Frank; Griffiths, Thomas; Ghahramani, Zoubin
2012-01-01
We present a non-parametric Bayesian approach to structure learning with hidden causes. Previous Bayesian treatments of this problem define a prior over the number of hidden causes and use algorithms such as reversible jump Markov chain Monte Carlo to move between solutions. In contrast, we assume that the number of hidden causes is unbounded, but only a finite number influence observable variables. This makes it possible to use a Gibbs sampler to approximate the distribution over causal stru...
Protein NMR Structure Refinement based on Bayesian Inference
Ikeya, Teppei; Ikeda, Shiro; Kigawa, Takanori; Ito, Yutaka; Güntert, Peter
2016-03-01
Nuclear Magnetic Resonance (NMR) spectroscopy is a tool to investigate threedimensional (3D) structures and dynamics of biomacromolecules at atomic resolution in solution or more natural environments such as living cells. Since NMR data are principally only spectra with peak signals, it is required to properly deduce structural information from the sparse experimental data with their imperfections and uncertainty, and to visualize 3D conformations by NMR structure calculation. In order to efficiently analyse the data, Rieping et al. proposed a new structure calculation method based on Bayes’ theorem. We implemented a similar approach into the program CYANA with some modifications. It allows us to handle automatic NOE cross peak assignments in unambiguous and ambiguous usages, and to create a prior distribution based on a physical force field with the generalized Born implicit water model. The sampling scheme for obtaining the posterior is performed by a hybrid Monte Carlo algorithm combined with Markov chain Monte Carlo (MCMC) by the Gibbs sampler, and molecular dynamics simulation (MD) for obtaining a canonical ensemble of conformations. Since it is not trivial to search the entire function space particularly for exploring the conformational prior due to the extraordinarily large conformation space of proteins, the replica exchange method is performed, in which several MCMC calculations with different temperatures run in parallel as replicas. It is shown with simulated data or randomly deleted experimental peaks that the new structure calculation method can provide accurate structures even with less peaks, especially compared with the conventional method. In particular, it dramatically improves in-cell structures of the proteins GB1 and TTHA1718 using exclusively information obtained in living Escherichia coli (E. coli) cells.
Sythesis of MCMC and Belief Propagation
Energy Technology Data Exchange (ETDEWEB)
Ahn, Sungsoo [Korea Advanced Institute of Science and Technology, Daejeon (South Korea); Chertkov, Michael [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Shin, Jinwoo [Korea Advanced Institute of Science and Technology, Daejeon (South Korea)
2016-05-27
Markov Chain Monte Carlo (MCMC) and Belief Propagation (BP) are the most popular algorithms for computational inference in Graphical Models (GM). In principle, MCMC is an exact probabilistic method which, however, often suffers from exponentially slow mixing. In contrast, BP is a deterministic method, which is typically fast, empirically very successful, however in general lacking control of accuracy over loopy graphs. In this paper, we introduce MCMC algorithms correcting the approximation error of BP, i.e., we provide a way to compensate for BP errors via a consecutive BP-aware MCMC. Our framework is based on the Loop Calculus (LC) approach which allows to express the BP error as a sum of weighted generalized loops. Although the full series is computationally intractable, it is known that a truncated series, summing up all 2-regular loops, is computable in polynomial-time for planar pair-wise binary GMs and it also provides a highly accurate approximation empirically. Motivated by this, we first propose a polynomial-time approximation MCMC scheme for the truncated series of general (non-planar) pair-wise binary models. Our main idea here is to use the Worm algorithm, known to provide fast mixing in other (related) problems, and then design an appropriate rejection scheme to sample 2-regular loops. Furthermore, we also design an efficient rejection-free MCMC scheme for approximating the full series. The main novelty underlying our design is in utilizing the concept of cycle basis, which provides an efficient decomposition of the generalized loops. In essence, the proposed MCMC schemes run on transformed GM built upon the non-trivial BP solution, and our experiments show that this synthesis of BP and MCMC outperforms both direct MCMC and bare BP schemes.
The Lumiere Project: Bayesian User Modeling for Inferring the Goals and Needs of Software Users
Horvitz, Eric J.; Breese, John S.; Heckerman, David; Hovel, David; Rommelse, Koos
2013-01-01
The Lumiere Project centers on harnessing probability and utility to provide assistance to computer software users. We review work on Bayesian user models that can be employed to infer a users needs by considering a user's background, actions, and queries. Several problems were tackled in Lumiere research, including (1) the construction of Bayesian models for reasoning about the time-varying goals of computer users from their observed actions and queries, (2) gaining access to a stream of eve...
Bayesian inference of BWR model parameters by Markov chain Monte Carlo
International Nuclear Information System (INIS)
In this paper, the Markov chain Monte Carlo approach to Bayesian inference is applied for estimating the parameters of a reduced-order model of the dynamics of a boiling water reactor system. A Bayesian updating strategy is devised to progressively refine the estimates, as newly measured data become available. Finally, the technique is used for detecting parameter changes during the system lifetime, e.g. due to component degradation
Jannson, Tomasz; Wang, Wenjian; Hodelin, Juan; Forrester, Thomas; Romanov, Volodymyr; Kostrzewski, Andrew
2016-05-01
In this paper, Bayesian Binary Sensing (BBS) is discussed as an effective tool for Bayesian Inference (BI) evaluation in interdisciplinary areas such as ISR (and, C3I), Homeland Security, QC, medicine, defense, and many others. In particular, Hilbertian Sine (HS) as an absolute measure of BI, is introduced, while avoiding relativity of decision threshold identification, as in the case of traditional measures of BI, related to false positives and false negatives.
Learning Bayesian networks for discrete data
Liang, Faming
2009-02-01
Bayesian networks have received much attention in the recent literature. In this article, we propose an approach to learn Bayesian networks using the stochastic approximation Monte Carlo (SAMC) algorithm. Our approach has two nice features. Firstly, it possesses the self-adjusting mechanism and thus avoids essentially the local-trap problem suffered by conventional MCMC simulation-based approaches in learning Bayesian networks. Secondly, it falls into the class of dynamic importance sampling algorithms; the network features can be inferred by dynamically weighted averaging the samples generated in the learning process, and the resulting estimates can have much lower variation than the single model-based estimates. The numerical results indicate that our approach can mix much faster over the space of Bayesian networks than the conventional MCMC simulation-based approaches. © 2008 Elsevier B.V. All rights reserved.
Analysis of Gumbel Model for Software Reliability Using Bayesian Paradigm
Directory of Open Access Journals (Sweden)
Raj Kumar
2012-12-01
Full Text Available In this paper, we have illustrated the suitability of Gumbel Model for software reliability data. The model parameters are estimated using likelihood based inferential procedure: classical as well as Bayesian. The quasi Newton-Raphson algorithm is applied to obtain the maximum likelihood estimates and associated probability intervals. The Bayesian estimates of the parameters of Gumbel model are obtained using Markov Chain Monte Carlo(MCMC simulation method in OpenBUGS(established software for Bayesian analysis using Markov Chain Monte Carlo methods. The R functions are developed to study the statistical properties, model validation and comparison tools of the model and the output analysis of MCMC samples generated from OpenBUGS. Details of applying MCMC to parameter estimation for the Gumbel model are elaborated and a real software reliability data set is considered to illustrate the methods of inference discussed in this paper.
Stochastic Collapsed Variational Bayesian Inference for Latent Dirichlet Allocation
J. Foulds; L. Boyles; C. DuBois; P. Smyth; M. Welling
2013-01-01
There has been an explosion in the amount of digital text information available in recent years, leading to challenges of scale for traditional inference algorithms for topic models. Recent advances in stochastic variational inference algorithms for latent Dirichlet allocation (LDA) have made it fea
Institute of Scientific and Technical Information of China (English)
LIU; Jianfeng; ZHANG; Yuan; ZHANG; Qin; WANG; Lixian; ZHANG; Jigang
2006-01-01
It is a challenging issue to map Quantitative Trait Loci (QTL) underlying complex discrete traits, which usually show discontinuous distribution and less information, using conventional statistical methods. Bayesian-Markov chain Monte Carlo (Bayesian-MCMC) approach is the key procedure in mapping QTL for complex binary traits, which provides a complete posterior distribution for QTL parameters using all prior information. As a consequence, Bayesian estimates of all interested variables can be obtained straightforwardly basing on their posterior samples simulated by the MCMC algorithm. In our study, utilities of Bayesian-MCMC are demonstrated using simulated several animal outbred full-sib families with different family structures for a complex binary trait underlied by both a QTL and polygene. Under the Identity-by-Descent-Based variance component random model, three samplers basing on MCMC, including Gibbs sampling, Metropolis algorithm and reversible jump MCMC, were implemented to generate the joint posterior distribution of all unknowns so that the QTL parameters were obtained by Bayesian statistical inferring. The results showed that Bayesian-MCMC approach could work well and robust under different family structures and QTL effects. As family size increases and the number of family decreases, the accuracy of the parameter estimates will be improved. When the true QTL has a small effect, using outbred population experiment design with large family size is the optimal mapping strategy.
A localization model to localize multiple sources using Bayesian inference
Dunham, Joshua Rolv
Accurate localization of a sound source in a room setting is important in both psychoacoustics and architectural acoustics. Binaural models have been proposed to explain how the brain processes and utilizes the interaural time differences (ITDs) and interaural level differences (ILDs) of sound waves arriving at the ears of a listener in determining source location. Recent work shows that applying Bayesian methods to this problem is proving fruitful. In this thesis, pink noise samples are convolved with head-related transfer functions (HRTFs) and compared to combinations of one and two anechoic speech signals convolved with different HRTFs or binaural room impulse responses (BRIRs) to simulate room positions. Through exhaustive calculation of Bayesian posterior probabilities and using a maximal likelihood approach, model selection will determine the number of sources present, and parameter estimation will result in azimuthal direction of the source(s).
Bayesian inference for a wavefront model of the Neolithisation of Europe
Baggaley, Andrew W; Shukurov, Anvar; Boys, Richard J; Golightly, Andrew
2012-01-01
We consider a wavefront model for the spread of Neolithic culture across Europe, and use Bayesian inference techniques to provide estimates for the parameters within this model, as constrained by radiocarbon data from Southern and Western Europe. Our wavefront model allows for both an isotropic background spread (incorporating the effects of local geography), and a localized anisotropic spread associated with major waterways. We introduce an innovative numerical scheme to track the wavefront, allowing us to simulate the times of the first arrival at any site orders of magnitude more efficiently than traditional PDE approaches. We adopt a Bayesian approach to inference and use Gaussian process emulators to facilitate further increases in efficiency in the inference scheme, thereby making Markov chain Monte Carlo methods practical. We allow for uncertainty in the fit of our model, and also infer a parameter specifying the magnitude of this uncertainty. We obtain a magnitude for the background spread of order 1 ...
What is the `relevant population' in Bayesian forensic inference?
Brümmer, Niko; de Villiers, Edward
2014-01-01
In works discussing the Bayesian paradigm for presenting forensic evidence in court, the concept of a `relevant population' is often mentioned, without a clear definition of what is meant, and without recommendations of how to select such populations. This note is to try to better understand this concept. Our analysis is intended to be general enough to be applicable to different forensic technologies and we shall consider both DNA profiling and speaker recognition as examples.
Robust Bayesian inference in Iq-Spherical models
Osiewalski, Jacek; Mark F.J. Steel
1992-01-01
The class of multivariate lq-spherical distributions is introduced and defined through their isodensity surfaces. We prove that, under a Jeffreys' type improper prior on the scale parameter, posterior inference on the location parameters is the same for all lq-spherical sampling models with common q. This gives us perfect inference robustness with respect to any departures from the reference case of independent sampling from the exponential power distribution.
Cornuet, Jean-Marie; Santos, Filipe; Beaumont, Mark A; Robert, Christian P.; Marin, Jean-Michel; Balding, David J.; Guillemaud, Thomas; Estoup, Arnaud
2008-01-01
Summary: Genetic data obtained on population samples convey information about their evolutionary history. Inference methods can extract part of this information but they require sophisticated statistical techniques that have been made available to the biologist community (through computer programs) only for simple and standard situations typically involving a small number of samples. We propose here a computer program (DIY ABC) for inference based on approximate Bayesian computation (ABC), in...
deBInfer: Bayesian inference for dynamical models of biological systems in R
Boersch-Supan, Philipp H.; Johnson, Leah R
2016-01-01
1. Differential equations (DEs) are commonly used to model the temporal evolution of biological systems, but statistical methods for comparing DE models to data and for parameter inference are relatively poorly developed. This is especially problematic in the context of biological systems where observations are often noisy and only a small number of time points may be available. 2. Bayesian approaches offer a coherent framework for parameter inference that can account for multiple sources of ...
Directory of Open Access Journals (Sweden)
Moritz eBoos
2016-05-01
Full Text Available Cognitive determinants of probabilistic inference were examined using hierarchical Bayesian modelling techniques. A classic urn-ball paradigm served as experimental strategy, involving a factorial two (prior probabilities by two (likelihoods design. Five computational models of cognitive processes were compared with the observed behaviour. Parameter-free Bayesian posterior probabilities and parameter-free base rate neglect provided inadequate models of probabilistic inference. The introduction of distorted subjective probabilities yielded more robust and generalizable results. A general class of (inverted S-shaped probability weighting functions had been proposed; however, the possibility of large differences in probability distortions not only across experimental conditions, but also across individuals, seems critical for the model’s success. It also seems advantageous to consider individual differences in parameters of probability weighting as being sampled from weakly informative prior distributions of individual parameter values. Thus, the results from hierarchical Bayesian modelling converge with previous results in revealing that probability weighting parameters show considerable task dependency and individual differences. Methodologically, this work exemplifies the usefulness of hierarchical Bayesian modelling techniques for cognitive psychology. Theoretically, human probabilistic inference might be best described as the application of individualized strategic policies for Bayesian belief revision.
Strategies for MCMC computation in quantitative genetics
DEFF Research Database (Denmark)
Waagepetersen, Rasmus; Ibánez, N.; Sorensen, Daniel
2006-01-01
sparse inverse. Maximum likelihood inference and Bayesian inference for the linear mixed model are well-studied topics (Sorensen and Gianola, 2002). Regarding Bayesian inference, with appropriate choice of priors, the full conditional distributions are standard distributions and Gibbs sampling can be...
Perkins, Simon; Zwart, Jonathan; Natarajan, Iniyan; Smirnov, Oleg
2015-01-01
We present Montblanc, a GPU implementation of the Radio interferometer measurement equation (RIME) in support of the Bayesian inference for radio observations (BIRO) technique. BIRO uses Bayesian inference to select sky models that best match the visibilities observed by a radio interferometer. To accomplish this, BIRO evaluates the RIME multiple times, varying sky model parameters to produce multiple model visibilities. Chi-squared values computed from the model and observed visibilities are used as likelihood values to drive the Bayesian sampling process and select the best sky model. As most of the elements of the RIME and chi-squared calculation are independent of one another, they are highly amenable to parallel computation. Additionally, Montblanc caters for iterative RIME evaluation to produce multiple chi-squared values. Only modified model parameters are transferred to the GPU between each iteration. We implemented Montblanc as a Python package based upon NVIDIA's CUDA architecture. As such, it is ea...
Probabilistic Modelling of Fatigue Life of Composite Laminates Using Bayesian Inference
DEFF Research Database (Denmark)
Dimitrov, Nikolay Krasimirov; Kiureghian, Armen Der
2014-01-01
. Model parameters are estimated by Bayesian inference. The reference data used consists of constant-amplitude fatigue test results for a multi-directional laminate subjected to seven different load ratios. The paper describes the modelling techniques and the parameter estimation procedure, supported by...
Markov Model of Wind Power Time Series UsingBayesian Inference of Transition Matrix
DEFF Research Database (Denmark)
Chen, Peiyuan; Berthelsen, Kasper Klitgaard; Bak-Jensen, Birgitte;
2009-01-01
This paper proposes to use Bayesian inference of transition matrix when developing a discrete Markov model of a wind speed/power time series and 95% credible interval for the model verification. The Dirichlet distribution is used as a conjugate prior for the transition matrix. Three discrete Markov...
Internal dosimetry of uranium isotopes using bayesian inference methods
International Nuclear Information System (INIS)
A group of personnel at Los Alamos National Laboratory is routinely monitored for the presence of uranium isotopes by urine bioassay. Samples are analysed by alpha spectroscopy, and the results are examined for evidence of an intake of uranium. Because the measurement uncertainties are often comparable to the quantities of material we wish to detect, statistical considerations are crucial for the proper interpretation of the data. The problem is further complicated by the significant, but highly non-uniform, presence of uranium in local drinking water and, in some cases, food supply. Software originally developed for internal dosimetry of plutonium has been adapted to the problem of uranium dosimetry. The software uses an unfolding algorithm to calculate an approximate Bayesian solution to the problem of characterising any intakes which may have occurred, given the history of urine bioassay results for each individual in the monitored population. The program uses biokinetic models from ICRP Publications 68 and later, and a prior probability distribution derived empirically from the body of uranium bioassay data collected at Los Alamos over the operating history of the Laboratory. For each individual, the software creates a posterior probability distribution of intake quantity and solubility type as a function of time. From this distribution, estimates are made of the cumulative committed dose (CEDE) to each individual. Results of the method are compared with those obtained using an earlier classical (non-Bayesian) algorithm for uranium dosimetry. We also discuss the problem of distinguishing occupational intakes from intake of environmental uranium, within a Bayesian framework. (author)
Using SAS PROC MCMC for Item Response Theory Models
Ames, Allison J.; Samonte, Kelli
2015-01-01
Interest in using Bayesian methods for estimating item response theory models has grown at a remarkable rate in recent years. This attentiveness to Bayesian estimation has also inspired a growth in available software such as WinBUGS, R packages, BMIRT, MPLUS, and SAS PROC MCMC. This article intends to provide an accessible overview of Bayesian…
Matrix-free Large Scale Bayesian inference in cosmology
Jasche, Jens
2014-01-01
In this work we propose a new matrix-free implementation of the Wiener sampler which is traditionally applied to high dimensional analysis when signal covariances are unknown. Specifically, the proposed method addresses the problem of jointly inferring a high dimensional signal and its corresponding covariance matrix from a set of observations. Our method implements a Gibbs sampling adaptation of the previously presented messenger approach, permitting to cast the complex multivariate inference problem into a sequence of uni-variate random processes. In this fashion, the traditional requirement of inverting high dimensional matrices is completely eliminated from the inference process, resulting in an efficient algorithm that is trivial to implement. Using cosmic large scale structure data as a showcase, we demonstrate the capabilities of our Gibbs sampling approach by performing a joint analysis of three dimensional density fields and corresponding power-spectra from Gaussian mock catalogues. These tests clear...
Bayesian Inference on Predictors of Sex of the Baby
Scarpa, Bruno
2016-01-01
It is well known that the sex ratio at birth is a biological constant, being about 106 boys to 100 girls. However couples have always wanted to know and decide in advance the sex of a newborn. For example, a large number of papers appeared connecting biometrical variables, such as length of follicular phase in the woman menstrual cycle or timing of intercourse acts to the sex of new baby. In this paper, we propose a Bayesian model to validate some of these theories by using an independent dat...
Bayesian inference and model comparison for metallic fatigue data
Babuška, Ivo
2016-02-23
In this work, we present a statistical treatment of stress-life (S-N) data drawn from a collection of records of fatigue experiments that were performed on 75S-T6 aluminum alloys. Our main objective is to predict the fatigue life of materials by providing a systematic approach to model calibration, model selection and model ranking with reference to S-N data. To this purpose, we consider fatigue-limit models and random fatigue-limit models that are specially designed to allow the treatment of the run-outs (right-censored data). We first fit the models to the data by maximum likelihood methods and estimate the quantiles of the life distribution of the alloy specimen. To assess the robustness of the estimation of the quantile functions, we obtain bootstrap confidence bands by stratified resampling with respect to the cycle ratio. We then compare and rank the models by classical measures of fit based on information criteria. We also consider a Bayesian approach that provides, under the prior distribution of the model parameters selected by the user, their simulation-based posterior distributions. We implement and apply Bayesian model comparison methods, such as Bayes factor ranking and predictive information criteria based on cross-validation techniques under various a priori scenarios.
Bayesian inference and model comparison for metallic fatigue data
Babuška, Ivo; Sawlan, Zaid; Scavino, Marco; Szabó, Barna; Tempone, Raúl
2016-06-01
In this work, we present a statistical treatment of stress-life (S-N) data drawn from a collection of records of fatigue experiments that were performed on 75S-T6 aluminum alloys. Our main objective is to predict the fatigue life of materials by providing a systematic approach to model calibration, model selection and model ranking with reference to S-N data. To this purpose, we consider fatigue-limit models and random fatigue-limit models that are specially designed to allow the treatment of the run-outs (right-censored data). We first fit the models to the data by maximum likelihood methods and estimate the quantiles of the life distribution of the alloy specimen. To assess the robustness of the estimation of the quantile functions, we obtain bootstrap confidence bands by stratified resampling with respect to the cycle ratio. We then compare and rank the models by classical measures of fit based on information criteria. We also consider a Bayesian approach that provides, under the prior distribution of the model parameters selected by the user, their simulation-based posterior distributions. We implement and apply Bayesian model comparison methods, such as Bayes factor ranking and predictive information criteria based on cross-validation techniques under various a priori scenarios.
Hierarchical Bayesian inference of galaxy redshift distributions from photometric surveys
Leistedt, Boris; Peiris, Hiranya V
2016-01-01
Accurately characterizing the redshift distributions of galaxies is essential for analysing deep photometric surveys and testing cosmological models. We present a technique to simultaneously infer redshift distributions and individual redshifts from photometric galaxy catalogues. Our model constructs a piecewise constant representation (effectively a histogram) of the distribution of galaxy types and redshifts, the parameters of which are efficiently inferred from noisy photometric flux measurements. This approach can be seen as a generalization of template-fitting photometric redshift methods and relies on a library of spectral templates to relate the photometric fluxes of individual galaxies to their redshifts. We illustrate this technique on simulated galaxy survey data, and demonstrate that it delivers correct posterior distributions on the underlying type and redshift distributions, as well as on the individual types and redshifts of galaxies. We show that even with uninformative priors, large photometri...
Approximate Bayesian inference in semi-mechanistic models
Aderhold, Andrej; Husmeier, Dirk; Grzegorczyk, Marco
2016-01-01
Inference of interaction networks represented by systems of differential equations is a challenging problem in many scientific disciplines. In the present article, we follow a semi-mechanistic modelling approach based on gradient matching. We investigate the extent to which key factors, including the kinetic model, statistical formulation and numerical methods, impact upon performance at network reconstruction. We emphasize general lessons for computational statisticians when faced with the c...
Solving #SAT and Bayesian Inference with Backtracking Search
Bacchus, Fahiem; Dalmao, Shannon; Pitassi, Toniann
2014-01-01
Inference in Bayes Nets (BAYES) is an important problem with numerous applications in probabilistic reasoning. Counting the number of satisfying assignments of a propositional formula (#SAT) is a closely related problem of fundamental theoretical importance. Both these problems, and others, are members of the class of sum-of-products (SUMPROD) problems. In this paper we show that standard backtracking search when augmented with a simple memoization scheme (caching) can solve any sum-of-produc...
Wu, Yuefeng; Hooker, Giles
2013-01-01
This paper introduces a hierarchical framework to incorporate Hellinger distance methods into Bayesian analysis. We propose to modify a prior over non-parametric densities with the exponential of twice the Hellinger distance between a candidate and a parametric density. By incorporating a prior over the parameters of the second density, we arrive at a hierarchical model in which a non-parametric model is placed between parameters and the data. The parameters of the family can then be estimate...
Hierarchical Bayesian inference of galaxy redshift distributions from photometric surveys
Leistedt, Boris; Mortlock, Daniel J.; Peiris, Hiranya V.
2016-08-01
Accurately characterizing the redshift distributions of galaxies is essential for analysing deep photometric surveys and testing cosmological models. We present a technique to simultaneously infer redshift distributions and individual redshifts from photometric galaxy catalogues. Our model constructs a piecewise constant representation (effectively a histogram) of the distribution of galaxy types and redshifts, the parameters of which are efficiently inferred from noisy photometric flux measurements. This approach can be seen as a generalization of template-fitting photometric redshift methods and relies on a library of spectral templates to relate the photometric fluxes of individual galaxies to their redshifts. We illustrate this technique on simulated galaxy survey data, and demonstrate that it delivers correct posterior distributions on the underlying type and redshift distributions, as well as on the individual types and redshifts of galaxies. We show that even with uninformative priors, large photometric errors and parameter degeneracies, the redshift and type distributions can be recovered robustly thanks to the hierarchical nature of the model, which is not possible with common photometric redshift estimation techniques. As a result, redshift uncertainties can be fully propagated in cosmological analyses for the first time, fulfilling an essential requirement for the current and future generations of surveys.
Large-Scale Optimization for Bayesian Inference in Complex Systems
Energy Technology Data Exchange (ETDEWEB)
Willcox, Karen [MIT; Marzouk, Youssef [MIT
2013-11-12
The SAGUARO (Scalable Algorithms for Groundwater Uncertainty Analysis and Robust Optimization) Project focused on the development of scalable numerical algorithms for large-scale Bayesian inversion in complex systems that capitalize on advances in large-scale simulation-based optimization and inversion methods. The project was a collaborative effort among MIT, the University of Texas at Austin, Georgia Institute of Technology, and Sandia National Laboratories. The research was directed in three complementary areas: efficient approximations of the Hessian operator, reductions in complexity of forward simulations via stochastic spectral approximations and model reduction, and employing large-scale optimization concepts to accelerate sampling. The MIT--Sandia component of the SAGUARO Project addressed the intractability of conventional sampling methods for large-scale statistical inverse problems by devising reduced-order models that are faithful to the full-order model over a wide range of parameter values; sampling then employs the reduced model rather than the full model, resulting in very large computational savings. Results indicate little effect on the computed posterior distribution. On the other hand, in the Texas--Georgia Tech component of the project, we retain the full-order model, but exploit inverse problem structure (adjoint-based gradients and partial Hessian information of the parameter-to-observation map) to implicitly extract lower dimensional information on the posterior distribution; this greatly speeds up sampling methods, so that fewer sampling points are needed. We can think of these two approaches as ``reduce then sample'' and ``sample then reduce.'' In fact, these two approaches are complementary, and can be used in conjunction with each other. Moreover, they both exploit deterministic inverse problem structure, in the form of adjoint-based gradient and Hessian information of the underlying parameter-to-observation map, to
Bayesian Spatial Modelling with R-INLA
Finn Lindgren; Håvard Rue
2015-01-01
The principles behind the interface to continuous domain spatial models in the R- INLA software package for R are described. The integrated nested Laplace approximation (INLA) approach proposed by Rue, Martino, and Chopin (2009) is a computationally effective alternative to MCMC for Bayesian inference. INLA is designed for latent Gaussian models, a very wide and flexible class of models ranging from (generalized) linear mixed to spatial and spatio-temporal models. Combined with the stochastic...
Dorn, C; Khan, A; Heng, K; Alibert, Y; Helled, R; Rivoldini, A; Benz, W
2016-01-01
We aim to present a generalized Bayesian inference method for constraining interiors of super Earths and sub-Neptunes. Our methodology succeeds in quantifying the degeneracy and correlation of structural parameters for high dimensional parameter spaces. Specifically, we identify what constraints can be placed on composition and thickness of core, mantle, ice, ocean, and atmospheric layers given observations of mass, radius, and bulk refractory abundance constraints (Fe, Mg, Si) from observations of the host star's photospheric composition. We employed a full probabilistic Bayesian inference analysis that formally accounts for observational and model uncertainties. Using a Markov chain Monte Carlo technique, we computed joint and marginal posterior probability distributions for all structural parameters of interest. We included state-of-the-art structural models based on self-consistent thermodynamics of core, mantle, high-pressure ice, and liquid water. Furthermore, we tested and compared two different atmosp...
Ball, William T; Egerton, Jack S; Haigh, Joanna D
2014-01-01
We investigate the relationship between spectral solar irradiance (SSI) and ozone in the tropical upper stratosphere. We find that solar cycle (SC) changes in ozone can be well approximated by considering the ozone response to SSI changes in a small number individual wavelength bands between 176 and 310 nm, operating independently of each other. Additionally, we find that the ozone varies approximately linearly with changes in the SSI. Using these facts, we present a Bayesian formalism for inferring SC SSI changes and uncertainties from measured SC ozone profiles. Bayesian inference is a powerful, mathematically self-consistent method of considering both the uncertainties of the data and additional external information to provide the best estimate of parameters being estimated. Using this method, we show that, given measurement uncertainties in both ozone and SSI datasets, it is not currently possible to distinguish between observed or modelled SSI datasets using available estimates of ozone change profiles, ...
Prudhomme, Serge
2015-09-17
Parameter estimation for complex models using Bayesian inference is usually a very costly process as it requires a large number of solves of the forward problem. We show here how the construction of adaptive surrogate models using a posteriori error estimates for quantities of interest can significantly reduce the computational cost in problems of statistical inference. As surrogate models provide only approximations of the true solutions of the forward problem, it is nevertheless necessary to control these errors in order to construct an accurate reduced model with respect to the observables utilized in the identification of the model parameters. Effectiveness of the proposed approach is demonstrated on a numerical example dealing with the Spalart–Allmaras model for the simulation of turbulent channel flows. In particular, we illustrate how Bayesian model selection using the adapted surrogate model in place of solving the coupled nonlinear equations leads to the same quality of results while requiring fewer nonlinear PDE solves.
Dorazio, R.M.; Johnson, F.A.
2003-01-01
Bayesian inference and decision theory may be used in the solution of relatively complex problems of natural resource management, owing to recent advances in statistical theory and computing. In particular, Markov chain Monte Carlo algorithms provide a computational framework for fitting models of adequate complexity and for evaluating the expected consequences of alternative management actions. We illustrate these features using an example based on management of waterfowl habitat.
Bayesian inference of models and hyper-parameters for robust optic-flow estimation
Héas, Patrick; Herzet, Cédric; Memin, Etienne
2012-01-01
International audience Selecting optimal models and hyper-parameters is crucial for accurate optic-flow estimation. This paper provides a solution to the problem in a generic Bayesian framework. The method is based on a conditional model linking the image intensity function, the unknown velocity field, hyper-parameters and the prior and likelihood motion models. Inference is performed on each of the three-level of this so-defined hierarchical model by maximization of marginalized \\textit{a...
GNU MCSim : bayesian statistical inference for SBML-coded systems biology models
Bois, Frédéric Y.
2009-01-01
International audience Statistical inference about the parameter values of complex models, such as the ones routinely developed in systems biology, is efficiently performed through Bayesian numerical techniques. In that framework, prior information and multiple levels of uncertainty can be seamlessly integrated. GNU MCSim was precisely developed to achieve those aims, in a general non-linear differential context. Starting with version 5.3.0, GNU MCSim reads in and simulates Systems Biology...
Bayesian inferences of the thermal properties of a wall using temperature and heat flux measurements
Iglesias, Marco; Sawlan, Zaid; Scavino, Marco; Tempone, Raul; Wood, Christopher
2016-01-01
We develop a hierarchical Bayesian inference method to estimate the thermal resistance and the volumetric heat capacity of a wall. These thermal properties are essential for accurate building energy simulations that are needed to make effective energy-saving policies. We apply our methodology to an experimental case study conducted in an environmental chamber, where measurements are recorded every minute from temperature probes and heat flux sensors placed on both sides of a solid brick wall ...
Seyed Taghi Akhavan Niaki; Mohammad Saber Fallah Nezhad
2007-01-01
In order to design a decision-making framework in production environments, in this study, we use both the stochastic dynamic programming and Bayesian inference concepts. Using the posterior probability of the production process to be in state λ (the hazard rate of defective products), first we formulate the problem into a stochastic dynamic programming model. Next, we derive some properties for the optimal value of the objective function. Then, we propose a solution algorithm. At the end, the...
Constraining East Antarctic mass trends using a Bayesian inference approach
Martin-Español, Alba; Bamber, Jonathan L.
2016-04-01
East Antarctica is an order of magnitude larger than its western neighbour and the Greenland ice sheet. It has the greatest potential to contribute to sea level rise of any source, including non-glacial contributors. It is, however, the most challenging ice mass to constrain because of a range of factors including the relative paucity of in-situ observations and the poor signal to noise ratio of Earth Observation data such as satellite altimetry and gravimetry. A recent study using satellite radar and laser altimetry (Zwally et al. 2015) concluded that the East Antarctic Ice Sheet (EAIS) had been accumulating mass at a rate of 136±28 Gt/yr for the period 2003-08. Here, we use a Bayesian hierarchical model, which has been tested on, and applied to, the whole of Antarctica, to investigate the impact of different assumptions regarding the origin of elevation changes of the EAIS. We combined GRACE, satellite laser and radar altimeter data and GPS measurements to solve simultaneously for surface processes (primarily surface mass balance, SMB), ice dynamics and glacio-isostatic adjustment over the period 2003-13. The hierarchical model partitions mass trends between SMB and ice dynamics based on physical principles and measures of statistical likelihood. Without imposing the division between these processes, the model apportions about a third of the mass trend to ice dynamics, +18 Gt/yr, and two thirds, +39 Gt/yr, to SMB. The total mass trend for that period for the EAIS was 57±20 Gt/yr. Over the period 2003-08, we obtain an ice dynamic trend of 12 Gt/yr and a SMB trend of 15 Gt/yr, with a total mass trend of 27 Gt/yr. We then imposed the condition that the surface mass balance is tightly constrained by the regional climate model RACMO2.3 and allowed height changes due to ice dynamics to occur in areas of low surface velocities (<10 m/yr) , such as those in the interior of East Antarctica (a similar condition as used in Zwally 2015). The model must find a solution that
Adaptability and phenotypic stability of common bean genotypes through Bayesian inference.
Corrêa, A M; Teodoro, P E; Gonçalves, M C; Barroso, L M A; Nascimento, M; Santos, A; Torres, F E
2016-01-01
This study used Bayesian inference to investigate the genotype x environment interaction in common bean grown in Mato Grosso do Sul State, and it also evaluated the efficiency of using informative and minimally informative a priori distributions. Six trials were conducted in randomized blocks, and the grain yield of 13 common bean genotypes was assessed. To represent the minimally informative a priori distributions, a probability distribution with high variance was used, and a meta-analysis concept was adopted to represent the informative a priori distributions. Bayes factors were used to conduct comparisons between the a priori distributions. The Bayesian inference was effective for the selection of upright common bean genotypes with high adaptability and phenotypic stability using the Eberhart and Russell method. Bayes factors indicated that the use of informative a priori distributions provided more accurate results than minimally informative a priori distributions. According to Bayesian inference, the EMGOPA-201, BAMBUÍ, CNF 4999, CNF 4129 A 54, and CNFv 8025 genotypes had specific adaptability to favorable environments, while the IAPAR 14 and IAC CARIOCA ETE genotypes had specific adaptability to unfavorable environments. PMID:27173270
Energy Technology Data Exchange (ETDEWEB)
Kang, Seong Keun; Seong, Poong Hyun [KAIST, Daejeon (Korea, Republic of)
2014-08-15
Bayesian methodology has been used widely used in various research fields. It is method of inference using Bayes' rule to update the estimation of probability for the certain hypothesis when additional evidences are acquired. According to the current researches, malfunction of nuclear power plant can be detected by using this Bayesian inference which consistently piles up the newly incoming data and updates its estimation. However, those researches are based on the assumption that people are doing like computer perfectly, which can be criticized and may cause a problem in real world application. Studies in cognitive psychology indicates that when the amount of information becomes larger, people can't save the whole data because people have limited memory capacity which is well known as working memory, and also they have attention problem. The purpose of this paper is to consider the psychological factors and confirm how much this working memory and attention will affect the resulted estimation based on the Bayesian inference. To confirm this, experiment on human is needed, and the tool of experiment is Compact Nuclear Simulator (CNS)
International Nuclear Information System (INIS)
Bayesian methodology has been used widely used in various research fields. It is method of inference using Bayes' rule to update the estimation of probability for the certain hypothesis when additional evidences are acquired. According to the current researches, malfunction of nuclear power plant can be detected by using this Bayesian inference which consistently piles up the newly incoming data and updates its estimation. However, those researches are based on the assumption that people are doing like computer perfectly, which can be criticized and may cause a problem in real world application. Studies in cognitive psychology indicates that when the amount of information becomes larger, people can't save the whole data because people have limited memory capacity which is well known as working memory, and also they have attention problem. The purpose of this paper is to consider the psychological factors and confirm how much this working memory and attention will affect the resulted estimation based on the Bayesian inference. To confirm this, experiment on human is needed, and the tool of experiment is Compact Nuclear Simulator (CNS)
A novel multimode process monitoring method integrating LDRSKM with Bayesian inference
Institute of Scientific and Technical Information of China (English)
Shi-jin REN; Yin LIANG; Xiang-jun ZHAO; Mao-yun YANG
2015-01-01
A local discriminant regularized soft k-means (LDRSKM) method with Bayesian inference is proposed for multimode process monitoring. LDRSKM extends the regularized soft k-means algorithm by exploiting the local and non-local geometric information of the data and generalized linear discriminant analysis to provide a better and more meaningful data partition. LDRSKM can perform clustering and subspace selection simultaneously, enhancing the separability of data residing in different clusters. With the data partition obtained, kernel support vector data description (KSVDD) is used to establish the monitoring statistics and control limits. Two Bayesian inference based global fault detection indicators are then developed using the local monitoring results associated with principal and residual subspaces. Based on clustering analysis, Bayesian inference and mani-fold learning methods, the within and cross-mode correlations, and local geometric information can be exploited to enhance monitoring performances for nonlinear and non-Gaussian processes. The effectiveness and efficiency of the proposed method are evaluated using the Tennessee Eastman benchmark process.
Trans-dimensional Bayesian inference for large sequential data sets
Mandolesi, E.; Dettmer, J.; Dosso, S. E.; Holland, C. W.
2015-12-01
This work develops a sequential Monte Carlo method to infer seismic parameters of layered seabeds from large sequential reflection-coefficient data sets. The approach provides parameter estimates and uncertainties along survey tracks with the goal to aid in the detection of unexploded ordnance in shallow water. The sequential data are acquired by a moving platform with source and receiver array towed close to the seabed. This geometry requires consideration of spherical reflection coefficients, computed efficiently by massively parallel implementation of the Sommerfeld integral via Levin integration on a graphics processing unit. The seabed is parametrized with a trans-dimensional model to account for changes in the environment (i.e. changes in layering) along the track. The method combines advanced Markov chain Monte Carlo methods (annealing) with particle filtering (resampling). Since data from closely-spaced source transmissions (pings) often sample similar environments, the solution from one ping can be utilized to efficiently estimate the posterior for data from subsequent pings. Since reflection-coefficient data are highly informative, the likelihood function can be extremely peaked, resulting in little overlap between posteriors of adjacent pings. This is addressed by adding bridging distributions (via annealed importance sampling) between pings for more efficient transitions. The approach assumes the environment to be changing slowly enough to justify the local 1D parametrization. However, bridging allows rapid changes between pings to be addressed and we demonstrate the method to be stable in such situations. Results are in terms of trans-D parameter estimates and uncertainties along the track. The algorithm is examined for realistic simulated data along a track and applied to a dataset collected by an autonomous underwater vehicle on the Malta Plateau, Mediterranean Sea. [Work supported by the SERDP, DoD.
Bayesian inference of T Tauri star properties using multi-wavelength survey photometry
Barentsen, Geert; Drew, Janet E; Sale, Stuart E
2012-01-01
There are many pertinent open issues in the area of star and planet formation. Large statistical samples of young stars across star-forming regions are needed to trigger a breakthrough in our understanding, but most optical studies are based on a wide variety of spectrographs and analysis methods, which introduces large biases. Here we show how graphical Bayesian networks can be employed to construct a hierarchical probabilistic model which allows pre-main sequence ages, masses, accretion rates, and extinctions to be estimated using two widely available photometric survey databases (IPHAS r/i/Halpha and 2MASS J-band magnitudes.) Because our approach does not rely on spectroscopy, it can easily be applied to homogeneously study the large number of clusters for which Gaia will yield membership lists. We explain how the analysis is carried out using the Markov Chain Monte Carlo (MCMC) method and provide Python source code. We then demonstrate its use on 587 known low-mass members of the star-forming region NGC 2...
Inferring activities from context measurements using Bayesian inference and random utility models
Hurtubia, Ricardo; Bierlaire, Michel; Flötteröd, Gunnar
2009-01-01
Smartphones collect a wealth of information about their users. This includes GPS tracks and the MAC addresses of devices around the user, and it can go as far as taking visual and acoustic samples of the user's environment. We present a framework to identify a smartphone user's activities in a Bayesian setting. As prior information, we us a random utility model that accounts for the type of activity a user is likely to perform at any given location and time; this model was estimated for the w...
Directory of Open Access Journals (Sweden)
Marcelo Costa Souza
2004-10-01
, in which the parameters are regarded as fixed quantities, not assuming changes in time. This work aimed at fitting of autoregressive models with order 2, AR(2, specified in the form of dynamic linear models using Bayesian inference. Monte Carlo Markov Chain (MCMC was used to obtain the estimates, via Gibbs Sampler and Forward Filtering Backward Sampling (FFBS. To evaluate the fitting, two chains with 8000 iterations each, and three different series sizes, with 200, 500 and 800 observations were sampled. The Canadian lynx series (NICHOLLS and QUIN, 1982, was fitted with different discount factors (0.90, 0.95 and 0.99, and the resulting mean square error was used to compare to the fitting using classical inference. A better fit for the model with discount equal to 0.99 was observed. One-step ahead forecasts were done to check the estimates obtained for the updated and the backward sampled series. To the latter, the fitting was better and mean square error lower. In general, it was observed a good fit of the AR(2 dynamic models via Bayesian inference, and this gives a better understanding of the fitting in different situations, both simulated and real.
Feroz, Farhan
2007-01-01
In performing a Bayesian analysis of astronomical data, two difficult problems often emerge. First, in estimating the parameters of some model for the data, the resulting posterior distribution may be multimodal or exhibit pronounced (curving) degeneracies, which can cause problems for traditional MCMC sampling methods. Second, in selecting between a set of competing models, calculation of the Bayesian evidence for each model is computationally expensive. The nested sampling method introduced by Skilling (2004), has greatly reduced the computational expense of calculating evidences and also produces posterior inferences as a by-product. This method has been applied successfully in cosmological applications by Mukherjee et al. (2006), but their implementation was efficient only for unimodal distributions without pronounced degeneracies. Shaw et al. (2007), recently introduced a clustered nested sampling method which is significantly more efficient in sampling from multimodal posteriors and also determines the ...
AGNfitter: SED-fitting code for AGN and galaxies from a MCMC approach
Calistro Rivera, Gabriela; Lusso, Elisabeta; Hennawi, Joseph F.; Hogg, David W.
2016-07-01
AGNfitter is a fully Bayesian MCMC method to fit the spectral energy distributions (SEDs) of active galactic nuclei (AGN) and galaxies from the sub-mm to the UV; it enables robust disentanglement of the physical processes responsible for the emission of sources. Written in Python, AGNfitter makes use of a large library of theoretical, empirical, and semi-empirical models to characterize both the nuclear and host galaxy emission simultaneously. The model consists of four physical emission components: an accretion disk, a torus of AGN heated dust, stellar populations, and cold dust in star forming regions. AGNfitter determines the posterior distributions of numerous parameters that govern the physics of AGN with a fully Bayesian treatment of errors and parameter degeneracies, allowing one to infer integrated luminosities, dust attenuation parameters, stellar masses, and star formation rates.
Gaffney, Jim A; Sonnad, Vijay; Libby, Stephen B
2013-01-01
First principles microphysics models are essential to the design and analysis of high energy density physics experiments. Using experimental data to investigate the underlying physics is also essential, particularly when simulations and experiments are not consistent with each other. This is a difficult task, due to the large number of physical models that play a role, and due to the complex (and as a result, noisy) nature of the experiments. This results in a large number of parameters that make any inference a daunting task; it is also very important to consistently treat both experimental and prior understanding of the problem. In this paper we present a Bayesian method that includes both these effects, and allows the inference of a set of modifiers which have been constructed to give information about microphysics models from experimental data. We pay particular attention to radiation transport models. The inference takes into account a large set of experimental parameters and an estimate of the prior kno...
Uncertainty Analysis in Fatigue Life Prediction of Gas Turbine Blades Using Bayesian Inference
Li, Yan-Feng; Zhu, Shun-Peng; Li, Jing; Peng, Weiwen; Huang, Hong-Zhong
2015-12-01
This paper investigates Bayesian model selection for fatigue life estimation of gas turbine blades considering model uncertainty and parameter uncertainty. Fatigue life estimation of gas turbine blades is a critical issue for the operation and health management of modern aircraft engines. Since lots of life prediction models have been presented to predict the fatigue life of gas turbine blades, model uncertainty and model selection among these models have consequently become an important issue in the lifecycle management of turbine blades. In this paper, fatigue life estimation is carried out by considering model uncertainty and parameter uncertainty simultaneously. It is formulated as the joint posterior distribution of a fatigue life prediction model and its model parameters using Bayesian inference method. Bayes factor is incorporated to implement the model selection with the quantified model uncertainty. Markov Chain Monte Carlo method is used to facilitate the calculation. A pictorial framework and a step-by-step procedure of the Bayesian inference method for fatigue life estimation considering model uncertainty are presented. Fatigue life estimation of a gas turbine blade is implemented to demonstrate the proposed method.
Planetary micro-rover operations on Mars using a Bayesian framework for inference and control
Post, Mark A.; Li, Junquan; Quine, Brendan M.
2016-03-01
With the recent progress toward the application of commercially-available hardware to small-scale space missions, it is now becoming feasible for groups of small, efficient robots based on low-power embedded hardware to perform simple tasks on other planets in the place of large-scale, heavy and expensive robots. In this paper, we describe design and programming of the Beaver micro-rover developed for Northern Light, a Canadian initiative to send a small lander and rover to Mars to study the Martian surface and subsurface. For a small, hardware-limited rover to handle an uncertain and mostly unknown environment without constant management by human operators, we use a Bayesian network of discrete random variables as an abstraction of expert knowledge about the rover and its environment, and inference operations for control. A framework for efficient construction and inference into a Bayesian network using only the C language and fixed-point mathematics on embedded hardware has been developed for the Beaver to make intelligent decisions with minimal sensor data. We study the performance of the Beaver as it probabilistically maps a simple outdoor environment with sensor models that include uncertainty. Results indicate that the Beaver and other small and simple robotic platforms can make use of a Bayesian network to make intelligent decisions in uncertain planetary environments.
Directory of Open Access Journals (Sweden)
Michael J McGeachie
2014-06-01
Full Text Available Bayesian Networks (BN have been a popular predictive modeling formalism in bioinformatics, but their application in modern genomics has been slowed by an inability to cleanly handle domains with mixed discrete and continuous variables. Existing free BN software packages either discretize continuous variables, which can lead to information loss, or do not include inference routines, which makes prediction with the BN impossible. We present CGBayesNets, a BN package focused around prediction of a clinical phenotype from mixed discrete and continuous variables, which fills these gaps. CGBayesNets implements Bayesian likelihood and inference algorithms for the conditional Gaussian Bayesian network (CGBNs formalism, one appropriate for predicting an outcome of interest from, e.g., multimodal genomic data. We provide four different network learning algorithms, each making a different tradeoff between computational cost and network likelihood. CGBayesNets provides a full suite of functions for model exploration and verification, including cross validation, bootstrapping, and AUC manipulation. We highlight several results obtained previously with CGBayesNets, including predictive models of wood properties from tree genomics, leukemia subtype classification from mixed genomic data, and robust prediction of intensive care unit mortality outcomes from metabolomic profiles. We also provide detailed example analysis on public metabolomic and gene expression datasets. CGBayesNets is implemented in MATLAB and available as MATLAB source code, under an Open Source license and anonymous download at http://www.cgbayesnets.com.
Bayesian inference of solar and stellar magnetic fields in the weak-field approximation
Ramos, A Asensio
2011-01-01
The weak-field approximation is one of the simplest models that allows us to relate the observed polarization induced by the Zeeman effect with the magnetic field vector present on the plasma of interest. It is usually applied for diagnosing magnetic fields in the solar and stellar atmospheres. A fully Bayesian approach to the inference of magnetic properties in unresolved structures is presented. The analytical expression for the marginal posterior distribution is obtained, from which we can obtain statistically relevant information about the model parameters. The role of a-priori information is discussed and a hierarchical procedure is presented that gives robust results that are almost insensitive to the precise election of the prior. The strength of the formalism is demonstrated through an application to IMaX data. Bayesian methods can optimally exploit data from filter-polarimeters given the scarcity of spectral information as compared with spectro-polarimeters. The effect of noise and how it degrades ou...
Holzinger, M.
2014-09-01
This paper introduces and discusses a method to rigorously classify and prioritize UCTs using Bayesian inference and admissible regions. A detailed derivation and discussion of the methodology is given, followed by a generalized definition of prioritization parameters. Several example prioritization parameters including `time left to detect,' 'zero-effort miss,' and `effective albedo-area' are motivated and given. A number of illustrative applications with optical UCTs are examined to demonstrate information that can be extracted from each observation. Finally, the information extracted from each UCT is then compared and approaches to observation prioritization discussed.
DIP -- Diagnostics for Insufficiencies of Posterior calculations in Bayesian signal inference
Dorn, Sebastian; lin, Torsten A Enß
2013-01-01
We present an error-diagnostic validation method for posterior distributions in Bayesian signal inference. It transfers deviations from the correct posterior into characteristic deviations from a uniform distribution of a quantity constructed for this purpose. We show that this method is able to reveal and discriminate several kinds of numerical and approximation errors. For this we present a number of analytical examples of posteriors with incorrect variance, skewness, position of the maximum, or normalization. We show further how this test can be applied to multidimensional signals.
Hadwin, Paul J.; Sipkens, T. A.; Thomson, K. A.; Liu, F.; Daun, K. J.
2016-01-01
Auto-correlated laser-induced incandescence (AC-LII) infers the soot volume fraction (SVF) of soot particles by comparing the spectral incandescence from laser-energized particles to the pyrometrically inferred peak soot temperature. This calculation requires detailed knowledge of model parameters such as the absorption function of soot, which may vary with combustion chemistry, soot age, and the internal structure of the soot. This work presents a Bayesian methodology to quantify such uncertainties. This technique treats the additional "nuisance" model parameters, including the soot absorption function, as stochastic variables and incorporates the current state of knowledge of these parameters into the inference process through maximum entropy priors. While standard AC-LII analysis provides a point estimate of the SVF, Bayesian techniques infer the posterior probability density, which will allow scientists and engineers to better assess the reliability of AC-LII inferred SVFs in the context of environmental regulations and competing diagnostics.
International Nuclear Information System (INIS)
Occurrence of hazardous accident in nuclear power plants and industrial units usually lead to release of radioactive materials and pollutants in environment. These materials and pollutants can be transported to a far downstream by the wind flow. In this paper, we implemented an atmospheric dispersion code to solve the inverse problem. Having received and detected the pollutants in one region, we may estimate the rate and location of the unknown source. For the modeling, one needs a model with ability of atmospheric dispersion calculation. Furthermore, it is required to implement a mathematical approach to infer the source location and the related rates. In this paper the AERMOD software and Bayesian inference along the Markov Chain Monte Carlo have been applied. Implementing, Bayesian approach and Markov Chain Monte Carlo for the aforementioned subject is not a new approach, but the AERMOD model coupled with the said methods is a new and well known regulatory software, and enhances the reliability of outcomes. To evaluate the method, an example is considered by defining pollutants concentration in a specific region and then obtaining the source location and intensity by a direct calculation. The result of the calculation estimates the average source location at a distance of 7km with an accuracy of 5m which is good enough to support the ability of the proposed algorithm.
Sraj, Ihab
2015-10-22
This paper addresses model dimensionality reduction for Bayesian inference based on prior Gaussian fields with uncertainty in the covariance function hyper-parameters. The dimensionality reduction is traditionally achieved using the Karhunen-Loève expansion of a prior Gaussian process assuming covariance function with fixed hyper-parameters, despite the fact that these are uncertain in nature. The posterior distribution of the Karhunen-Loève coordinates is then inferred using available observations. The resulting inferred field is therefore dependent on the assumed hyper-parameters. Here, we seek to efficiently estimate both the field and covariance hyper-parameters using Bayesian inference. To this end, a generalized Karhunen-Loève expansion is derived using a coordinate transformation to account for the dependence with respect to the covariance hyper-parameters. Polynomial Chaos expansions are employed for the acceleration of the Bayesian inference using similar coordinate transformations, enabling us to avoid expanding explicitly the solution dependence on the uncertain hyper-parameters. We demonstrate the feasibility of the proposed method on a transient diffusion equation by inferring spatially-varying log-diffusivity fields from noisy data. The inferred profiles were found closer to the true profiles when including the hyper-parameters’ uncertainty in the inference formulation.
Perkins, S. J.; Marais, P. C.; Zwart, J. T. L.; Natarajan, I.; Tasse, C.; Smirnov, O.
2015-09-01
We present Montblanc, a GPU implementation of the Radio interferometer measurement equation (RIME) in support of the Bayesian inference for radio observations (BIRO) technique. BIRO uses Bayesian inference to select sky models that best match the visibilities observed by a radio interferometer. To accomplish this, BIRO evaluates the RIME multiple times, varying sky model parameters to produce multiple model visibilities. χ2 values computed from the model and observed visibilities are used as likelihood values to drive the Bayesian sampling process and select the best sky model. As most of the elements of the RIME and χ2 calculation are independent of one another, they are highly amenable to parallel computation. Additionally, Montblanc caters for iterative RIME evaluation to produce multiple χ2 values. Modified model parameters are transferred to the GPU between each iteration. We implemented Montblanc as a Python package based upon NVIDIA's CUDA architecture. As such, it is easy to extend and implement different pipelines. At present, Montblanc supports point and Gaussian morphologies, but is designed for easy addition of new source profiles. Montblanc's RIME implementation is performant: On an NVIDIA K40, it is approximately 250 times faster than MEQTREES on a dual hexacore Intel E5-2620v2 CPU. Compared to the OSKAR simulator's GPU-implemented RIME components it is 7.7 and 12 times faster on the same K40 for single and double-precision floating point respectively. However, OSKAR's RIME implementation is more general than Montblanc's BIRO-tailored RIME. Theoretical analysis of Montblanc's dominant CUDA kernel suggests that it is memory bound. In practice, profiling shows that is balanced between compute and memory, as much of the data required by the problem is retained in L1 and L2 caches.
Evidence cross-validation and Bayesian inference of MAST plasma equilibria
International Nuclear Information System (INIS)
In this paper, current profiles for plasma discharges on the mega-ampere spherical tokamak are directly calculated from pickup coil, flux loop, and motional-Stark effect observations via methods based in the statistical theory of Bayesian analysis. By representing toroidal plasma current as a series of axisymmetric current beams with rectangular cross-section and inferring the current for each one of these beams, flux-surface geometry and q-profiles are subsequently calculated by elementary application of Biot-Savart's law. The use of this plasma model in the context of Bayesian analysis was pioneered by Svensson and Werner on the joint-European tokamak [Svensson and Werner,Plasma Phys. Controlled Fusion 50(8), 085002 (2008)]. In this framework, linear forward models are used to generate diagnostic predictions, and the probability distribution for the currents in the collection of plasma beams was subsequently calculated directly via application of Bayes' formula. In this work, we introduce a new diagnostic technique to identify and remove outlier observations associated with diagnostics falling out of calibration or suffering from an unidentified malfunction. These modifications enable a good agreement between Bayesian inference of the last-closed flux-surface with other corroborating data, such as that from force balance considerations using EFIT++[Appel et al., ''A unified approach to equilibrium reconstruction'' Proceedings of the 33rd EPS Conference on Plasma Physics (Rome, Italy, 2006)]. In addition, this analysis also yields errors on the plasma current profile and flux-surface geometry as well as directly predicting the Shafranov shift of the plasma core.
Understanding the Scalability of Bayesian Network Inference Using Clique Tree Growth Curves
Mengshoel, Ole J.
2010-01-01
One of the main approaches to performing computation in Bayesian networks (BNs) is clique tree clustering and propagation. The clique tree approach consists of propagation in a clique tree compiled from a Bayesian network, and while it was introduced in the 1980s, there is still a lack of understanding of how clique tree computation time depends on variations in BN size and structure. In this article, we improve this understanding by developing an approach to characterizing clique tree growth as a function of parameters that can be computed in polynomial time from BNs, specifically: (i) the ratio of the number of a BN s non-root nodes to the number of root nodes, and (ii) the expected number of moral edges in their moral graphs. Analytically, we partition the set of cliques in a clique tree into different sets, and introduce a growth curve for the total size of each set. For the special case of bipartite BNs, there are two sets and two growth curves, a mixed clique growth curve and a root clique growth curve. In experiments, where random bipartite BNs generated using the BPART algorithm are studied, we systematically increase the out-degree of the root nodes in bipartite Bayesian networks, by increasing the number of leaf nodes. Surprisingly, root clique growth is well-approximated by Gompertz growth curves, an S-shaped family of curves that has previously been used to describe growth processes in biology, medicine, and neuroscience. We believe that this research improves the understanding of the scaling behavior of clique tree clustering for a certain class of Bayesian networks; presents an aid for trade-off studies of clique tree clustering using growth curves; and ultimately provides a foundation for benchmarking and developing improved BN inference and machine learning algorithms.
Bickel, David R
2011-01-01
In statistical practice, whether a Bayesian or frequentist approach is used in inference depends not only on the availability of prior information but also on the attitude taken toward partial prior information, with frequentists tending to be more cautious than Bayesians. The proposed framework defines that attitude in terms of a specified amount of caution, thereby enabling data analysis at the level of caution desired and on the basis of any prior information. The caution parameter represents the attitude toward partial prior information in much the same way as a loss function represents the attitude toward risk. When there is very little prior information and nonzero caution, the resulting inferences correspond to those of the candidate confidence intervals and p-values that are most similar to the credible intervals and hypothesis probabilities of the specified Bayesian posterior. On the other hand, in the presence of a known physical distribution of the parameter, inferences are based only on the corres...
Dagne, Getachew; Huang, Yangxin
2012-01-01
Censored data are characteristics of many bioassays in HIV/AIDS studies where assays may not be sensitive enough to determine gradations in viral load determination among those below a detectable threshold. Not accounting for such left-censoring appropriately can lead to biased parameter estimates in most data analysis. To properly adjust for left-censoring, this paper presents an extension of the Tobit model for fitting nonlinear dynamic mixed-effects models with skew distributions. Such extensions allow one to specify the conditional distributions for viral load response to account for left-censoring, skewness and heaviness in the tails of the distributions of the response variable. A Bayesian modeling approach via Markov Chain Monte Carlo (MCMC) algorithm is used to estimate model parameters. The proposed methods are illustrated using real data from an HIV/AIDS study. PMID:22992288
Dagne, Getachew; Huang, Yangxin
2016-01-01
Censored data are characteristics of many bioassays in HIV/AIDS studies where assays may not be sensitive enough to determine gradations in viral load determination among those below a detectable threshold. Not accounting for such left-censoring appropriately can lead to biased parameter estimates in most data analysis. To properly adjust for left-censoring, this paper presents an extension of the Tobit model for fitting nonlinear dynamic mixed-effects models with skew distributions. Such extensions allow one to specify the conditional distributions for viral load response to account for left-censoring, skewness and heaviness in the tails of the distributions of the response variable. A Bayesian modeling approach via Markov Chain Monte Carlo (MCMC) algorithm is used to estimate model parameters. The proposed methods are illustrated using real data from an HIV/AIDS study. PMID:22992288
Probabilistic Damage Characterization Using the Computationally-Efficient Bayesian Approach
Warner, James E.; Hochhalter, Jacob D.
2016-01-01
This work presents a computationally-ecient approach for damage determination that quanti es uncertainty in the provided diagnosis. Given strain sensor data that are polluted with measurement errors, Bayesian inference is used to estimate the location, size, and orientation of damage. This approach uses Bayes' Theorem to combine any prior knowledge an analyst may have about the nature of the damage with information provided implicitly by the strain sensor data to form a posterior probability distribution over possible damage states. The unknown damage parameters are then estimated based on samples drawn numerically from this distribution using a Markov Chain Monte Carlo (MCMC) sampling algorithm. Several modi cations are made to the traditional Bayesian inference approach to provide signi cant computational speedup. First, an ecient surrogate model is constructed using sparse grid interpolation to replace a costly nite element model that must otherwise be evaluated for each sample drawn with MCMC. Next, the standard Bayesian posterior distribution is modi ed using a weighted likelihood formulation, which is shown to improve the convergence of the sampling process. Finally, a robust MCMC algorithm, Delayed Rejection Adaptive Metropolis (DRAM), is adopted to sample the probability distribution more eciently. Numerical examples demonstrate that the proposed framework e ectively provides damage estimates with uncertainty quanti cation and can yield orders of magnitude speedup over standard Bayesian approaches.
TYPE Ia SUPERNOVA LIGHT-CURVE INFERENCE: HIERARCHICAL BAYESIAN ANALYSIS IN THE NEAR-INFRARED
International Nuclear Information System (INIS)
We present a comprehensive statistical analysis of the properties of Type Ia supernova (SN Ia) light curves in the near-infrared using recent data from Peters Automated InfraRed Imaging TELescope and the literature. We construct a hierarchical Bayesian framework, incorporating several uncertainties including photometric error, peculiar velocities, dust extinction, and intrinsic variations, for principled and coherent statistical inference. SN Ia light-curve inferences are drawn from the global posterior probability of parameters describing both individual supernovae and the population conditioned on the entire SN Ia NIR data set. The logical structure of the hierarchical model is represented by a directed acyclic graph. Fully Bayesian analysis of the model and data is enabled by an efficient Markov Chain Monte Carlo algorithm exploiting the conditional probabilistic structure using Gibbs sampling. We apply this framework to the JHKs SN Ia light-curve data. A new light-curve model captures the observed J-band light-curve shape variations. The marginal intrinsic variances in peak absolute magnitudes are σ(MJ) = 0.17 ± 0.03, σ(MH) = 0.11 ± 0.03, and σ(MKs) = 0.19 ± 0.04. We describe the first quantitative evidence for correlations between the NIR absolute magnitudes and J-band light-curve shapes, and demonstrate their utility for distance estimation. The average residual in the Hubble diagram for the training set SNe at cz > 2000kms-1 is 0.10 mag. The new application of bootstrap cross-validation to SN Ia light-curve inference tests the sensitivity of the statistical model fit to the finite sample and estimates the prediction error at 0.15 mag. These results demonstrate that SN Ia NIR light curves are as effective as corrected optical light curves, and, because they are less vulnerable to dust absorption, they have great potential as precise and accurate cosmological distance indicators.
Albert, Carlo; Ulzega, Simone; Stoop, Ruedi
2016-04-01
Parameter inference is a fundamental problem in data-driven modeling. Given observed data that is believed to be a realization of some parameterized model, the aim is to find parameter values that are able to explain the observed data. In many situations, the dominant sources of uncertainty must be included into the model for making reliable predictions. This naturally leads to stochastic models. Stochastic models render parameter inference much harder, as the aim then is to find a distribution of likely parameter values. In Bayesian statistics, which is a consistent framework for data-driven learning, this so-called posterior distribution can be used to make probabilistic predictions. We propose a novel, exact, and very efficient approach for generating posterior parameter distributions for stochastic differential equation models calibrated to measured time series. The algorithm is inspired by reinterpreting the posterior distribution as a statistical mechanics partition function of an object akin to a polymer, where the measurements are mapped on heavier beads compared to those of the simulated data. To arrive at distribution samples, we employ a Hamiltonian Monte Carlo approach combined with a multiple time-scale integration. A separation of time scales naturally arises if either the number of measurement points or the number of simulation points becomes large. Furthermore, at least for one-dimensional problems, we can decouple the harmonic modes between measurement points and solve the fastest part of their dynamics analytically. Our approach is applicable to a wide range of inference problems and is highly parallelizable.
Alsing, Justin; Jaffe, Andrew H
2016-01-01
We apply two Bayesian hierarchical inference schemes to infer shear power spectra, shear maps and cosmological parameters from the CFHTLenS weak lensing survey - the first application of this method to data. In the first approach, we sample the joint posterior distribution of the shear maps and power spectra by Gibbs sampling, with minimal model assumptions. In the second approach, we sample the joint posterior of the shear maps and cosmological parameters, providing a new, accurate and principled approach to cosmological parameter inference from cosmic shear data. As a first demonstration on data we perform a 2-bin tomographic analysis to constrain cosmological parameters and investigate the possibility of photometric redshift bias in the CFHTLenS data. Under the baseline $\\Lambda$CDM model we constrain $S_8 = \\sigma_8(\\Omega_\\mathrm{m}/0.3)^{0.5} = 0.67 ^{\\scriptscriptstyle+ 0.03 }_{\\scriptscriptstyle- 0.03 }$ $(68\\%)$, consistent with previous CFHTLenS analysis but in tension with Planck. Adding neutrino m...
von Nessi, G T
2012-01-01
A new method, based on Bayesian analysis, is presented which unifies the inference of plasma equilibria parameters in a Tokamak with the ability to quantify differences between inferred equilibria and Grad-Shafranov force-balance solutions. At the heart of this technique is the new method of observation splitting, which allows multiple forward models to be associated with a single diagnostic observation. This new idea subsequently provides a means by which the the space of GS solutions can be efficiently characterised via a prior distribution. Moreover, by folding force-balance directly into one set of forward models and utilising simple Biot-Savart responses in another, the Bayesian inference of the plasma parameters itself produces an evidence (a normalisation constant of the inferred posterior distribution) which is sensitive to the relative consistency between both sets of models. This evidence can then be used to help determine the relative accuracy of the tested force-balance model across several discha...
A Bayesian method for inferring transmission chains in a partially observed epidemic.
Energy Technology Data Exchange (ETDEWEB)
Marzouk, Youssef M.; Ray, Jaideep
2008-10-01
We present a Bayesian approach for estimating transmission chains and rates in the Abakaliki smallpox epidemic of 1967. The epidemic affected 30 individuals in a community of 74; only the dates of appearance of symptoms were recorded. Our model assumes stochastic transmission of the infections over a social network. Distinct binomial random graphs model intra- and inter-compound social connections, while disease transmission over each link is treated as a Poisson process. Link probabilities and rate parameters are objects of inference. Dates of infection and recovery comprise the remaining unknowns. Distributions for smallpox incubation and recovery periods are obtained from historical data. Using Markov chain Monte Carlo, we explore the joint posterior distribution of the scalar parameters and provide an expected connectivity pattern for the social graph and infection pathway.
Improving PWR core simulations by Monte Carlo uncertainty analysis and Bayesian inference
Castro, Emilio; Buss, Oliver; Garcia-Herranz, Nuria; Hoefer, Axel; Porsch, Dieter
2016-01-01
A Monte Carlo-based Bayesian inference model is applied to the prediction of reactor operation parameters of a PWR nuclear power plant. In this non-perturbative framework, high-dimensional covariance information describing the uncertainty of microscopic nuclear data is combined with measured reactor operation data in order to provide statistically sound, well founded uncertainty estimates of integral parameters, such as the boron letdown curve and the burnup-dependent reactor power distribution. The performance of this methodology is assessed in a blind test approach, where we use measurements of a given reactor cycle to improve the prediction of the subsequent cycle. As it turns out, the resulting improvement of the prediction quality is impressive. In particular, the prediction uncertainty of the boron letdown curve, which is of utmost importance for the planning of the reactor cycle length, can be reduced by one order of magnitude by including the boron concentration measurement information of the previous...
Validi, AbdoulAhad
2013-01-01
This study introduces a non-intrusive approach in the context of low-rank separated representation to construct a surrogate of high-dimensional stochastic functions, e.g., PDEs/ODEs, in order to decrease the computational cost of Markov Chain Monte Carlo simulations in Bayesian inference. The surrogate model is constructed via a regularized alternative least-square regression with Tikhonov regularization using a roughening matrix computing the gradient of the solution, in conjunction with a perturbation-based error indicator to detect optimal model complexities. The model approximates a vector of a continuous solution at discrete values of a physical variable. The required number of random realizations to achieve a successful approximation linearly depends on the function dimensionality. The computational cost of the model construction is quadratic in the number of random inputs, which potentially tackles the curse of dimensionality in high-dimensional stochastic functions. Furthermore, this vector valued sep...
Mocapy++ - A toolkit for inference and learning in dynamic Bayesian networks
Directory of Open Access Journals (Sweden)
Hamelryck Thomas
2010-03-01
Full Text Available Abstract Background Mocapy++ is a toolkit for parameter learning and inference in dynamic Bayesian networks (DBNs. It supports a wide range of DBN architectures and probability distributions, including distributions from directional statistics (the statistics of angles, directions and orientations. Results The program package is freely available under the GNU General Public Licence (GPL from SourceForge http://sourceforge.net/projects/mocapy. The package contains the source for building the Mocapy++ library, several usage examples and the user manual. Conclusions Mocapy++ is especially suitable for constructing probabilistic models of biomolecular structure, due to its support for directional statistics. In particular, it supports the Kent distribution on the sphere and the bivariate von Mises distribution on the torus. These distributions have proven useful to formulate probabilistic models of protein and RNA structure in atomic detail.
Hill, T; Minier, V; Burton, M G; Cunningham, M R
2008-01-01
Concatenating data from the millimetre regime to the infrared, we have performed spectral energy distribution modelling for 227 of the 405 millimetre continuum sources of Hill et al. (2005) which are thought to contain young massive stars in the earliest stages of their formation. Three main parameters are extracted from the fits: temperature, mass and luminosity. The method employed was Bayesian inference, which allows a statistically probable range of suitable values for each parameter to be drawn for each individual protostellar candidate. This is the first application of this method to massive star formation. The cumulative distribution plots of the SED modelled parameters in this work indicate that collectively, the sources without methanol maser and/or radio continuum associations (MM-only cores) display similar characteristics to those of high mass star formation regions. Attributing significance to the marginal distinctions between the MM-only cores and the high-mass star formation sample we draw hypo...
Chakraborty, Shubhankar; Roy Chaudhuri, Partha; Das, Prasanta Kr.
2016-07-01
In this communication, a novel optical technique has been proposed for the reconstruction of the shape of a Taylor bubble using measurements from multiple arrays of optical sensors. The deviation of an optical beam passing through the bubble depends on the contour of bubble surface. A theoretical model of the deviation of a beam during the traverse of a Taylor bubble through it has been developed. Using this model and the time history of the deviation captured by the sensor array, the bubble shape has been reconstructed. The reconstruction has been performed using an inverse algorithm based on Bayesian inference technique and Markov chain Monte Carlo sampling algorithm. The reconstructed nose shape has been compared with the true shape, extracted through image processing of high speed images. Finally, an error analysis has been performed to pinpoint the sources of the errors.
Optimal modeling of 1D azimuth correlations in the context of Bayesian inference
De Kock, Michiel B; Trainor, Thomas A
2015-01-01
Analysis and interpretation of spectrum and correlation data from high-energy nuclear collisions is currently controversial because two opposing physics narratives derive contradictory implications from the same data-one narrative claiming collision dynamics is dominated by dijet production and projectile-nucleon fragmentation, the other claiming collision dynamics is dominated by a dense, flowing QCD medium. Opposing interpretations seem to be supported by alternative data models, and current model-comparison schemes are unable to distinguish between them. There is clearly need for a convincing new methodology to break the deadlock. In this study we introduce Bayesian Inference (BI) methods applied to angular correlation data as a basis to evaluate competing data models. For simplicity the data considered are projections of 2D angular correlations onto 1D azimuth from three centrality classes of 200 GeV Au-Au collisions. We consider several data models typical of current model choices, including Fourier seri...
cosmoabc: Likelihood-free inference via Population Monte Carlo Approximate Bayesian Computation
Ishida, E E O; Penna-Lima, M; Cisewski, J; de Souza, R S; Trindade, A M M; Cameron, E
2015-01-01
Approximate Bayesian Computation (ABC) enables parameter inference for complex physical systems in cases where the true likelihood function is unknown, unavailable, or computationally too expensive. It relies on the forward simulation of mock data and comparison between observed and synthetic catalogues. Here we present cosmoabc, a Python ABC sampler featuring a Population Monte Carlo (PMC) variation of the original ABC algorithm, which uses an adaptive importance sampling scheme. The code is very flexible and can be easily coupled to an external simulator, while allowing to incorporate arbitrary distance and prior functions. As an example of practical application, we coupled cosmoabc with the numcosmo library and demonstrate how it can be used to estimate posterior probability distributions over cosmological parameters based on measurements of galaxy clusters number counts without computing the likelihood function. cosmoabc is published under the GPLv3 license on PyPI and GitHub and documentation is availabl...
Giuntini, Michael E.; Giuntini, Ronald E.
1991-01-01
A Bayesian inference process for system logistical planning is presented which provides a method for incorporating actual failures with prediction data for an ongoing and improving reliability estimates. The process uses the Weibull distribution, and provides a means for examining and updating logistical and maintenance support needs.
Bayesian inference in camera trapping studies for a class of spatial capture-recapture models
Royle, J. Andrew; Karanth, K. Ullas; Gopalaswamy, Arjun M.; Kumar, N. Samba
2009-01-01
We develop a class of models for inference about abundance or density using spatial capture-recapture data from studies based on camera trapping and related methods. The model is a hierarchical model composed of two components: a point process model describing the distribution of individuals in space (or their home range centers) and a model describing the observation of individuals in traps. We suppose that trap- and individual-specific capture probabilities are a function of distance between individual home range centers and trap locations. We show that the models can be regarded as generalized linear mixed models, where the individual home range centers are random effects. We adopt a Bayesian framework for inference under these models using a formulation based on data augmentation. We apply the models to camera trapping data on tigers from the Nagarahole Reserve, India, collected over 48 nights in 2006. For this study, 120 camera locations were used, but cameras were only operational at 30 locations during any given sample occasion. Movement of traps is common in many camera-trapping studies and represents an important feature of the observation model that we address explicitly in our application.
Bayesian inference of biochemical kinetic parameters using the linear noise approximation
Directory of Open Access Journals (Sweden)
Finkenstädt Bärbel
2009-10-01
Full Text Available Abstract Background Fluorescent and luminescent gene reporters allow us to dynamically quantify changes in molecular species concentration over time on the single cell level. The mathematical modeling of their interaction through multivariate dynamical models requires the deveopment of effective statistical methods to calibrate such models against available data. Given the prevalence of stochasticity and noise in biochemical systems inference for stochastic models is of special interest. In this paper we present a simple and computationally efficient algorithm for the estimation of biochemical kinetic parameters from gene reporter data. Results We use the linear noise approximation to model biochemical reactions through a stochastic dynamic model which essentially approximates a diffusion model by an ordinary differential equation model with an appropriately defined noise process. An explicit formula for the likelihood function can be derived allowing for computationally efficient parameter estimation. The proposed algorithm is embedded in a Bayesian framework and inference is performed using Markov chain Monte Carlo. Conclusion The major advantage of the method is that in contrast to the more established diffusion approximation based methods the computationally costly methods of data augmentation are not necessary. Our approach also allows for unobserved variables and measurement error. The application of the method to both simulated and experimental data shows that the proposed methodology provides a useful alternative to diffusion approximation based methods.
ESTIMATE OF THE HYPSOMETRIC RELATIONSHIP WITH NONLINEAR MODELS FITTED BY EMPIRICAL BAYESIAN METHODS
Directory of Open Access Journals (Sweden)
Monica Fabiana Bento Moreira
2015-09-01
Full Text Available In this paper we propose a Bayesian approach to solve the inference problem with restriction on parameters, regarding to nonlinear models used to represent the hypsometric relationship in clones of Eucalyptus sp. The Bayesian estimates are calculated using Monte Carlo Markov Chain (MCMC method. The proposed method was applied to different groups of actual data from which two were selected to show the results. These results were compared to the results achieved by the minimum square method, highlighting the superiority of the Bayesian approach, since this approach always generate the biologically consistent results for hipsometric relationship.
Storz, Jay F; Beaumont, Mark A; Alberts, Susan C
2002-11-01
The purpose of this study was to test for evidence that savannah baboons (Papio cynocephalus) underwent a population expansion in concert with a hypothesized expansion of African human and chimpanzee populations during the late Pleistocene. The rationale is that any type of environmental event sufficient to cause simultaneous population expansions in African humans and chimpanzees would also be expected to affect other codistributed mammals. To test for genetic evidence of population expansion or contraction, we performed a coalescent analysis of multilocus microsatellite data using a hierarchical Bayesian model. Markov chain Monte Carlo (MCMC) simulations were used to estimate the posterior probability density of demographic and genealogical parameters. The model was designed to allow interlocus variation in mutational and demographic parameters, which made it possible to detect aberrant patterns of variation at individual loci that could result from heterogeneity in mutational dynamics or from the effects of selection at linked sites. Results of the MCMC simulations were consistent with zero variance in demographic parameters among loci, but there was evidence for a 10- to 20-fold difference in mutation rate between the most slowly and most rapidly evolving loci. Results of the model provided strong evidence that savannah baboons have undergone a long-term historical decline in population size. The mode of the highest posterior density for the joint distribution of current and ancestral population size indicated a roughly eightfold contraction over the past 1,000 to 250,000 years. These results indicate that savannah baboons apparently did not share a common demographic history with other codistributed primate species. PMID:12411607
A Hamiltonian Monte–Carlo method for Bayesian inference of supermassive black hole binaries
International Nuclear Information System (INIS)
We investigate the use of a Hamiltonian Monte–Carlo to map out the posterior density function for supermassive black hole binaries. While previous Markov Chain Monte–Carlo (MCMC) methods, such as Metropolis–Hastings MCMC, have been successfully employed for a number of different gravitational wave sources, these methods are essentially random walk algorithms. The Hamiltonian Monte–Carlo treats the inverse likelihood surface as a ‘gravitational potential’ and by introducing canonical positions and momenta, dynamically evolves the Markov chain by solving Hamilton's equations of motion. This method is not as widely used as other MCMC algorithms due to the necessity of calculating gradients of the log-likelihood, which for most applications results in a bottleneck that makes the algorithm computationally prohibitive. We circumvent this problem by using accepted initial phase-space trajectory points to analytically fit for each of the individual gradients. Eliminating the waveform generation needed for the numerical derivatives reduces the total number of required templates for a 106 iteration chain from ∼109 to ∼106. The result is in an implementation of the Hamiltonian Monte–Carlo that is faster, and more efficient by a factor of approximately the dimension of the parameter space, than a Hessian MCMC. (paper)
Reversible Jump MCMC Simulated Annealing for Neural Networks
Andrieu, Christophe; De Freitas, Nando; Doucet, Arnaud
2013-01-01
We propose a novel reversible jump Markov chain Monte Carlo (MCMC) simulated annealing algorithm to optimize radial basis function (RBF) networks. This algorithm enables us to maximize the joint posterior distribution of the network parameters and the number of basis functions. It performs a global search in the joint space of the parameters and number of parameters, thereby surmounting the problem of local minima. We also show that by calibrating a Bayesian model, we can obtain the classical...
Bazin, Eric; Dawson, Kevin J; Beaumont, Mark A
2010-06-01
We address the problem of finding evidence of natural selection from genetic data, accounting for the confounding effects of demographic history. In the absence of natural selection, gene genealogies should all be sampled from the same underlying distribution, often approximated by a coalescent model. Selection at a particular locus will lead to a modified genealogy, and this motivates a number of recent approaches for detecting the effects of natural selection in the genome as "outliers" under some models. The demographic history of a population affects the sampling distribution of genealogies, and therefore the observed genotypes and the classification of outliers. Since we cannot see genealogies directly, we have to infer them from the observed data under some model of mutation and demography. Thus the accuracy of an outlier-based approach depends to a greater or a lesser extent on the uncertainty about the demographic and mutational model. A natural modeling framework for this type of problem is provided by Bayesian hierarchical models, in which parameters, such as mutation rates and selection coefficients, are allowed to vary across loci. It has proved quite difficult computationally to implement fully probabilistic genealogical models with complex demographies, and this has motivated the development of approximations such as approximate Bayesian computation (ABC). In ABC the data are compressed into summary statistics, and computation of the likelihood function is replaced by simulation of data under the model. In a hierarchical setting one may be interested both in hyperparameters and parameters, and there may be very many of the latter--for example, in a genetic model, these may be parameters describing each of many loci or populations. This poses a problem for ABC in that one then requires summary statistics for each locus, which, if used naively, leads to a consequent difficulty in conditional density estimation. We develop a general method for applying
Raue, Andreas; Theis, Fabian Joachim; Timmer, Jens
2012-01-01
Increasingly complex applications involve large datasets in combination with non-linear and high dimensional mathematical models. In this context, statistical inference is a challenging issue that calls for pragmatic approaches that take advantage of both Bayesian and frequentist methods. The elegance of Bayesian methodology is founded in the propagation of information content provided by experimental data and prior assumptions to the posterior probability distribution of model predictions. However, for complex applications experimental data and prior assumptions potentially constrain the posterior probability distribution insufficiently. In these situations Bayesian Markov chain Monte Carlo sampling can be infeasible. From a frequentist point of view insufficient experimental data and prior assumptions can be interpreted as non-identifiability. The profile likelihood approach offers to detect and to resolve non-identifiability by experimental design iteratively. Therefore, it allows one to better constrain t...
Cuevas Rivera, Dario; Bitzer, Sebastian; Kiebel, Stefan J
2015-10-01
The olfactory information that is received by the insect brain is encoded in the form of spatiotemporal patterns in the projection neurons of the antennal lobe. These dense and overlapping patterns are transformed into a sparse code in Kenyon cells in the mushroom body. Although it is clear that this sparse code is the basis for rapid categorization of odors, it is yet unclear how the sparse code in Kenyon cells is computed and what information it represents. Here we show that this computation can be modeled by sequential firing rate patterns using Lotka-Volterra equations and Bayesian online inference. This new model can be understood as an 'intelligent coincidence detector', which robustly and dynamically encodes the presence of specific odor features. We found that the model is able to qualitatively reproduce experimentally observed activity in both the projection neurons and the Kenyon cells. In particular, the model explains mechanistically how sparse activity in the Kenyon cells arises from the dense code in the projection neurons. The odor classification performance of the model proved to be robust against noise and time jitter in the observed input sequences. As in recent experimental results, we found that recognition of an odor happened very early during stimulus presentation in the model. Critically, by using the model, we found surprising but simple computational explanations for several experimental phenomena. PMID:26451888
Directory of Open Access Journals (Sweden)
Dario Cuevas Rivera
2015-10-01
Full Text Available The olfactory information that is received by the insect brain is encoded in the form of spatiotemporal patterns in the projection neurons of the antennal lobe. These dense and overlapping patterns are transformed into a sparse code in Kenyon cells in the mushroom body. Although it is clear that this sparse code is the basis for rapid categorization of odors, it is yet unclear how the sparse code in Kenyon cells is computed and what information it represents. Here we show that this computation can be modeled by sequential firing rate patterns using Lotka-Volterra equations and Bayesian online inference. This new model can be understood as an 'intelligent coincidence detector', which robustly and dynamically encodes the presence of specific odor features. We found that the model is able to qualitatively reproduce experimentally observed activity in both the projection neurons and the Kenyon cells. In particular, the model explains mechanistically how sparse activity in the Kenyon cells arises from the dense code in the projection neurons. The odor classification performance of the model proved to be robust against noise and time jitter in the observed input sequences. As in recent experimental results, we found that recognition of an odor happened very early during stimulus presentation in the model. Critically, by using the model, we found surprising but simple computational explanations for several experimental phenomena.
Bayesian model comparison and parameter inference in systems biology using nested sampling.
Pullen, Nick; Morris, Richard J
2014-01-01
Inferring parameters for models of biological processes is a current challenge in systems biology, as is the related problem of comparing competing models that explain the data. In this work we apply Skilling's nested sampling to address both of these problems. Nested sampling is a Bayesian method for exploring parameter space that transforms a multi-dimensional integral to a 1D integration over likelihood space. This approach focuses on the computation of the marginal likelihood or evidence. The ratio of evidences of different models leads to the Bayes factor, which can be used for model comparison. We demonstrate how nested sampling can be used to reverse-engineer a system's behaviour whilst accounting for the uncertainty in the results. The effect of missing initial conditions of the variables as well as unknown parameters is investigated. We show how the evidence and the model ranking can change as a function of the available data. Furthermore, the addition of data from extra variables of the system can deliver more information for model comparison than increasing the data from one variable, thus providing a basis for experimental design. PMID:24523891
Bayesian model comparison and parameter inference in systems biology using nested sampling.
Directory of Open Access Journals (Sweden)
Nick Pullen
Full Text Available Inferring parameters for models of biological processes is a current challenge in systems biology, as is the related problem of comparing competing models that explain the data. In this work we apply Skilling's nested sampling to address both of these problems. Nested sampling is a Bayesian method for exploring parameter space that transforms a multi-dimensional integral to a 1D integration over likelihood space. This approach focuses on the computation of the marginal likelihood or evidence. The ratio of evidences of different models leads to the Bayes factor, which can be used for model comparison. We demonstrate how nested sampling can be used to reverse-engineer a system's behaviour whilst accounting for the uncertainty in the results. The effect of missing initial conditions of the variables as well as unknown parameters is investigated. We show how the evidence and the model ranking can change as a function of the available data. Furthermore, the addition of data from extra variables of the system can deliver more information for model comparison than increasing the data from one variable, thus providing a basis for experimental design.
Bayesian Inference of the Evolution of a Phenotype Distribution on a Phylogenetic Tree
Ansari, M. Azim; Didelot, Xavier
2016-01-01
The distribution of a phenotype on a phylogenetic tree is often a quantity of interest. Many phenotypes have imperfect heritability, so that a measurement of the phenotype for an individual can be thought of as a single realization from the phenotype distribution of that individual. If all individuals in a phylogeny had the same phenotype distribution, measured phenotypes would be randomly distributed on the tree leaves. This is, however, often not the case, implying that the phenotype distribution evolves over time. Here we propose a new model based on this principle of evolving phenotype distribution on the branches of a phylogeny, which is different from ancestral state reconstruction where the phenotype itself is assumed to evolve. We develop an efficient Bayesian inference method to estimate the parameters of our model and to test the evidence for changes in the phenotype distribution. We use multiple simulated data sets to show that our algorithm has good sensitivity and specificity properties. Since our method identifies branches on the tree on which the phenotype distribution has changed, it is able to break down a tree into components for which this distribution is unique and constant. We present two applications of our method, one investigating the association between HIV genetic variation and human leukocyte antigen and the other studying host range distribution in a lineage of Salmonella enterica, and we discuss many other potential applications. PMID:27412711
International Nuclear Information System (INIS)
This study introduces a non-intrusive approach in the context of low-rank separated representation to construct a surrogate of high-dimensional stochastic functions, e.g., PDEs/ODEs, in order to decrease the computational cost of Markov Chain Monte Carlo simulations in Bayesian inference. The surrogate model is constructed via a regularized alternative least-square regression with Tikhonov regularization using a roughening matrix computing the gradient of the solution, in conjunction with a perturbation-based error indicator to detect optimal model complexities. The model approximates a vector of a continuous solution at discrete values of a physical variable. The required number of random realizations to achieve a successful approximation linearly depends on the function dimensionality. The computational cost of the model construction is quadratic in the number of random inputs, which potentially tackles the curse of dimensionality in high-dimensional stochastic functions. Furthermore, this vector-valued separated representation-based model, in comparison to the available scalar-valued case, leads to a significant reduction in the cost of approximation by an order of magnitude equal to the vector size. The performance of the method is studied through its application to three numerical examples including a 41-dimensional elliptic PDE and a 21-dimensional cavity flow
International Nuclear Information System (INIS)
We have developed a Bayesian approach to the analysis of neural electromagnetic (MEG/EEG) data that can incorporate or fuse information from other imaging modalities and addresses the ill-posed inverse problem by sarnpliig the many different solutions which could have produced the given data. From these samples one can draw probabilistic inferences about regions of activation. Our source model assumes a variable number of variable size cortical regions of stimulus-correlated activity. An active region consists of locations on the cortical surf ace, within a sphere centered on some location in cortex. The number and radi of active regions can vary to defined maximum values. The goal of the analysis is to determine the posterior probability distribution for the set of parameters that govern the number, location, and extent of active regions. Markov Chain Monte Carlo is used to generate a large sample of sets of parameters distributed according to the posterior distribution. This sample is representative of the many different source distributions that could account for given data, and allows identification of probable (i.e. consistent) features across solutions. Examples of the use of this analysis technique with both simulated and empirical MEG data are presented
Energy Technology Data Exchange (ETDEWEB)
George, J.S.; Schmidt, D.M.; Wood, C.C.
1999-02-01
We have developed a Bayesian approach to the analysis of neural electromagnetic (MEG/EEG) data that can incorporate or fuse information from other imaging modalities and addresses the ill-posed inverse problem by sarnpliig the many different solutions which could have produced the given data. From these samples one can draw probabilistic inferences about regions of activation. Our source model assumes a variable number of variable size cortical regions of stimulus-correlated activity. An active region consists of locations on the cortical surf ace, within a sphere centered on some location in cortex. The number and radi of active regions can vary to defined maximum values. The goal of the analysis is to determine the posterior probability distribution for the set of parameters that govern the number, location, and extent of active regions. Markov Chain Monte Carlo is used to generate a large sample of sets of parameters distributed according to the posterior distribution. This sample is representative of the many different source distributions that could account for given data, and allows identification of probable (i.e. consistent) features across solutions. Examples of the use of this analysis technique with both simulated and empirical MEG data are presented.
Wang, Dong; Tsui, Kwok-Leung; Zhou, Qiang
2016-05-01
Rolling element bearings are commonly used in machines to provide support for rotating shafts. Bearing failures may cause unexpected machine breakdowns and increase economic cost. To prevent machine breakdowns and reduce unnecessary economic loss, bearing faults should be detected as early as possible. Because wavelet transform can be used to highlight impulses caused by localized bearing faults, wavelet transform has been widely investigated and proven to be one of the most effective and efficient methods for bearing fault diagnosis. In this paper, a new Gauss-Hermite integration based Bayesian inference method is proposed to estimate the posterior distribution of wavelet parameters. The innovations of this paper are illustrated as follows. Firstly, a non-linear state space model of wavelet parameters is constructed to describe the relationship between wavelet parameters and hypothetical measurements. Secondly, the joint posterior probability density function of wavelet parameters and hypothetical measurements is assumed to follow a joint Gaussian distribution so as to generate Gaussian perturbations for the state space model. Thirdly, Gauss-Hermite integration is introduced to analytically predict and update moments of the joint Gaussian distribution, from which optimal wavelet parameters are derived. At last, an optimal wavelet filtering is conducted to extract bearing fault features and thus identify localized bearing faults. Two instances are investigated to illustrate how the proposed method works. Two comparisons with the fast kurtogram are used to demonstrate that the proposed method can achieve better visual inspection performances than the fast kurtogram.
Duggento, Andrea; McClintock, Peter V E; Stefanovska, Aneta
2012-01-01
Living systems have time-evolving interactions that, until recently, could not be identified accurately from recorded time series in the presence of noise. Stankovski et al. (Phys. Rev. Lett. 109 024101, 2012) introduced a method based on dynamical Bayesian inference that facilitates the simultaneous detection of time-varying synchronization, directionality of influence, and coupling functions. It can distinguish unsynchronized dynamics from noise-induced phase slips. The method is based on phase dynamics, with Bayesian inference of the time- evolving parameters being achieved by shaping the prior densities to incorporate knowledge of previous samples. We now present the method in detail using numerically-generated data, data from an analog electronic circuit, and cardio-respiratory data. We also generalize the method to encompass networks of interacting oscillators and thus demonstrate its applicability to small-scale networks.
Sudano, Mateus José; Crespilho, André Maciel; Fernandes, Claudia Barbosa; Junior, Alicio Martins; Papa, Frederico Ozanam; Rodrigues, Josemar; Machado, Rui; Landim-Alvarenga, Fernanda Da Cruz
2011-01-01
The objective of this experiment was to test in vitro embryo production (IVP) as a tool to estimate fertility performance in zebu bulls using Bayesian inference statistics. Oocytes were matured and fertilized in vitro using sperm cells from three different Zebu bulls (V, T, and G). The three bulls presented similar results with regard to pronuclear formation and blastocyst formation rates. However, the cleavage rates were different between bulls. The estimated conception rates based on combined data of cleavage and blastocyst formation were very similar to the true conception rates observed for the same bulls after a fixed-time artificial insemination program. Moreover, even when we used cleavage rate data only or blastocyst formation data only, the estimated conception rates were still close to the true conception rates. We conclude that Bayesian inference is an effective statistical procedure to estimate in vivo bull fertility using data from IVP. PMID:21547211
Hug, Sabine Carolin
2015-01-01
In this thesis we use differential equations for mathematically representing biological processes. For this we have to infer the associated parameters for fitting the differential equations to measurement data. If the structure of the ODE itself is uncertain, model selection methods have to be applied. We refine several existing Bayesian methods, ranging from an adaptive scheme for the computation of high-dimensional integrals to multi-chain Metropolis-Hastings algorithms for high-dimensional...
A surrogate model enables a Bayesian approach to the inverse problem of scatterometry
International Nuclear Information System (INIS)
Scatterometry is an indirect optical method for the determination of photomask geometry parameters from scattered light intensities by solving an inverse problem. The Bayesian approach is a powerful method to solve the inverse problem. In the Bayesian framework estimates of parameters and associated uncertainties are obtained from posterior distributions. The determination the probability distribution is typically based on Markov chain Monte Carlo (MCMC) methods. However, in scatterometry the evaluation of MCMC steps require solutions of partial differential equations that are computationally expensive and application of MCMC methods is thus impractical. In this article we introduce a surrogate model for scatterometry based on polynomial chaos that can be treated by Bayesian inference. We compare the results of the surrogate model with rigorous finite element simulations and demonstrate its convergence. The accuracy reaches a value of lower than one percent for a sufficient fine mesh and the speed up amounts more than two order of magnitudes. Furthermore, we apply the surrogate model to MCMC calculations and we reconstruct geometry parameters of a photomask
MCMC for Wind Power Simulation
Papaefthymiou, G.; Klöckl, B.
2008-01-01
This paper contributes a Markov chain Monte Carlo (MCMC) method for the direct generation of synthetic time series of wind power output. It is shown that obtaining a stochastic model directly in the wind power domain leads to reduced number of states and to lower order of the Markov chain at equal p
Bayesian inference along Markov Chain Monte Carlo approach for PWR core loading pattern optimization
International Nuclear Information System (INIS)
Highlights: ► The BIMCMC method performs very well and is comparable to GA and PSO techniques. ► The potential of the technique is very well for optimization. ► It is observed that the performance of the method is quite adequate. ► The BIMCMC is very easy to implement. -- Abstract: Despite remarkable progress in optimization procedures, inherent complexities in nuclear reactor structure and strong interdependence among the fundamental indices namely, economic, neutronic, thermo-hydraulic and environmental effects make it necessary to evaluate the most efficient arrangement of a reactor core. In this paper a reactor core reloading technique based on Bayesian inference along Markov Chain Monte Carlo, BIMCMC, is addressed in the context of obtaining an optimal configuration of fuel assemblies in reactor cores. The Markov Chain Monte Carlo with Metropolis–Hastings algorithm has been applied for sampling variable and its acceptance. The proposed algorithm can be used for in-core fuel management optimization problems in pressurized water reactors. Considerable work has been expended for loading pattern optimization, but no preferred approach has yet emerged. To evaluate the proposed technique, increasing the effective multiplication factor Keff of a WWER-1000 core along flattening power with keeping power peaking factor below a specific limit as a first test case and flattening of power as a second test case are considered as objective functions; although other variables such as burn up and cycle length can also be taken into account. The results, convergence rate and reliability of the new method are compared to published data resulting from particle swarm optimization and genetic algorithm; the outcome is quite promising and demonstrating the potential of the technique very well for optimization applications in the nuclear engineering field.
Energy Technology Data Exchange (ETDEWEB)
Kang, Seongkeun; Seong, Poong Hyun [Korea Advanced Institute of Science and Technology, Daejeon (Korea, Republic of)
2014-05-15
The purpose of this paper is to confirm if Bayesian inference can properly reflect the situation awareness of real human operators, and find the difference between the situation of ideal and practical operators, and investigate the factors which contributes to those difference. As a results, human can not think like computer. If human can memorize all the information, and their thinking process is same to the CPU of computer, the results of these two experiments come out more than 99%. However the probability of finding right malfunction by humans are only 64.52% in simple experiment, and 51.61% in complex experiment. Cognition is the mental processing that includes the attention of working memory, comprehending and producing language, calculating, reasoning, problem solving, and decision making. There are many reasons why human thinking process is different with computer, but in this experiment, we suggest that the working memory is the most important factor. Humans have limited working memory which has only seven chunks capacity. These seven chunks are called magic number. If there are more than seven sequential information, people start to forget the previous information because their working memory capacity is running over. We can check how much working memory affects to the result through the simple experiment. Then what if we neglect the effect of working memory? The total number of subjects who have incorrect memory is 7 (subject 3, 5, 6, 7, 8, 15, 25). They could find the right malfunction if the memory hadn't changed because of lack of working memory. Then the probability of find correct malfunction will be increased to 87.10% from 64.52%. Complex experiment has similar result. In this case, eight subjects(1, 5, 8, 9, 15, 17, 18, 30) had changed the memory, and it affects to find the right malfunction. Considering it, then the probability would be (16+8)/31 = 77.42%.
International Nuclear Information System (INIS)
The purpose of this paper is to confirm if Bayesian inference can properly reflect the situation awareness of real human operators, and find the difference between the situation of ideal and practical operators, and investigate the factors which contributes to those difference. As a results, human can not think like computer. If human can memorize all the information, and their thinking process is same to the CPU of computer, the results of these two experiments come out more than 99%. However the probability of finding right malfunction by humans are only 64.52% in simple experiment, and 51.61% in complex experiment. Cognition is the mental processing that includes the attention of working memory, comprehending and producing language, calculating, reasoning, problem solving, and decision making. There are many reasons why human thinking process is different with computer, but in this experiment, we suggest that the working memory is the most important factor. Humans have limited working memory which has only seven chunks capacity. These seven chunks are called magic number. If there are more than seven sequential information, people start to forget the previous information because their working memory capacity is running over. We can check how much working memory affects to the result through the simple experiment. Then what if we neglect the effect of working memory? The total number of subjects who have incorrect memory is 7 (subject 3, 5, 6, 7, 8, 15, 25). They could find the right malfunction if the memory hadn't changed because of lack of working memory. Then the probability of find correct malfunction will be increased to 87.10% from 64.52%. Complex experiment has similar result. In this case, eight subjects(1, 5, 8, 9, 15, 17, 18, 30) had changed the memory, and it affects to find the right malfunction. Considering it, then the probability would be (16+8)/31 = 77.42%
Multi-scale inference of interaction rules in animal groups using Bayesian model selection.
Directory of Open Access Journals (Sweden)
Richard P Mann
Full Text Available Inference of interaction rules of animals moving in groups usually relies on an analysis of large scale system behaviour. Models are tuned through repeated simulation until they match the observed behaviour. More recent work has used the fine scale motions of animals to validate and fit the rules of interaction of animals in groups. Here, we use a Bayesian methodology to compare a variety of models to the collective motion of glass prawns (Paratya australiensis. We show that these exhibit a stereotypical 'phase transition', whereby an increase in density leads to the onset of collective motion in one direction. We fit models to this data, which range from: a mean-field model where all prawns interact globally; to a spatial Markovian model where prawns are self-propelled particles influenced only by the current positions and directions of their neighbours; up to non-Markovian models where prawns have 'memory' of previous interactions, integrating their experiences over time when deciding to change behaviour. We show that the mean-field model fits the large scale behaviour of the system, but does not capture the observed locality of interactions. Traditional self-propelled particle models fail to capture the fine scale dynamics of the system. The most sophisticated model, the non-Markovian model, provides a good match to the data at both the fine scale and in terms of reproducing global dynamics, while maintaining a biologically plausible perceptual range. We conclude that prawns' movements are influenced by not just the current direction of nearby conspecifics, but also those encountered in the recent past. Given the simplicity of prawns as a study system our research suggests that self-propelled particle models of collective motion should, if they are to be realistic at multiple biological scales, include memory of previous interactions and other non-Markovian effects.
Multi-scale inference of interaction rules in animal groups using Bayesian model selection.
Directory of Open Access Journals (Sweden)
Richard P Mann
2012-01-01
Full Text Available Inference of interaction rules of animals moving in groups usually relies on an analysis of large scale system behaviour. Models are tuned through repeated simulation until they match the observed behaviour. More recent work has used the fine scale motions of animals to validate and fit the rules of interaction of animals in groups. Here, we use a Bayesian methodology to compare a variety of models to the collective motion of glass prawns (Paratya australiensis. We show that these exhibit a stereotypical 'phase transition', whereby an increase in density leads to the onset of collective motion in one direction. We fit models to this data, which range from: a mean-field model where all prawns interact globally; to a spatial Markovian model where prawns are self-propelled particles influenced only by the current positions and directions of their neighbours; up to non-Markovian models where prawns have 'memory' of previous interactions, integrating their experiences over time when deciding to change behaviour. We show that the mean-field model fits the large scale behaviour of the system, but does not capture fine scale rules of interaction, which are primarily mediated by physical contact. Conversely, the Markovian self-propelled particle model captures the fine scale rules of interaction but fails to reproduce global dynamics. The most sophisticated model, the non-Markovian model, provides a good match to the data at both the fine scale and in terms of reproducing global dynamics. We conclude that prawns' movements are influenced by not just the current direction of nearby conspecifics, but also those encountered in the recent past. Given the simplicity of prawns as a study system our research suggests that self-propelled particle models of collective motion should, if they are to be realistic at multiple biological scales, include memory of previous interactions and other non-Markovian effects.
HASSET: a probability event tree tool to evaluate future volcanic scenarios using Bayesian inference
Sobradelo, Rosa; Bartolini, Stefania; Martí, Joan
2014-01-01
Event tree structures constitute one of the most useful and necessary tools in modern volcanology for assessment of hazards from future volcanic scenarios (those that culminate in an eruptive event as well as those that do not). They are particularly relevant for evaluation of long- and short-term probabilities of occurrence of possible volcanic scenarios and their potential impacts on urbanized areas. In this paper, we introduce Hazard Assessment Event Tree (HASSET), a probability tool, built on an event tree structure that uses Bayesian inference to estimate the probability of occurrence of a future volcanic scenario and to evaluate the most relevant sources of uncertainty from the corresponding volcanic system. HASSET includes hazard assessment of noneruptive and nonmagmatic volcanic scenarios, that is, episodes of unrest that do not evolve into volcanic eruption but have an associated volcanic hazard (e.g., sector collapse and phreatic explosion), as well as unrest episodes triggered by external triggers rather than the magmatic system alone. Additionally, HASSET introduces the Delta method to assess precision of the probability estimates, by reporting a 1 standard deviation variability interval around the expected value for each scenario. HASSET is presented as a free software package in the form of a plug-in for the open source geographic information system Quantum Gis (QGIS), providing a graphically supported computation of the event tree structure in an interactive and user-friendly way. We also include further in-depth explanations for each node together with an application of HASSET to Teide-Pico Viejo volcanic complex (Spain).
MCMC joint separation and segmentation of hidden Markov fields
Snoussi, H; Snoussi, Hichem; Mohammad-Djafari, Ali
2002-01-01
In this contribution, we consider the problem of the blind separation of noisy instantaneously mixed images. The images are modelized by hidden Markov fields with unknown parameters. Given the observed images, we give a Bayesian formulation and we propose to solve the resulting data augmentation problem by implementing a Monte Carlo Markov Chain (MCMC) procedure. We separate the unknown variables into two categories: 1. The parameters of interest which are the mixing matrix, the noise covariance and the parameters of the sources distributions. 2. The hidden variables which are the unobserved sources and the unobserved pixels classification labels. The proposed algorithm provides in the stationary regime samples drawn from the posterior distributions of all the variables involved in the problem leading to a flexibility in the cost function choice. We discuss and characterize some problems of non identifiability and degeneracies of the parameters likelihood and the behavior of the MCMC algorithm in this case. F...
Directory of Open Access Journals (Sweden)
Fang-Rong Yan
Full Text Available This article provides a fully bayesian approach for modeling of single-dose and complete pharmacokinetic data in a population pharmacokinetic (PK model. To overcome the impact of outliers and the difficulty of computation, a generalized linear model is chosen with the hypothesis that the errors follow a multivariate Student t distribution which is a heavy-tailed distribution. The aim of this study is to investigate and implement the performance of the multivariate t distribution to analyze population pharmacokinetic data. Bayesian predictive inferences and the Metropolis-Hastings algorithm schemes are used to process the intractable posterior integration. The precision and accuracy of the proposed model are illustrated by the simulating data and a real example of theophylline data.
Analysis of trend changes in Northern African palaeo-climate by using Bayesian inference
Schütz, Nadine; Trauth, Martin H.; Holschneider, Matthias
2010-05-01
Climate variability of Northern Africa is of high interest due to climate-evolutionary linkages under study. The reconstruction of the palaeo-climate over long time scales, including the expected linkages (> 3 Ma), is mainly accessible by proxy data from deep sea drilling cores. By concentrating on published data sets, we try to decipher rhythms and trends to detect correlations between different proxy time series by advanced mathematical methods. Our preliminary data is dust concentration, as an indicator for climatic changes such as humidity, from the ODP sites 659, 721 and 967 situated around Northern Africa. Our interest is in challenging the available time series with advanced statistical methods to detect significant trend changes and to compare different model assumptions. For that purpose, we want to avoid the rescaling of the time axis to obtain equidistant time steps for filtering methods. Additionally we demand an plausible description of the errors for the estimated parameters, in terms of confidence intervals. Finally, depending on what model we restrict on, we also want an insight in the parameter structure of the assumed models. To gain this information, we focus on Bayesian inference by formulating the problem as a linear mixed model, so that the expectation and deviation are of linear structure. By using the Bayesian method we can formulate the posteriori density as a function of the model parameters and calculate this probability density in the parameter space. Depending which parameters are of interest, we analytically and numerically marginalize the posteriori with respect to the remaining parameters of less interest. We apply a simple linear mixed model to calculate the posteriori densities of the ODP sites 659 and 721 concerning the last 5 Ma at maximum. From preliminary calculations on these data sets, we can confirm results gained by the method of breakfit regression combined with block bootstrapping ([1]). We obtain a significant change
Directory of Open Access Journals (Sweden)
Edson Sandoval-Castellanos
Full Text Available Inference of population demographic history has vastly improved in recent years due to a number of technological and theoretical advances including the use of ancient DNA. Approximate Bayesian computation (ABC stands among the most promising methods due to its simple theoretical fundament and exceptional flexibility. However, limited availability of user-friendly programs that perform ABC analysis renders it difficult to implement, and hence programming skills are frequently required. In addition, there is limited availability of programs able to deal with heterochronous data. Here we present the software BaySICS: Bayesian Statistical Inference of Coalescent Simulations. BaySICS provides an integrated and user-friendly platform that performs ABC analyses by means of coalescent simulations from DNA sequence data. It estimates historical demographic population parameters and performs hypothesis testing by means of Bayes factors obtained from model comparisons. Although providing specific features that improve inference from datasets with heterochronous data, BaySICS also has several capabilities making it a suitable tool for analysing contemporary genetic datasets. Those capabilities include joint analysis of independent tables, a graphical interface and the implementation of Markov-chain Monte Carlo without likelihoods.
Sankararaman, Shankar
2016-01-01
This paper presents a computational framework for uncertainty characterization and propagation, and sensitivity analysis under the presence of aleatory and epistemic un- certainty, and develops a rigorous methodology for efficient refinement of epistemic un- certainty by identifying important epistemic variables that significantly affect the overall performance of an engineering system. The proposed methodology is illustrated using the NASA Langley Uncertainty Quantification Challenge (NASA-LUQC) problem that deals with uncertainty analysis of a generic transport model (GTM). First, Bayesian inference is used to infer subsystem-level epistemic quantities using the subsystem-level model and corresponding data. Second, tools of variance-based global sensitivity analysis are used to identify four important epistemic variables (this limitation specified in the NASA-LUQC is reflective of practical engineering situations where not all epistemic variables can be refined due to time/budget constraints) that significantly affect system-level performance. The most significant contribution of this paper is the development of the sequential refine- ment methodology, where epistemic variables for refinement are not identified all-at-once. Instead, only one variable is first identified, and then, Bayesian inference and global sensi- tivity calculations are repeated to identify the next important variable. This procedure is continued until all 4 variables are identified and the refinement in the system-level perfor- mance is computed. The advantages of the proposed sequential refinement methodology over the all-at-once uncertainty refinement approach are explained, and then applied to the NASA Langley Uncertainty Quantification Challenge problem.
Knuth, K. H.
2001-05-01
We consider the application of Bayesian inference to the study of self-organized structures in complex adaptive systems. In particular, we examine the distribution of elements, agents, or processes in systems dominated by hierarchical structure. We demonstrate that results obtained by Caianiello [1] on Hierarchical Modular Systems (HMS) can be found by applying Jaynes' Principle of Group Invariance [2] to a few key assumptions about our knowledge of hierarchical organization. Subsequent application of the Principle of Maximum Entropy allows inferences to be made about specific systems. The utility of the Bayesian method is considered by examining both successes and failures of the hierarchical model. We discuss how Caianiello's original statements suffer from the Mind Projection Fallacy [3] and we restate his assumptions thus widening the applicability of the HMS model. The relationship between inference and statistical physics, described by Jaynes [4], is reiterated with the expectation that this realization will aid the field of complex systems research by moving away from often inappropriate direct application of statistical mechanics to a more encompassing inferential methodology.
Gustafson, Paul
2014-01-01
Partially identified models are characterized by the distribution of observables being compatible with a set of values for the target parameter, rather than a single value. This set is often referred to as an identification region. From a non-Bayesian point of view, the identification region is the object revealed to the investigator in the limit of increasing sample size. Conversely, a Bayesian analysis provides the identification region plus the limiting posterior distribution over this reg...
Yu, Jianbo
2015-12-01
Prognostics is much efficient to achieve zero-downtime performance, maximum productivity and proactive maintenance of machines. Prognostics intends to assess and predict the time evolution of machine health degradation so that machine failures can be predicted and prevented. A novel prognostics system is developed based on the data-model-fusion scheme using the Bayesian inference-based self-organizing map (SOM) and an integration of logistic regression (LR) and high-order particle filtering (HOPF). In this prognostics system, a baseline SOM is constructed to model the data distribution space of healthy machine under an assumption that predictable fault patterns are not available. Bayesian inference-based probability (BIP) derived from the baseline SOM is developed as a quantification indication of machine health degradation. BIP is capable of offering failure probability for the monitored machine, which has intuitionist explanation related to health degradation state. Based on those historic BIPs, the constructed LR and its modeling noise constitute a high-order Markov process (HOMP) to describe machine health propagation. HOPF is used to solve the HOMP estimation to predict the evolution of the machine health in the form of a probability density function (PDF). An on-line model update scheme is developed to adapt the Markov process changes to machine health dynamics quickly. The experimental results on a bearing test-bed illustrate the potential applications of the proposed system as an effective and simple tool for machine health prognostics.
Bayesian inference of nanoparticle-broadened x-ray line profiles
Armstrong, N; Cline, J P; Bonevich, J
2003-01-01
A single and self-contained method for determining the crystallite-size distribution and shape from experimental x-ray line profile data is presented. We have shown that the crystallite-size distribution can be determined without assuming a functional form for the size distribution, determining instead the size distribution with the least assumptions by applying the Bayesian/MaxEnt method. The Bayesian/MaxEnt method is tested using both simulated and experimental CeO$_{2}$ data. The results demonstrate that the proposed method can determine size distributions, while making the least number of assumptions. The comparison of the Bayesian/MaxEnt results from experimental CeO$_2$ with TEM results is favorable
Tang, An-Min; Tang, Nian-Sheng
2015-02-28
We propose a semiparametric multivariate skew-normal joint model for multivariate longitudinal and multivariate survival data. One main feature of the posited model is that we relax the commonly used normality assumption for random effects and within-subject error by using a centered Dirichlet process prior to specify the random effects distribution and using a multivariate skew-normal distribution to specify the within-subject error distribution and model trajectory functions of longitudinal responses semiparametrically. A Bayesian approach is proposed to simultaneously obtain Bayesian estimates of unknown parameters, random effects and nonparametric functions by combining the Gibbs sampler and the Metropolis-Hastings algorithm. Particularly, a Bayesian local influence approach is developed to assess the effect of minor perturbations to within-subject measurement error and random effects. Several simulation studies and an example are presented to illustrate the proposed methodologies. PMID:25404574
Emery, A. F.; Valenti, E.; Bardot, D.
2007-01-01
Parameter estimation is generally based upon the maximum likelihood approach and often involves regularization. Typically it is desired that the results be unbiased and of minimum variance. However, it is often better to accept biased estimates that have minimum mean square error. Bayesian inference is an attractive approach that achieves this goal and incorporates regularization automatically. More importantly, it permits us to analyse experiments in which both the system response and the independent variables (time, sensor position, experimental conditions, etc) are corrupted by noise and in which the model includes nuisance variables. This paper describes the use of Bayesian inference for an apparently simple experiment which is, in fact, fundamentally difficult and is compounded by a nuisance variable. By presenting this analysis we hope that members of the inverse community will see the value of applying Bayesian inference.
Bayesian networks in educational assessment
Almond, Russell G; Steinberg, Linda S; Yan, Duanli; Williamson, David M
2015-01-01
Bayesian inference networks, a synthesis of statistics and expert systems, have advanced reasoning under uncertainty in medicine, business, and social sciences. This innovative volume is the first comprehensive treatment exploring how they can be applied to design and analyze innovative educational assessments. Part I develops Bayes nets’ foundations in assessment, statistics, and graph theory, and works through the real-time updating algorithm. Part II addresses parametric forms for use with assessment, model-checking techniques, and estimation with the EM algorithm and Markov chain Monte Carlo (MCMC). A unique feature is the volume’s grounding in Evidence-Centered Design (ECD) framework for assessment design. This “design forward” approach enables designers to take full advantage of Bayes nets’ modularity and ability to model complex evidentiary relationships that arise from performance in interactive, technology-rich assessments such as simulations. Part III describes ECD, situates Bayes nets as ...
From least squares to multilevel modeling: A graphical introduction to Bayesian inference
Loredo, Thomas J.
2016-01-01
This tutorial presentation will introduce some of the key ideas and techniques involved in applying Bayesian methods to problems in astrostatistics. The focus will be on the big picture: understanding the foundations (interpreting probability, Bayes's theorem, the law of total probability and marginalization), making connections to traditional methods (propagation of errors, least squares, chi-squared, maximum likelihood, Monte Carlo simulation), and highlighting problems where a Bayesian approach can be particularly powerful (Poisson processes, density estimation and curve fitting with measurement error). The "graphical" component of the title reflects an emphasis on pictorial representations of some of the math, but also on the use of graphical models (multilevel or hierarchical models) for analyzing complex data. Code for some examples from the talk will be available to participants, in Python and in the Stan probabilistic programming language.
Ge, L.; Asseldonk, van, N.; Valeeva, N.I.; Hennen, W.H.G.J.; Bergevoet, R.H.M.
2011-01-01
Efficient policy intervention to reduce antibiotic use in livestock production requires knowledge about the rationale underlying antibiotic usage. Animal health status and management quality are considered the two most important factors that influence farmersâ¿¿ decision-making concerning antibiotic use. Information on these two factors is therefore crucial in designing incentive mechanisms. In this paper, a Bayesian belief network (BBN) is built to represent the knowledge on how these factor...
Safner, T.; Miller, M.P.; McRae, B.H.; Fortin, M.-J.; Manel, S.
2011-01-01
Recently, techniques available for identifying clusters of individuals or boundaries between clusters using genetic data from natural populations have expanded rapidly. Consequently, there is a need to evaluate these different techniques. We used spatially-explicit simulation models to compare three spatial Bayesian clustering programs and two edge detection methods. Spatially-structured populations were simulated where a continuous population was subdivided by barriers. We evaluated the ability of each method to correctly identify boundary locations while varying: (i) time after divergence, (ii) strength of isolation by distance, (iii) level of genetic diversity, and (iv) amount of gene flow across barriers. To further evaluate the methods' effectiveness to detect genetic clusters in natural populations, we used previously published data on North American pumas and a European shrub. Our results show that with simulated and empirical data, the Bayesian spatial clustering algorithms outperformed direct edge detection methods. All methods incorrectly detected boundaries in the presence of strong patterns of isolation by distance. Based on this finding, we support the application of Bayesian spatial clustering algorithms for boundary detection in empirical datasets, with necessary tests for the influence of isolation by distance. ?? 2011 by the authors; licensee MDPI, Basel, Switzerland.
Adaptive multiscale MCMC algorithm for uncertainty quantification in seismic parameter estimation
Tan, Xiaosi
2014-08-05
Formulating an inverse problem in a Bayesian framework has several major advantages (Sen and Stoffa, 1996). It allows finding multiple solutions subject to flexible a priori information and performing uncertainty quantification in the inverse problem. In this paper, we consider Bayesian inversion for the parameter estimation in seismic wave propagation. The Bayes\\' theorem allows writing the posterior distribution via the likelihood function and the prior distribution where the latter represents our prior knowledge about physical properties. One of the popular algorithms for sampling this posterior distribution is Markov chain Monte Carlo (MCMC), which involves making proposals and calculating their acceptance probabilities. However, for large-scale problems, MCMC is prohibitevely expensive as it requires many forward runs. In this paper, we propose a multilevel MCMC algorithm that employs multilevel forward simulations. Multilevel forward simulations are derived using Generalized Multiscale Finite Element Methods that we have proposed earlier (Efendiev et al., 2013a; Chung et al., 2013). Our overall Bayesian inversion approach provides a substantial speed-up both in the process of the sampling via preconditioning using approximate posteriors and the computation of the forward problems for different proposals by using the adaptive nature of multiscale methods. These aspects of the method are discussed n the paper. This paper is motivated by earlier work of M. Sen and his collaborators (Hong and Sen, 2007; Hong, 2008) who proposed the development of efficient MCMC techniques for seismic applications. In the paper, we present some preliminary numerical results.
Linden, Daniel W; Roloff, Gary J
2015-08-01
Pilot studies are often used to design short-term research projects and long-term ecological monitoring programs, but data are sometimes discarded when they do not match the eventual survey design. Bayesian hierarchical modeling provides a convenient framework for integrating multiple data sources while explicitly separating sample variation into observation and ecological state processes. Such an approach can better estimate state uncertainty and improve inferences from short-term studies in dynamic systems. We used a dynamic multistate occupancy model to estimate the probabilities of occurrence and nesting for white-headed woodpeckers Picoides albolarvatus in recent harvest units within managed forests of northern California, USA. Our objectives were to examine how occupancy states and state transitions were related to forest management practices, and how the probabilities changed over time. Using Gibbs variable selection, we made inferences using multiple model structures and generated model-averaged estimates. Probabilities of white-headed woodpecker occurrence and nesting were high in 2009 and 2010, and the probability that nesting persisted at a site was positively related to the snag density in harvest units. Prior-year nesting resulted in higher probabilities of subsequent occurrence and nesting. We demonstrate the benefit of forest management practices that increase the density of retained snags in harvest units for providing white-headed woodpecker nesting habitat. While including an additional year of data from our pilot study did not drastically alter management recommendations, it changed the interpretation of the mechanism behind the observed dynamics. Bayesian hierarchical modeling has the potential to maximize the utility of studies based on small sample sizes while fully accounting for measurement error and both estimation and model uncertainty, thereby improving the ability of observational data to inform conservation and management strategies
Bayesian inference for data assimilation using Least-Squares Finite Element methods
International Nuclear Information System (INIS)
It has recently been observed that Least-Squares Finite Element methods (LS-FEMs) can be used to assimilate experimental data into approximations of PDEs in a natural way, as shown by Heyes et al. in the case of incompressible Navier-Stokes flow. The approach was shown to be effective without regularization terms, and can handle substantial noise in the experimental data without filtering. Of great practical importance is that - unlike other data assimilation techniques - it is not significantly more expensive than a single physical simulation. However the method as presented so far in the literature is not set in the context of an inverse problem framework, so that for example the meaning of the final result is unclear. In this paper it is shown that the method can be interpreted as finding a maximum a posteriori (MAP) estimator in a Bayesian approach to data assimilation, with normally distributed observational noise, and a Bayesian prior based on an appropriate norm of the governing equations. In this setting the method may be seen to have several desirable properties: most importantly discretization and modelling error in the simulation code does not affect the solution in limit of complete experimental information, so these errors do not have to be modelled statistically. Also the Bayesian interpretation better justifies the choice of the method, and some useful generalizations become apparent. The technique is applied to incompressible Navier-Stokes flow in a pipe with added velocity data, where its effectiveness, robustness to noise, and application to inverse problems is demonstrated.
DEFF Research Database (Denmark)
Møller, Jesper; Rasmussen, Jakob Gulddahl
We introduce a flexible spatial point process model for spatial point patterns exhibiting linear structures, without incorporating a latent line process. The model is given by an underlying sequential point process model, i.e. each new point is generated given the previous points. Under this model...... previous points is such that the dependent cluster point is likely to occur closely to a previous cluster point. We demonstrate the flexibility of the model for producing point patterns with linear structures, and propose to use the model as the likelihood in a Bayesian setting when analyzing a spatial...
Directory of Open Access Journals (Sweden)
LiMin Wang
2013-01-01
Full Text Available The problem of extracting knowledge from a relational database for probabilistic reasoning is still unsolved. On the basis of a three-phase learning framework, we propose the integration of a Bayesian network (BN with the functional dependency (FD discovery technique. Association rule analysis is employed to discover FDs and expert knowledge encoded within a BN; that is, key relationships between attributes are emphasized. Moreover, the BN can be updated by using an expert-driven annotation process wherein redundant nodes and edges are removed. Experimental results show the effectiveness and efficiency of the proposed approach.
Martin, Summer L; Stohs, Stephen M; Moore, Jeffrey E
2015-03-01
Fisheries bycatch is a global threat to marine megafauna. Environmental laws require bycatch assessment for protected species, but this is difficult when bycatch is rare. Low bycatch rates, combined with low observer coverage, may lead to biased, imprecise estimates when using standard ratio estimators. Bayesian model-based approaches incorporate uncertainty, produce less volatile estimates, and enable probabilistic evaluation of estimates relative to management thresholds. Here, we demonstrate a pragmatic decision-making process that uses Bayesian model-based inferences to estimate the probability of exceeding management thresholds for bycatch in fisheries with model rates of rare-event bycatch and mortality using Bayesian Markov chain Monte Carlo estimation methods and 20 years of observer data; (2) predict unobserved counts of bycatch and mortality; (3) infer expected annual mortality; (4) determine probabilities of mortality exceeding regulatory thresholds; and (5) classify the fishery as having low, medium, or high bycatch impact using those probabilities. We focused on leatherback sea turtles (Dermochelys coriacea) and humpback whales (Megaptera novaeangliae). Candidate models included Poisson or zero-inflated Poisson likelihood, fishing effort, and a bycatch rate that varied with area, time, or regulatory regime. Regulatory regime had the strongest effect on leatherback bycatch, with the highest levels occurring prior to a regulatory change. Area had the strongest effect on humpback bycatch. Cumulative bycatch estimates for the 20-year period were 104-242 leatherbacks (52-153 deaths) and 6-50 humpbacks (0-21 deaths). The probability of exceeding a regulatory threshold under the U.S. Marine Mammal Protection Act (Potential Biological Removal, PBR) of 0.113 humpback deaths was 0.58, warranting a "medium bycatch impact" classification of the fishery. No PBR thresholds exist for leatherbacks, but the probability of exceeding an anticipated level of two deaths
Propriety conditions for the Bayesian autologistic model – Inference for histone modifications
Mitra, Riten; Müller, Peter; Ji, Yuan
2013-01-01
Motivated by inference for a set of histone modifications we consider an improper prior for an autologistic model. We state sufficient conditions for posterior propriety under a constant prior on the coefficients of an autologistic model. We use known results for a multinomial logistic regression to prove posterior propriety under the autologistic model. The conditions are easily verified.
Bayesian Inference in Hidden Markov Random Fields for Binary Data Defined on Large Lattices
Friel, N.; Pettitt, A.N.; Reeves, R.; Wit, E.
2009-01-01
Hidden Markov random fields represent a complex hierarchical model, where the hidden latent process is an undirected graphical structure. Performing inference for such models is difficult primarily because the likelihood of the hidden states is often unavailable. The main contribution of this articl
Energy Technology Data Exchange (ETDEWEB)
Marzouk, Youssef; Fast P. (Lawrence Livermore National Laboratory, Livermore, CA); Kraus, M. (Peterson AFB, CO); Ray, J. P.
2006-01-01
Terrorist attacks using an aerosolized pathogen preparation have gained credibility as a national security concern after the anthrax attacks of 2001. The ability to characterize such attacks, i.e., to estimate the number of people infected, the time of infection, and the average dose received, is important when planning a medical response. We address this question of characterization by formulating a Bayesian inverse problem predicated on a short time-series of diagnosed patients exhibiting symptoms. To be of relevance to response planning, we limit ourselves to 3-5 days of data. In tests performed with anthrax as the pathogen, we find that these data are usually sufficient, especially if the model of the outbreak used in the inverse problem is an accurate one. In some cases the scarcity of data may initially support outbreak characterizations at odds with the true one, but with sufficient data the correct inferences are recovered; in other words, the inverse problem posed and its solution methodology are consistent. We also explore the effect of model error-situations for which the model used in the inverse problem is only a partially accurate representation of the outbreak; here, the model predictions and the observations differ by more than a random noise. We find that while there is a consistent discrepancy between the inferred and the true characterizations, they are also close enough to be of relevance when planning a response.
Blanc, Guillermo A; Vogt, Frédéric P A; Dopita, Michael A
2014-01-01
We present a new method for inferring the metallicity (Z) and ionization parameter (q) of HII regions and star-forming galaxies using strong nebular emission lines (SEL). We use Bayesian inference to derive the joint and marginalized posterior probability density functions for Z and q given a set of observed line fluxes and an input photo-ionization model. Our approach allows the use of arbitrary sets of SELs and the inclusion of flux upper limits. The method provides a self-consistent way of determining the physical conditions of ionized nebulae that is not tied to the arbitrary choice of a particular SEL diagnostic and uses all the available information. Unlike theoretically calibrated SEL diagnostics the method is flexible and not tied to a particular photo-ionization model. We describe our algorithm, validate it against other methods, and present a tool that implements it called IZI. Using a sample of nearby extra-galactic HII regions we assess the performance of commonly used SEL abundance diagnostics. W...
Inversion of hierarchical Bayesian models using Gaussian processes.
Lomakina, Ekaterina I; Paliwal, Saee; Diaconescu, Andreea O; Brodersen, Kay H; Aponte, Eduardo A; Buhmann, Joachim M; Stephan, Klaas E
2015-09-01
Over the past decade, computational approaches to neuroimaging have increasingly made use of hierarchical Bayesian models (HBMs), either for inferring on physiological mechanisms underlying fMRI data (e.g., dynamic causal modelling, DCM) or for deriving computational trajectories (from behavioural data) which serve as regressors in general linear models. However, an unresolved problem is that standard methods for inverting the hierarchical Bayesian model are either very slow, e.g. Markov Chain Monte Carlo Methods (MCMC), or are vulnerable to local minima in non-convex optimisation problems, such as variational Bayes (VB). This article considers Gaussian process optimisation (GPO) as an alternative approach for global optimisation of sufficiently smooth and efficiently evaluable objective functions. GPO avoids being trapped in local extrema and can be computationally much more efficient than MCMC. Here, we examine the benefits of GPO for inverting HBMs commonly used in neuroimaging, including DCM for fMRI and the Hierarchical Gaussian Filter (HGF). Importantly, to achieve computational efficiency despite high-dimensional optimisation problems, we introduce a novel combination of GPO and local gradient-based search methods. The utility of this GPO implementation for DCM and HGF is evaluated against MCMC and VB, using both synthetic data from simulations and empirical data. Our results demonstrate that GPO provides parameter estimates with equivalent or better accuracy than the other techniques, but at a fraction of the computational cost required for MCMC. We anticipate that GPO will prove useful for robust and efficient inversion of high-dimensional and nonlinear models of neuroimaging data. PMID:26048619
MCMC with Strings and Branes: The Suburban Algorithm
Heckman, Jonathan J; Vigoda, Ben
2016-01-01
Motivated by the physics of strings and branes, we introduce a general suite of Markov chain Monte Carlo (MCMC) "suburban samplers" (i.e., spread out Metropolis). The suburban algorithm involves an ensemble of statistical agents connected together by a random network. Performance of the collective in reaching a fast and accurate inference depends primarily on the average number of nearest neighbor connections. Increasing the average number of neighbors above zero initially leads to an increase in performance, though there is a critical connectivity with effective dimension d_eff ~ 1, above which "groupthink" takes over, and the performance of the sampler declines.
Stochastic model updating utilizing Bayesian approach and Gaussian process model
Wan, Hua-Ping; Ren, Wei-Xin
2016-03-01
Stochastic model updating (SMU) has been increasingly applied in quantifying structural parameter uncertainty from responses variability. SMU for parameter uncertainty quantification refers to the problem of inverse uncertainty quantification (IUQ), which is a nontrivial task. Inverse problem solved with optimization usually brings about the issues of gradient computation, ill-conditionedness, and non-uniqueness. Moreover, the uncertainty present in response makes the inverse problem more complicated. In this study, Bayesian approach is adopted in SMU for parameter uncertainty quantification. The prominent strength of Bayesian approach for IUQ problem is that it solves IUQ problem in a straightforward manner, which enables it to avoid the previous issues. However, when applied to engineering structures that are modeled with a high-resolution finite element model (FEM), Bayesian approach is still computationally expensive since the commonly used Markov chain Monte Carlo (MCMC) method for Bayesian inference requires a large number of model runs to guarantee the convergence. Herein we reduce computational cost in two aspects. On the one hand, the fast-running Gaussian process model (GPM) is utilized to approximate the time-consuming high-resolution FEM. On the other hand, the advanced MCMC method using delayed rejection adaptive Metropolis (DRAM) algorithm that incorporates local adaptive strategy with global adaptive strategy is employed for Bayesian inference. In addition, we propose the use of the powerful variance-based global sensitivity analysis (GSA) in parameter selection to exclude non-influential parameters from calibration parameters, which yields a reduced-order model and thus further alleviates the computational burden. A simulated aluminum plate and a real-world complex cable-stayed pedestrian bridge are presented to illustrate the proposed framework and verify its feasibility.
BayesWave: Bayesian Inference for Gravitational Wave Bursts and Instrument Glitches
Cornish, Neil J
2014-01-01
A central challenge in Gravitational Wave Astronomy is identifying weak signals in the presence of non-stationary and non-Gaussian noise. The separation of gravitational wave signals from noise requires good models for both. When accurate signal models are available, such as for binary Neutron star systems, it is possible to make robust detection statements even when the noise is poorly understood. In contrast, searches for "un-modeled" transient signals are strongly impacted by the methods used to characterize the noise. Here we take a Bayesian approach and introduce a multi-component, variable dimension, parameterized noise model that explicitly accounts for non-stationarity and non-Gaussianity in data from interferometric gravitational wave detectors. Instrumental transients (glitches) and burst sources of gravitational waves are modeled using a Morlet-Gabor continuous wavelet basis. The number and placement of the wavelets is determined by a trans-dimensional Reversible Jump Markov Chain Monte Carlo algor...
Bayesian Inference for LISA Pathfinder using Markov Chain Monte Carlo Methods
Ferraioli, Luigi; Plagnol, Eric
2012-01-01
We present a parameter estimation procedure based on a Bayesian framework by applying a Markov Chain Monte Carlo algorithm to the calibration of the dynamical parameters of a space based gravitational wave detector. The method is based on the Metropolis-Hastings algorithm and a two-stage annealing treatment in order to ensure an effective exploration of the parameter space at the beginning of the chain. We compare two versions of the algorithm with an application to a LISA Pathfinder data analysis problem. The two algorithms share the same heating strategy but with one moving in coordinate directions using proposals from a multivariate Gaussian distribution, while the other uses the natural logarithm of some parameters and proposes jumps in the eigen-space of the Fisher Information matrix. The algorithm proposing jumps in the eigen-space of the Fisher Information matrix demonstrates a higher acceptance rate and a slightly better convergence towards the equilibrium parameter distributions in the application to...
Multi-Pitch Estimation and Tracking Using Bayesian Inference in Block Sparsity
DEFF Research Database (Denmark)
Karimian-Azari, Sam; Jakobsson, Andreas; Jensen, Jesper Rindom;
2015-01-01
In this paper, we consider the problem of multi-pitch estimation and tracking of an unknown number of harmonic audio sources. The regularized least-squares is a solution for simultaneous sparse source selection and parameter estimation. Exploiting block sparsity, the method allows for reliable...... tracking of the found sources, without posing detailed a priori assumptions of the number of harmonics for each source. The method incorporates a Bayesian prior and assigns data-dependent regularization coefficients to efficiently incorporate both earlier and future data blocks in the tracking of estimates....... In comparison with fix regularization coefficients, the simulation results, using both real and synthetic audio signals, confirm the performance of the proposed method....
Bayesian inference of T Tauri star properties using multi-wavelength survey photometry
Barentsen, Geert; Vink, J. S.; Drew, J. E.; Sale, S. E.
2013-03-01
There are many pertinent open issues in the area of star and planet formation. Large statistical samples of young stars across star-forming regions are needed to trigger a breakthrough in our understanding, but most optical studies are based on a wide variety of spectrographs and analysis methods, which introduces large biases. Here we show how graphical Bayesian networks can be employed to construct a hierarchical probabilistic model which allows pre-main-sequence ages, masses, accretion rates and extinctions to be estimated using two widely available photometric survey data bases (Isaac Newton Telescope Photometric Hα Survey r'/Hα/i' and Two Micron All Sky Survey J-band magnitudes). Because our approach does not rely on spectroscopy, it can easily be applied to ho-mogeneously study the large number of clusters for which Gaia will yield membership lists. We explain how the analysis is carried out using the Markov chain Monte Carlo method and provide PYTHON source code. We then demonstrate its use on 587 known low-mass members of the star-forming region NGC 2264 (Cone Nebula), arriving at a median age of 3.0 Myr, an accretion fraction of 20 ± 2 per cent and a median accretion rate of 10-8.4 M⊙ yr-1. The Bayesian analysis formulated in this work delivers results which are in agreement with spectroscopic studies already in the literature, but achieves this with great efficiency by depending only on photometry. It is a significant step forward from previous photometric studies because the probabilistic approach ensures that nuisance parameters, such as extinction and distance, are fully included in the analysis with a clear picture on any degeneracies.
Bayesian large-scale structure inference: initial conditions and the cosmic web
Leclercq, Florent
2014-01-01
We describe an innovative statistical approach for the ab initio simultaneous analysis of the formation history and morphology of the large-scale structure of the inhomogeneous Universe. Our algorithm explores the joint posterior distribution of the many millions of parameters involved via efficient Hamiltonian Markov Chain Monte Carlo sampling. We describe its application to the Sloan Digital Sky Survey data release 7 and an additional non-linear filtering step. We illustrate the use of our findings for cosmic web analysis: identification of structures via tidal shear analysis and inference of dark matter voids.
Zubillaga, María; Skewes, Oscar; Soto, Nicolás; Rabinovich, Jorge E.; Colchero, Fernando
2014-01-01
Understanding the mechanisms that drive population dynamics is fundamental for management of wild populations. The guanaco (Lama guanicoe) is one of two wild camelid species in South America. We evaluated the effects of density dependence and weather variables on population regulation based on a time series of 36 years of population sampling of guanacos in Tierra del Fuego, Chile. The population density varied between 2.7 and 30.7 guanaco/km2, with an apparent monotonic growth during the first 25 years; however, in the last 10 years the population has shown large fluctuations, suggesting that it might have reached its carrying capacity. We used a Bayesian state-space framework and model selection to determine the effect of density and environmental variables on guanaco population dynamics. Our results show that the population is under density dependent regulation and that it is currently fluctuating around an average carrying capacity of 45,000 guanacos. We also found a significant positive effect of previous winter temperature while sheep density has a strong negative effect on the guanaco population growth. We conclude that there are significant density dependent processes and that climate as well as competition with domestic species have important effects determining the population size of guanacos, with important implications for management and conservation. PMID:25514510
Park, Taeyoung; Jeong, Jong-Hyeon; Lee, Jae Won
2012-08-15
There is often an interest in estimating a residual life function as a summary measure of survival data. For ease in presentation of the potential therapeutic effect of a new drug, investigators may summarize survival data in terms of the remaining life years of patients. Under heavy right censoring, however, some reasonably high quantiles (e.g., median) of a residual lifetime distribution cannot be always estimated via a popular nonparametric approach on the basis of the Kaplan-Meier estimator. To overcome the difficulties in dealing with heavily censored survival data, this paper develops a Bayesian nonparametric approach that takes advantage of a fully model-based but highly flexible probabilistic framework. We use a Dirichlet process mixture of Weibull distributions to avoid strong parametric assumptions on the unknown failure time distribution, making it possible to estimate any quantile residual life function under heavy censoring. Posterior computation through Markov chain Monte Carlo is straightforward and efficient because of conjugacy properties and partial collapse. We illustrate the proposed methods by using both simulated data and heavily censored survival data from a recent breast cancer clinical trial conducted by the National Surgical Adjuvant Breast and Bowel Project. PMID:22437758
Directory of Open Access Journals (Sweden)
María Zubillaga
Full Text Available Understanding the mechanisms that drive population dynamics is fundamental for management of wild populations. The guanaco (Lama guanicoe is one of two wild camelid species in South America. We evaluated the effects of density dependence and weather variables on population regulation based on a time series of 36 years of population sampling of guanacos in Tierra del Fuego, Chile. The population density varied between 2.7 and 30.7 guanaco/km2, with an apparent monotonic growth during the first 25 years; however, in the last 10 years the population has shown large fluctuations, suggesting that it might have reached its carrying capacity. We used a Bayesian state-space framework and model selection to determine the effect of density and environmental variables on guanaco population dynamics. Our results show that the population is under density dependent regulation and that it is currently fluctuating around an average carrying capacity of 45,000 guanacos. We also found a significant positive effect of previous winter temperature while sheep density has a strong negative effect on the guanaco population growth. We conclude that there are significant density dependent processes and that climate as well as competition with domestic species have important effects determining the population size of guanacos, with important implications for management and conservation.
Fajardo, Alvaro; Soñora, Martín; Moreno, Pilar; Moratorio, Gonzalo; Cristina, Juan
2016-10-01
Zika virus (ZIKV) is a member of the family Flaviviridae. In 2015, ZIKV triggered an epidemic in Brazil and spread across Latin America. By May of 2016, the World Health Organization warns over spread of ZIKV beyond this region. Detailed studies on the mode of evolution of ZIKV strains are extremely important for our understanding of the emergence and spread of ZIKV populations. In order to gain insight into these matters, a Bayesian coalescent Markov Chain Monte Carlo analysis of complete genome sequences of recently isolated ZIKV strains was performed. The results of these studies revealed a mean rate of evolution of 1.20 × 10(-3) nucleotide substitutions per site per year (s/s/y) for ZIKV strains enrolled in this study. Several variants isolated in China are grouped together with all strains isolated in Latin America. Another genetic group composed exclusively by Chinese strains were also observed, suggesting the co-circulation of different genetic lineages in China. These findings indicate a high level of diversification of ZIKV populations. Strains isolated from microcephaly cases do not share amino acid substitutions, suggesting that other factors besides viral genetic differences may play a role for the proposed pathogenesis caused by ZIKV infection. J. Med. Virol. 88:1672-1676, 2016. © 2016 Wiley Periodicals, Inc. PMID:27278855
Systematic validation of non-equilibrium thermochemical models using Bayesian inference
Energy Technology Data Exchange (ETDEWEB)
Miki, Kenji [NASA Glenn Research Center, OAI, 22800 Cedar Point Rd, Cleveland, OH 44142 (United States); Panesi, Marco, E-mail: mpanesi@illinois.edu [Department of Aerospace Engineering, University of Illinois at Urbana-Champaign, 306 Talbot Lab, 104 S. Wright St., Urbana, IL 61801 (United States); Prudhomme, Serge [Département de mathématiques et de génie industriel, Ecole Polytechnique de Montréal, C.P. 6079, succ. Centre-ville, Montréal, QC, H3C 3A7 (Canada)
2015-10-01
The validation process proposed by Babuška et al. [1] is applied to thermochemical models describing post-shock flow conditions. In this validation approach, experimental data is involved only in the calibration of the models, and the decision process is based on quantities of interest (QoIs) predicted on scenarios that are not necessarily amenable experimentally. Moreover, uncertainties present in the experimental data, as well as those resulting from an incomplete physical model description, are propagated to the QoIs. We investigate four commonly used thermochemical models: a one-temperature model (which assumes thermal equilibrium among all inner modes), and two-temperature models developed by Macheret et al. [2], Marrone and Treanor [3], and Park [4]. Up to 16 uncertain parameters are estimated using Bayesian updating based on the latest absolute volumetric radiance data collected at the Electric Arc Shock Tube (EAST) installed inside the NASA Ames Research Center. Following the solution of the inverse problems, the forward problems are solved in order to predict the radiative heat flux, QoI, and examine the validity of these models. Our results show that all four models are invalid, but for different reasons: the one-temperature model simply fails to reproduce the data while the two-temperature models exhibit unacceptably large uncertainties in the QoI predictions.
Systematic validation of non-equilibrium thermochemical models using Bayesian inference
Miki, Kenji
2015-10-01
© 2015 Elsevier Inc. The validation process proposed by Babuška et al. [1] is applied to thermochemical models describing post-shock flow conditions. In this validation approach, experimental data is involved only in the calibration of the models, and the decision process is based on quantities of interest (QoIs) predicted on scenarios that are not necessarily amenable experimentally. Moreover, uncertainties present in the experimental data, as well as those resulting from an incomplete physical model description, are propagated to the QoIs. We investigate four commonly used thermochemical models: a one-temperature model (which assumes thermal equilibrium among all inner modes), and two-temperature models developed by Macheret et al. [2], Marrone and Treanor [3], and Park [4]. Up to 16 uncertain parameters are estimated using Bayesian updating based on the latest absolute volumetric radiance data collected at the Electric Arc Shock Tube (EAST) installed inside the NASA Ames Research Center. Following the solution of the inverse problems, the forward problems are solved in order to predict the radiative heat flux, QoI, and examine the validity of these models. Our results show that all four models are invalid, but for different reasons: the one-temperature model simply fails to reproduce the data while the two-temperature models exhibit unacceptably large uncertainties in the QoI predictions.
Coping with Trial-to-Trial Variability of Event Related Signals: A Bayesian Inference Approach
Ding, Mingzhou; Chen, Youghong; Knuth, Kevin H.; Bressler, Steven L.; Schroeder, Charles E.
2005-01-01
In electro-neurophysiology, single-trial brain responses to a sensory stimulus or a motor act are commonly assumed to result from the linear superposition of a stereotypic event-related signal (e.g. the event-related potential or ERP) that is invariant across trials and some ongoing brain activity often referred to as noise. To extract the signal, one performs an ensemble average of the brain responses over many identical trials to attenuate the noise. To date, h s simple signal-plus-noise (SPN) model has been the dominant approach in cognitive neuroscience. Mounting empirical evidence has shown that the assumptions underlying this model may be overly simplistic. More realistic models have been proposed that account for the trial-to-trial variability of the event-related signal as well as the possibility of multiple differentially varying components within a given ERP waveform. The variable-signal-plus-noise (VSPN) model, which has been demonstrated to provide the foundation for separation and characterization of multiple differentially varying components, has the potential to provide a rich source of information for questions related to neural functions that complement the SPN model. Thus, being able to estimate the amplitude and latency of each ERP component on a trial-by-trial basis provides a critical link between the perceived benefits of the VSPN model and its many concrete applications. In this paper we describe a Bayesian approach to deal with this issue and the resulting strategy is referred to as the differentially Variable Component Analysis (dVCA). We compare the performance of dVCA on simulated data with Independent Component Analysis (ICA) and analyze neurobiological recordings from monkeys performing cognitive tasks.
Final Report: Large-Scale Optimization for Bayesian Inference in Complex Systems
Energy Technology Data Exchange (ETDEWEB)
Ghattas, Omar [The University of Texas at Austin
2013-10-15
The SAGUARO (Scalable Algorithms for Groundwater Uncertainty Analysis and Robust Optimiza- tion) Project focuses on the development of scalable numerical algorithms for large-scale Bayesian inversion in complex systems that capitalize on advances in large-scale simulation-based optimiza- tion and inversion methods. Our research is directed in three complementary areas: efficient approximations of the Hessian operator, reductions in complexity of forward simulations via stochastic spectral approximations and model reduction, and employing large-scale optimization concepts to accelerate sampling. Our efforts are integrated in the context of a challenging testbed problem that considers subsurface reacting flow and transport. The MIT component of the SAGUARO Project addresses the intractability of conventional sampling methods for large-scale statistical inverse problems by devising reduced-order models that are faithful to the full-order model over a wide range of parameter values; sampling then employs the reduced model rather than the full model, resulting in very large computational savings. Results indicate little effect on the computed posterior distribution. On the other hand, in the Texas-Georgia Tech component of the project, we retain the full-order model, but exploit inverse problem structure (adjoint-based gradients and partial Hessian information of the parameter-to- observation map) to implicitly extract lower dimensional information on the posterior distribution; this greatly speeds up sampling methods, so that fewer sampling points are needed. We can think of these two approaches as "reduce then sample" and "sample then reduce." In fact, these two approaches are complementary, and can be used in conjunction with each other. Moreover, they both exploit deterministic inverse problem structure, in the form of adjoint-based gradient and Hessian information of the underlying parameter-to-observation map, to achieve their speedups.
Dead or gone? Bayesian inference on mortality for the dispersing sex.
Barthold, Julia A; Packer, Craig; Loveridge, Andrew J; Macdonald, David W; Colchero, Fernando
2016-07-01
Estimates of age-specific mortality are regularly used in ecology, evolution, and conservation research. However, estimating mortality of the dispersing sex, in species where one sex undergoes natal dispersal, is difficult. This is because it is often unclear whether members of the dispersing sex that disappear from monitored areas have died or dispersed. Here, we develop an extension of a multievent model that imputes dispersal state (i.e., died or dispersed) for uncertain records of the dispersing sex as a latent state and estimates age-specific mortality and dispersal parameters in a Bayesian hierarchical framework. To check the performance of our model, we first conduct a simulation study. We then apply our model to a long-term data set of African lions. Using these data, we further study how well our model estimates mortality of the dispersing sex by incrementally reducing the level of uncertainty in the records of male lions. We achieve this by taking advantage of an expert's indication on the likely fate of each missing male (i.e., likely died or dispersed). We find that our model produces accurate mortality estimates for simulated data of varying sample sizes and proportions of uncertain male records. From the empirical study, we learned that our model provides similar mortality estimates for different levels of uncertainty in records. However, a sensitivity of the mortality estimates to varying uncertainty is, as can be expected, detectable. We conclude that our model provides a solution to the challenge of estimating mortality of the dispersing sex in species with data deficiency due to natal dispersal. Given the utility of sex-specific mortality estimates in biological and conservation research, and the virtual ubiquity of sex-biased dispersal, our model will be useful to a wide variety of applications. PMID:27547322
Bayesian inference of baseline fertility and treatment effects via a crop yield-fertility model.
Chen, Hungyen; Yamagishi, Junko; Kishino, Hirohisa
2014-01-01
To effectively manage soil fertility, knowledge is needed of how a crop uses nutrients from fertilizer applied to the soil. Soil quality is a combination of biological, chemical and physical properties and is hard to assess directly because of collective and multiple functional effects. In this paper, we focus on the application of these concepts to agriculture. We define the baseline fertility of soil as the level of fertility that a crop can acquire for growth from the soil. With this strict definition, we propose a new crop yield-fertility model that enables quantification of the process of improving baseline fertility and the effects of treatments solely from the time series of crop yields. The model was modified from Michaelis-Menten kinetics and measured the additional effects of the treatments given the baseline fertility. Using more than 30 years of experimental data, we used the Bayesian framework to estimate the improvements in baseline fertility and the effects of fertilizer and farmyard manure (FYM) on maize (Zea mays), barley (Hordeum vulgare), and soybean (Glycine max) yields. Fertilizer contributed the most to the barley yield and FYM contributed the most to the soybean yield among the three crops. The baseline fertility of the subsurface soil was very low for maize and barley prior to fertilization. In contrast, the baseline fertility in this soil approximated half-saturated fertility for the soybean crop. The long-term soil fertility was increased by adding FYM, but the effect of FYM addition was reduced by the addition of fertilizer. Our results provide evidence that long-term soil fertility under continuous farming was maintained, or increased, by the application of natural nutrients compared with the application of synthetic fertilizer. PMID:25405353
Akutekwe, Arinze; Seker, Huseyin
2015-08-01
Comprehensive understanding of gene regulatory networks (GRNs) is a major challenge in systems biology. Most methods for modeling and inferring the dynamics of GRNs, such as those based on state space models, vector autoregressive models and G1DBN algorithm, assume linear dependencies among genes. However, this strong assumption does not make for true representation of time-course relationships across the genes, which are inherently nonlinear. Nonlinear modeling methods such as the S-systems and causal structure identification (CSI) have been proposed, but are known to be statistically inefficient and analytically intractable in high dimensions. To overcome these limitations, we propose an optimized ensemble approach based on support vector regression (SVR) and dynamic Bayesian networks (DBNs). The method called SVR-DBN, uses nonlinear kernels of the SVR to infer the temporal relationships among genes within the DBN framework. The two-stage ensemble is further improved by SVR parameter optimization using Particle Swarm Optimization. Results on eight insilico-generated datasets, and two real world datasets of Drosophila Melanogaster and Escherichia Coli, show that our method outperformed the G1DBN algorithm by a total average accuracy of 12%. We further applied our method to model the time-course relationships of ovarian carcinoma. From our results, four hub genes were discovered. Stratified analysis further showed that the expression levels Prostrate differentiation factor and BTG family member 2 genes, were significantly increased by the cisplatin and oxaliplatin platinum drugs; while expression levels of Polo-like kinase and Cyclin B1 genes, were both decreased by the platinum drugs. These hub genes might be potential biomarkers for ovarian carcinoma. PMID:26738192
Pardo, Mario A; Gerrodette, Tim; Beier, Emilio; Gendron, Diane; Forney, Karin A; Chivers, Susan J; Barlow, Jay; Palacios, Daniel M
2015-01-01
We inferred the population densities of blue whales (Balaenoptera musculus) and short-beaked common dolphins (Delphinus delphis) in the Northeast Pacific Ocean as functions of the water-column's physical structure by implementing hierarchical models in a Bayesian framework. This approach allowed us to propagate the uncertainty of the field observations into the inference of species-habitat relationships and to generate spatially explicit population density predictions with reduced effects of sampling heterogeneity. Our hypothesis was that the large-scale spatial distributions of these two cetacean species respond primarily to ecological processes resulting from shoaling and outcropping of the pycnocline in regions of wind-forced upwelling and eddy-like circulation. Physically, these processes affect the thermodynamic balance of the water column, decreasing its volume and thus the height of the absolute dynamic topography (ADT). Biologically, they lead to elevated primary productivity and persistent aggregation of low-trophic-level prey. Unlike other remotely sensed variables, ADT provides information about the structure of the entire water column and it is also routinely measured at high spatial-temporal resolution by satellite altimeters with uniform global coverage. Our models provide spatially explicit population density predictions for both species, even in areas where the pycnocline shoals but does not outcrop (e.g. the Costa Rica Dome and the North Equatorial Countercurrent thermocline ridge). Interannual variations in distribution during El Niño anomalies suggest that the population density of both species decreases dramatically in the Equatorial Cold Tongue and the Costa Rica Dome, and that their distributions retract to particular areas that remain productive, such as the more oceanic waters in the central California Current System, the northern Gulf of California, the North Equatorial Countercurrent thermocline ridge, and the more southern portion of the
Directory of Open Access Journals (Sweden)
Mario A Pardo
Full Text Available We inferred the population densities of blue whales (Balaenoptera musculus and short-beaked common dolphins (Delphinus delphis in the Northeast Pacific Ocean as functions of the water-column's physical structure by implementing hierarchical models in a Bayesian framework. This approach allowed us to propagate the uncertainty of the field observations into the inference of species-habitat relationships and to generate spatially explicit population density predictions with reduced effects of sampling heterogeneity. Our hypothesis was that the large-scale spatial distributions of these two cetacean species respond primarily to ecological processes resulting from shoaling and outcropping of the pycnocline in regions of wind-forced upwelling and eddy-like circulation. Physically, these processes affect the thermodynamic balance of the water column, decreasing its volume and thus the height of the absolute dynamic topography (ADT. Biologically, they lead to elevated primary productivity and persistent aggregation of low-trophic-level prey. Unlike other remotely sensed variables, ADT provides information about the structure of the entire water column and it is also routinely measured at high spatial-temporal resolution by satellite altimeters with uniform global coverage. Our models provide spatially explicit population density predictions for both species, even in areas where the pycnocline shoals but does not outcrop (e.g. the Costa Rica Dome and the North Equatorial Countercurrent thermocline ridge. Interannual variations in distribution during El Niño anomalies suggest that the population density of both species decreases dramatically in the Equatorial Cold Tongue and the Costa Rica Dome, and that their distributions retract to particular areas that remain productive, such as the more oceanic waters in the central California Current System, the northern Gulf of California, the North Equatorial Countercurrent thermocline ridge, and the more
A Bayesian Predictive Discriminant Analysis with Screened Data
Directory of Open Access Journals (Sweden)
Hea-Jung Kim
2015-09-01
Full Text Available In the application of discriminant analysis, a situation sometimes arises where individual measurements are screened by a multidimensional screening scheme. For this situation, a discriminant analysis with screened populations is considered from a Bayesian viewpoint, and an optimal predictive rule for the analysis is proposed. In order to establish a flexible method to incorporate the prior information of the screening mechanism, we propose a hierarchical screened scale mixture of normal (HSSMN model, which makes provision for flexible modeling of the screened observations. An Markov chain Monte Carlo (MCMC method using the Gibbs sampler and the Metropolis–Hastings algorithm within the Gibbs sampler is used to perform a Bayesian inference on the HSSMN models and to approximate the optimal predictive rule. A simulation study is given to demonstrate the performance of the proposed predictive discrimination procedure.
Sobradelo, Rosa; Martí, Joan
2015-01-01
One of the most challenging aspects of managing a volcanic crisis is the interpretation of the monitoring data, so as to anticipate to the evolution of the unrest and implement timely mitigation actions. An unrest episode may include different stages or time intervals of increasing activity that may or may not precede a volcanic eruption, depending on the causes of the unrest (magmatic, geothermal or tectonic). Therefore, one of the main goals in monitoring volcanic unrest is to forecast whether or not such increase of activity will end up with an eruption, and if this is the case, how, when, and where this eruption will take place. As an alternative method to expert elicitation for assessing and merging monitoring data and relevant past information, we present a probabilistic method to transform precursory activity into the probability of experiencing a significant variation by the next time interval (i.e. the next step in the unrest), given its preceding evolution, and by further estimating the probability of the occurrence of a particular eruptive scenario combining monitoring and past data. With the 1991 Pinatubo volcanic crisis as a reference, we have developed such a method to assess short-term volcanic hazard using Bayesian inference.
Ata, Metin; Kitaura, Francisco-Shu; Müller, Volker
2015-02-01
We present a Bayesian reconstruction algorithm to generate unbiased samples of the underlying dark matter field from halo catalogues. Our new contribution consists of implementing a non-Poisson likelihood including a deterministic non-linear and scale-dependent bias. In particular we present the Hamiltonian equations of motions for the negative binomial (NB) probability distribution function. This permits us to efficiently sample the posterior distribution function of density fields given a sample of galaxies using the Hamiltonian Monte Carlo technique implemented in the ARGO code. We have tested our algorithm with the Bolshoi N-body simulation at redshift z = 0, inferring the underlying dark matter density field from subsamples of the halo catalogue with biases smaller and larger than one. Our method shows that we can draw closely unbiased samples (compatible within 1-σ) from the posterior distribution up to scales of about k ˜ 1 h Mpc-1 in terms of power-spectra and cell-to-cell correlations. We find that a Poisson likelihood including a scale-dependent non-linear deterministic bias can yield reconstructions with power spectra deviating more than 10 per cent at k = 0.2 h Mpc-1. Our reconstruction algorithm is especially suited for emission line galaxy data for which a complex non-linear stochastic biasing treatment beyond Poissonity becomes indispensable.
Ata, Metin; Müller, Volker
2014-01-01
We present a Bayesian reconstruction algorithm to generate unbiased samples of the underlying dark matter field from galaxy redshift data. Our new contribution consists of implementing a non-Poisson likelihood including a deterministic non-linear and scale-dependent bias. In particular we present the Hamiltonian equations of motions for the negative binomial (NB) probability distribution function. This permits us to efficiently sample the posterior distribution function of density fields given a sample of galaxies using the Hamiltonian Monte Carlo technique implemented in the Argo code. We have tested our algorithm with the Bolshoi N-body simulation, inferring the underlying dark matter density field from a subsample of the halo catalogue. Our method shows that we can draw closely unbiased samples (compatible within 1-$\\sigma$) from the posterior distribution up to scales of about k~1 h/Mpc in terms of power-spectra and cell-to-cell correlations. We find that a Poisson likelihood yields reconstructions with p...
International Nuclear Information System (INIS)
A novel approach is presented in this paper for improving anisotropic diffusion PDE models, based on the Perona–Malik equation. A solution is proposed from an engineering perspective to adaptively estimate the parameters of the regularizing function in this equation. The goal of such a new adaptive diffusion scheme is to better preserve edges when the anisotropic diffusion PDE models are applied to image enhancement tasks. The proposed adaptive parameter estimation in the anisotropic diffusion PDE model involves self-organizing maps and Bayesian inference to define edge probabilities accurately. The proposed modifications attempt to capture not only simple edges but also difficult textural edges and incorporate their probability in the anisotropic diffusion model. In the context of the application of PDE models to image processing such adaptive schemes are closely related to the discrete image representation problem and the investigation of more suitable discretization algorithms using constraints derived from image processing theory. The proposed adaptive anisotropic diffusion model illustrates these concepts when it is numerically approximated by various discretization schemes in a database of magnetic resonance images (MRI), where it is shown to be efficient in image filtering and restoration applications
D'Agostini, G
2010-01-01
Triggered by a recent interesting New Scientist article on the too frequent incorrect use of probabilistic evidence in courts, I introduce the basic concepts of probabilistic inference with a toy model, and discuss several important issues that need to be understood in order to extend the basic reasoning to real life cases. In particular, I emphasize the often neglected point that degrees of beliefs are updated not by `bare facts' alone, but by all available information pertaining to them, including how they have been acquired. In this light I show that, contrary to what claimed in that article, there was no "probabilistic pitfall" in the Columbo's episode pointed as example of "bad mathematics" yielding "rough justice". Instead, such a criticism could have a `negative reaction' to the article itself and to the use of Bayesian reasoning in courts, as well as in all other places in which probabilities need to be assessed and decisions need to be made. Anyway, besides introductory/recreational aspects, the pape...
Sobradelo, Rosa; Bartolini, Stefania; Martí, Joan
2014-05-01
Event tree structures constitute one of the most useful and necessary tools of modern volcanology to assess the volcanic hazard of future volcanic scenarios. They are particularly relevant to evaluate long- and short-term probabilities of occurrence of possible volcanic scenarios and their potential impacts on urbanized areas. Here we introduce HASSET, a Hazard Assessment Event Tree probability tool, built on an event tree structure that uses Bayesian inference to estimate the probability of occurrence of a future volcanic scenario, and to evaluate the most relevant sources of uncertainty from the corresponding volcanic system. HASSET includes hazard assessment of non-eruptive and non-magmatic volcanic scenarios, that is, episodes of unrest that do not evolve into volcanic eruption but have an associated volcanic hazard (eg. sector collapse and phreatic explosion), as well as those with external triggers as primary sources of unrest (as opposed to magmatic unrest alone). Additionally, HASSET introduces the Delta method to assess how precise the probability estimates are, by reporting a one standard deviation variability interval around the expected value for each scenario. HASSET is presented as a free software package in the form of a plugin for the open source geographic information system Quantum Gis (QGIS), providing a graphically supported computation of the event tree structure in an interactive and user-friendly way. We also include an example of HASSET applied to Teide-Pico Viejo volcanic complex (Spain).
Directory of Open Access Journals (Sweden)
Heringstad Bjørg
2010-07-01
Full Text Available Abstract Background In the genetic analysis of binary traits with one observation per animal, animal threshold models frequently give biased heritability estimates. In some cases, this problem can be circumvented by fitting sire- or sire-dam models. However, these models are not appropriate in cases where individual records exist on parents. Therefore, the aim of our study was to develop a new Gibbs sampling algorithm for a proper estimation of genetic (covariance components within an animal threshold model framework. Methods In the proposed algorithm, individuals are classified as either "informative" or "non-informative" with respect to genetic (covariance components. The "non-informative" individuals are characterized by their Mendelian sampling deviations (deviance from the mid-parent mean being completely confounded with a single residual on the underlying liability scale. For threshold models, residual variance on the underlying scale is not identifiable. Hence, variance of fully confounded Mendelian sampling deviations cannot be identified either, but can be inferred from the between-family variation. In the new algorithm, breeding values are sampled as in a standard animal model using the full relationship matrix, but genetic (covariance components are inferred from the sampled breeding values and relationships between "informative" individuals (usually parents only. The latter is analogous to a sire-dam model (in cases with no individual records on the parents. Results When applied to simulated data sets, the standard animal threshold model failed to produce useful results since samples of genetic variance always drifted towards infinity, while the new algorithm produced proper parameter estimates essentially identical to the results from a sire-dam model (given the fact that no individual records exist for the parents. Furthermore, the new algorithm showed much faster Markov chain mixing properties for genetic parameters (similar to
Marquez, Maria Jose
2012-01-01
Calibration is nowadays one of the most important processes involved in the extraction of valuable data from measurements. The current availability of an optimum data cube measured from a heterogeneous set of instruments and surveys relies on a systematic and robust approach in the corresponding measurement analysis. In that sense, the inference of configurable instrument parameters can considerably increase the quality of the data obtained. This paper proposes a solution based on Bayesian in...
DeLannoy, Gabrielle J. M.; Reichle, Rolf H.; Vrugt, Jasper A.
2013-01-01
Uncertainties in L-band (1.4 GHz) radiative transfer modeling (RTM) affect the simulation of brightness temperatures (Tb) over land and the inversion of satellite-observed Tb into soil moisture retrievals. In particular, accurate estimates of the microwave soil roughness, vegetation opacity and scattering albedo for large-scale applications are difficult to obtain from field studies and often lack an uncertainty estimate. Here, a Markov Chain Monte Carlo (MCMC) simulation method is used to determine satellite-scale estimates of RTM parameters and their posterior uncertainty by minimizing the misfit between long-term averages and standard deviations of simulated and observed Tb at a range of incidence angles, at horizontal and vertical polarization, and for morning and evening overpasses. Tb simulations are generated with the Goddard Earth Observing System (GEOS-5) and confronted with Tb observations from the Soil Moisture Ocean Salinity (SMOS) mission. The MCMC algorithm suggests that the relative uncertainty of the RTM parameter estimates is typically less than 25 of the maximum a posteriori density (MAP) parameter value. Furthermore, the actual root-mean-square-differences in long-term Tb averages and standard deviations are found consistent with the respective estimated total simulation and observation error standard deviations of m3.1K and s2.4K. It is also shown that the MAP parameter values estimated through MCMC simulation are in close agreement with those obtained with Particle Swarm Optimization (PSO).
Bayesian Spatial Modelling with R-INLA
Directory of Open Access Journals (Sweden)
Finn Lindgren
2015-02-01
Full Text Available The principles behind the interface to continuous domain spatial models in the R- INLA software package for R are described. The integrated nested Laplace approximation (INLA approach proposed by Rue, Martino, and Chopin (2009 is a computationally effective alternative to MCMC for Bayesian inference. INLA is designed for latent Gaussian models, a very wide and flexible class of models ranging from (generalized linear mixed to spatial and spatio-temporal models. Combined with the stochastic partial differential equation approach (SPDE, Lindgren, Rue, and Lindstrm 2011, one can accommodate all kinds of geographically referenced data, including areal and geostatistical ones, as well as spatial point process data. The implementation interface covers stationary spatial mod- els, non-stationary spatial models, and also spatio-temporal models, and is applicable in epidemiology, ecology, environmental risk assessment, as well as general geostatistics.
Bayesian inference of a lake water quality model by emulating its posterior density
Dietzel, A.; Reichert, P.
2014-10-01
We use a Gaussian stochastic process emulator to interpolate the posterior probability density of a computationally demanding application of the biogeochemical-ecological lake model BELAMO to accelerate statistical inference of deterministic model and error model parameters. The deterministic model consists of a mechanistic description of key processes influencing the mass balance of nutrients, dissolved oxygen, organic particles, and phytoplankton and zooplankton in the lake. This model is complemented by a Gaussian stochastic process to describe the remaining model bias and by Normal, independent observation errors. A small subsample of the Markov chain representing the posterior of the model parameters is propagated through the full model to get model predictions and uncertainty estimates. We expect this approximation to be more accurate at only slightly higher computational costs compared to using a Normal approximation to the posterior probability density and linear error propagation to the results as we did in an earlier paper. The performance of the two techniques is compared for a didactical example as well as for the lake model. As expected, for the didactical example, the use of the emulator led to posterior marginals of the model parameters that are closer to those calculated by Markov chain simulation using the full model than those based on the Normal approximation. For the lake model, the new technique proved applicable without an excessive increase in computational requirements, but we faced challenges in the choice of the design data set for emulator calibration. As the posterior is a scalar function of the parameters, the suggested technique is an alternative to the emulation of a potentially more complex, structured output of the simulation model that allows for the use of a less case-specific emulator. This is at the cost that still the full model has to be used for prediction (which can be done with a smaller, approximately independent subsample
Equifinality of formal (DREAM) and informal (GLUE) bayesian approaches in hydrologic modeling?
Energy Technology Data Exchange (ETDEWEB)
Vrugt, Jasper A [Los Alamos National Laboratory; Robinson, Bruce A [Los Alamos National Laboratory; Ter Braak, Cajo J F [NON LANL; Gupta, Hoshin V [NON LANL
2008-01-01
In recent years, a strong debate has emerged in the hydrologic literature regarding what constitutes an appropriate framework for uncertainty estimation. Particularly, there is strong disagreement whether an uncertainty framework should have its roots within a proper statistical (Bayesian) context, or whether such a framework should be based on a different philosophy and implement informal measures and weaker inference to summarize parameter and predictive distributions. In this paper, we compare a formal Bayesian approach using Markov Chain Monte Carlo (MCMC) with generalized likelihood uncertainty estimation (GLUE) for assessing uncertainty in conceptual watershed modeling. Our formal Bayesian approach is implemented using the recently developed differential evolution adaptive metropolis (DREAM) MCMC scheme with a likelihood function that explicitly considers model structural, input and parameter uncertainty. Our results demonstrate that DREAM and GLUE can generate very similar estimates of total streamflow uncertainty. This suggests that formal and informal Bayesian approaches have more common ground than the hydrologic literature and ongoing debate might suggest. The main advantage of formal approaches is, however, that they attempt to disentangle the effect of forcing, parameter and model structural error on total predictive uncertainty. This is key to improving hydrologic theory and to better understand and predict the flow of water through catchments.
Boers, Niklas; Goswami, Bedartha; Chekroun, Mickael; Svensson, Anders; Rousseau, Denis-Didier; Ghil, Michael
2016-04-01
In the recent past, empirical stochastic models have been successfully applied to model a wide range of climatic phenomena [1,2]. In addition to enhancing our understanding of the geophysical systems under consideration, multilayer stochastic models (MSMs) have been shown to be solidly grounded in the Mori-Zwanzig formalism of statistical physics [3]. They are also well-suited for predictive purposes, e.g., for the El Niño Southern Oscillation [4] and the Madden-Julian Oscillation [5]. In general, these models are trained on a given time series under consideration, and then assumed to reproduce certain dynamical properties of the underlying natural system. Most existing approaches are based on least-squares fitting to determine optimal model parameters, which does not allow for an uncertainty estimation of these parameters. This approach significantly limits the degree to which dynamical characteristics of the time series can be safely inferred from the model. Here, we are specifically interested in fitting low-dimensional stochastic models to time series obtained from paleoclimatic proxy records, such as the oxygen isotope ratio and dust concentration of the NGRIP record [6]. The time series derived from these records exhibit substantial dating uncertainties, in addition to the proxy measurement errors. In particular, for time series of this kind, it is crucial to obtain uncertainty estimates for the final model parameters. Following [7], we first propose a statistical procedure to shift dating uncertainties from the time axis to the proxy axis of layer-counted paleoclimatic records. Thereafter, we show how Maximum Likelihood Estimation in combination with Markov Chain Monte Carlo parameter sampling can be employed to translate all uncertainties present in the original proxy time series to uncertainties of the parameter estimates of the stochastic model. We compare time series simulated by the empirical model to the original time series in terms of standard
Sraj, Ihab
2014-11-01
Tsunami computational models are employed to explore multiple flooding scenarios and to predict water elevations. However, accurate estimation of water elevations requires accurate estimation of many model parameters including the Manning\\'s n friction parameterization. Our objective is to develop an efficient approach for the uncertainty quantification and inference of the Manning\\'s n coefficient which we characterize here by three different parameters set to be constant in the on-shore, near-shore and deep-water regions as defined using iso-baths. We use Polynomial Chaos (PC) to build an inexpensive surrogate for the G. eoC. law model and employ Bayesian inference to estimate and quantify uncertainties related to relevant parameters using the DART buoy data collected during the Tōhoku tsunami. The surrogate model significantly reduces the computational burden of the Markov Chain Monte-Carlo (MCMC) sampling of the Bayesian inference. The PC surrogate is also used to perform a sensitivity analysis.
Directory of Open Access Journals (Sweden)
Simon Boitard
2016-03-01
Full Text Available Inferring the ancestral dynamics of effective population size is a long-standing question in population genetics, which can now be tackled much more accurately thanks to the massive genomic data available in many species. Several promising methods that take advantage of whole-genome sequences have been recently developed in this context. However, they can only be applied to rather small samples, which limits their ability to estimate recent population size history. Besides, they can be very sensitive to sequencing or phasing errors. Here we introduce a new approximate Bayesian computation approach named PopSizeABC that allows estimating the evolution of the effective population size through time, using a large sample of complete genomes. This sample is summarized using the folded allele frequency spectrum and the average zygotic linkage disequilibrium at different bins of physical distance, two classes of statistics that are widely used in population genetics and can be easily computed from unphased and unpolarized SNP data. Our approach provides accurate estimations of past population sizes, from the very first generations before present back to the expected time to the most recent common ancestor of the sample, as shown by simulations under a wide range of demographic scenarios. When applied to samples of 15 or 25 complete genomes in four cattle breeds (Angus, Fleckvieh, Holstein and Jersey, PopSizeABC revealed a series of population declines, related to historical events such as domestication or modern breed creation. We further highlight that our approach is robust to sequencing errors, provided summary statistics are computed from SNPs with common alleles.
Bayesian MCMC estimation of the rose of directions
Czech Academy of Sciences Publication Activity Database
Prokešová, Michaela
2003-01-01
Roč. 39, č. 6 (2003), s. 703-718. ISSN 0023-5954 R&D Projects: GA AV ČR IAA1075201; GA ČR GA201/03/0946 Institutional research plan: CEZ:AV0Z1075907 Keywords : rose of directions * space of compact sets Subject RIV: BA - General Mathematics Impact factor: 0.319, year: 2003
MCMC and variational approaches for Bayesian inversion in diffraction imaging
Ayasso, Hacheme; Duchêne, Bernard; Mohammad-Djafari, Ali
2015-01-01
International audience The term “diffraction imaging” is meant, herein, in the sense of an “inverse scattering problem” where the goal is to build up an image of an unknown object from measurements of the scattered field that results from its interaction with a known probing wave. This type of problem occurs in many imaging and non-destructive testing applications. It corresponds to the situation where looking for a good trade-off between the image resolution and the penetration of the inc...
A Very Simple Safe-Bayesian Random Forest.
Quadrianto, Novi; Ghahramani, Zoubin
2015-06-01
Random forests works by averaging several predictions of de-correlated trees. We show a conceptually radical approach to generate a random forest: random sampling of many trees from a prior distribution, and subsequently performing a weighted ensemble of predictive probabilities. Our approach uses priors that allow sampling of decision trees even before looking at the data, and a power likelihood that explores the space spanned by combination of decision trees. While each tree performs Bayesian inference to compute its predictions, our aggregation procedure uses the power likelihood rather than the likelihood and is therefore strictly speaking not Bayesian. Nonetheless, we refer to it as a Bayesian random forest but with a built-in safety. The safeness comes as it has good predictive performance even if the underlying probabilistic model is wrong. We demonstrate empirically that our Safe-Bayesian random forest outperforms MCMC or SMC based Bayesian decision trees in term of speed and accuracy, and achieves competitive performance to entropy or Gini optimised random forest, yet is very simple to construct. PMID:26357350
Marquez, Maria Jose
2012-01-01
Calibration is nowadays one of the most important processes involved in the extraction of valuable data from measurements. The current availability of an optimum data cube measured from a heterogeneous set of instruments and surveys relies on a systematic and robust approach in the corresponding measurement analysis. In that sense, the inference of configurable instrument parameters can considerably increase the quality of the data obtained. This paper proposes a solution based on Bayesian inference for the estimation of the configurable parameters relevant to the signal to noise ratio. The information obtained by the resolution of this problem can be handled in a very useful way if it is considered as part of an adaptive loop for the overall measurement strategy, in such a way that the outcome of this parametric inference leads to an increase in the knowledge of a model comparison problem in the context of the measurement interpretation. The context of this problem is the multi-wavelength measurements coming...
Griffiths, Thomas L.; Tenenbaum, Joshua B.
2011-01-01
Predicting the future is a basic problem that people have to solve every day and a component of planning, decision making, memory, and causal reasoning. In this article, we present 5 experiments testing a Bayesian model of predicting the duration or extent of phenomena from their current state. This Bayesian model indicates how people should…
Vehicle Trajectory Estimation Using Spatio-Temporal MCMC
Directory of Open Access Journals (Sweden)
Francois Bardet
2010-01-01
Full Text Available This paper presents an algorithm for modeling and tracking vehicles in video sequences within one integrated framework. Most of the solutions are based on sequential methods that make inference according to current information. In contrast, we propose a deferred logical inference method that makes a decision according to a sequence of observations, thus processing a spatio-temporal search on the whole trajectory. One of the drawbacks of deferred logical inference methods is that the solution space of hypotheses grows exponentially related to the depth of observation. Our approach takes into account both the kinematic model of the vehicle and a driver behavior model in order to reduce the space of the solutions. The resulting proposed state model explains the trajectory with only 11 parameters. The solution space is then sampled with a Markov Chain Monte Carlo (MCMC that uses a model-driven proposal distribution in order to control random walk behavior. We demonstrate our method on real video sequences from which we have ground truth provided by a RTK GPS (Real-Time Kinematic GPS. Experimental results show that the proposed algorithm outperforms a sequential inference solution (particle filter.
International Nuclear Information System (INIS)
The estimation of lower atmospheric refractivity from radar sea clutter (RFC) is a complicated nonlinear optimization problem. This paper deals with the RFC problem in a Bayesian framework. It uses the unbiased Markov Chain Monte Carlo (MCMC) sampling technique, which can provide accurate posterior probability distributions of the estimated refractivity parameters by using an electromagnetic split-step fast Fourier transform terrain parabolic equation propagation model within a Bayesian inversion framework. In contrast to the global optimization algorithm, the Bayesian—MCMC can obtain not only the approximate solutions, but also the probability distributions of the solutions, that is, uncertainty analyses of solutions. The Bayesian—MCMC algorithm is implemented on the simulation radar sea-clutter data and the real radar sea-clutter data. Reference data are assumed to be simulation data and refractivity profiles are obtained using a helicopter. The inversion algorithm is assessed (i) by comparing the estimated refractivity profiles from the assumed simulation and the helicopter sounding data; (ii) the one-dimensional (1D) and two-dimensional (2D) posterior probability distribution of solutions. (geophysics, astronomy, and astrophysics)
Intracluster Moves for Constrained Discrete-Space MCMC
Hamze, Firas
2012-01-01
This paper addresses the problem of sampling from binary distributions with constraints. In particular, it proposes an MCMC method to draw samples from a distribution of the set of all states at a specified distance from some reference state. For example, when the reference state is the vector of zeros, the algorithm can draw samples from a binary distribution with a constraint on the number of active variables, say the number of 1's. We motivate the need for this algorithm with examples from statistical physics and probabilistic inference. Unlike previous algorithms proposed to sample from binary distributions with these constraints, the new algorithm allows for large moves in state space and tends to propose them such that they are energetically favourable. The algorithm is demonstrated on three Boltzmann machines of varying difficulty: A ferromagnetic Ising model (with positive potentials), a restricted Boltzmann machine with learned Gabor-like filters as potentials, and a challenging three-dimensional spi...
Christley, Scott; Emr, Bryanna; Ghosh, Auyon; Satalin, Josh; Gatto, Louis; Vodovotz, Yoram; Nieman, Gary F.; An, Gary
2013-06-01
Acute respiratory distress syndrome (ARDS) is acute lung failure secondary to severe systemic inflammation, resulting in a derangement of alveolar mechanics (i.e. the dynamic change in alveolar size and shape during tidal ventilation), leading to alveolar instability that can cause further damage to the pulmonary parenchyma. Mechanical ventilation is a mainstay in the treatment of ARDS, but may induce mechano-physical stresses on unstable alveoli, which can paradoxically propagate the cellular and molecular processes exacerbating ARDS pathology. This phenomenon is called ventilator induced lung injury (VILI), and plays a significant role in morbidity and mortality associated with ARDS. In order to identify optimal ventilation strategies to limit VILI and treat ARDS, it is necessary to understand the complex interplay between biological and physical mechanisms of VILI, first at the alveolar level, and then in aggregate at the whole-lung level. Since there is no current consensus about the underlying dynamics of alveolar mechanics, as an initial step we investigate the ventilatory dynamics of an alveolar sac (AS) with the lung alveolar spatial model (LASM), a 3D spatial biomechanical representation of the AS and its interaction with airflow pressure and the surface tension effects of pulmonary surfactant. We use the LASM to identify the mechanical ramifications of alveolar dynamics associated with ARDS. Using graphical processing unit parallel algorithms, we perform Bayesian inference on the model parameters using experimental data from rat lung under control and Tween-induced ARDS conditions. Our results provide two plausible models that recapitulate two fundamental hypotheses about volume change at the alveolar level: (1) increase in alveolar size through isotropic volume change, or (2) minimal change in AS radius with primary expansion of the mouth of the AS, with the implication that the majority of change in lung volume during the respiratory cycle occurs in the
International Nuclear Information System (INIS)
Acute respiratory distress syndrome (ARDS) is acute lung failure secondary to severe systemic inflammation, resulting in a derangement of alveolar mechanics (i.e. the dynamic change in alveolar size and shape during tidal ventilation), leading to alveolar instability that can cause further damage to the pulmonary parenchyma. Mechanical ventilation is a mainstay in the treatment of ARDS, but may induce mechano-physical stresses on unstable alveoli, which can paradoxically propagate the cellular and molecular processes exacerbating ARDS pathology. This phenomenon is called ventilator induced lung injury (VILI), and plays a significant role in morbidity and mortality associated with ARDS. In order to identify optimal ventilation strategies to limit VILI and treat ARDS, it is necessary to understand the complex interplay between biological and physical mechanisms of VILI, first at the alveolar level, and then in aggregate at the whole-lung level. Since there is no current consensus about the underlying dynamics of alveolar mechanics, as an initial step we investigate the ventilatory dynamics of an alveolar sac (AS) with the lung alveolar spatial model (LASM), a 3D spatial biomechanical representation of the AS and its interaction with airflow pressure and the surface tension effects of pulmonary surfactant. We use the LASM to identify the mechanical ramifications of alveolar dynamics associated with ARDS. Using graphical processing unit parallel algorithms, we perform Bayesian inference on the model parameters using experimental data from rat lung under control and Tween-induced ARDS conditions. Our results provide two plausible models that recapitulate two fundamental hypotheses about volume change at the alveolar level: (1) increase in alveolar size through isotropic volume change, or (2) minimal change in AS radius with primary expansion of the mouth of the AS, with the implication that the majority of change in lung volume during the respiratory cycle occurs in the
MCMC Analysis of biases in the interpretation of disk galaxy kinematics
Aquino-Ortíz, E.; Valenzuela, O.; Cano-Díaz, M.; Sánchez-Sánchez, S. F.; Hernández-Toledo, H.
2016-06-01
The new generation of galaxy surveys like SAMI, CALIFA and MaNGA opens up the possibility of studying simultaneously properties of galaxies such as spiral arms, bars, disk geometry and orientation, stellar and gas mass distribution, 2D kinematics, etc. The previous task involves exploring a complicated multi-dimensional parameter space. Puglielli et al. (2010) introduced Bayesian statistics and MCMC (Monte Carlo Markov Chain) techniques to construct dynamical models of spiral galaxies. In our study we used synthetic velocity fields that include non-circular motions and assume different disk orientations in order to produce mock observations. We apply popular reconstruction techniques in order to estimate the geometrical disk parameters, systemic velocities, rotation curve shape and maximum circular velocity which are crucial to construct the scaling relations. We conclude that a detailed analysis of kinematics in galaxies using MCMC technique will be reflected in accurate estimations of galaxy properties and more robust scalings relations, otherwise physical conclusions may be importantly biased.
Trajectory averaging for stochastic approximation MCMC algorithms
Liang, Faming
2010-01-01
The subject of stochastic approximation was founded by Robbins and Monro [Ann. Math. Statist. 22 (1951) 400--407]. After five decades of continual development, it has developed into an important area in systems control and optimization, and it has also served as a prototype for the development of adaptive algorithms for on-line estimation and control of stochastic systems. Recently, it has been used in statistics with Markov chain Monte Carlo for solving maximum likelihood estimation problems and for general simulation and optimizations. In this paper, we first show that the trajectory averaging estimator is asymptotically efficient for the stochastic approximation MCMC (SAMCMC) algorithm under mild conditions, and then apply this result to the stochastic approximation Monte Carlo algorithm [Liang, Liu and Carroll J. Amer. Statist. Assoc. 102 (2007) 305--320]. The application of the trajectory averaging estimator to other stochastic approximation MCMC algorithms, for example, a stochastic approximation MLE al...
Bayesian Concordance Correlation Coefficient with Application to Repeatedly Measured Data
Directory of Open Access Journals (Sweden)
Atanu BHATTACHARJEE
2015-10-01
Full Text Available Objective: In medical research, Lin's classical concordance correlation coefficient (CCC is frequently applied to evaluate the similarity of the measurements produced by different raters or methods on the same subjects. It is particularly useful for continuous data. The objective of this paper is to propose the Bayesian counterpart to compute CCC for continuous data. Material and Methods: A total of 33 patients of astrocytoma brain treated in the Department of Radiation Oncology at Malabar Cancer Centre is enrolled in this work. It is a continuous data of tumor volume and tumor size repeatedly measured during baseline pretreatment workup and post surgery follow-ups for all patients. The tumor volume and tumor size are measured separately by MRI and CT scan. The agreement of measurement between MRI and CT scan is calculated through CCC. The statistical inference is performed through Markov Chain Monte Carlo (MCMC technique. Results: Bayesian CCC is found suitable to get prominent evidence for test statistics to explore the relation between concordance measurements. The posterior mean estimates and 95% credible interval of CCC on tumor size and tumor volume are observed with 0.96(0.87,0.99 and 0.98(0.95,0.99 respectively. Conclusion: The Bayesian inference is adopted for development of the computational algorithm. The approach illustrated in this work provides the researchers an opportunity to find out the most appropriate model for specific data and apply CCC to fulfill the desired hypothesis.
International Nuclear Information System (INIS)
We present a systematic implementation of the recently developed Z-method for computing melting points of solids, augmented by a Bayesian analysis of the data obtained from molecular dynamics simulations. The use of Bayesian inference allows us to extract valuable information from limited data, reducing the computational cost of drawing the isochoric curve. From this Bayesian Z-method we obtain posterior distributions for the melting temperature Tm, the critical superheating temperature TLS and the slopes dT/dE of the liquid and solid phases. The method therefore gives full quantification of the errors in the prediction of the melting point. This procedure is applied to the estimation of the melting point of Ti2GaN (one of the so-called MAX phases), a complex, laminar material, by density functional theory molecular dynamics, finding an estimate Tm of 2591.61 ± 89.61 K, which is in good agreement with melting points of similar ceramics. (paper)
Inferring Genotype of DNA Molecular Marker by Bayesian Theorem%应用贝叶斯理论推断DNA分子标记基因型
Institute of Scientific and Technical Information of China (English)
莫惠栋; 姜长鉴
2002-01-01
引入贝叶斯理论用以从DNA分子标记的表现型(电泳谱带)推断其基因型(DNA来源).结果表明,根据标记座位独立假定而确定的遗传信息不完全标记的基因型概率,与根据邻近的遗传信息完全标记的基因型和有关重组率算得的相应贝叶斯概率,通常都有很大的差异.所以在进行数量性状基因定位和标记辅助选择等工作之前,应当计算每一个体基因组上所有遗传信息不完全座位的有关基因型的贝叶斯概率.文中列出计算未知基因型的贝叶斯概率的详细过程,也讨论了贝叶斯概率的若干推广应用.%Bayesian theorem is applied to infer the DNA molecular marker genotype(DNA chain type) from its phenotype (electrophoresis band type). The results indicated that large differences often present in the genotype probability of a molecular marker with incomplete genetic information when it is obtained from the assumption of independence among markers as compared with that inferred from the genotypes of the flanking markers with the complete genetic information and the recombination fractions among them based on the Bayesian theorem. Therefore, before utilizing the marker information, such as in mapping quantitative trait loci (QTL), marker assisted selection (MAS) etc., Bayesian probability of the genotype for all markers with incomplete genetic information must be calculated over the whole genome for every individual. This study provides detailed procedure for the calculation of the Bayesian probability of the unknown genotype. Several extensions were also discussed for the application of the Bayesian theorem.
Caticha, Ariel
2011-03-01
In this tutorial we review the essential arguments behing entropic inference. We focus on the epistemological notion of information and its relation to the Bayesian beliefs of rational agents. The problem of updating from a prior to a posterior probability distribution is tackled through an eliminative induction process that singles out the logarithmic relative entropy as the unique tool for inference. The resulting method of Maximum relative Entropy (ME), includes as special cases both MaxEnt and Bayes' rule, and therefore unifies the two themes of these workshops—the Maximum Entropy and the Bayesian methods—into a single general inference scheme.
Caticha, Ariel
2010-01-01
In this tutorial we review the essential arguments behing entropic inference. We focus on the epistemological notion of information and its relation to the Bayesian beliefs of rational agents. The problem of updating from a prior to a posterior probability distribution is tackled through an eliminative induction process that singles out the logarithmic relative entropy as the unique tool for inference. The resulting method of Maximum relative Entropy (ME), includes as special cases both MaxEnt and Bayes' rule, and therefore unifies the two themes of these workshops -- the Maximum Entropy and the Bayesian methods -- into a single general inference scheme.
Use of SAMC for Bayesian analysis of statistical models with intractable normalizing constants
Jin, Ick Hoon
2014-03-01
Statistical inference for the models with intractable normalizing constants has attracted much attention. During the past two decades, various approximation- or simulation-based methods have been proposed for the problem, such as the Monte Carlo maximum likelihood method and the auxiliary variable Markov chain Monte Carlo methods. The Bayesian stochastic approximation Monte Carlo algorithm specifically addresses this problem: It works by sampling from a sequence of approximate distributions with their average converging to the target posterior distribution, where the approximate distributions can be achieved using the stochastic approximation Monte Carlo algorithm. A strong law of large numbers is established for the Bayesian stochastic approximation Monte Carlo estimator under mild conditions. Compared to the Monte Carlo maximum likelihood method, the Bayesian stochastic approximation Monte Carlo algorithm is more robust to the initial guess of model parameters. Compared to the auxiliary variable MCMC methods, the Bayesian stochastic approximation Monte Carlo algorithm avoids the requirement for perfect samples, and thus can be applied to many models for which perfect sampling is not available or very expensive. The Bayesian stochastic approximation Monte Carlo algorithm also provides a general framework for approximate Bayesian analysis. © 2012 Elsevier B.V. All rights reserved.
Olmi, L; Elia, D; Molinari, S; Pestalozzi, M; Pezzuto, S; Schisano, E; Testi, L; Thompson, M
2013-01-01
Context. Stars form in dense, dusty clumps of molecular clouds, but little is known about their origin, their evolution and their detailed physical properties. In particular, the relationship between the mass distribution of these clumps (also known as the "clump mass function", or CMF) and the stellar initial mass function (IMF), is still poorly understood. Aims. In order to better understand how the CMF evolve toward the IMF, and to discern the "true" shape of the CMF, large samples of bona-fide pre- and proto-stellar clumps are required. Two such datasets obtained from the Herschel infrared GALactic Plane Survey (Hi-GAL) have been described in paper I. Robust statistical methods are needed in order to infer the parameters describing the models used to fit the CMF, and to compare the competing models themselves. Methods. In this paper we apply Bayesian inference to the analysis of the CMF of the two regions discussed in Paper I. First, we determine the Bayesian posterior probability distribution for each of...
Pujolar, J M; Zane, L; Congiu, L
2012-06-01
The aim of our study is to examine the phylogenetic relationship, divergence times and demographic history of the five close-related Mediterranean and North-eastern Atlantic species/forms of Atherina using the full Bayesian framework for species tree estimation recently implemented in ∗BEAST. The inference is made possible by multilocus data using three mitochondrial genes (12S rRNA, 16S rRNA, control region) and one nuclear gene (rhodopsin) from multiple individuals per species available in GenBank. Bayesian phylogenetic analysis of the complete gene dataset produced a tree with strong support for the monophyly of each species, as well as high support for higher level nodes. An old origin of the Atherina group was suggested (19.2 MY), with deep split events within the Atherinidae predating the Messinian Salinity Crisis. Regional genetic substructuring was observed among populations of A. boyeri, with AMOVA and MultiDimensional Scaling suggesting the existence of five groupings (Atlantic/West Mediterranean, Adriatic, Greece, Black Sea and Tunis). The level of subdivision found might be consequence of the hydrographic isolation within the Mediterranean Sea. Bayesian inference of past demographic histories showed a clear signature of demographic expansion for the European coast populations of A. presbyter, possibly linked to post-glacial colonizations, but not for the Azores/Canary Islands, which is expected in isolated populations because of the impossibility of finding new habitats. Within the Mediterranean, signatures of recent demographic expansion were only found for the Adriatic population of A. boyeri, which could be associated with the relatively recent emergence of the Adriatic Sea. PMID:22425706
Fast MCMC sampling for hidden markov models to determine copy number variations
Directory of Open Access Journals (Sweden)
Mahmud Md Pavel
2011-11-01
Full Text Available Abstract Background Hidden Markov Models (HMM are often used for analyzing Comparative Genomic Hybridization (CGH data to identify chromosomal aberrations or copy number variations by segmenting observation sequences. For efficiency reasons the parameters of a HMM are often estimated with maximum likelihood and a segmentation is obtained with the Viterbi algorithm. This introduces considerable uncertainty in the segmentation, which can be avoided with Bayesian approaches integrating out parameters using Markov Chain Monte Carlo (MCMC sampling. While the advantages of Bayesian approaches have been clearly demonstrated, the likelihood based approaches are still preferred in practice for their lower running times; datasets coming from high-density arrays and next generation sequencing amplify these problems. Results We propose an approximate sampling technique, inspired by compression of discrete sequences in HMM computations and by kd-trees to leverage spatial relations between data points in typical data sets, to speed up the MCMC sampling. Conclusions We test our approximate sampling method on simulated and biological ArrayCGH datasets and high-density SNP arrays, and demonstrate a speed-up of 10 to 60 respectively 90 while achieving competitive results with the state-of-the art Bayesian approaches. Availability: An implementation of our method will be made available as part of the open source GHMM library from http://ghmm.org.
Lafrenière-Bérubé, Charles; Chouteau, Michel; Shamsipour, Pejman; Olivo, Gema R.
2016-04-01
Spectral induced polarization (SIP) parameters can be extracted from field or laboratory complex resistivity measurements, and even airborne or ground frequency domain electromagnetic data. With the growing interest in application of complex resistivity measurements to environmental and mineral exploration problems, there is a need for accurate and easy-to-use inversion tools to estimate SIP parameters. These parameters, which often include chargeability and relaxation time may then be studied and related to other rock attributes such as porosity or metallic grain content, in the case of mineral exploration. We present an open source program, available both as a standalone application or Python module, to estimate SIP parameters using Markov-chain Monte Carlo (MCMC) sampling. The Python language is a high level, open source language that is now widely used in scientific computing. Our program allows the user to choose between the more common Cole-Cole (Pelton), Dias, or Debye decomposition models. Simple circuits composed of resistances and constant phase elements may also be used to represent SIP data. Initial guesses are required when using more classic inversion techniques such as the least-squares formulation, and wrong estimates are often the cause of bad curve fitting. In stochastic optimization using MCMC, the effect of the starting values disappears as the simulation proceeds. Our program is then optimized to do batch inversion over large data sets with as little user-interaction as possible. Additionally, the Bayesian formulation allows the user to do quality control by fully propagating the measurement errors in the inversion process, providing an estimation of the SIP parameters uncertainty. This information is valuable when trying to relate chargeability or relaxation time to other physical properties. We test the inversion program on complex resistivity measurements of 12 core samples from the world-class gold deposit of Canadian Malartic. Results show
Rubin, Donald B.
1981-01-01
The Bayesian bootstrap is the Bayesian analogue of the bootstrap. Instead of simulating the sampling distribution of a statistic estimating a parameter, the Bayesian bootstrap simulates the posterior distribution of the parameter; operationally and inferentially the methods are quite similar. Because both methods of drawing inferences are based on somewhat peculiar model assumptions and the resulting inferences are generally sensitive to these assumptions, neither method should be applied wit...
GSEVM v.2: MCMC software to analyse genetically structured environmental variance models
DEFF Research Database (Denmark)
Ibáñez-Escriche, N; Garcia, M; Sorensen, D
2010-01-01
This note provides a description of software that allows to fit Bayesian genetically structured variance models using Markov chain Monte Carlo (MCMC). The gsevm v.2 program was written in Fortran 90. The DOS and Unix executable programs, the user's guide, and some example files are freely available...... for research purposes at http://www.bdporc.irta.es/estudis.jsp. The main feature of the program is to compute Monte Carlo estimates of marginal posterior distributions of parameters of interest. The program is quite flexible, allowing the user to fit a variety of linear models at the level of the mean...
Bayesian artificial intelligence
Korb, Kevin B
2010-01-01
Updated and expanded, Bayesian Artificial Intelligence, Second Edition provides a practical and accessible introduction to the main concepts, foundation, and applications of Bayesian networks. It focuses on both the causal discovery of networks and Bayesian inference procedures. Adopting a causal interpretation of Bayesian networks, the authors discuss the use of Bayesian networks for causal modeling. They also draw on their own applied research to illustrate various applications of the technology.New to the Second EditionNew chapter on Bayesian network classifiersNew section on object-oriente
DEFF Research Database (Denmark)
Jensen, Finn Verner; Nielsen, Thomas Dyhre
2016-01-01
Mathematically, a Bayesian graphical model is a compact representation of the joint probability distribution for a set of variables. The most frequently used type of Bayesian graphical models are Bayesian networks. The structural part of a Bayesian graphical model is a graph consisting of nodes and...... largely due to the availability of efficient inference algorithms for answering probabilistic queries about the states of the variables in the network. Furthermore, to support the construction of Bayesian network models, learning algorithms are also available. We give an overview of the Bayesian network...
Gelman, Andrew; Stern, Hal S; Dunson, David B; Vehtari, Aki; Rubin, Donald B
2013-01-01
FUNDAMENTALS OF BAYESIAN INFERENCEProbability and InferenceSingle-Parameter Models Introduction to Multiparameter Models Asymptotics and Connections to Non-Bayesian ApproachesHierarchical ModelsFUNDAMENTALS OF BAYESIAN DATA ANALYSISModel Checking Evaluating, Comparing, and Expanding ModelsModeling Accounting for Data Collection Decision AnalysisADVANCED COMPUTATION Introduction to Bayesian Computation Basics of Markov Chain Simulation Computationally Efficient Markov Chain Simulation Modal and Distributional ApproximationsREGRESSION MODELS Introduction to Regression Models Hierarchical Linear
Yuan, Ying; MacKinnon, David P.
2009-01-01
This article proposes Bayesian analysis of mediation effects. Compared to conventional frequentist mediation analysis, the Bayesian approach has several advantages. First, it allows researchers to incorporate prior information into the mediation analysis, thus potentially improving the efficiency of estimates. Second, under the Bayesian mediation analysis, inference is straightforward and exact, which makes it appealing for studies with small samples. Third, the Bayesian approach is conceptua...
Institute of Scientific and Technical Information of China (English)
张杨; 赵继广; 陈景鹏; 王亚琪
2014-01-01
在计算初因事件频率的过程中,当初因事件获取的数据较少时,便要考虑不确定性.在概率安全评估(PSA)中,通常以概率分布的形式表示事件的不确定性.本文针对“贮罐内压过大”这个初因事件数据量获取少的问题,采用贝叶斯-马尔科夫链蒙特卡洛方法(Bayesian-MCMC方法)对其频率进行了不确定性分析,得到了初因事件频率的不确定分布图形,并分析了不确定分布图形的特点,同时与直接计算频率方法进行了结果比较,从而验证了该方法的正确性和有效性.
A Bayesian Approach to Detection of Small Low Emission Sources
Xun, Xiaolei; Carroll, Raymond J; Kuchment, Peter
2011-01-01
The article addresses the problem of detecting presence and location of a small low emission source inside of an object, when the background noise dominates. This problem arises, for instance, in some homeland security applications. The goal is to reach the signal-to-noise ratio (SNR) levels on the order of $10^{-3}$. A Bayesian approach to this problem is implemented in 2D. The method allows inference not only about the existence of the source, but also about its location. We derive Bayes factors for model selection and estimation of location based on Markov Chain Monte Carlo (MCMC) simulation. A simulation study shows that with sufficiently high total emission level, our method can effectively locate the source.
Modeling error distributions of growth curve models through Bayesian methods.
Zhang, Zhiyong
2016-06-01
Growth curve models are widely used in social and behavioral sciences. However, typical growth curve models often assume that the errors are normally distributed although non-normal data may be even more common than normal data. In order to avoid possible statistical inference problems in blindly assuming normality, a general Bayesian framework is proposed to flexibly model normal and non-normal data through the explicit specification of the error distributions. A simulation study shows when the distribution of the error is correctly specified, one can avoid the loss in the efficiency of standard error estimates. A real example on the analysis of mathematical ability growth data from the Early Childhood Longitudinal Study, Kindergarten Class of 1998-99 is used to show the application of the proposed methods. Instructions and code on how to conduct growth curve analysis with both normal and non-normal error distributions using the the MCMC procedure of SAS are provided. PMID:26019004
Métodos avanzados de muestreo : MCMC
Pascual Del Olmo, Víctor
2011-01-01
Este proyecto se propone estudiar, analizar e investigar las diferentes metodologías de generación de números aleatorios mediante técnicas avanzadas y modernas de Monte Carlo Markov Chain (MCMC). Los métodos de Monte Carlo son métodos numéricos usados para calcular, aproximar y simular expresiones o sistemas matemáticos complejos y difíciles de evaluar. Aunque estos métodos comenzaron a desarrollarse en los años cuarenta, hasta que las computadoras no se hicieron más potentes estuvieron en un...
Directory of Open Access Journals (Sweden)
Navid Feroze
2016-03-01
Full Text Available The families of mixture distributions have a wider range of applications in different fields such as fisheries, agriculture, botany, economics, medicine, psychology, electrophoresis, finance, communication theory, geology and zoology. They provide the necessary flexibility to model failure distributions of components with multiple failure modes. Mostly, the Bayesian procedure for the estimation of parameters of mixture model is described under the scheme of Type-I censoring. In particular, the Bayesian analysis for the mixture models under doubly censored samples has not been considered in the literature yet. The main objective of this paper is to develop the Bayes estimation of the inverse Weibull mixture distributions under doubly censoring. The posterior estimation has been conducted under the assumption of gamma and inverse levy using precautionary loss function and weighted squared error loss function. The comparisons among the different estimators have been made based on analysis of simulated and real life data sets.
Trajectory averaging for stochastic approximation MCMC algorithms
Liang, Faming
2010-10-01
The subject of stochastic approximation was founded by Robbins and Monro [Ann. Math. Statist. 22 (1951) 400-407]. After five decades of continual development, it has developed into an important area in systems control and optimization, and it has also served as a prototype for the development of adaptive algorithms for on-line estimation and control of stochastic systems. Recently, it has been used in statistics with Markov chain Monte Carlo for solving maximum likelihood estimation problems and for general simulation and optimizations. In this paper, we first show that the trajectory averaging estimator is asymptotically efficient for the stochastic approximation MCMC (SAMCMC) algorithm under mild conditions, and then apply this result to the stochastic approximation Monte Carlo algorithm [Liang, Liu and Carroll J. Amer. Statist. Assoc. 102 (2007) 305-320]. The application of the trajectory averaging estimator to other stochastic approximationMCMC algorithms, for example, a stochastic approximation MLE algorithm for missing data problems, is also considered in the paper. © Institute of Mathematical Statistics, 2010.
Directory of Open Access Journals (Sweden)
Kevin McNally
2012-01-01
Full Text Available There are numerous biomonitoring programs, both recent and ongoing, to evaluate environmental exposure of humans to chemicals. Due to the lack of exposure and kinetic data, the correlation of biomarker levels with exposure concentrations leads to difficulty in utilizing biomonitoring data for biological guidance values. Exposure reconstruction or reverse dosimetry is the retrospective interpretation of external exposure consistent with biomonitoring data. We investigated the integration of physiologically based pharmacokinetic modelling, global sensitivity analysis, Bayesian inference, and Markov chain Monte Carlo simulation to obtain a population estimate of inhalation exposure to m-xylene. We used exhaled breath and venous blood m-xylene and urinary 3-methylhippuric acid measurements from a controlled human volunteer study in order to evaluate the ability of our computational framework to predict known inhalation exposures. We also investigated the importance of model structure and dimensionality with respect to its ability to reconstruct exposure.
ABCtoolbox: a versatile toolkit for approximate Bayesian computations
Directory of Open Access Journals (Sweden)
Neuenschwander Samuel
2010-03-01
Full Text Available Abstract Background The estimation of demographic parameters from genetic data often requires the computation of likelihoods. However, the likelihood function is computationally intractable for many realistic evolutionary models, and the use of Bayesian inference has therefore been limited to very simple models. The situation changed recently with the advent of Approximate Bayesian Computation (ABC algorithms allowing one to obtain parameter posterior distributions based on simulations not requiring likelihood computations. Results Here we present ABCtoolbox, a series of open source programs to perform Approximate Bayesian Computations (ABC. It implements various ABC algorithms including rejection sampling, MCMC without likelihood, a Particle-based sampler and ABC-GLM. ABCtoolbox is bundled with, but not limited to, a program that allows parameter inference in a population genetics context and the simultaneous use of different types of markers with different ploidy levels. In addition, ABCtoolbox can also interact with most simulation and summary statistics computation programs. The usability of the ABCtoolbox is demonstrated by inferring the evolutionary history of two evolutionary lineages of Microtus arvalis. Using nuclear microsatellites and mitochondrial sequence data in the same estimation procedure enabled us to infer sex-specific population sizes and migration rates and to find that males show smaller population sizes but much higher levels of migration than females. Conclusion ABCtoolbox allows a user to perform all the necessary steps of a full ABC analysis, from parameter sampling from prior distributions, data simulations, computation of summary statistics, estimation of posterior distributions, model choice, validation of the estimation procedure, and visualization of the results.
Prediction of trajectory based on modified Bayesian inference%基于改进贝叶斯方法的轨迹预测算法研究
Institute of Scientific and Technical Information of China (English)
李万高; 赵雪梅; 孙德厂
2013-01-01
The existing algorithms for trajectory prediction have very low prediction accuracy when there are a limited number of available trajectories.To address this problem,the Modified Bayesian Inference (MBI) approach was proposed,which constructed the Markov model to quantify the correlation between adjacent locations.MBI decomposed historical trajectories into sub-trajectories to get more precise Markov model and the probability formula of Bayesian inference was obtained.The experimental results based on real datasets show that MBI approach is two to three times faster than the existing algorithm,and it has higher prediction accuracy and stability.MBI makes full use of the available trajectories and improves the efficiency and accuracy for the prediction of trajectory.%针对传统轨迹预测方法在历史轨迹数目有限时,预测准确度较低的问题,提出一种改进的贝叶斯推理(MBI)方法,MBI构建了马尔可夫模型来量化相邻位置的相关性,并通过对历史轨迹进行分解来获得更准确的马尔可夫模型,最后得到改进的贝叶斯推理公式.实验结果表明,MBI方法比现有方法的预测速度快2到3倍,并且有较高的准确度和稳定性.MBI方法充分利用现有轨迹信息,不仅提高了查询效率,还保证了较高的预测精度.
Pitombeira-Neto, Anselmo Ramalho; Loureiro, Carlos Felipe Grangeiro; Carvalho, Luis Eduardo
2016-01-01
Estimation of origin-destination (OD) demand plays a key role in successful transportation studies. In this paper, we consider the estimation of time-varying day-to-day OD flows given data on traffic volumes in a transportation network for a sequence of days. We propose a dynamic linear model (DLM) in order to represent the stochastic evolution of OD flows over time. DLM's are Bayesian state-space models which can capture non-stationarity. We take into account the hierarchical relationships b...
mbb_emcee: Modified Blackbody MCMC
Conley, Alexander
2016-02-01
Mbb_emcee fits modified blackbodies to photometry data using an affine invariant MCMC. It has large number of options which, for example, allow computation of the IR luminosity or dustmass as part of the fit. Carrying out a fit produces a HDF5 output file containing the results, which can either be read directly, or read back into a mbb_results object for analysis. Upper and lower limits can be imposed as well as Gaussian priors on the model parameters. These additions are useful for analyzing poorly constrained data. In addition to standard Python packages scipy, numpy, and cython, mbb_emcee requires emcee (ascl:1303.002), Astropy (ascl:1304.002), h5py, and for unit tests, nose.
Bayesian target tracking based on particle filter
Institute of Scientific and Technical Information of China (English)
无
2005-01-01
For being able to deal with the nonlinear or non-Gaussian problems, particle filters have been studied by many researchers. Based on particle filter, the extended Kalman filter (EKF) proposal function is applied to Bayesian target tracking. Markov chain Monte Carlo (MCMC) method, the resampling step, etc novel techniques are also introduced into Bayesian target tracking. And the simulation results confirm the improved particle filter with these techniques outperforms the basic one.
Degradation monitoring using probabilistic inference
Alpay, Bulent
In order to increase safety and improve economy and performance in a nuclear power plant (NPP), the source and extent of component degradations should be identified before failures and breakdowns occur. It is also crucial for the next generation of NPPs, which are designed to have a long core life and high fuel burnup to have a degradation monitoring system in order to keep the reactor in a safe state, to meet the designed reactor core lifetime and to optimize the scheduled maintenance. Model-based methods are based on determining the inconsistencies between the actual and expected behavior of the plant, and use these inconsistencies for detection and diagnostics of degradations. By defining degradation as a random abrupt change from the nominal to a constant degraded state of a component, we employed nonlinear filtering techniques based on state/parameter estimation. We utilized a Bayesian recursive estimation formulation in the sequential probabilistic inference framework and constructed a hidden Markov model to represent a general physical system. By addressing the problem of a filter's inability to estimate an abrupt change, which is called the oblivious filter problem in nonlinear extensions of Kalman filtering, and the sample impoverishment problem in particle filtering, we developed techniques to modify filtering algorithms by utilizing additional data sources to improve the filter's response to this problem. We utilized a reliability degradation database that can be constructed from plant specific operational experience and test and maintenance reports to generate proposal densities for probable degradation modes. These are used in a multiple hypothesis testing algorithm. We then test samples drawn from these proposal densities with the particle filtering estimates based on the Bayesian recursive estimation formulation with the Metropolis Hastings algorithm, which is a well-known Markov chain Monte Carlo method (MCMC). This multiple hypothesis testing
Bayesian segmentation of hyperspectral images
Mohammadpour, Adel; Mohammad-Djafari, Ali
2007-01-01
In this paper we consider the problem of joint segmentation of hyperspectral images in the Bayesian framework. The proposed approach is based on a Hidden Markov Modeling (HMM) of the images with common segmentation, or equivalently with common hidden classification label variables which is modeled by a Potts Markov Random Field. We introduce an appropriate Markov Chain Monte Carlo (MCMC) algorithm to implement the method and show some simulation results.
Bayesian segmentation of hyperspectral images
Mohammadpour, Adel; Féron, Olivier; Mohammad-Djafari, Ali
2004-11-01
In this paper we consider the problem of joint segmentation of hyperspectral images in the Bayesian framework. The proposed approach is based on a Hidden Markov Modeling (HMM) of the images with common segmentation, or equivalently with common hidden classification label variables which is modeled by a Potts Markov Random Field. We introduce an appropriate Markov Chain Monte Carlo (MCMC) algorithm to implement the method and show some simulation results.
Chain ladder method: Bayesian bootstrap versus classical bootstrap
Peters, Gareth W.; Mario V. W\\"uthrich; Shevchenko, Pavel V.
2010-01-01
The intention of this paper is to estimate a Bayesian distribution-free chain ladder (DFCL) model using approximate Bayesian computation (ABC) methodology. We demonstrate how to estimate quantities of interest in claims reserving and compare the estimates to those obtained from classical and credibility approaches. In this context, a novel numerical procedure utilising Markov chain Monte Carlo (MCMC), ABC and a Bayesian bootstrap procedure was developed in a truly distribution-free setting. T...
Energy Technology Data Exchange (ETDEWEB)
Sigeti, David E. [Los Alamos National Laboratory; Pelak, Robert A. [Los Alamos National Laboratory
2012-09-11
We present a Bayesian statistical methodology for identifying improvement in predictive simulations, including an analysis of the number of (presumably expensive) simulations that will need to be made in order to establish with a given level of confidence that an improvement has been observed. Our analysis assumes the ability to predict (or postdict) the same experiments with legacy and new simulation codes and uses a simple binomial model for the probability, {theta}, that, in an experiment chosen at random, the new code will provide a better prediction than the old. This model makes it possible to do statistical analysis with an absolute minimum of assumptions about the statistics of the quantities involved, at the price of discarding some potentially important information in the data. In particular, the analysis depends only on whether or not the new code predicts better than the old in any given experiment, and not on the magnitude of the improvement. We show how the posterior distribution for {theta} may be used, in a kind of Bayesian hypothesis testing, both to decide if an improvement has been observed and to quantify our confidence in that decision. We quantify the predictive probability that should be assigned, prior to taking any data, to the possibility of achieving a given level of confidence, as a function of sample size. We show how this predictive probability depends on the true value of {theta} and, in particular, how there will always be a region around {theta} = 1/2 where it is highly improbable that we will be able to identify an improvement in predictive capability, although the width of this region will shrink to zero as the sample size goes to infinity. We show how the posterior standard deviation may be used, as a kind of 'plan B metric' in the case that the analysis shows that {theta} is close to 1/2 and argue that such a plan B should generally be part of hypothesis testing. All the analysis presented in the paper is done with a
Renner, Susanne S; Zhang, Li-Bing
2004-06-01
Pistia stratiotes (water lettuce) and Lemna (duckweeds) are the only free-floating aquatic Araceae. The geographic origin and phylogenetic placement of these unrelated aroids present long-standing problems because of their highly modified reproductive structures and wide geographical distributions. We sampled chloroplast (trnL-trnF and rpl20-rps12 spacers, trnL intron) and mitochondrial sequences (nad1 b/c intron) for all genera implicated as close relatives of Pistia by morphological, restriction site, and sequencing data, and present a hypothesis about its geographic origin based on the consensus of trees obtained from the combined data, using Bayesian, maximum likelihood, parsimony, and distance analyses. Of the 14 genera closest to Pistia, only Alocasia, Arisaema, and Typhonium are species-rich, and the latter two were studied previously, facilitating the choice of representatives that span the roots of these genera. Results indicate that Pistia and the Seychelles endemic Protarum sechellarum are the basalmost branches in a grade comprising the tribes Colocasieae (Ariopsis, Steudnera, Remusatia, Alocasia, Colocasia), Arisaemateae (Arisaema, Pinellia), and Areae (Arum, Biarum, Dracunculus, Eminium, Helicodiceros, Theriophonum, Typhonium). Unexpectedly, all Areae genera are embedded in Typhonium, which throws new light on the geographic history of Areae. A Bayesian analysis of divergence times that explores the effects of multiple fossil and geological calibration points indicates that the Pistia lineage is 90 to 76 million years (my) old. The oldest fossils of the Pistia clade, though not Pistia itself, are 45-my-old leaves from Germany; the closest outgroup, Peltandreae (comprising a few species in Florida, the Mediterranean, and Madagascar), is known from 60-my-old leaves from Europe, Kazakhstan, North Dakota, and Tennessee. Based on the geographic ranges of close relatives, Pistia likely originated in the Tethys region, with Protarum then surviving on the
Yao, Zhewei; Hu, Zixi; Li, Jinglai
2016-07-01
Many scientific and engineering problems require to perform Bayesian inferences in function spaces, where the unknowns are of infinite dimension. In such problems, choosing an appropriate prior distribution is an important task. In particular, when the function to infer is subject to sharp jumps, the commonly used Gaussian measures become unsuitable. On the other hand, the so-called total variation (TV) prior can only be defined in a finite-dimensional setting, and does not lead to a well-defined posterior measure in function spaces. In this work we present a TV-Gaussian (TG) prior to address such problems, where the TV term is used to detect sharp jumps of the function, and the Gaussian distribution is used as a reference measure so that it results in a well-defined posterior measure in the function space. We also present an efficient Markov Chain Monte Carlo (MCMC) algorithm to draw samples from the posterior distribution of the TG prior. With numerical examples we demonstrate the performance of the TG prior and the efficiency of the proposed MCMC algorithm.
Advances in Bayesian Model Based Clustering Using Particle Learning
Energy Technology Data Exchange (ETDEWEB)
Merl, D M
2009-11-19
Recent work by Carvalho, Johannes, Lopes and Polson and Carvalho, Lopes, Polson and Taddy introduced a sequential Monte Carlo (SMC) alternative to traditional iterative Monte Carlo strategies (e.g. MCMC and EM) for Bayesian inference for a large class of dynamic models. The basis of SMC techniques involves representing the underlying inference problem as one of state space estimation, thus giving way to inference via particle filtering. The key insight of Carvalho et al was to construct the sequence of filtering distributions so as to make use of the posterior predictive distribution of the observable, a distribution usually only accessible in certain Bayesian settings. Access to this distribution allows a reversal of the usual propagate and resample steps characteristic of many SMC methods, thereby alleviating to a large extent many problems associated with particle degeneration. Furthermore, Carvalho et al point out that for many conjugate models the posterior distribution of the static variables can be parametrized in terms of [recursively defined] sufficient statistics of the previously observed data. For models where such sufficient statistics exist, particle learning as it is being called, is especially well suited for the analysis of streaming data do to the relative invariance of its algorithmic complexity with the number of data observations. Through a particle learning approach, a statistical model can be fit to data as the data is arriving, allowing at any instant during the observation process direct quantification of uncertainty surrounding underlying model parameters. Here we describe the use of a particle learning approach for fitting a standard Bayesian semiparametric mixture model as described in Carvalho, Lopes, Polson and Taddy. In Section 2 we briefly review the previously presented particle learning algorithm for the case of a Dirichlet process mixture of multivariate normals. In Section 3 we describe several novel extensions to the original
Konda Reddy, B.; Gnanasekaran, N.; Balaji, C.
2012-11-01
An inverse methodology is proposed to estimate thermo-physical and transport properties individually and simultaneously from in-house experimental data obtained using the transient Liquid Crystal Thermography (LCT) technique. A vertical rectangular fin made of mild steel and size of 75 × 250 × 3 (L × W × t) (all in mm) has been used. Thermochromic Liquid Crystals (TLCs) are used to obtain transient temperature distribution along the fin surface to determine the temperature dependent heat transfer coefficient, hθ and the thermal diffusivity, α of the fin. The variation of heat transfer coefficient is considered as a power law function of temperature excess (hθ = a''θb(x,t)) and is derived from the basic Nusselt number equation, Nuθ = aRabθ used for laminar natural convection for a vertical plate in ambient air. Using this functional form, the 1-D transient fin equation solved using the finite difference technique for assumed values of 'a' and 'α'. Treating the inverse problem as a one parameter estimation in 'a' or 'α' or a two parameter estimation problem in 'a' and 'α', the sum of the squares of the difference between the TLC measured and simulated temperatures are minimized with the Bayesian frame work in the inverse model to determine the point estimates for 'a' and 'α'. Two point estimates namely the (i) mean and (ii) maximum a posterior (MAP) are used to report the retrieved quantities together with the associated standard deviation.
International Nuclear Information System (INIS)
An inverse methodology is proposed to estimate thermo-physical and transport properties individually and simultaneously from in-house experimental data obtained using the transient Liquid Crystal Thermography (LCT) technique. A vertical rectangular fin made of mild steel and size of 75 × 250 × 3 (L × W × t) (all in mm) has been used. Thermochromic Liquid Crystals (TLCs) are used to obtain transient temperature distribution along the fin surface to determine the temperature dependent heat transfer coefficient, hθ and the thermal diffusivity, α of the fin. The variation of heat transfer coefficient is considered as a power law function of temperature excess (hθ = a''θb(x,t)) and is derived from the basic Nusselt number equation, Nuθ = aRabθ used for laminar natural convection for a vertical plate in ambient air. Using this functional form, the 1–D transient fin equation solved using the finite difference technique for assumed values of 'a' and 'α'. Treating the inverse problem as a one parameter estimation in 'a' or 'α' or a two parameter estimation problem in 'a' and 'α', the sum of the squares of the difference between the TLC measured and simulated temperatures are minimized with the Bayesian frame work in the inverse model to determine the point estimates for 'a' and 'α'. Two point estimates namely the (i) mean and (ii) maximum a posterior (MAP) are used to report the retrieved quantities together with the associated standard deviation.
Directory of Open Access Journals (Sweden)
Steven Kelly
Full Text Available BACKGROUND: The rapid increase in the availability of genome information has created considerable demand for both comparative and ab initio predictive bioinformatic analyses. The biology laid bare in the genomes of many organisms is often novel, presenting new challenges for bioinformatic interrogation. A paradigm for this is the collected genomes of the kinetoplastid parasites, a group which includes Trypanosoma brucei the causative agent of human African trypanosomiasis. These genomes, though outwardly simple in organisation and gene content, have historically challenged many theories for gene expression regulation in eukaryotes. METHODOLOGY/PRINCIPLE FINDINGS: Here we utilise a Bayesian approach to identify local changes in nucleotide composition in the genome of T. brucei. We show that there are several elements which are found at the starts and ends of multicopy gene arrays and that there are compositional elements that are common to all intergenic regions. We also show that there is a composition-inversion element that occurs at the position of the trans-splice site. CONCLUSIONS/SIGNIFICANCE: The nature of the elements discovered reinforces the hypothesis that context dependant RNA secondary structure has an important influence on gene expression regulation in Trypanosoma brucei.
Directory of Open Access Journals (Sweden)
Watanabe Yukito
2012-01-01
Full Text Available Abstract Background Bayesian networks (BNs have been widely used to estimate gene regulatory networks. Many BN methods have been developed to estimate networks from microarray data. However, two serious problems reduce the effectiveness of current BN methods. The first problem is that BN-based methods require huge computational time to estimate large-scale networks. The second is that the estimated network cannot have cyclic structures, even if the actual network has such structures. Results In this paper, we present a novel BN-based deterministic method with reduced computational time that allows cyclic structures. Our approach generates all the combinational triplets of genes, estimates networks of the triplets by BN, and unites the networks into a single network containing all genes. This method decreases the search space of predicting gene regulatory networks without degrading the solution accuracy compared with the greedy hill climbing (GHC method. The order of computational time is the cube of number of genes. In addition, the network estimated by our method can include cyclic structures. Conclusions We verified the effectiveness of the proposed method for all known gene regulatory networks and their expression profiles. The results demonstrate that this approach can predict regulatory networks with reduced computational time without degrading the solution accuracy compared with the GHC method.
An MCMC Circumstellar Disks Modeling Tool
Wolff, Schuyler; Perrin, Marshall D.; Mazoyer, Johan; Choquet, Elodie; Soummer, Remi; Ren, Bin; Pueyo, Laurent; Debes, John H.; Duchene, Gaspard; Pinte, Christophe; Menard, Francois
2016-01-01
We present an enhanced software framework for the Monte Carlo Markov Chain modeling of circumstellar disk observations, including spectral energy distributions and multi wavelength images from a variety of instruments (e.g. GPI, NICI, HST, WFIRST). The goal is to self-consistently and simultaneously fit a wide variety of observables in order to place constraints on the physical properties of a given disk, while also rigorously assessing the uncertainties in the derived properties. This modular code is designed to work with a collection of existing modeling tools, ranging from simple scripts to define the geometry for optically thin debris disks, to full radiative transfer modeling of complex grain structures in protoplanetary disks (using the MCFOST radiative transfer modeling code). The MCMC chain relies on direct chi squared comparison of model images/spectra to observations. We will include a discussion of how best to weight different observations in the modeling of a single disk and how to incorporate forward modeling from PCA PSF subtraction techniques. The code is open source, python, and available from github. Results for several disks at various evolutionary stages will be discussed.
On the Markov Chain Monte Carlo (MCMC) method
Indian Academy of Sciences (India)
Rajeeva L Karandikar
2006-04-01
Markov Chain Monte Carlo (MCMC) is a popular method used to generate samples from arbitrary distributions, which may be speciﬁed indirectly. In this article, we give an introduction to this method along with some examples.
International Nuclear Information System (INIS)
To study the impact of the Deepwater Horizon oil spill on photosynthesis of coastal salt marsh plants in Mississippi, we developed a hierarchical Bayesian (HB) model based on field measurements collected from July 2010 to November 2011. We sampled three locations in Davis Bayou, Mississippi (30.375°N, 88.790°W) representative of a range of oil spill impacts. Measured photosynthesis was negative (respiration only) at the heavily oiled location in July 2010 only, and rates started to increase by August 2010. Photosynthesis at the medium oiling location was lower than at the control location in July 2010 and it continued to decrease in September 2010. During winter 2010–2011, the contrast between the control and the two impacted locations was not as obvious as in the growing season of 2010. Photosynthesis increased through spring 2011 at the three locations and decreased starting with October at the control location and a month earlier (September) at the impacted locations. Using the field data, we developed an HB model. The model simulations agreed well with the measured photosynthesis, capturing most of the variability of the measured data. On the basis of the posteriors of the parameters, we found that air temperature and photosynthetic active radiation positively influenced photosynthesis whereas the leaf stress level negatively affected photosynthesis. The photosynthesis rates at the heavily impacted location had recovered to the status of the control location about 140 days after the initial impact, while the impact at the medium impact location was never severe enough to make photosynthesis significantly lower than that at the control location over the study period. The uncertainty in modeling photosynthesis rates mainly came from the individual and micro-site scales, and to a lesser extent from the leaf scale. (letter)
Institute of Scientific and Technical Information of China (English)
KUNDU Debasis; PRADHAN Biswabrata
2009-01-01
Recently generalized exponential distribution has received considerable attentions. In this paper, we deal with the Bayesian inference of the unknown parameters of the progressively censored generalized exponential distribution. It is assumed that the scale and the shape parameters have independent gamma priors. The Bayes estimates of the unknown parameters cannot be obtained in the closed form. Lindley's approximation and importance sampling technique have been suggested to compute the approximate Bayes estimates. Markov Chain Monte Carlo method has been used to compute the approximate Bayes estimates and also to construct the highest posterior density credible intervals. We also provide different criteria to compare two different sampling schemes and hence to find the optimal sampling schemes. It is observed that finding the optimum censoring procedure is a computationally expensive process. And we have recommended to use the sub-optimal censoring procedure, which can be obtained very easily. Monte Carlo simulations are performed to compare the performances of the different methods and one data analysis has been performed for illustrative purposes.
Warren, Harry
2016-05-01
We report on the efforts of a multidisciplinary International Space Science Institute team that is investigating the limits of our ability to infer the physical properties of solar and stellar atmospheres from remote sensing observations. As part of this project we have estimated the uncertainties in the collisional cross sections and radiative decay rates for Fe XIII and O VII and created 1000 realizations of the CHIANTI atomic database. These perturbed atomic data are then used to analyze solar observations from the EIS spectrometer on Hinode and stellar observations from the LETG on Chandra within a Bayesian framework. For the solar case we find that the systematic errors from the atomic physics dominate the statistical uncertainties from the observations. For many cases the uncertainties are about 10 times larger when variations in the atomic data are included. This indicates the need for very accurate atomic physics. Comparisons among recent Fe XIII calculations suggest that for some transitions the collision rates are currently known well enough to measure the electron density and emission measure to about 15%.
Amo de Paz, Guillermo; Cubas, Paloma; Divakar, Pradeep K; Lumbsch, H Thorsten; Crespo, Ana
2011-01-01
There is a long-standing debate on the extent of vicariance and long-distance dispersal events to explain the current distribution of organisms, especially in those with small diaspores potentially prone to long-distance dispersal. Age estimates of clades play a crucial role in evaluating the impact of these processes. The aim of this study is to understand the evolutionary history of the largest clade of macrolichens, the parmelioid lichens (Parmeliaceae, Lecanoromycetes, Ascomycota) by dating the origin of the group and its major lineages. They have a worldwide distribution with centers of distribution in the Neo- and Paleotropics, and semi-arid subtropical regions of the Southern Hemisphere. Phylogenetic analyses were performed using DNA sequences of nuLSU and mtSSU rDNA, and the protein-coding RPB1 gene. The three DNA regions had different evolutionary rates: RPB1 gave a rate two to four times higher than nuLSU and mtSSU. Divergence times of the major clades were estimated with partitioned BEAST analyses allowing different rates for each DNA region and using a relaxed clock model. Three calibrations points were used to date the tree: an inferred age at the stem of Lecanoromycetes, and two dated fossils: Parmelia in the parmelioid group, and Alectoria. Palaeoclimatic conditions and the palaeogeological area cladogram were compared to the dated phylogeny of parmelioid. The parmelioid group diversified around the K/T boundary, and the major clades diverged during the Eocene and Oligocene. The radiation of the genera occurred through globally changing climatic condition of the early Oligocene, Miocene and early Pliocene. The estimated divergence times are consistent with long-distance dispersal events being the major factor to explain the biogeographical distribution patterns of Southern Hemisphere parmelioids, especially for Africa-Australia disjunctions, because the sequential break-up of Gondwana started much earlier than the origin of these clades. However, our
Understanding Computational Bayesian Statistics
Bolstad, William M
2011-01-01
A hands-on introduction to computational statistics from a Bayesian point of view Providing a solid grounding in statistics while uniquely covering the topics from a Bayesian perspective, Understanding Computational Bayesian Statistics successfully guides readers through this new, cutting-edge approach. With its hands-on treatment of the topic, the book shows how samples can be drawn from the posterior distribution when the formula giving its shape is all that is known, and how Bayesian inferences can be based on these samples from the posterior. These ideas are illustrated on common statistic
Directory of Open Access Journals (Sweden)
Cláudio Roberto Thiersch
2013-02-01
Full Text Available Neste trabalho foi considerado o modelo de Curtis para a relação hipsométrica em clones de Eucalyptus sp. com os parâmetros sujeitos a restrições. Para fazer a inferência dos parâmetros do modelo com restrições, utilizou-se uma abordagem bayesiana com densidade a priori construída empiricamente. As estimativas bayesianas são calculadas com a técnica de simulação de Monte Carlo em Cadeia de Markov (MCMC. O método proposto foi aplicado a diferentes conjuntos de dados reais, dos quais foram selecionados cinco para exemplificar os resultados. Estes foram comparados com os resultados obtidos pelo método de mínimos quadrados, destacando-se a superioridade da abordagem bayesiana proposta.The model of Curtis was considered in this work for the hypsometric relationship in Eucalyptus sp. clones with parameters submitted to restrictions. A Bayesian a priori density was empirically constructed to infer the parameters of the models with restrictions. The Bayesian estimates are calculated using Monte Carlo Markov Chain (MCMC Method. The proposed method was applied to different groups of real data from which five were selected to show the results. Those were compared to the achieved results by the minimum square method and the superiority of the Bayesian approach is highlighted.
Bayesian object classification of gold nanoparticles
Konomi, Bledar A.
2013-06-01
The properties of materials synthesized with nanoparticles (NPs) are highly correlated to the sizes and shapes of the nanoparticles. The transmission electron microscopy (TEM) imaging technique can be used to measure the morphological characteristics of NPs, which can be simple circles or more complex irregular polygons with varying degrees of scales and sizes. A major difficulty in analyzing the TEM images is the overlapping of objects, having different morphological properties with no specific information about the number of objects present. Furthermore, the objects lying along the boundary render automated image analysis much more difficult. To overcome these challenges, we propose a Bayesian method based on the marked-point process representation of the objects. We derive models, both for the marks which parameterize the morphological aspects and the points which determine the location of the objects. The proposed model is an automatic image segmentation and classification procedure, which simultaneously detects the boundaries and classifies the NPs into one of the predetermined shape families. We execute the inference by sampling the posterior distribution using Markov chainMonte Carlo (MCMC) since the posterior is doubly intractable. We apply our novel method to several TEM imaging samples of gold NPs, producing the needed statistical characterization of their morphology. © Institute of Mathematical Statistics, 2013.
BEAST: Bayesian evolutionary analysis by sampling trees
Directory of Open Access Journals (Sweden)
Drummond Alexei J
2007-11-01
Full Text Available Abstract Background The evolutionary analysis of molecular sequence variation is a statistical enterprise. This is reflected in the increased use of probabilistic models for phylogenetic inference, multiple sequence alignment, and molecular population genetics. Here we present BEAST: a fast, flexible software architecture for Bayesian analysis of molecular sequences related by an evolutionary tree. A large number of popular stochastic models of sequence evolution are provided and tree-based models suitable for both within- and between-species sequence data are implemented. Results BEAST version 1.4.6 consists of 81000 lines of Java source code, 779 classes and 81 packages. It provides models for DNA and protein sequence evolution, highly parametric coalescent analysis, relaxed clock phylogenetics, non-contemporaneous sequence data, statistical alignment and a wide range of options for prior distributions. BEAST source code is object-oriented, modular in design and freely available at http://beast-mcmc.googlecode.com/ under the GNU LGPL license. Conclusion BEAST is a powerful and flexible evolutionary analysis package for molecular sequence variation. It also provides a resource for the further development of new models and statistical methods of evolutionary analysis.
Bayesian Analysis of Inertial Confinement Fusion Experiments at the National Ignition Facility
Gaffney, J A; Sonnad, V; Libby, S B
2012-01-01
We develop a Bayesian inference method that allows the efficient determination of several interesting parameters from complicated high-energy-density experiments performed on the National Ignition Facility (NIF). The model is based on an exploration of phase space using the hydrodynamic code HYDRA. A linear model is used to describe the effect of nuisance parameters on the analysis, allowing an analytic likelihood to be derived that can be determined from a small number of HYDRA runs and then used in existing advanced statistical analysis methods. This approach is applied to a recent experiment in order to determine the carbon opacity and X-ray drive; it is found that the inclusion of prior expert knowledge and fluctuations in capsule dimensions and chemical composition significantly improve the agreement between experiment and theoretical opacity calculations. A parameterisation of HYDRA results is used to test the application of both Markov chain Monte Carlo (MCMC) and genetic algorithm (GA) techniques to e...
Bayesian Reliability Analysis of Non-Stationarity in Multi-agent Systems
Directory of Open Access Journals (Sweden)
TONT Gabriela
2013-05-01
Full Text Available The Bayesian methods provide information about the meaningful parameters in a statistical analysis obtained by combining the prior and sampling distributions to form the posterior distribution of theparameters. The desired inferences are obtained from this joint posterior. An estimation strategy for hierarchical models, where the resulting joint distribution of the associated model parameters cannotbe evaluated analytically, is to use sampling algorithms, known as Markov Chain Monte Carlo (MCMC methods, from which approximate solutions can be obtained. Both serial and parallel configurations of subcomponents are permitted. The capability of time-dependent method to describe a multi-state system is based on a case study, assessingthe operatial situation of studied system. The rationality and validity of the presented model are demonstrated via a case of study. The effect of randomness of the structural parameters is alsoexamined.
Directory of Open Access Journals (Sweden)
Guillermo Amo de Paz
Full Text Available There is a long-standing debate on the extent of vicariance and long-distance dispersal events to explain the current distribution of organisms, especially in those with small diaspores potentially prone to long-distance dispersal. Age estimates of clades play a crucial role in evaluating the impact of these processes. The aim of this study is to understand the evolutionary history of the largest clade of macrolichens, the parmelioid lichens (Parmeliaceae, Lecanoromycetes, Ascomycota by dating the origin of the group and its major lineages. They have a worldwide distribution with centers of distribution in the Neo- and Paleotropics, and semi-arid subtropical regions of the Southern Hemisphere. Phylogenetic analyses were performed using DNA sequences of nuLSU and mtSSU rDNA, and the protein-coding RPB1 gene. The three DNA regions had different evolutionary rates: RPB1 gave a rate two to four times higher than nuLSU and mtSSU. Divergence times of the major clades were estimated with partitioned BEAST analyses allowing different rates for each DNA region and using a relaxed clock model. Three calibrations points were used to date the tree: an inferred age at the stem of Lecanoromycetes, and two dated fossils: Parmelia in the parmelioid group, and Alectoria. Palaeoclimatic conditions and the palaeogeological area cladogram were compared to the dated phylogeny of parmelioid. The parmelioid group diversified around the K/T boundary, and the major clades diverged during the Eocene and Oligocene. The radiation of the genera occurred through globally changing climatic condition of the early Oligocene, Miocene and early Pliocene. The estimated divergence times are consistent with long-distance dispersal events being the major factor to explain the biogeographical distribution patterns of Southern Hemisphere parmelioids, especially for Africa-Australia disjunctions, because the sequential break-up of Gondwana started much earlier than the origin of these
A Refined MCMC Sampling from RKHS for PAC-Bayes Bound Calculation
Directory of Open Access Journals (Sweden)
Li Tang
2014-04-01
Full Text Available PAC-Bayes risk bound integrating theories of Bayesian paradigm and structure risk minimization for stochastic classifiers has been considered as a framework for deriving some of the tightest generalization bounds. A major issue in practical use of this bound is estimations of unknown prior and posterior distributions of the concept space. In this paper, by formulating the concept space as Reproducing Kernel Hilbert Space (RKHS using the kernel method, we proposed a refined Markov Chain Monte Carlo (MCMC sampling algorithm by incorporating feedback information of the simulated model over training examples for simulating posterior distributions of the concept space. Furthermore, we used a kernel density method to estimate their probability distributions in calculating the Kullback-Leibler divergence of the posterior and prior distributions. The experimental results on two artificial data sets show that the simulation is reasonable and effective in practice.
Hybrid Gibbs Sampling and MCMC for CMB Analysis at Small Angular Scales
Jewell, Jeffrey B.; Eriksen, H. K.; Wandelt, B. D.; Gorski, K. M.; Huey, G.; O'Dwyer, I. J.; Dickinson, C.; Banday, A. J.; Lawrence, C. R.
2008-01-01
A) Gibbs Sampling has now been validated as an efficient, statistically exact, and practically useful method for "low-L" (as demonstrated on WMAP temperature polarization data). B) We are extending Gibbs sampling to directly propagate uncertainties in both foreground and instrument models to total uncertainty in cosmological parameters for the entire range of angular scales relevant for Planck. C) Made possible by inclusion of foreground model parameters in Gibbs sampling and hybrid MCMC and Gibbs sampling for the low signal to noise (high-L) regime. D) Future items to be included in the Bayesian framework include: 1) Integration with Hybrid Likelihood (or posterior) code for cosmological parameters; 2) Include other uncertainties in instrumental systematics? (I.e. beam uncertainties, noise estimation, calibration errors, other).
Bayesian binary regression model: an application to in-hospital death after AMI prediction
Directory of Open Access Journals (Sweden)
Aparecida D. P. Souza
2004-08-01
Full Text Available A Bayesian binary regression model is developed to predict death of patients after acute myocardial infarction (AMI. Markov Chain Monte Carlo (MCMC methods are used to make inference and to evaluate Bayesian binary regression models. A model building strategy based on Bayes factor is proposed and aspects of model validation are extensively discussed in the paper, including the posterior distribution for the c-index and the analysis of residuals. Risk assessment, based on variables easily available within minutes of the patients' arrival at the hospital, is very important to decide the course of the treatment. The identified model reveals itself strongly reliable and accurate, with a rate of correct classification of 88% and a concordance index of 83%.Um modelo bayesiano de regressão binária é desenvolvido para predizer óbito hospitalar em pacientes acometidos por infarto agudo do miocárdio. Métodos de Monte Carlo via Cadeias de Markov (MCMC são usados para fazer inferência e validação. Uma estratégia para construção de modelos, baseada no uso do fator de Bayes, é proposta e aspectos de validação são extensivamente discutidos neste artigo, incluindo a distribuição a posteriori para o índice de concordância e análise de resíduos. A determinação de fatores de risco, baseados em variáveis disponíveis na chegada do paciente ao hospital, é muito importante para a tomada de decisão sobre o curso do tratamento. O modelo identificado se revela fortemente confiável e acurado, com uma taxa de classificação correta de 88% e um índice de concordância de 83%.
Loredo, T J
2004-01-01
I describe a framework for adaptive scientific exploration based on iterating an Observation--Inference--Design cycle that allows adjustment of hypotheses and observing protocols in response to the results of observation on-the-fly, as data are gathered. The framework uses a unified Bayesian methodology for the inference and design stages: Bayesian inference to quantify what we have learned from the available data and predict future data, and Bayesian decision theory to identify which new observations would teach us the most. When the goal of the experiment is simply to make inferences, the framework identifies a computationally efficient iterative ``maximum entropy sampling'' strategy as the optimal strategy in settings where the noise statistics are independent of signal properties. Results of applying the method to two ``toy'' problems with simulated data--measuring the orbit of an extrasolar planet, and locating a hidden one-dimensional object--show the approach can significantly improve observational eff...
Time-varying nonstationary multivariate risk analysis using a dynamic Bayesian copula
Sarhadi, Ali; Burn, Donald H.; Concepción Ausín, María.; Wiper, Michael P.
2016-03-01
A time-varying risk analysis is proposed for an adaptive design framework in nonstationary conditions arising from climate change. A Bayesian, dynamic conditional copula is developed for modeling the time-varying dependence structure between mixed continuous and discrete multiattributes of multidimensional hydrometeorological phenomena. Joint Bayesian inference is carried out to fit the marginals and copula in an illustrative example using an adaptive, Gibbs Markov Chain Monte Carlo (MCMC) sampler. Posterior mean estimates and credible intervals are provided for the model parameters and the Deviance Information Criterion (DIC) is used to select the model that best captures different forms of nonstationarity over time. This study also introduces a fully Bayesian, time-varying joint return period for multivariate time-dependent risk analysis in nonstationary environments. The results demonstrate that the nature and the risk of extreme-climate multidimensional processes are changed over time under the impact of climate change, and accordingly the long-term decision making strategies should be updated based on the anomalies of the nonstationary environment.
A Structure Learning Algorithm for Bayesian Network Using Prior Knowledge
Institute of Scientific and Technical Information of China (English)
徐俊刚; 赵越; 陈健; 韩超
2015-01-01
Learning structure from data is one of the most important fundamental tasks of Bayesian network research. Particularly, learning optional structure of Bayesian network is a non-deterministic polynomial-time (NP) hard problem. To solve this problem, many heuristic algorithms have been proposed, and some of them learn Bayesian network structure with the help of different types of prior knowledge. However, the existing algorithms have some restrictions on the prior knowledge, such as quality restriction and use restriction. This makes it diﬃcult to use the prior knowledge well in these algorithms. In this paper, we introduce the prior knowledge into the Markov chain Monte Carlo (MCMC) algorithm and propose an algorithm called Constrained MCMC (C-MCMC) algorithm to learn the structure of the Bayesian network. Three types of prior knowledge are defined: existence of parent node, absence of parent node, and distribution knowledge including the conditional probability distribution (CPD) of edges and the probability distribution (PD) of nodes. All of these types of prior knowledge are easily used in this algorithm. We conduct extensive experiments to demonstrate the feasibility and effectiveness of the proposed method C-MCMC.
Bayesian nonparametric estimation of hazard rate in monotone Aalen model
Czech Academy of Sciences Publication Activity Database
Timková, Jana
2014-01-01
Roč. 50, č. 6 (2014), s. 849-868. ISSN 0023-5954 Institutional support: RVO:67985556 Keywords : Aalen model * Bayesian estimation * MCMC Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.541, year: 2014 http://library.utia.cas.cz/separaty/2014/SI/timkova-0438210.pdf
Bayesian estimation of parameters in a regional hydrological model
Engeland, K.; Gottschalk, L.
2002-01-01
This study evaluates the applicability of the distributed, process-oriented Ecomag model for prediction of daily streamflow in ungauged basins. The Ecomag model is applied as a regional model to nine catchments in the NOPEX area, using Bayesian statistics to estimate the posterior distribution of the model parameters conditioned on the observed streamflow. The distribution is calculated by Markov Chain Monte Carlo (MCMC) analysis. The Bayesian method requires formulation of a likelihood funct...
Bayesian estimation of parameters in a regional hydrological model
Engeland, K.; Gottschalk, L.
2002-01-01
This study evaluates the applicability of the distributed, process-oriented Ecomag model for prediction of daily streamflow in ungauged basins. The Ecomag model is applied as a regional model to nine catchments in the NOPEX area, using Bayesian statistics to estimate the posterior distribution of the model parameters conditioned on the observed streamflow. The distribution is calculated by Markov Chain Monte Carlo (MCMC) analysis. The Bayesian method requires formulation of ...
A survey of current Bayesian gene mapping method
Molitor John; Marjoram Paul; Conti David; Thomas Duncan
2004-01-01
Abstract Recently, there has been much interest in the use of Bayesian statistical methods for performing genetic analyses. Many of the computational difficulties previously associated with Bayesian analysis, such as multidimensional integration, can now be easily overcome using modern high-speed computers and Markov chain Monte Carlo (MCMC) methods. Much of this new technology has been used to perform gene mapping, especially through the use of multi-locus linkage disequilibrium techniques. ...
Bayesian default probability models
Andrlíková, Petra
2014-01-01
This paper proposes a methodology for default probability estimation for low default portfolios, where the statistical inference may become troublesome. The author suggests using logistic regression models with the Bayesian estimation of parameters. The piecewise logistic regression model and Box-Cox transformation of credit risk score is used to derive the estimates of probability of default, which extends the work by Neagu et al. (2009). The paper shows that the Bayesian models are more acc...
Bayesian theory and applications
Dellaportas, Petros; Polson, Nicholas G; Stephens, David A
2013-01-01
The development of hierarchical models and Markov chain Monte Carlo (MCMC) techniques forms one of the most profound advances in Bayesian analysis since the 1970s and provides the basis for advances in virtually all areas of applied and theoretical Bayesian statistics. This volume guides the reader along a statistical journey that begins with the basic structure of Bayesian theory, and then provides details on most of the past and present advances in this field. The book has a unique format. There is an explanatory chapter devoted to each conceptual advance followed by journal-style chapters that provide applications or further advances on the concept. Thus, the volume is both a textbook and a compendium of papers covering a vast range of topics. It is appropriate for a well-informed novice interested in understanding the basic approach, methods and recent applications. Because of its advanced chapters and recent work, it is also appropriate for a more mature reader interested in recent applications and devel...
Stochastic Annealing for Variational Inference
Gultekin, San; Zhang, Aonan; Paisley, John
2015-01-01
We empirically evaluate a stochastic annealing strategy for Bayesian posterior optimization with variational inference. Variational inference is a deterministic approach to approximate posterior inference in Bayesian models in which a typically non-convex objective function is locally optimized over the parameters of the approximating distribution. We investigate an annealing method for optimizing this objective with the aim of finding a better local optimal solution and compare with determin...
Labarbe, Rudi; Janssens, Guillaume; Sterpin, Edmond
2016-09-01
In proton therapy, quantification of the proton range uncertainty is important to achieve dose distribution compliance. The promising accuracy of prompt gamma imaging (PGI) suggests the development of a mathematical framework using the range measurements to convert population based estimates of uncertainties into patient specific estimates with the purpose of plan adaptation. We present here such framework using Bayesian inference. The sources of uncertainty were modeled by three parameters: setup bias m, random setup precision r and water equivalent path length bias u. The evolution of the expectation values E(m), E(r) and E(u) during the treatment was simulated. The expectation values converged towards the true simulation parameters after 5 and 10 fractions, for E(m) and E(u), respectively. E(r) settle on a constant value slightly lower than the true value after 10 fractions. In conclusion, the simulation showed that there is enough information in the frequency distribution of the range errors measured by PGI to estimate the expectation values and the confidence interval of the model parameters by Bayesian inference. The updated model parameters were used to compute patient specific lateral and local distal margins for adaptive re-planning.
Minsley, B.J.
2011-01-01
A meaningful interpretation of geophysical measurements requires an assessment of the space of models that are consistent with the data, rather than just a single, 'best' model which does not convey information about parameter uncertainty. For this purpose, a trans-dimensional Bayesian Markov chain Monte Carlo (MCMC) algorithm is developed for assessing frequency-domain electromagnetic (FDEM) data acquired from airborne or ground-based systems. By sampling the distribution of models that are consistent with measured data and any prior knowledge, valuable inferences can be made about parameter values such as the likely depth to an interface, the distribution of possible resistivity values as a function of depth and non-unique relationships between parameters. The trans-dimensional aspect of the algorithm allows the number of layers to be a free parameter that is controlled by the data, where models with fewer layers are inherently favoured, which provides a natural measure of parsimony and a significant degree of flexibility in parametrization. The MCMC algorithm is used with synthetic examples to illustrate how the distribution of acceptable models is affected by the choice of prior information, the system geometry and configuration and the uncertainty in the measured system elevation. An airborne FDEM data set that was acquired for the purpose of hydrogeological characterization is also studied. The results compare favourably with traditional least-squares analysis, borehole resistivity and lithology logs from the site, and also provide new information about parameter uncertainty necessary for model assessment. ?? 2011. Geophysical Journal International ?? 2011 RAS.
Target tracking in glint noise using a MCMC particle filter
Institute of Scientific and Technical Information of China (English)
Hu Hongtao; Jing Zhongliang; Li Anping; Hu Shiqiang; Tian Hongwei
2005-01-01
In radar target tracking application, the observation noise is usually non-Gaussian, which is also referred as glint noise. The performances of conventional trackers degra de severely in the presence of glint noise. An improved particle filter, Markov chain Monte Carlo particle filter (MCMC-PF), is applied to cope with radar target tracking when the measurements are perturbed by glint noise. Tracking performance of the filter is demonstrated in the present of glint noise by computer simulation.
Bayesian approach to rough set
Marwala, Tshilidzi
2007-01-01
This paper proposes an approach to training rough set models using Bayesian framework trained using Markov Chain Monte Carlo (MCMC) method. The prior probabilities are constructed from the prior knowledge that good rough set models have fewer rules. Markov Chain Monte Carlo sampling is conducted through sampling in the rough set granule space and Metropolis algorithm is used as an acceptance criteria. The proposed method is tested to estimate the risk of HIV given demographic data. The results obtained shows that the proposed approach is able to achieve an average accuracy of 58% with the accuracy varying up to 66%. In addition the Bayesian rough set give the probabilities of the estimated HIV status as well as the linguistic rules describing how the demographic parameters drive the risk of HIV.
Bayesian methods for measures of agreement
Broemeling, Lyle D
2009-01-01
Using WinBUGS to implement Bayesian inferences of estimation and testing hypotheses, Bayesian Methods for Measures of Agreement presents useful methods for the design and analysis of agreement studies. It focuses on agreement among the various players in the diagnostic process.The author employs a Bayesian approach to provide statistical inferences based on various models of intra- and interrater agreement. He presents many examples that illustrate the Bayesian mode of reasoning and explains elements of a Bayesian application, including prior information, experimental information, the likelihood function, posterior distribution, and predictive distribution. The appendices provide the necessary theoretical foundation to understand Bayesian methods as well as introduce the fundamentals of programming and executing the WinBUGS software.Taking a Bayesian approach to inference, this hands-on book explores numerous measures of agreement, including the Kappa coefficient, the G coefficient, and intraclass correlation...
Bayesian kinematic earthquake source models
Minson, S. E.; Simons, M.; Beck, J. L.; Genrich, J. F.; Galetzka, J. E.; Chowdhury, F.; Owen, S. E.; Webb, F.; Comte, D.; Glass, B.; Leiva, C.; Ortega, F. H.
2009-12-01
Most coseismic, postseismic, and interseismic slip models are based on highly regularized optimizations which yield one solution which satisfies the data given a particular set of regularizing constraints. This regularization hampers our ability to answer basic questions such as whether seismic and aseismic slip overlap or instead rupture separate portions of the fault zone. We present a Bayesian methodology for generating kinematic earthquake source models with a focus on large subduction zone earthquakes. Unlike classical optimization approaches, Bayesian techniques sample the ensemble of all acceptable models presented as an a posteriori probability density function (PDF), and thus we can explore the entire solution space to determine, for example, which model parameters are well determined and which are not, or what is the likelihood that two slip distributions overlap in space. Bayesian sampling also has the advantage that all a priori knowledge of the source process can be used to mold the a posteriori ensemble of models. Although very powerful, Bayesian methods have up to now been of limited use in geophysical modeling because they are only computationally feasible for problems with a small number of free parameters due to what is called the "curse of dimensionality." However, our methodology can successfully sample solution spaces of many hundreds of parameters, which is sufficient to produce finite fault kinematic earthquake models. Our algorithm is a modification of the tempered Markov chain Monte Carlo (tempered MCMC or TMCMC) method. In our algorithm, we sample a "tempered" a posteriori PDF using many MCMC simulations running in parallel and evolutionary computation in which models which fit the data poorly are preferentially eliminated in favor of models which better predict the data. We present results for both synthetic test problems as well as for the 2007 Mw 7.8 Tocopilla, Chile earthquake, the latter of which is constrained by InSAR, local high
Single channel signal component separation using Bayesian estimation
Institute of Scientific and Technical Information of China (English)
Cai Quanwei; Wei Ping; Xiao Xianci
2007-01-01
A Bayesian estimation method to separate multicomponent signals with single channel observation is presented in this paper. By using the basis function projection, the component separation becomes a problem of limited parameter estimation. Then, a Bayesian model for estimating parameters is set up. The reversible jump MCMC (Monte Carlo Markov Chain) algorithmis adopted to perform the Bayesian computation. The method can jointly estimate the parameters of each component and the component number. Simulation results demonstrate that the method has low SNR threshold and better performance.
Bayesian modeling using WinBUGS
Ntzoufras, Ioannis
2009-01-01
A hands-on introduction to the principles of Bayesian modeling using WinBUGS Bayesian Modeling Using WinBUGS provides an easily accessible introduction to the use of WinBUGS programming techniques in a variety of Bayesian modeling settings. The author provides an accessible treatment of the topic, offering readers a smooth introduction to the principles of Bayesian modeling with detailed guidance on the practical implementation of key principles. The book begins with a basic introduction to Bayesian inference and the WinBUGS software and goes on to cover key topics, including: Markov Chain Monte Carlo algorithms in Bayesian inference Generalized linear models Bayesian hierarchical models Predictive distribution and model checking Bayesian model and variable evaluation Computational notes and screen captures illustrate the use of both WinBUGS as well as R software to apply the discussed techniques. Exercises at the end of each chapter allow readers to test their understanding of the presented concepts and all ...
Bayesian analysis for extreme climatic events: A review
Chu, Pao-Shin; Zhao, Xin
2011-11-01
This article reviews Bayesian analysis methods applied to extreme climatic data. We particularly focus on applications to three different problems related to extreme climatic events including detection of abrupt regime shifts, clustering tropical cyclone tracks, and statistical forecasting for seasonal tropical cyclone activity. For identifying potential change points in an extreme event count series, a hierarchical Bayesian framework involving three layers - data, parameter, and hypothesis - is formulated to demonstrate the posterior probability of the shifts throughout the time. For the data layer, a Poisson process with a gamma distributed rate is presumed. For the hypothesis layer, multiple candidate hypotheses with different change-points are considered. To calculate the posterior probability for each hypothesis and its associated parameters we developed an exact analytical formula, a Markov Chain Monte Carlo (MCMC) algorithm, and a more sophisticated reversible jump Markov Chain Monte Carlo (RJMCMC) algorithm. The algorithms are applied to several rare event series: the annual tropical cyclone or typhoon counts over the central, eastern, and western North Pacific; the annual extremely heavy rainfall event counts at Manoa, Hawaii; and the annual heat wave frequency in France. Using an Expectation-Maximization (EM) algorithm, a Bayesian clustering method built on a mixture Gaussian model is applied to objectively classify historical, spaghetti-like tropical cyclone tracks (1945-2007) over the western North Pacific and the South China Sea into eight distinct track types. A regression based approach to forecasting seasonal tropical cyclone frequency in a region is developed. Specifically, by adopting large-scale environmental conditions prior to the tropical cyclone season, a Poisson regression model is built for predicting seasonal tropical cyclone counts, and a probit regression model is alternatively developed toward a binary classification problem. With a non
Directory of Open Access Journals (Sweden)
Oliver Ratmann
2007-11-01
Full Text Available Gene duplication with subsequent interaction divergence is one of the primary driving forces in the evolution of genetic systems. Yet little is known about the precise mechanisms and the role of duplication divergence in the evolution of protein networks from the prokaryote and eukaryote domains. We developed a novel, model-based approach for Bayesian inference on biological network data that centres on approximate Bayesian computation, or likelihood-free inference. Instead of computing the intractable likelihood of the protein network topology, our method summarizes key features of the network and, based on these, uses a MCMC algorithm to approximate the posterior distribution of the model parameters. This allowed us to reliably fit a flexible mixture model that captures hallmarks of evolution by gene duplication and subfunctionalization to protein interaction network data of Helicobacter pylori and Plasmodium falciparum. The 80% credible intervals for the duplication-divergence component are [0.64, 0.98] for H. pylori and [0.87, 0.99] for P. falciparum. The remaining parameter estimates are not inconsistent with sequence data. An extensive sensitivity analysis showed that incompleteness of PIN data does not largely affect the analysis of models of protein network evolution, and that the degree sequence alone barely captures the evolutionary footprints of protein networks relative to other statistics. Our likelihood-free inference approach enables a fully Bayesian analysis of a complex and highly stochastic system that is otherwise intractable at present. Modelling the evolutionary history of PIN data, it transpires that only the simultaneous analysis of several global aspects of protein networks enables credible and consistent inference to be made from available datasets. Our results indicate that gene duplication has played a larger part in the network evolution of the eukaryote than in the prokaryote, and suggests that single gene
Loredo, Thomas J.
2004-04-01
I describe a framework for adaptive scientific exploration based on iterating an Observation-Inference-Design cycle that allows adjustment of hypotheses and observing protocols in response to the results of observation on-the-fly, as data are gathered. The framework uses a unified Bayesian methodology for the inference and design stages: Bayesian inference to quantify what we have learned from the available data and predict future data, and Bayesian decision theory to identify which new observations would teach us the most. When the goal of the experiment is simply to make inferences, the framework identifies a computationally efficient iterative ``maximum entropy sampling'' strategy as the optimal strategy in settings where the noise statistics are independent of signal properties. Results of applying the method to two ``toy'' problems with simulated data-measuring the orbit of an extrasolar planet, and locating a hidden one-dimensional object-show the approach can significantly improve observational efficiency in settings that have well-defined nonlinear models. I conclude with a list of open issues that must be addressed to make Bayesian adaptive exploration a practical and reliable tool for optimizing scientific exploration.
Bayesian methods for proteomic biomarker development
Directory of Open Access Journals (Sweden)
Belinda Hernández
2015-12-01
In this review we provide an introduction to Bayesian inference and demonstrate some of the advantages of using a Bayesian framework. We summarize how Bayesian methods have been used previously in proteomics and other areas of bioinformatics. Finally, we describe some popular and emerging Bayesian models from the statistical literature and provide a worked tutorial including code snippets to show how these methods may be applied for the evaluation of proteomic biomarkers.
Directory of Open Access Journals (Sweden)
Moslem Moradi
2015-06-01
Full Text Available Here in, an application of a new seismic inversion algorithm in one of Iran’s oilfields is described. Stochastic (geostatistical seismic inversion, as a complementary method to deterministic inversion, is perceived as contribution combination of geostatistics and seismic inversion algorithm. This method integrates information from different data sources with different scales, as prior information in Bayesian statistics. Data integration leads to a probability density function (named as a posteriori probability that can yield a model of subsurface. The Markov Chain Monte Carlo (MCMC method is used to sample the posterior probability distribution, and the subsurface model characteristics can be extracted by analyzing a set of the samples. In this study, the theory of stochastic seismic inversion in a Bayesian framework was described and applied to infer P-impedance and porosity models. The comparison between the stochastic seismic inversion and the deterministic model based seismic inversion indicates that the stochastic seismic inversion can provide more detailed information of subsurface character. Since multiple realizations are extracted by this method, an estimation of pore volume and uncertainty in the estimation were analyzed.
Decoding X-ray observations from centres of galaxy clusters using MCMC
Lakhchaura, Kiran; Sharma, Prateek
2016-01-01
Traditionally the thermodynamic profiles (gas density, temperature, etc.) of galaxy clusters are obtained by assuming spherical symmetry and modeling projected X-ray spectra in each annulus. The outer annuli contribute to the inner ones and their contribution needs to be subtracted to obtain the temperature and density of spherical shells. The usual deprojection methods lead to propagation of errors from outside to in and do not model the covariance of parameters in different radial shells. In this paper we describe a method based on a free-form model of clusters with cluster parameters (density, temperature) given in spherical shells, which we {\\it jointly} forward fit to the X-ray data by constructing a Bayesian posterior probability distribution that we sample using the MCMC technique. By systematically marginalising over the nuisance outer shells, we estimate the inner entropy profiles of clusters and fit them to various models for a sample of Chandra X-ray observations of 17 clusters. We show that the en...
Methane emission modeling with MCMC calibration for a boreal peatland
Raivonen, Maarit; Smolander, Sampo; Susiluoto, Jouni; Backman, Leif; Li, Xuefei; Markkanen, Tiina; Kleinen, Thomas; Makela, Jarmo; Aalto, Tuula; Rinne, Janne; Brovkin, Victor; Vesala, Timo
2016-04-01
Natural wetlands, particularly peatlands of the boreal latitudes, are a significant source of methane (CH4). At the moment, the emission estimates are highly uncertain. These natural emissions respond to climatic variability, so it is necessary to understand their dynamics, in order to be able to predict how they affect the greenhouse gas balance in the future. We have developed a model of CH4 production, oxidation and transport in boreal peatlands. It simulates production of CH4 as a proportion of anaerobic peat respiration, transport of CH4 and oxygen between the soil and the atmosphere via diffusion in aerenchymatous plants and in peat pores (water and air filled), ebullition and oxidation of CH4 by methanotrophic microbes. Ultimately, we aim to add the model functionality to global climate models such as the JSBACH (Reick et al., 2013), the land surface scheme of the MPI Earth System Model. We tested the model with measured methane fluxes (using eddy covariance technique) from the Siikaneva site, an oligotrophic boreal fen in southern Finland (61°49' N, 24°11' E), over years 2005-2011. To give the model estimates regional reliability, we calibrated the model using Markov chain Monte Carlo (MCMC) technique. Although the simulations and the research are still ongoing, preliminary results from the MCMC calibration can be described as very promising considering that the model is still at relatively early stage. We will present the model and its dynamics as well as results from the MCMC calibration and the comparison with Siikaneva flux data.
Liu, Y.; Pau, G. S. H.; Finsterle, S.
2015-12-01
Parameter inversion involves inferring the model parameter values based on sparse observations of some observables. To infer the posterior probability distributions of the parameters, Markov chain Monte Carlo (MCMC) methods are typically used. However, the large number of forward simulations needed and limited computational resources limit the complexity of the hydrological model we can use in these methods. In view of this, we studied the implicit sampling (IS) method, an efficient importance sampling technique that generates samples in the high-probability region of the posterior distribution and thus reduces the number of forward simulations that we need to run. For a pilot-point inversion of a heterogeneous permeability field based on a synthetic ponded infiltration experiment simulated with TOUGH2 (a subsurface modeling code), we showed that IS with linear map provides an accurate Bayesian description of the parameterized permeability field at the pilot points with just approximately 500 forward simulations. We further studied the use of surrogate models to improve the computational efficiency of parameter inversion. We implemented two reduced-order models (ROMs) for the TOUGH2 forward model. One is based on polynomial chaos expansion (PCE), of which the coefficients are obtained using the sparse Bayesian learning technique to mitigate the "curse of dimensionality" of the PCE terms. The other model is Gaussian process regression (GPR) for which different covariance, likelihood and inference models are considered. Preliminary results indicate that ROMs constructed based on the prior parameter space perform poorly. It is thus impractical to replace this hydrological model by a ROM directly in a MCMC method. However, the IS method can work with a ROM constructed for parameters in the close vicinity of the maximum a posteriori probability (MAP) estimate. We will discuss the accuracy and computational efficiency of using ROMs in the implicit sampling procedure
Uniform Modeling of KOIs: MCMC Data Release Notes
Rowe, Jason F.; Thompson, Susan E.
2015-01-01
The Kepler Mission used a 0.95-m aperture space-based telescope to continuously observe more than 150 000 stars for 4 years. We model and analyze most KOIs listed at the Exoplanet Archive using the Kepler data. This document describes data products related to the reported planetary parameters and uncertainties for the Kepler Objects of Interest (KOIs) based on a Markov-Chain-Monte- Carlo (MCMC) analysis. Reported parameters, uncertainties and data products can be found at the NASA Exoplanet A...
Intracluster Moves for Constrained Discrete-Space MCMC
Hamze, Firas; De Freitas, Nando
2012-01-01
This paper addresses the problem of sampling from binary distributions with constraints. In particular, it proposes an MCMC method to draw samples from a distribution of the set of all states at a specified distance from some reference state. For example, when the reference state is the vector of zeros, the algorithm can draw samples from a binary distribution with a constraint on the number of active variables, say the number of 1's. We motivate the need for this algorithm with examples from...
Bayesian Models of Brain and Behaviour
Penny, William
2012-01-01
This paper presents a review of Bayesian models of brain and behaviour. We first review the basic principles of Bayesian inference. This is followed by descriptions of sampling and variational methods for approximate inference, and forward and backward recursions in time for inference in dynamical models. The review of behavioural models covers work in visual processing, sensory integration, sensorimotor integration, and collective decision making. The review of brain models covers a range of...
Statistical inference an integrated Bayesianlikelihood approach
Aitkin, Murray
2010-01-01
Filling a gap in current Bayesian theory, Statistical Inference: An Integrated Bayesian/Likelihood Approach presents a unified Bayesian treatment of parameter inference and model comparisons that can be used with simple diffuse prior specifications. This novel approach provides new solutions to difficult model comparison problems and offers direct Bayesian counterparts of frequentist t-tests and other standard statistical methods for hypothesis testing.After an overview of the competing theories of statistical inference, the book introduces the Bayes/likelihood approach used throughout. It pre
DEFF Research Database (Denmark)
Ødegård, Jørgen; Meuwissen, Theo HE; Heringstad, Bjørg;
2010-01-01
relationship matrix, but genetic (co)variance components are inferred from the sampled breeding values and relationships between "informative" individuals (usually parents) only. The latter is analogous to a sire-dam model (in cases with no individual records on the parents). Results When applied to simulated......, residual variance on the underlying scale is not identifiable. Hence, variance of fully confounded Mendelian sampling deviations cannot be identified either, but can be inferred from the between-family variation. In the new algorithm, breeding values are sampled as in a standard animal model using the full...
Directory of Open Access Journals (Sweden)
Thomas Christopher M
2011-07-01
Full Text Available Abstract Background IncP-1 plasmids are broad host range plasmids that have been found in clinical and environmental bacteria. They often carry genes for antibiotic resistance or catabolic pathways. The archetypal IncP-1 plasmid RK2 is a well-characterized biological system, with a fully sequenced and annotated genome and wide range of experimental measurements. Its central control operon, encoding two global regulators KorA and KorB, is a natural example of a negatively self-regulated operon. To increase our understanding of the regulation of this operon, we have constructed a dynamical mathematical model using Ordinary Differential Equations, and employed a Bayesian inference scheme, Markov Chain Monte Carlo (MCMC using the Metropolis-Hastings algorithm, as a way of integrating experimental measurements and a priori knowledge. We also compared MCMC and Metabolic Control Analysis (MCA as approaches for determining the sensitivity of model parameters. Results We identified two distinct sets of parameter values, with different biological interpretations, that fit and explain the experimental data. This allowed us to highlight the proportion of repressor protein as dimers as a key experimental measurement defining the dynamics of the system. Analysis of joint posterior distributions led to the identification of correlations between parameters for protein synthesis and partial repression by KorA or KorB dimers, indicating the necessary use of joint posteriors for correct parameter estimation. Using MCA, we demonstrated that the system is highly sensitive to the growth rate but insensitive to repressor monomerization rates in their selected value regions; the latter outcome was also confirmed by MCMC. Finally, by examining a series of different model refinements for partial repression by KorA or KorB dimers alone, we showed that a model including partial repression by KorA and KorB was most compatible with existing experimental data. Conclusions We
Bayesian multiple target tracking
Streit, Roy L
2013-01-01
This second edition has undergone substantial revision from the 1999 first edition, recognizing that a lot has changed in the multiple target tracking field. One of the most dramatic changes is in the widespread use of particle filters to implement nonlinear, non-Gaussian Bayesian trackers. This book views multiple target tracking as a Bayesian inference problem. Within this framework it develops the theory of single target tracking, multiple target tracking, and likelihood ratio detection and tracking. In addition to providing a detailed description of a basic particle filter that implements
Gomes, Guilherme J. C.; Vrugt, Jasper A.; Vargas, Eurípedes A.
2016-04-01
The depth to bedrock controls a myriad of processes by influencing subsurface flow paths, erosion rates, soil moisture, and water uptake by plant roots. As hillslope interiors are very difficult and costly to illuminate and access, the topography of the bedrock surface is largely unknown. This essay is concerned with the prediction of spatial patterns in the depth to bedrock (DTB) using high-resolution topographic data, numerical modeling, and Bayesian analysis. Our DTB model builds on the bottom-up control on fresh-bedrock topography hypothesis of Rempe and Dietrich (2014) and includes a mass movement and bedrock-valley morphology term to extent the usefulness and general applicability of the model. We reconcile the DTB model with field observations using Bayesian analysis with the DREAM algorithm. We investigate explicitly the benefits of using spatially distributed parameter values to account implicitly, and in a relatively simple way, for rock mass heterogeneities that are very difficult, if not impossible, to characterize adequately in the field. We illustrate our method using an artificial data set of bedrock depth observations and then evaluate our DTB model with real-world data collected at the Papagaio river basin in Rio de Janeiro, Brazil. Our results demonstrate that the DTB model predicts accurately the observed bedrock depth data. The posterior mean DTB simulation is shown to be in good agreement with the measured data. The posterior prediction uncertainty of the DTB model can be propagated forward through hydromechanical models to derive probabilistic estimates of factors of safety.
Institute of Scientific and Technical Information of China (English)
黄凯; 张晓玲
2012-01-01
贝叶斯方法是解决不确定问题的新思路,评述了以贝叶斯公式、贝叶斯统计推断及贝叶斯网络为基础的贝叶斯方法在水质评价、水环境模型参数识别、水环境管理及风险决策方面的应用.贝叶斯公式可很好地解决水质评价中监测数据、水质级别、水质标准等蕴含的不确定信息.贝叶斯统计推断耦合水环境模型为模型参数识别提供新方法,其应用难点为贝叶斯后验分布的计算.介绍了后验分布的离散贝叶斯算法、传统及改进MCMC算法.贝叶斯网络在水质评价、模型预测、水环境管理及风险决策中可同时考虑多个变量的综合作用,得到影响管理决策各因素的不确定性信息,为水环境的管理决策提供科学依据.%Bayesian methods provide new ideas for solving uncertainty problems in Water environmental system. Several Bayesian methods, such as Bayesian formula, Bayesian statistical inference and Bayesian networks, are commented on applying to water quality evaluation, parameters identification of water environment model, water environment manage ment and risk decision making. Bayesian formula can solve uncertain information of monitoring data, water quality grade and standard in water quality evaluation. Bayesian statistical inference coupling the water environmental model provides a new approach for model parameter identification. The posterior distribution calculation is the key of application of Bayes ian statistical inference. The Bayesian discrete algorithms based posterior distribution, the traditional and improved MC-MC algorithms are introduced. The application of Bayesian networks to water quality assessment, model prediction, wa ter environment management and risk decision making can take multiple variable into account simultaneously. Then the uncertain information of factors influencing on management decision making is obtained, which provides the scientific ba sis for water environmental
Bayesian analysis of heavy-tailed and long-range dependent Processes
Graves, Timothy; Watkins, Nick; Gramacy, Robert; Franzke, Christian
2014-05-01
We have used MCMC algorithms to perform a Bayesian analysis of Auto-Regressive Fractionally-Integrated Moving-Average ARFIMA(p,d,q) processes, which are capable of modelling long range dependence (e.g. Beran et al, 2013). Our principal aim is to obtain inference about the long memory parameter, d, with secondary interest in the scale and location parameters. We have developed a reversible-jump method enabling us to integrate over different model forms for the short memory component. We initially assume Gaussianity, and have tested the method on both synthetic and physical time series. We have extended the ARFIMA model by weakening the Gaussianity assumption, assuming an alpha-stable, heavy tailed, distribution for the innovations, and performing joint inference on d and alpha. We will present a study of the dependence of the posterior variance of the memory parameter d on the length of the time series considered. This will be compared with equivalent error diagnostics for other popular measures of d.
Bayesian analysis of spatial point processes in the neighbourhood of Voronoi networks
DEFF Research Database (Denmark)
Skare, Øivind; Møller, Jesper; Vedel Jensen, Eva B.
A model for an inhomogeneous Poisson process with high intensity near the edges of a Voronoi tessellation in 2D or 3D is proposed. The model is analysed in a Bayesian setting with priors on nuclei of the Voronoi tessellation and other model parameters. An MCMC algorithm is constructed to sample f...
Bayesian analysis of spatial point processes in the neighbourhood of Voronoi networks
DEFF Research Database (Denmark)
Skare, Øivind; Møller, Jesper; Jensen, Eva B. Vedel
2007-01-01
A model for an inhomogeneous Poisson process with high intensity near the edges of a Voronoi tessellation in 2D or 3D is proposed. The model is analysed in a Bayesian setting with priors on nuclei of the Voronoi tessellation and other model parameters. An MCMC algorithm is constructed to sample f...
A Bayesian MCMC method for point process models with intractable normalising constants
DEFF Research Database (Denmark)
Berthelsen, Kasper Klitgaard; Møller, Jesper
We present new methodology for drawing samples from a posterior distribution when the likelihood function is only specified up to a normalising constant. Our method is "on-line" as compared with alternative approaches to the problem which require "off-line" computations. Since it is needed to sim...
Directory of Open Access Journals (Sweden)
L.S. Vismara
2007-12-01
Full Text Available A dinâmica da população de plantas daninhas pode ser representada por um sistema de equações que relaciona as densidades de sementes produzidas e de plântulas em áreas de cultivo. Os valores dos parâmetros dos modelos podem ser inferidos diretamente de experimentação e análise estatística ou extraídos da literatura. O presente trabalho teve por objetivo estimar os parâmetros do modelo de densidade populacional de plantas daninhas, a partir de um experimento conduzido na área experimental da Embrapa Milho e Sorgo, Sete Lagoas, MG, via os procedimentos de inferências clássica e Bayesiana.Dynamics of weed populations can be described as a system of equations relating the produced seed and seedling densities in crop areas. The model parameter values can be either directly inferred from experimentation and statistical analysis or obtained from the literature. The objective of this work was to estimate the weed population density model parameters based on experimental field data at Embrapa Milho e Sorgo, Sete Lagoas, MG, using classic and Bayesian inferences.
An Efficient MCMC Algorithm to Sample Binary Matrices with Fixed Marginals
Verhelst, Norman D.
2008-01-01
Uniform sampling of binary matrices with fixed margins is known as a difficult problem. Two classes of algorithms to sample from a distribution not too different from the uniform are studied in the literature: importance sampling and Markov chain Monte Carlo (MCMC). Existing MCMC algorithms converge slowly, require a long burn-in period and yield…
Topics in Bayesian statistics and maximum entropy
International Nuclear Information System (INIS)
Notions of Bayesian decision theory and maximum entropy methods are reviewed with particular emphasis on probabilistic inference and Bayesian modeling. The axiomatic approach is considered as the best justification of Bayesian analysis and maximum entropy principle applied in natural sciences. Particular emphasis is put on solving the inverse problem in digital image restoration and Bayesian modeling of neural networks. Further topics addressed briefly include language modeling, neutron scattering, multiuser detection and channel equalization in digital communications, genetic information, and Bayesian court decision-making. (author)
MCMC with Strings and Branes: The Suburban Algorithm (Extended Version)
Heckman, Jonathan J; Vigoda, Ben
2016-01-01
Motivated by the physics of strings and branes, we develop a class of Markov chain Monte Carlo (MCMC) algorithms involving extended objects. Starting from a collection of parallel Metropolis-Hastings (MH) samplers, we place them on an auxiliary grid, and couple them together via nearest neighbor interactions. This leads to a class of "suburban samplers" (i.e., spread out Metropolis). Coupling the samplers in this way modifies the mixing rate and speed of convergence for the Markov chain, and can in many cases allow a sampler to more easily overcome free energy barriers in a target distribution. We test these general theoretical considerations by performing several numerical experiments. For suburban samplers with a fluctuating grid topology, performance is strongly correlated with the average number of neighbors. Increasing the average number of neighbors above zero initially leads to an increase in performance, though there is a critical connectivity with effective dimension d_eff ~ 1, above which "groupthin...
Bayesian spatial semi-parametric modeling of HIV variation in Kenya.
Directory of Open Access Journals (Sweden)
Oscar Ngesa
Full Text Available Spatial statistics has seen rapid application in many fields, especially epidemiology and public health. Many studies, nonetheless, make limited use of the geographical location information and also usually assume that the covariates, which are related to the response variable, have linear effects. We develop a Bayesian semi-parametric regression model for HIV prevalence data. Model estimation and inference is based on fully Bayesian approach via Markov Chain Monte Carlo (McMC. The model is applied to HIV prevalence data among men in Kenya, derived from the Kenya AIDS indicator survey, with n = 3,662. Past studies have concluded that HIV infection has a nonlinear association with age. In this study a smooth function based on penalized regression splines is used to estimate this nonlinear effect. Other covariates were assumed to have a linear effect. Spatial references to the counties were modeled as both structured and unstructured spatial effects. We observe that circumcision reduces the risk of HIV infection. The results also indicate that men in the urban areas were more likely to be infected by HIV as compared to their rural counterpart. Men with higher education had the lowest risk of HIV infection. A nonlinear relationship between HIV infection and age was established. Risk of HIV infection increases with age up to the age of 40 then declines with increase in age. Men who had STI in the last 12 months were more likely to be infected with HIV. Also men who had ever used a condom were found to have higher likelihood to be infected by HIV. A significant spatial variation of HIV infection in Kenya was also established. The study shows the practicality and flexibility of Bayesian semi-parametric regression model in analyzing epidemiological data.
How Much Can We Learn from a Single Chromatographic Experiment? A Bayesian Perspective.
Wiczling, Paweł; Kaliszan, Roman
2016-01-01
In this work, we proposed and investigated a Bayesian inference procedure to find the desired chromatographic conditions based on known analyte properties (lipophilicity, pKa, and polar surface area) using one preliminary experiment. A previously developed nonlinear mixed effect model was used to specify the prior information about a new analyte with known physicochemical properties. Further, the prior (no preliminary data) and posterior predictive distribution (prior + one experiment) were determined sequentially to search towards the desired separation. The following isocratic high-performance reversed-phase liquid chromatographic conditions were sought: (1) retention time of a single analyte within the range of 4-6 min and (2) baseline separation of two analytes with retention times within the range of 4-10 min. The empirical posterior Bayesian distribution of parameters was estimated using the "slice sampling" Markov Chain Monte Carlo (MCMC) algorithm implemented in Matlab. The simulations with artificial analytes and experimental data of ketoprofen and papaverine were used to test the proposed methodology. The simulation experiment showed that for a single and two randomly selected analytes, there is 97% and 74% probability of obtaining a successful chromatogram using none or one preliminary experiment. The desired separation for ketoprofen and papaverine was established based on a single experiment. It was confirmed that the search for a desired separation rarely requires a large number of chromatographic analyses at least for a simple optimization problem. The proposed Bayesian-based optimization scheme is a powerful method of finding a desired chromatographic separation based on a small number of preliminary experiments. PMID:26607659
Cameron, E
2012-01-01
"Approximate Bayesian Computation" (ABC) represents a powerful methodology for the analysis of complex stochastic systems for which the likelihood of the observed data under an arbitrary set of input parameters may be entirely intractable-the latter condition rendering useless the standard machinery of tractable likelihood-based, Bayesian statistical inference (e.g. conventional Markov Chain Monte Carlo simulation; MCMC). In this article we demonstrate the potential of ABC for astronomical model analysis by application to a case study in the morphological transformation of high redshift galaxies. To this end we develop, first, a stochastic model for the competing processes of merging and secular evolution in the early Universe; and second, through an ABC-based comparison against the observed demographics of the first generation of massive (M_gal > 10^11 M_sun) galaxies (at 1.5 < z < 3) in the CANDELS/EGS dataset we derive posterior probability densities for the key parameters of this model. The "Sequent...
Elvira, Clément; Dobigeon, Nicolas
2015-01-01
Sparse representations have proven their efficiency in solving a wide class of inverse problems encountered in signal and image processing. Conversely, enforcing the information to be spread uniformly over representation coefficients exhibits relevant properties in various applications such as digital communications. Anti-sparse regularization can be naturally expressed through an $\\ell_{\\infty}$-norm penalty. This paper derives a probabilistic formulation of such representations. A new probability distribution, referred to as the democratic prior, is first introduced. Its main properties as well as three random variate generators for this distribution are derived. Then this probability distribution is used as a prior to promote anti-sparsity in a Gaussian linear inverse problem, yielding a fully Bayesian formulation of anti-sparse coding. Two Markov chain Monte Carlo (MCMC) algorithms are proposed to generate samples according to the posterior distribution. The first one is a standard Gibbs sampler. The seco...
A new approach for Bayesian model averaging
Institute of Scientific and Technical Information of China (English)
TIAN XiangJun; XIE ZhengHui; WANG AiHui; YANG XiaoChun
2012-01-01
Bayesian model averaging (BMA) is a recently proposed statistical method for calibrating forecast ensembles from numerical weather models.However,successful implementation of BMA requires accurate estimates of the weights and variances of the individual competing models in the ensemble.Two methods,namely the Expectation-Maximization (EM) and the Markov Chain Monte Carlo (MCMC) algorithms,are widely used for BMA model training.Both methods have their own respective strengths and weaknesses.In this paper,we first modify the BMA log-likelihood function with the aim of removing the additional limitation that requires that the BMA weights add to one,and then use a limited memory quasi-Newtonian algorithm for solving the nonlinear optimization problem,thereby formulating a new approach for BMA (referred to as BMA-BFGS).Several groups of multi-model soil moisture simulation experiments from three land surface models show that the performance of BMA-BFGS is similar to the MCMC method in terms of simulation accuracy,and that both are superior to the EM algorithm.On the other hand,the computational cost of the BMA-BFGS algorithm is substantially less than for MCMC and is almost equivalent to that for EM.
Decoding X-ray observations from centres of galaxy clusters using MCMC
Lakhchaura, Kiran; Saini, Tarun Deep; Sharma, Prateek
2016-08-01
Traditionally the thermodynamic profiles (gas density, temperature, etc.) of galaxy clusters are obtained by assuming spherical symmetry and modeling projected X-ray spectra in each annulus. The outer annuli contribute to the inner ones and their contribution needs to be subtracted to obtain the temperature and density of spherical shells. The usual deprojection methods lead to propagation of errors from outside to in and typically do not model the covariance of parameters in different radial shells. In this paper we describe a method based on a free-form model of clusters with cluster parameters (density, temperature) given in spherical shells, which we {\\it jointly} forward fit to the X-ray data by constructing a Bayesian posterior probability distribution that we sample using the MCMC technique. By systematically marginalising over the nuisance outer shells, we estimate the inner entropy profiles of clusters and fit them to various models for a sample of Chandra X-ray observations of 17 clusters. We show that the entropy profiles in almost all of our clusters are best described as cored power laws. A small subsample is found to be either consistent with a power law, or alternatively their cores are not fully resolved (smaller than, or about few kpc). We find marginal evidence for bimodality in the central values of entropy (and cooling time) corresponding to cool-core and non cool-core clusters. The minimum value of the ratio of the cooling time and the free-fall time (min[$t_{\\rm cool}/t_{\\rm ff}$]; correlation is much weaker with core entropy) is anti-correlated with $H\\alpha$ and radio luminosity. $H\\alpha$ emitting cold gas is absent in our clusters with min$(t_{\\rm cool}/t_{\\rm ff})\\gtrsim 10$. Our lowest core entropies are systematically and substantially lower than the values quoted by the ACCEPT sample.
Nonparametric Bayesian Classification
Coram, M A
2002-01-01
A Bayesian approach to the classification problem is proposed in which random partitions play a central role. It is argued that the partitioning approach has the capacity to take advantage of a variety of large-scale spatial structures, if they are present in the unknown regression function $f_0$. An idealized one-dimensional problem is considered in detail. The proposed nonparametric prior uses random split points to partition the unit interval into a random number of pieces. This prior is found to provide a consistent estimate of the regression function in the $\\L^p$ topology, for any $1 \\leq p < \\infty$, and for arbitrary measurable $f_0:[0,1] \\rightarrow [0,1]$. A Markov chain Monte Carlo (MCMC) implementation is outlined and analyzed. Simulation experiments are conducted to show that the proposed estimate compares favorably with a variety of conventional estimators. A striking resemblance between the posterior mean estimate and the bagged CART estimate is noted and discussed. For higher dimensions, a ...
Zhang, Linlin; Guindani, Michele; Versace, Francesco; Vannucci, Marina
2014-07-15
In this paper we present a novel wavelet-based Bayesian nonparametric regression model for the analysis of functional magnetic resonance imaging (fMRI) data. Our goal is to provide a joint analytical framework that allows to detect regions of the brain which exhibit neuronal activity in response to a stimulus and, simultaneously, infer the association, or clustering, of spatially remote voxels that exhibit fMRI time series with similar characteristics. We start by modeling the data with a hemodynamic response function (HRF) with a voxel-dependent shape parameter. We detect regions of the brain activated in response to a given stimulus by using mixture priors with a spike at zero on the coefficients of the regression model. We account for the complex spatial correlation structure of the brain by using a Markov random field (MRF) prior on the parameters guiding the selection of the activated voxels, therefore capturing correlation among nearby voxels. In order to infer association of the voxel time courses, we assume correlated errors, in particular long memory, and exploit the whitening properties of discrete wavelet transforms. Furthermore, we achieve clustering of the voxels by imposing a Dirichlet process (DP) prior on the parameters of the long memory process. For inference, we use Markov Chain Monte Carlo (MCMC) sampling techniques that combine Metropolis-Hastings schemes employed in Bayesian variable selection with sampling algorithms for nonparametric DP models. We explore the performance of the proposed model on simulated data, with both block- and event-related design, and on real fMRI data. PMID:24650600
Nonparametric Bayesian Modeling of Complex Networks
DEFF Research Database (Denmark)
Schmidt, Mikkel Nørgaard; Mørup, Morten
2013-01-01
Modeling structure in complex networks using Bayesian nonparametrics makes it possible to specify flexible model structures and infer the adequate model complexity from the observed data. This article provides a gentle introduction to nonparametric Bayesian modeling of complex networks: Using...... for complex networks can be derived and point out relevant literature....
Institute of Scientific and Technical Information of China (English)
左哲
2015-01-01
为了研究长输管道腐蚀泄漏及蒸气云爆炸事故的演化规律，通过对埋地管道内(外)壁腐蚀失效、燃气泄漏、气体云团扩散及蒸气云爆炸等4阶段事件进行分析，构建埋地管线腐蚀泄漏火灾的贝叶斯网络模型。研究网络结构中节点变量的取值范围及离散化方法，并基于对事故统计和专家分析判断，设定节点变量的先验概率，量化节点关联的条件概率分布。在对贝叶斯网络推理策略研究的基础上，考察节点变量对推理结果的敏感性，验证模型的合理性。结果表明，长输管道腐蚀泄漏及次生灾害事件过程具有较大的不确定性，主要体现在中间事件均具有多种状态，事故演化路径概率受模型输入条件影响较大。贝叶斯网络方法用于描述事故过程中间节点事件间的依赖关系有较大的优势，可以定量衡量事故风险的不确定性。%In order to research evolutionary laws of unconfined vapor cloud explosion ( UVCE) induced by combustible gas leak in long-distance oil and gas pipelines, Bayesian networks on buried pipelines corrosion leak fire were built by analyzing event nodes on inner and outer wall corrosion failure, combustible gas leak, the gas cloud diffusion and UVCE. The state ranges and discrete methods of node variables were studied. Priori probability and conditional probability distribution of the node variables were set by analyzing on accident statistics data and expert judgements. Bayesian network inference strategy was developed, the sensitivities of each network node variable on inference results were analyzed by researching on evolution mechanism of corrosion leak fire, and the rationality of the model was verified. The results show that there are greater uncer-tainty in the process of pipeline corrosion leaks and secondary disaster. The uncertainty presents in diverse intermediate event status value and probability of accident evolutionary
Inferring weed spatial distribution from multi-type data
Bourgeois, Agnès; Gaba, Sabrina; Munier-Jolain, Nicolas; Borgy, Benjamin; Monestiez, Pascal
2012-01-01
An accurate weed management in a context of sustainable agriculture relies on the knowledge about spatial weed distribution within fields. To improve the representation of patchy spatial distributions of weeds, several sampling strategies are used and lead to various weed measurements (abundance, count, patch boundaries). Here, we propose a hierarchical Bayesian model which includes such multitype data and which allows the interpolation of weed spatial distributions (using a MCMC algorithm). ...
Multitarget tracking with IP reversible jump MCMC-PF
Bocquel, Mélanie; Driessen, Hans; Bagchi, Arun
2013-01-01
In this paper we address the problem of tracking multiple targets based on raw measurements by means of Particle filtering. Bayesian multitarget tracking, in the Random Finite Set framework, propagates the multitarget posterior density recursively in time. Sequential Monte Carlo (SMC) approximations
Plug & Play object oriented Bayesian networks
DEFF Research Database (Denmark)
Bangsø, Olav; Flores, J.; Jensen, Finn Verner
2003-01-01
Object oriented Bayesian networks have proven themselves useful in recent years. The idea of applying an object oriented approach to Bayesian networks has extended their scope to larger domains that can be divided into autonomous but interrelated entities. Object oriented Bayesian networks have...... been shown to be quite suitable for dynamic domains as well. However, processing object oriented Bayesian networks in practice does not take advantage of their modular structure. Normally the object oriented Bayesian network is transformed into a Bayesian network and, inference is performed...... by constructing a junction tree from this network. In this paper we propose a method for translating directly from object oriented Bayesian networks to junction trees, avoiding the intermediate translation. We pursue two main purposes: firstly, to maintain the original structure organized in an instance tree...
International Nuclear Information System (INIS)
Highlights: • The usefulness of multi-year tracer data in evaluating age distribution is examined. • Bayesian MCMC approach is used to infer the parameter of age distribution forms. • Transient tracer data was found to be able to reduce the uncertainty of age distributions. • Transient tracer data can evaluate the appropriateness of presumed distributions. - Abstract: In this work, the effectiveness of transient environmental tracer data in reducing the uncertainty associated with the inference of groundwater residence time distribution was evaluated. A Bayesian Markov Chain Monte Carlo method was used to infer the parameters of presumed residence time distribution forms—exponential and gamma—using concentrations of five tracers, including CFC-11, CFC-12, CFC-113, SF6, and 85Kr. The transient tracer concentrations were synthetically generated using the residence time distributions obtained from a model of the Plœmeur aquifer in southern Brittany, France. Several measures of model adequacy, including Deviance Information Criteria, Bayes factors, and measures based on the deviation of inferred and true cumulative residence time distribution, were used to evaluate the value of groundwater age time-series. Neither of the presumed forms of residence time distributions, exponential and gamma, perfectly represent the simulated true distribution; therefore, the method was not able to show a definitive preference to one over the other in all cases. The results show that using multiple years of tracer data not only reduces the bias of inference (as defined by the difference between the expected value of a metric of inferred residence time distribution and the true value of the same metric), but also helps quantify the uncertainty more realistically. It was found that when one year of data is used, both models could almost perfectly reproduce the observed tracer data, even when the inferred residence time distributions differed substantially from the true one
A tutorial introduction to Bayesian models of cognitive development
Perfors, Amy; Tenenbaum, Joshua B.; Griffiths, Thomas L.; Xu, Fei
2010-01-01
We present an introduction to Bayesian inference as it is used in probabilistic models of cognitive development. Our goal is to provide an intuitive and accessible guide to the what, the how, and the why of the Bayesian approach: what sorts of problems and data the framework is most relevant for, and how and why it may be useful for developmentalists. We emphasize a qualitative understanding of Bayesian inference, but also include information about additional resources for those interested in...
Bayesian image restoration, using configurations
DEFF Research Database (Denmark)
Thorarinsdottir, Thordis Linda
2006-01-01
configurations are expressed in terms of the mean normal measure of the random set. These probabilities are used as prior probabilities in a Bayesian image restoration approach. Estimation of the remaining parameters in the model is outlined for the salt and pepper noise. The inference in the model is discussed...
Bayesian image restoration, using configurations
DEFF Research Database (Denmark)
Thorarinsdottir, Thordis
configurations are expressed in terms of the mean normal measure of the random set. These probabilities are used as prior probabilities in a Bayesian image restoration approach. Estimation of the remaining parameters in the model is outlined for salt and pepper noise. The inference in the model is discussed in...
Bayesian Agglomerative Clustering with Coalescents
Teh, Yee Whye; Daumé III, Hal; Roy, Daniel
2009-01-01
We introduce a new Bayesian model for hierarchical clustering based on a prior over trees called Kingman's coalescent. We develop novel greedy and sequential Monte Carlo inferences which operate in a bottom-up agglomerative fashion. We show experimentally the superiority of our algorithms over others, and demonstrate our approach in document clustering and phylolinguistics.
Raje, D.; Krishnan, R.
2012-08-01
Macroscale hydrologic models (MHMs) were developed to study changes in land surface hydrology due to changing climate over large domains, such as continents or large river basins. However, there are many sources of uncertainty introduced in MHM hydrological simulation, such as model structure error, ineffective model parameters, and low-accuracy model input or validation data. It is hence important to model the uncertainty arising in projection results from an MHM. The objective of this study is to present a Bayesian statistical inference framework for parameter uncertainty modeling of a macroscale hydrologic model. The Bayesian approach implemented using Markov Chain Monte Carlo (MCMC) methods is used in this study to model uncertainty arising from calibration parameters of the Variable Infiltration Capacity (VIC) MHM. The study examines large-scale hydrologic impacts for Indian river basins and changes in discharges for three major river basins with distinct climatic and geographic characteristics, under climate change. Observed/reanalysis meteorological variables such as precipitation, temperature and wind speed are used to drive the VIC macroscale hydrologic model. An objective function describing the fit between observed and simulated discharges at four stations is used to compute the likelihood of the parameters. An MCMC approach using the Metropolis-Hastings algorithm is used to update probability distributions of the parameters. For future hydrologic simulations, bias-corrected GCM projections of climatic variables are used. The posterior distributions of VIC parameters are used for projection of 5th and 95th percentile discharge statistics at four stations, namely, Farakka, Jamtara, Garudeshwar, and Vijayawada for an ensemble of three GCMs and three scenarios, for two time slices. Spatial differences in uncertainty projections of runoff and evapotranspiration for years 2056-2065 for the a1b scenario at the 5th and 95th percentile levels are also projected
Bayesian networks with applications in reliability analysis
Langseth, Helge
2002-01-01
A common goal of the papers in this thesis is to propose, formalize and exemplify the use of Bayesian networks as a modelling tool in reliability analysis. The papers span work in which Bayesian networks are merely used as a modelling tool (Paper I), work where models are specially designed to utilize the inference algorithms of Bayesian networks (Paper II and Paper III), and work where the focus has been on extending the applicability of Bayesian networks to very large domains (Paper IV and ...
A Bayesian hierarchical framework for modeling brain connectivity for neuroimaging data.
Chen, Shuo; Bowman, F DuBois; Mayberg, Helen S
2016-06-01
We propose a novel Bayesian hierarchical model for brain imaging data that unifies voxel-level (the most localized unit of measure) and region-level brain connectivity analyses, and yields population-level inferences. Functional connectivity generally refers to associations in brain activity between distinct locations. The first level of our model summarizes brain connectivity for cross-region voxel pairs using a two-component mixture model consisting of connected and nonconnected voxels. We use the proportion of connected voxel pairs to define a new measure of connectivity strength, which reflects the breadth of between-region connectivity. Furthermore, we evaluate the impact of clinical covariates on connectivity between region-pairs at a population level. We perform parameter estimation using Markov chain Monte Carlo (MCMC) techniques, which can be executed quickly relative to the number of model parameters. We apply our method to resting-state functional magnetic resonance imaging (fMRI) data from 32 subjects with major depression and simulated data to demonstrate the properties of our method. PMID:26501687
Applications of Bayesian approach in modelling risk of malaria-related hospital mortality
Directory of Open Access Journals (Sweden)
Simbeye Jupiter S
2008-02-01
Full Text Available Abstract Background Malaria is a major public health problem in Malawi, however, quantifying its burden in a population is a challenge. Routine hospital data provide a proxy for measuring the incidence of severe malaria and for crudely estimating morbidity rates. Using such data, this paper proposes a method to describe trends, patterns and factors associated with in-hospital mortality attributed to the disease. Methods We develop semiparametric regression models which allow joint analysis of nonlinear effects of calendar time and continuous covariates, spatially structured variation, unstructured heterogeneity, and other fixed covariates. Modelling and inference use the fully Bayesian approach via Markov Chain Monte Carlo (MCMC simulation techniques. The methodology is applied to analyse data arising from paediatric wards in Zomba district, Malawi, between 2002 and 2003. Results and Conclusion We observe that the risk of dying in hospital is lower in the dry season, and for children who travel a distance of less than 5 kms to the hospital, but increases for those who are referred to the hospital. The results also indicate significant differences in both structured and unstructured spatial effects, and the health facility effects reveal considerable differences by type of facility or practice. More importantly, our approach shows non-linearities in the effect of metrical covariates on the probability of dying in hospital. The study emphasizes that the methodological framework used provides a useful tool for analysing the data at hand and of similar structure.
Ryu, Duchwan
2010-09-28
We consider nonparametric regression analysis in a generalized linear model (GLM) framework for data with covariates that are the subject-specific random effects of longitudinal measurements. The usual assumption that the effects of the longitudinal covariate processes are linear in the GLM may be unrealistic and if this happens it can cast doubt on the inference of observed covariate effects. Allowing the regression functions to be unknown, we propose to apply Bayesian nonparametric methods including cubic smoothing splines or P-splines for the possible nonlinearity and use an additive model in this complex setting. To improve computational efficiency, we propose the use of data-augmentation schemes. The approach allows flexible covariance structures for the random effects and within-subject measurement errors of the longitudinal processes. The posterior model space is explored through a Markov chain Monte Carlo (MCMC) sampler. The proposed methods are illustrated and compared to other approaches, the "naive" approach and the regression calibration, via simulations and by an application that investigates the relationship between obesity in adulthood and childhood growth curves. © 2010, The International Biometric Society.
Comparison of the Bayesian and Frequentist Approach to the Statistics
Hakala, Michal
2015-01-01
The Thesis deals with introduction to Bayesian statistics and comparing Bayesian approach with frequentist approach to statistics. Bayesian statistics is modern branch of statistics which provides an alternative comprehensive theory to the frequentist approach. Bayesian concepts provides solution for problems not being solvable by frequentist theory. In the thesis are compared definitions, concepts and quality of statistical inference. The main interest is focused on a point estimation, an in...
Lesaffre, Emmanuel
2012-01-01
The growth of biostatistics has been phenomenal in recent years and has been marked by considerable technical innovation in both methodology and computational practicality. One area that has experienced significant growth is Bayesian methods. The growing use of Bayesian methodology has taken place partly due to an increasing number of practitioners valuing the Bayesian paradigm as matching that of scientific discovery. In addition, computational advances have allowed for more complex models to be fitted routinely to realistic data sets. Through examples, exercises and a combination of introd
Probability and Bayesian statistics
1987-01-01
This book contains selected and refereed contributions to the "Inter national Symposium on Probability and Bayesian Statistics" which was orga nized to celebrate the 80th birthday of Professor Bruno de Finetti at his birthplace Innsbruck in Austria. Since Professor de Finetti died in 1985 the symposium was dedicated to the memory of Bruno de Finetti and took place at Igls near Innsbruck from 23 to 26 September 1986. Some of the pa pers are published especially by the relationship to Bruno de Finetti's scientific work. The evolution of stochastics shows growing importance of probability as coherent assessment of numerical values as degrees of believe in certain events. This is the basis for Bayesian inference in the sense of modern statistics. The contributions in this volume cover a broad spectrum ranging from foundations of probability across psychological aspects of formulating sub jective probability statements, abstract measure theoretical considerations, contributions to theoretical statistics an...
An Approximate Bayesian Fundamental Frequency Estimator
DEFF Research Database (Denmark)
Nielsen, Jesper Kjær; Christensen, Mads Græsbøll; Jensen, Søren Holdt
Joint fundamental frequency and model order estimation is an important problem in several applications such as speech and music processing. In this paper, we develop an approximate estimation algorithm of these quantities using Bayesian inference. The inference about the fundamental frequency and...
Draper, D.
2001-01-01
© 2012 Springer Science+Business Media, LLC. All rights reserved. Article Outline: Glossary Definition of the Subject and Introduction The Bayesian Statistical Paradigm Three Examples Comparison with the Frequentist Statistical Paradigm Future Directions Bibliography
Symbolic Probabilistic Inference with Continuous Variables
Chang, Kuo-Chu; Fung, Robert
2013-01-01
Research on Symbolic Probabilistic Inference (SPI) [2, 3] has provided an algorithm for resolving general queries in Bayesian networks. SPI applies the concept of dependency directed backward search to probabilistic inference, and is incremental with respect to both queries and observations. Unlike traditional Bayesian network inferencing algorithms, SPI algorithm is goal directed, performing only those calculations that are required to respond to queries. Research to date on SPI applies to B...
Bayesian Analysis of Multiple Populations I: Statistical and Computational Methods
Stenning, D C; Robinson, E; van Dyk, D A; von Hippel, T; Sarajedini, A; Stein, N
2016-01-01
We develop a Bayesian model for globular clusters composed of multiple stellar populations, extending earlier statistical models for open clusters composed of simple (single) stellar populations (vanDyk et al. 2009, Stein et al. 2013). Specifically, we model globular clusters with two populations that differ in helium abundance. Our model assumes a hierarchical structuring of the parameters in which physical properties---age, metallicity, helium abundance, distance, absorption, and initial mass---are common to (i) the cluster as a whole or to (ii) individual populations within a cluster, or are unique to (iii) individual stars. An adaptive Markov chain Monte Carlo (MCMC) algorithm is devised for model fitting that greatly improves convergence relative to its precursor non-adaptive MCMC algorithm. Our model and computational tools are incorporated into an open-source software suite known as BASE-9. We use numerical studies to demonstrate that our method can recover parameters of two-population clusters, and al...
A Bayesian approach to the modelling of alpha Cen A
Bazot, M; Christensen-Dalsgaard, J
2012-01-01
Determining the physical characteristics of a star is an inverse problem consisting in estimating the parameters of models for the stellar structure and evolution, knowing certain observable quantities. We use a Bayesian approach to solve this problem for alpha Cen A, which allows us to incorporate prior information on the parameters to be estimated, in order to better constrain the problem. Our strategy is based on the use of a Markov Chain Monte Carlo (MCMC) algorithm to estimate the posterior probability densities of the stellar parameters: mass, age, initial chemical composition,... We use the stellar evolutionary code ASTEC to model the star. To constrain this model both seismic and non-seismic observations were considered. Several different strategies were tested to fit these values, either using two or five free parameters in ASTEC. We are thus able to show evidence that MCMC methods become efficient with respect to more classical grid-based strategies when the number of parameters increases. The resul...
International Nuclear Information System (INIS)
Estimating uncertainties on doses from bioassay data is of interest in epidemiology studies that estimate cancer risk from occupational exposures to radionuclides. Bayesian methods provide a logical framework to calculate these uncertainties. However, occupational exposures often consist of many intakes, and this can make the Bayesian calculation computationally intractable. This paper describes a novel strategy for increasing the computational speed of the calculation by simplifying the intake pattern to a single composite intake, termed as complex intake regime (CIR). In order to assess whether this approximation is accurate and fast enough for practical purposes, the method is implemented by the Weighted Likelihood Monte Carlo Sampling (WeLMoS) method and evaluated by comparing its performance with a Markov Chain Monte Carlo (MCMC) method. The MCMC method gives the full solution (all intakes are independent), but is very computationally intensive to apply routinely. Posterior distributions of model parameter values, intakes and doses are calculated for a representative sample of plutonium workers from the United Kingdom Atomic Energy cohort using the WeLMoS method with the CIR and the MCMC method. The distributions are in good agreement: posterior means and Q 0.025 and Q 0.975 quantiles are typically within 20 %. Furthermore, the WeLMoS method using the CIR converges quickly: a typical case history takes around 10-20 min on a fast workstation, whereas the MCMC method took around 12-hr. The advantages and disadvantages of the method are discussed. (authors)
Technical Note: Approximate Bayesian parameterization of a complex tropical forest model
Directory of Open Access Journals (Sweden)
F. Hartig
2013-08-01
Full Text Available Inverse parameter estimation of process-based models is a long-standing problem in ecology and evolution. A key problem of inverse parameter estimation is to define a metric that quantifies how well model predictions fit to the data. Such a metric can be expressed by general cost or objective functions, but statistical inversion approaches are based on a particular metric, the probability of observing the data given the model, known as the likelihood. Deriving likelihoods for dynamic models requires making assumptions about the probability for observations to deviate from mean model predictions. For technical reasons, these assumptions are usually derived without explicit consideration of the processes in the simulation. Only in recent years have new methods become available that allow generating likelihoods directly from stochastic simulations. Previous applications of these approximate Bayesian methods have concentrated on relatively simple models. Here, we report on the application of a simulation-based likelihood approximation for FORMIND, a parameter-rich individual-based model of tropical forest dynamics. We show that approximate Bayesian inference, based on a parametric likelihood approximation placed in a conventional MCMC, performs well in retrieving known parameter values from virtual field data generated by the forest model. We analyze the results of the parameter estimation, examine the sensitivity towards the choice and aggregation of model outputs and observed data (summary statistics, and show results from using this method to fit the FORMIND model to field data from an Ecuadorian tropical forest. Finally, we discuss differences of this approach to Approximate Bayesian Computing (ABC, another commonly used method to generate simulation-based likelihood approximations. Our results demonstrate that simulation-based inference, which offers considerable conceptual advantages over more traditional methods for inverse parameter
Technical Note: Approximate Bayesian parameterization of a process-based tropical forest model
Hartig, F.; Dislich, C.; Wiegand, T.; Huth, A.
2014-02-01
Inverse parameter estimation of process-based models is a long-standing problem in many scientific disciplines. A key question for inverse parameter estimation is how to define the metric that quantifies how well model predictions fit to the data. This metric can be expressed by general cost or objective functions, but statistical inversion methods require a particular metric, the probability of observing the data given the model parameters, known as the likelihood. For technical and computational reasons, likelihoods for process-based stochastic models are usually based on general assumptions about variability in the observed data, and not on the stochasticity generated by the model. Only in recent years have new methods become available that allow the generation of likelihoods directly from stochastic simulations. Previous applications of these approximate Bayesian methods have concentrated on relatively simple models. Here, we report on the application of a simulation-based likelihood approximation for FORMIND, a parameter-rich individual-based model of tropical forest dynamics. We show that approximate Bayesian inference, based on a parametric likelihood approximation placed in a conventional Markov chain Monte Carlo (MCMC) sampler, performs well in retrieving known parameter values from virtual inventory data generated by the forest model. We analyze the results of the parameter estimation, examine its sensitivity to the choice and aggregation of model outputs and observed data (summary statistics), and demonstrate the application of this method by fitting the FORMIND model to field data from an Ecuadorian tropical forest. Finally, we discuss how this approach differs from approximate Bayesian computation (ABC), another method commonly used to generate simulation-based likelihood approximations. Our results demonstrate that simulation-based inference, which offers considerable conceptual advantages over more traditional methods for inverse parameter estimation
Technical Note: Approximate Bayesian parameterization of a complex tropical forest model
Hartig, F.; Dislich, C.; Wiegand, T.; Huth, A.
2013-08-01
Inverse parameter estimation of process-based models is a long-standing problem in ecology and evolution. A key problem of inverse parameter estimation is to define a metric that quantifies how well model predictions fit to the data. Such a metric can be expressed by general cost or objective functions, but statistical inversion approaches are based on a particular metric, the probability of observing the data given the model, known as the likelihood. Deriving likelihoods for dynamic models requires making assumptions about the probability for observations to deviate from mean model predictions. For technical reasons, these assumptions are usually derived without explicit consideration of the processes in the simulation. Only in recent years have new methods become available that allow generating likelihoods directly from stochastic simulations. Previous applications of these approximate Bayesian methods have concentrated on relatively simple models. Here, we report on the application of a simulation-based likelihood approximation for FORMIND, a parameter-rich individual-based model of tropical forest dynamics. We show that approximate Bayesian inference, based on a parametric likelihood approximation placed in a conventional MCMC, performs well in retrieving known parameter values from virtual field data generated by the forest model. We analyze the results of the parameter estimation, examine the sensitivity towards the choice and aggregation of model outputs and observed data (summary statistics), and show results from using this method to fit the FORMIND model to field data from an Ecuadorian tropical forest. Finally, we discuss differences of this approach to Approximate Bayesian Computing (ABC), another commonly used method to generate simulation-based likelihood approximations. Our results demonstrate that simulation-based inference, which offers considerable conceptual advantages over more traditional methods for inverse parameter estimation, can
Variational Bayesian Inference of Line Spectra
DEFF Research Database (Denmark)
Badiu, Mihai Alin; Hansen, Thomas Lundgaard; Fleury, Bernard Henri
2016-01-01
additional parameters. We propose an accurate representation of the pdfs of the frequencies by mixtures of von Mises pdfs, which yields closed-form expectations. We define the algorithm VALSE in which the estimates of the pdfs and parameters are iteratively updated. VALSE is a gridless, convergent method...
Efficient Bayesian inference for ARFIMA processes
Graves, T; Gramacy, R. B.; C. L. E. Franzke; Watkins, N. W.
2015-01-01
Many geophysical quantities, like atmospheric temperature, water levels in rivers, and wind speeds, have shown evidence of long-range dependence (LRD). LRD means that these quantities experience non-trivial temporal memory, which potentially enhances their predictability, but also hampers the detection of externally forced trends. Thus, it is important to reliably identify whether or not a system exhibits LRD. In this paper we present a modern and systematic approach to the ...
Analysis of KATRIN data using Bayesian inference
DEFF Research Database (Denmark)
Riis, Anna Sejersen; Hannestad, Steen; Weinheimer, Christian
2011-01-01
The KATRIN (KArlsruhe TRItium Neutrino) experiment will be analyzing the tritium beta-spectrum to determine the mass of the neutrino with a sensitivity of 0.2 eV (90% C.L.). This approach to a measurement of the absolute value of the neutrino mass relies only on the principle of energy conservation...
Bayesian inference of the metazoan phylogeny
DEFF Research Database (Denmark)
Glenner, Henrik; Hansen, Anders J; Sørensen, Martin V;
2004-01-01
Metazoan phylogeny remains one of evolutionary biology's major unsolved problems. Molecular and morphological data, as well as different analytical approaches, have produced highly conflicting results due to homoplasy resulting from more than 570 million years of evolution. To date, parsimony has...
Smail, Linda
2016-06-01
The basic task of any probabilistic inference system in Bayesian networks is computing the posterior probability distribution for a subset or subsets of random variables, given values or evidence for some other variables from the same Bayesian network. Many methods and algorithms have been developed to exact and approximate inference in Bayesian networks. This work compares two exact inference methods in Bayesian networks-Lauritzen-Spiegelhalter and the successive restrictions algorithm-from the perspective of computational efficiency. The two methods were applied for comparison to a Chest Clinic Bayesian Network. Results indicate that the successive restrictions algorithm shows more computational efficiency than the Lauritzen-Spiegelhalter algorithm.
An MCMC study of general squark flavour mixing in the MSSM
Herrmann, Björn; Fuks, Benjamin; Mahmoudi, Farvah; O'Leary, Ben; Porod, Werner; Sekmen, Sezen; Strobbe, Nadja
2015-01-01
We present an extensive study of non-minimally flavour violating (NMFV) terms in the Lagrangian of the Minimal Supersymmetric Standard Model (MSSM). We impose a variety of theoretical and experimental constraints and perform a detailed scan of the parameter space by means of a Markov Chain Monte-Carlo (MCMC) setup. This represents the first study of several non-zero flavour-violating elements within the MSSM. We present the results of the MCMC scan with a special focus on the flavour-violating parameters. Based on these results, we define benchmark scenarios for future studies of NMFV effects at the LHC.
An MCMC Study of General Squark Flavour Mixing in the MSSM
Energy Technology Data Exchange (ETDEWEB)
Herrmann, Björn [Annecy, LAPTH; De Causmaecker, Karen [Intl. Solvay Inst., Brussels; Fuks, Benjamin [UPMC, Paris (main); Mahmoudi, Farvah [Lyon, Ecole Normale Superieure; O' Leary, Ben [Wurzburg U.; Porod, Werner [Wurzburg U.; Sekmen, Sezen [Kyungpook Natl. U.; Strobbe, Nadja [Fermilab
2015-10-05
We present an extensive study of non-minimally flavour violating (NMFV) terms in the Lagrangian of the Minimal Supersymmetric Standard Model (MSSM). We impose a variety of theoretical and experimental constraints and perform a detailed scan of the parameter space by means of a Markov Chain Monte-Carlo (MCMC) setup. This represents the first study of several non-zero flavour-violating elements within the MSSM. We present the results of the MCMC scan with a special focus on the flavour-violating parameters. Based on these results, we define benchmark scenarios for future studies of NMFV effects at the LHC.
Yee, Eugene
2007-04-01
Although a great deal of research effort has been focused on the forward prediction of the dispersion of contaminants (e.g., chemical and biological warfare agents) released into the turbulent atmosphere, much less work has been directed toward the inverse prediction of agent source location and strength from the measured concentration, even though the importance of this problem for a number of practical applications is obvious. In general, the inverse problem of source reconstruction is ill-posed and unsolvable without additional information. It is demonstrated that a Bayesian probabilistic inferential framework provides a natural and logically consistent method for source reconstruction from a limited number of noisy concentration data. In particular, the Bayesian approach permits one to incorporate prior knowledge about the source as well as additional information regarding both model and data errors. The latter enables a rigorous determination of the uncertainty in the inference of the source parameters (e.g., spatial location, emission rate, release time, etc.), hence extending the potential of the methodology as a tool for quantitative source reconstruction. A model (or, source-receptor relationship) that relates the source distribution to the concentration data measured by a number of sensors is formulated, and Bayesian probability theory is used to derive the posterior probability density function of the source parameters. A computationally efficient methodology for determination of the likelihood function for the problem, based on an adjoint representation of the source-receptor relationship, is described. Furthermore, we describe the application of efficient stochastic algorithms based on Markov chain Monte Carlo (MCMC) for sampling from the posterior distribution of the source parameters, the latter of which is required to undertake the Bayesian computation. The Bayesian inferential methodology for source reconstruction is validated against real
Frequentism and Bayesianism: A Python-driven Primer
VanderPlas, Jake
2014-01-01
This paper presents a brief, semi-technical comparison of the essential features of the frequentist and Bayesian approaches to statistical inference, with several illustrative examples implemented in Python. The differences between frequentism and Bayesianism fundamentally stem from differing definitions of probability, a philosophical divide which leads to distinct approaches to the solution of statistical problems as well as contrasting ways of asking and answering questions about unknown parameters. After an example-driven discussion of these differences, we briefly compare several leading Python statistical packages which implement frequentist inference using classical methods and Bayesian inference using Markov Chain Monte Carlo.
Bayesian Estimation of the GARCH(1,1) Model with Student-t Innovations
David, David; Hoogerheide, Lennart
2010-01-01
textabstractThis note presents the R package bayesGARCH (Ardia, 2007) which provides functions for the Bayesian estimation of the parsimonious and effective GARCH(1,1) model with Student-t innovations. The estimation procedure is fully automatic and thus avoids the tedious task of tuning a MCMC sampling algorithm. The usage of the package is shown in an empirical application to exchange rate logreturns.
Bayesian Approach to Handling Informative Sampling
Sikov, Anna
2015-01-01
In the case of informative sampling the sampling scheme explicitly or implicitly depends on the response variable. As a result, the sample distribution of response variable can- not be used for making inference about the population. In this research I investigate the problem of informative sampling from the Bayesian perspective. Application of the Bayesian approach permits solving the problems, which arise due to complexity of the models, being used for handling informative sampling. The main...
Dynamic Bayesian Combination of Multiple Imperfect Classifiers
Simpson, Edwin; Psorakis, Ioannis; Smith, Arfon
2012-01-01
Classifier combination methods need to make best use of the outputs of multiple, imperfect classifiers to enable higher accuracy classifications. In many situations, such as when human decisions need to be combined, the base decisions can vary enormously in reliability. A Bayesian approach to such uncertain combination allows us to infer the differences in performance between individuals and to incorporate any available prior knowledge about their abilities when training data is sparse. In this paper we explore Bayesian classifier combination, using the computationally efficient framework of variational Bayesian inference. We apply the approach to real data from a large citizen science project, Galaxy Zoo Supernovae, and show that our method far outperforms other established approaches to imperfect decision combination. We go on to analyse the putative community structure of the decision makers, based on their inferred decision making strategies, and show that natural groupings are formed. Finally we present ...
Cover Tree Bayesian Reinforcement Learning
Tziortziotis, Nikolaos; Dimitrakakis, Christos; Blekas, Konstantinos
2013-01-01
This paper proposes an online tree-based Bayesian approach for reinforcement learning. For inference, we employ a generalised context tree model. This defines a distribution on multivariate Gaussian piecewise-linear models, which can be updated in closed form. The tree structure itself is constructed using the cover tree method, which remains efficient in high dimensional spaces. We combine the model with Thompson sampling and approximate dynamic programming to obtain effective exploration po...
Quantile pyramids for Bayesian nonparametrics
2009-01-01
P\\'{o}lya trees fix partitions and use random probabilities in order to construct random probability measures. With quantile pyramids we instead fix probabilities and use random partitions. For nonparametric Bayesian inference we use a prior which supports piecewise linear quantile functions, based on the need to work with a finite set of partitions, yet we show that the limiting version of the prior exists. We also discuss and investigate an alternative model based on the so-called substitut...
Bayesian analysis of contingency tables
Gómez Villegas, Miguel A.; González Pérez, Beatriz
2005-01-01
The display of the data by means of contingency tables is used in different approaches to statistical inference, for example, to broach the test of homogeneity of independent multinomial distributions. We develop a Bayesian procedure to test simple null hypotheses versus bilateral alternatives in contingency tables. Given independent samples of two binomial distributions and taking a mixed prior distribution, we calculate the posterior probability that the proportion of successes in the first...
Bayesian modeling of ChIP-chip data using latent variables.
Wu, Mingqi
2009-10-26
BACKGROUND: The ChIP-chip technology has been used in a wide range of biomedical studies, such as identification of human transcription factor binding sites, investigation of DNA methylation, and investigation of histone modifications in animals and plants. Various methods have been proposed in the literature for analyzing the ChIP-chip data, such as the sliding window methods, the hidden Markov model-based methods, and Bayesian methods. Although, due to the integrated consideration of uncertainty of the models and model parameters, Bayesian methods can potentially work better than the other two classes of methods, the existing Bayesian methods do not perform satisfactorily. They usually require multiple replicates or some extra experimental information to parametrize the model, and long CPU time due to involving of MCMC simulations. RESULTS: In this paper, we propose a Bayesian latent model for the ChIP-chip data. The new model mainly differs from the existing Bayesian models, such as the joint deconvolution model, the hierarchical gamma mixture model, and the Bayesian hierarchical model, in two respects. Firstly, it works on the difference between the averaged treatment and control samples. This enables the use of a simple model for the data, which avoids the probe-specific effect and the sample (control/treatment) effect. As a consequence, this enables an efficient MCMC simulation of the posterior distribution of the model, and also makes the model more robust to the outliers. Secondly, it models the neighboring dependence of probes by introducing a latent indicator vector. A truncated Poisson prior distribution is assumed for the latent indicator variable, with the rationale being justified at length. CONCLUSION: The Bayesian latent method is successfully applied to real and ten simulated datasets, with comparisons with some of the existing Bayesian methods, hidden Markov model methods, and sliding window methods. The numerical results indicate that the
Bayesian modeling of ChIP-chip data using latent variables
Directory of Open Access Journals (Sweden)
Tian Yanan
2009-10-01
Full Text Available Abstract Background The ChIP-chip technology has been used in a wide range of biomedical studies, such as identification of human transcription factor binding sites, investigation of DNA methylation, and investigation of histone modifications in animals and plants. Various methods have been proposed in the literature for analyzing the ChIP-chip data, such as the sliding window methods, the hidden Markov model-based methods, and Bayesian methods. Although, due to the integrated consideration of uncertainty of the models and model parameters, Bayesian methods can potentially work better than the other two classes of methods, the existing Bayesian methods do not perform satisfactorily. They usually require multiple replicates or some extra experimental information to parametrize the model, and long CPU time due to involving of MCMC simulations. Results In this paper, we propose a Bayesian latent model for the ChIP-chip data. The new model mainly differs from the existing Bayesian models, such as the joint deconvolution model, the hierarchical gamma mixture model, and the Bayesian hierarchical model, in two respects. Firstly, it works on the difference between the averaged treatment and control samples. This enables the use of a simple model for the data, which avoids the probe-specific effect and the sample (control/treatment effect. As a consequence, this enables an efficient MCMC simulation of the posterior distribution of the model, and also makes the model more robust to the outliers. Secondly, it models the neighboring dependence of probes by introducing a latent indicator vector. A truncated Poisson prior distribution is assumed for the latent indicator variable, with the rationale being justified at length. Conclusion The Bayesian latent method is successfully applied to real and ten simulated datasets, with comparisons with some of the existing Bayesian methods, hidden Markov model methods, and sliding window methods. The numerical results
Kirstein, Roland
2005-01-01
This paper presents a modification of the inspection game: The ?Bayesian Monitoring? model rests on the assumption that judges are interested in enforcing compliant behavior and making correct decisions. They may base their judgements on an informative but imperfect signal which can be generated costlessly. In the original inspection game, monitoring is costly and generates a perfectly informative signal. While the inspection game has only one mixed strategy equilibrium, three Perfect Bayesia...
Macroscopic Models of Clique Tree Growth for Bayesian Networks
National Aeronautics and Space Administration — In clique tree clustering, inference consists of propagation in a clique tree compiled from a Bayesian network. In this paper, we develop an analytical approach to...
Bayesian phylogeography finds its roots.
Directory of Open Access Journals (Sweden)
Philippe Lemey
2009-09-01
Full Text Available As a key factor in endemic and epidemic dynamics, the geographical distribution of viruses has been frequently interpreted in the light of their genetic histories. Unfortunately, inference of historical dispersal or migration patterns of viruses has mainly been restricted to model-free heuristic approaches that provide little insight into the temporal setting of the spatial dynamics. The introduction of probabilistic models of evolution, however, offers unique opportunities to engage in this statistical endeavor. Here we introduce a Bayesian framework for inference, visualization and hypothesis testing of phylogeographic history. By implementing character mapping in a Bayesian software that samples time-scaled phylogenies, we enable the reconstruction of timed viral dispersal patterns while accommodating phylogenetic uncertainty. Standard Markov model inference is extended with a stochastic search variable selection procedure that identifies the parsimonious descriptions of the diffusion process. In addition, we propose priors that can incorporate geographical sampling distributions or characterize alternative hypotheses about the spatial dynamics. To visualize the spatial and temporal information, we summarize inferences using virtual globe software. We describe how Bayesian phylogeography compares with previous parsimony analysis in the investigation of the influenza A H5N1 origin and H5N1 epidemiological linkage among sampling localities. Analysis of rabies in West African dog populations reveals how virus diffusion may enable endemic maintenance through continuous epidemic cycles. From these analyses, we conclude that our phylogeographic framework will make an important asset in molecular epidemiology that can be easily generalized to infer biogeogeography from genetic data for many organisms.
Laloy, Eric; Linde, Niklas; Jacques, Diederik; Vrugt, Jasper A.
2015-06-01
We present a Bayesian inversion method for the joint inference of high-dimensional multi-Gaussian hydraulic conductivity fields and associated geostatistical parameters from indirect hydrological data. We combine Gaussian process generation via circulant embedding to decouple the variogram from grid cell specific values, with dimensionality reduction by interpolation to enable Markov chain Monte Carlo (MCMC) simulation. Using the Matérn variogram model, this formulation allows inferring the conductivity values simultaneously with the field smoothness (also called Matérn shape parameter) and other geostatistical parameters such as the mean, sill, integral scales and anisotropy direction(s) and ratio(s). The proposed dimensionality reduction method systematically honors the underlying variogram and is demonstrated to achieve better performance than the Karhunen-Loève expansion. We illustrate our inversion approach using synthetic (error corrupted) data from a tracer experiment in a fairly heterogeneous 10,000-dimensional 2-D conductivity field. A 40-times reduction of the size of the parameter space did not prevent the posterior simulations to appropriately fit the measurement data and the posterior parameter distributions to include the true geostatistical parameter values. Overall, the posterior field realizations covered a wide range of geostatistical models, questioning the common practice of assuming a fixed variogram prior to inference of the hydraulic conductivity values. Our method is shown to be more efficient than sequential Gibbs sampling (SGS) for the considered case study, particularly when implemented on a distributed computing cluster. It is also found to outperform the method of anchored distributions (MAD) for the same computational budget.
Efficient Nonparametric Bayesian Modelling with Sparse Gaussian Process Approximations
Seeger, Matthias; Lawrence, Neil; Herbrich, Ralf
2006-01-01
Sparse approximations to Bayesian inference for nonparametric Gaussian Process models scale linearly in the number of training points, allowing for the application of powerful kernel-based models to large datasets. We present a general framework based on the informative vector machine (IVM) (Lawrence et.al., 2002) and show how the complete Bayesian task of inference and learning of free hyperparameters can be performed in a practically efficient manner. Our framework allows for arbitrary like...
Algebraic methods for evaluating integrals In Bayesian statistics
Lin, Shaowei
2011-01-01
The accurate evaluation of marginal likelihood integrals is a difficult fundamental problem in Bayesian inference that has important applications in machine learning and computational biology. Following the recent success of algebraic statistics in frequentist inference and inspired by Watanabe's foundational approach to singular learning theory, the goal of this dissertation is to study algebraic, geometric and combinatorial methods for computing Bayesian integrals effectively, and to explor...
Strategies for Generating Micro Explanations for Bayesian Belief Networks
Sember, Peter; Zukerman, Ingrid
2013-01-01
Bayesian Belief Networks have been largely overlooked by Expert Systems practitioners on the grounds that they do not correspond to the human inference mechanism. In this paper, we introduce an explanation mechanism designed to generate intuitive yet probabilistically sound explanations of inferences drawn by a Bayesian Belief Network. In particular, our mechanism accounts for the results obtained due to changes in the causal and the evidential support of a node.
International Nuclear Information System (INIS)
Highlights: • A Bayesian method is used to infer the freeform (histogram) age distributions. • The value at each bin was estimated using MCMC approach. • The method was applied to a synthetic tracer dataset and two real datasets. • The cumulative age distribution is captured better than the non-cumulative. • Less uncertainty is obtained when smaller number of bins is used. - Abstract: Due to the mixing of groundwaters with different ages in aquifers, groundwater age is more appropriately represented by a distribution rather than a scalar number. To infer a groundwater age distribution from environmental tracers, a mathematical form is often assumed for the shape of the distribution and the parameters of the mathematical distribution are estimated using deterministic or stochastic inverse methods. The prescription of the mathematical form limits the exploration of the age distribution to the shapes that can be described by the selected distribution. In this paper, the use of freeform histograms as groundwater age distributions is evaluated. A Bayesian Markov Chain Monte Carlo approach is used to estimate the fraction of groundwater in each histogram bin. The method was able to capture the shape of a hypothetical gamma distribution from the concentrations of four age tracers. The number of bins that can be considered in this approach is limited based on the number of tracers available. The histogram method was also tested on tracer data sets from Holten (The Netherlands; 3H, 3He, 85Kr, 39Ar) and the La Selva Biological Station (Costa-Rica; SF6, CFCs, 3H, 4He and 14C), and compared to a number of mathematical forms. According to standard Bayesian measures of model goodness, the best mathematical distribution performs better than the histogram distributions in terms of the ability to capture the observed tracer data relative to their complexity. Among the histogram distributions, the four bin histogram performs better in most of the cases. The Monte Carlo
Towards Bayesian Deep Learning: A Survey
Wang, Hao; Yeung, Dit-Yan
2016-01-01
While perception tasks such as visual object recognition and text understanding play an important role in human intelligence, the subsequent tasks that involve inference, reasoning and planning require an even higher level of intelligence. The past few years have seen major advances in many perception tasks using deep learning models. For higher-level inference, however, probabilistic graphical models with their Bayesian nature are still more powerful and flexible. To achieve integrated intel...
Pitman Yor Diffusion Trees for Bayesian Hierarchical Clustering.
Knowles, David A; Ghahramani, Zoubin
2015-02-01
In this paper we introduce the Pitman Yor Diffusion Tree (PYDT), a Bayesian non-parametric prior over tree structures which generalises the Dirichlet Diffusion Tree [30] and removes the restriction to binary branching structure. The generative process is described and shown to result in an exchangeable distribution over data points. We prove some theoretical properties of the model including showing its construction as the continuum limit of a nested Chinese restaurant process model. We then present two alternative MCMC samplers which allow us to model uncertainty over tree structures, and a computationally efficient greedy Bayesian EM search algorithm. Both algorithms use message passing on the tree structure. The utility of the model and algorithms is demonstrated on synthetic and real world data, both continuous and binary. PMID:26353241
Bessiere, Pierre; Ahuactzin, Juan Manuel; Mekhnacha, Kamel
2013-01-01
Probability as an Alternative to Boolean LogicWhile logic is the mathematical foundation of rational reasoning and the fundamental principle of computing, it is restricted to problems where information is both complete and certain. However, many real-world problems, from financial investments to email filtering, are incomplete or uncertain in nature. Probability theory and Bayesian computing together provide an alternative framework to deal with incomplete and uncertain data. Decision-Making Tools and Methods for Incomplete and Uncertain DataEmphasizing probability as an alternative to Boolean
On the use of pseudo-likelihoods in Bayesian variable selection.
Racugno, Walter; Salvan, Alessandra; Ventura, Laura
2005-01-01
In the presence of nuisance parameters, we discuss a one-parameter Bayesian analysis based on a pseudo-likelihood assuming a default prior distribution for the parameter of interest only. Although this way to proceed cannot always be considered as orthodox in the Bayesian perspective, it is of interest to evaluate whether the use of suitable pseudo-likelihoods may be proposed for Bayesian inference. Attention is focused in the context of regression models, in particular on inference about a s...
Bayesian Averaging is Well-Temperated
DEFF Research Database (Denmark)
Hansen, Lars Kai
2000-01-01
Bayesian predictions are stochastic just like predictions of any other inference scheme that generalize from a finite sample. While a simple variational argument shows that Bayes averaging is generalization optimal given that the prior matches the teacher parameter distribution the situation is l...
Sensitivity to Sampling in Bayesian Word Learning
Xu, Fei; Tenenbaum, Joshua B.
2007-01-01
We report a new study testing our proposal that word learning may be best explained as an approximate form of Bayesian inference (Xu & Tenenbaum, in press). Children are capable of learning word meanings across a wide range of communicative contexts. In different contexts, learners may encounter different sampling processes generating the examples…
Bayesian Uncertainty Quantification for Subsurface Inversion Using a Multiscale Hierarchical Model
Mondal, Anirban
2014-07-03
We consider a Bayesian approach to nonlinear inverse problems in which the unknown quantity is a random field (spatial or temporal). The Bayesian approach contains a natural mechanism for regularization in the form of prior information, can incorporate information from heterogeneous sources and provide a quantitative assessment of uncertainty in the inverse solution. The Bayesian setting casts the inverse solution as a posterior probability distribution over the model parameters. The Karhunen-Loeve expansion is used for dimension reduction of the random field. Furthermore, we use a hierarchical Bayes model to inject multiscale data in the modeling framework. In this Bayesian framework, we show that this inverse problem is well-posed by proving that the posterior measure is Lipschitz continuous with respect to the data in total variation norm. Computational challenges in this construction arise from the need for repeated evaluations of the forward model (e.g., in the context of MCMC) and are compounded by high dimensionality of the posterior. We develop two-stage reversible jump MCMC that has the ability to screen the bad proposals in the first inexpensive stage. Numerical results are presented by analyzing simulated as well as real data from hydrocarbon reservoir. This article has supplementary material available online. © 2014 American Statistical Association and the American Society for Quality.
BAYESIAN IMAGE RESTORATION, USING CONFIGURATIONS
Directory of Open Access Journals (Sweden)
Thordis Linda Thorarinsdottir
2011-05-01
Full Text Available In this paper, we develop a Bayesian procedure for removing noise from images that can be viewed as noisy realisations of random sets in the plane. The procedure utilises recent advances in configuration theory for noise free random sets, where the probabilities of observing the different boundary configurations are expressed in terms of the mean normal measure of the random set. These probabilities are used as prior probabilities in a Bayesian image restoration approach. Estimation of the remaining parameters in the model is outlined for salt and pepper noise. The inference in the model is discussed in detail for 3 X 3 and 5 X 5 configurations and examples of the performance of the procedure are given.
Bayesian Seismology of the Sun
Gruberbauer, Michael
2013-01-01
We perform a Bayesian grid-based analysis of the solar l=0,1,2 and 3 p modes obtained via BiSON in order to deliver the first Bayesian asteroseismic analysis of the solar composition problem. We do not find decisive evidence to prefer either of the contending chemical compositions, although the revised solar abundances (AGSS09) are more probable in general. We do find indications for systematic problems in standard stellar evolution models, unrelated to the consequences of inadequate modelling of the outer layers on the higher-order modes. The seismic observables are best fit by solar models that are several hundred million years older than the meteoritic age of the Sun. Similarly, meteoritic age calibrated models do not adequately reproduce the observed seismic observables. Our results suggest that these problems will affect any asteroseismic inference that relies on a calibration to the Sun.
Bayesian priors for transiting planets
Kipping, David M
2016-01-01
As astronomers push towards discovering ever-smaller transiting planets, it is increasingly common to deal with low signal-to-noise ratio (SNR) events, where the choice of priors plays an influential role in Bayesian inference. In the analysis of exoplanet data, the selection of priors is often treated as a nuisance, with observers typically defaulting to uninformative distributions. Such treatments miss a key strength of the Bayesian framework, especially in the low SNR regime, where even weak a priori information is valuable. When estimating the parameters of a low-SNR transit, two key pieces of information are known: (i) the planet has the correct geometric alignment to transit and (ii) the transit event exhibits sufficient signal-to-noise to have been detected. These represent two forms of observational bias. Accordingly, when fitting transits, the model parameter priors should not follow the intrinsic distributions of said terms, but rather those of both the intrinsic distributions and the observational ...
Bayesian Modelling of fMRI Time Series
DEFF Research Database (Denmark)
Højen-Sørensen, Pedro; Hansen, Lars Kai; Rasmussen, Carl Edward
2000-01-01
We present a Hidden Markov Model (HMM) for inferring the hidden psychological state (or neural activity) during single trial fMRI activation experiments with blocked task paradigms. Inference is based on Bayesian methodology, using a combination of analytical and a variety of Markov Chain Monte...
Bayesian Modelling of fMRI Time Series
DEFF Research Database (Denmark)
Højen-Sørensen, Pedro; Hansen, Lars Kai; Rasmussen, Carl Edward
We present a Hidden Markov Model (HMM) for inferring the hidden psychological state (or neural activity) during single trial fMRI activation experiments with blocked task paradigms. Inference is based on Bayesian methodology, using a combination of analytical and a variety of Markov Chain Monte...
Directory of Open Access Journals (Sweden)
Abd El-Maseh, M. P
2015-11-01
Full Text Available In this paper, the Bayesian estimation for the unknown parameters for the bivariate generalized exponential (BVGE distribution under Bivariate censoring type-I samples with constant stress accelerated life testing (CSALT are discussed. The scale parameter of the lifetime distribution at constant stress levels is assumed to be an inverse power law function of the stress level. The parameters are estimated by Bayesian approach using Markov Chain Monte Carlo (MCMC method based on Gibbs sampling. Then, the numerical studies are introduced to illustrate the approach study using samples which have been generated from the BVGE distribution.
Bayesian Smoothing with Gaussian Processes Using Fourier Basis Functions in the spectralGP Package
Directory of Open Access Journals (Sweden)
Christopher J. Paciorek
2007-04-01
Full Text Available The spectral representation of stationary Gaussian processes via the Fourier basis provides a computationally efficient specification of spatial surfaces and nonparametric regression functions for use in various statistical models. I describe the representation in detail and introduce the spectralGP package in R for computations. Because of the large number of basis coefficients, some form of shrinkage is necessary; I focus on a natural Bayesian approach via a particular parameterized prior structure that approximates stationary Gaussian processes on a regular grid. I review several models from the literature for data that do not lie on a grid, suggest a simple model modification, and provide example code demonstrating MCMC sampling using the spectralGP package. I describe reasons that mixing can be slow in certain situations and provide some suggestions for MCMC techniques to improve mixing, also with example code, and some general recommendations grounded in experience.
Bayesian analysis for exponential random graph models using the adaptive exchange sampler
Jin, Ick Hoon
2013-01-01
Exponential random graph models have been widely used in social network analysis. However, these models are extremely difficult to handle from a statistical viewpoint, because of the existence of intractable normalizing constants. In this paper, we consider a fully Bayesian analysis for exponential random graph models using the adaptive exchange sampler, which solves the issue of intractable normalizing constants encountered in Markov chain Monte Carlo (MCMC) simulations. The adaptive exchange sampler can be viewed as a MCMC extension of the exchange algorithm, and it generates auxiliary networks via an importance sampling procedure from an auxiliary Markov chain running in parallel. The convergence of this algorithm is established under mild conditions. The adaptive exchange sampler is illustrated using a few social networks, including the Florentine business network, molecule synthetic network, and dolphins network. The results indicate that the adaptive exchange algorithm can produce more accurate estimates than approximate exchange algorithms, while maintaining the same computational efficiency.
Bayesian Analysis for Exponential Random Graph Models Using the Adaptive Exchange Sampler.
Jin, Ick Hoon; Yuan, Ying; Liang, Faming
2013-10-01
Exponential random graph models have been widely used in social network analysis. However, these models are extremely difficult to handle from a statistical viewpoint, because of the intractable normalizing constant and model degeneracy. In this paper, we consider a fully Bayesian analysis for exponential random graph models using the adaptive exchange sampler, which solves the intractable normalizing constant and model degeneracy issues encountered in Markov chain Monte Carlo (MCMC) simulations. The adaptive exchange sampler can be viewed as a MCMC extension of the exchange algorithm, and it generates auxiliary networks via an importance sampling procedure from an auxiliary Markov chain running in parallel. The convergence of this algorithm is established under mild conditions. The adaptive exchange sampler is illustrated using a few social networks, including the Florentine business network, molecule synthetic network, and dolphins network. The results indicate that the adaptive exchange algorithm can produce more accurate estimates than approximate exchange algorithms, while maintaining the same computational efficiency. PMID:24653788
Bayesian model selection without evidences: application to the dark energy equation-of-state
Hee, Sonke; Hobson, Mike P; Lasenby, Anthony N
2015-01-01
A method is presented for Bayesian model selection without explicitly computing evidences, by using a combined likelihood and introducing an integer model selection parameter $n$ so that Bayes factors, or more generally posterior odds ratios, may be read off directly from the posterior of $n$. If the total number of models under consideration is specified a priori, the full joint parameter space $(\\theta, n)$ of the models is of fixed dimensionality and can be explored using standard MCMC or nested sampling methods, without the need for reversible jump MCMC techniques. The posterior on $n$ is then obtained by straightforward marginalisation. We demonstrate the efficacy of our approach by application to several toy models. We then apply it to constraining the dark energy equation-of-state using a free-form reconstruction technique. We show that $\\Lambda$CDM is significantly favoured over all extensions, including the simple $w(z){=}{\\rm constant}$ model.
Approximate Bayesian Computation in Large Scale Structure: constraining the galaxy-halo connection
Hahn, ChangHoon; Vakili, Mohammadjavad; Walsh, Kilian; Hearin, Andrew P.; Hogg, David W.; Cambpell, Duncan
2016-01-01
The standard approaches to Bayesian parameter inference in large scale structure (LSS) assume a Gaussian functional form (chi-squared form) for the likelihood. They are also typically restricted to measurements such as the two point correlation function. Likelihood free inferences such as Approximate Bayesian Computation (ABC) make inference possible without assuming any functional form for the likelihood, thereby relaxing the assumptions and restrictions of the standard approach. Instead it ...
A general Bayes weibull inference model for accelerated life testing
International Nuclear Information System (INIS)
This article presents the development of a general Bayes inference model for accelerated life testing. The failure times at a constant stress level are assumed to belong to a Weibull distribution, but the specification of strict adherence to a parametric time-transformation function is not required. Rather, prior information is used to indirectly define a multivariate prior distribution for the scale parameters at the various stress levels and the common shape parameter. Using the approach, Bayes point estimates as well as probability statements for use-stress (and accelerated) life parameters may be inferred from a host of testing scenarios. The inference procedure accommodates both the interval data sampling strategy and type I censored sampling strategy for the collection of ALT test data. The inference procedure uses the well-known MCMC (Markov Chain Monte Carlo) methods to derive posterior approximations. The approach is illustrated with an example
Bayesian Query-Focused Summarization
Daumé, Hal
2009-01-01
We present BayeSum (for ``Bayesian summarization''), a model for sentence extraction in query-focused summarization. BayeSum leverages the common case in which multiple documents are relevant to a single query. Using these documents as reinforcement for query terms, BayeSum is not afflicted by the paucity of information in short queries. We show that approximate inference in BayeSum is possible on large data sets and results in a state-of-the-art summarization system. Furthermore, we show how BayeSum can be understood as a justified query expansion technique in the language modeling for IR framework.
A New Approach to Probabilistic Programming Inference
Wood, Frank; van de Meent, Jan Willem; Mansinghka, Vikash
2015-01-01
We introduce and demonstrate a new approach to inference in expressive probabilistic programming languages based on particle Markov chain Monte Carlo. Our approach is simple to implement and easy to parallelize. It applies to Turing-complete probabilistic programming languages and supports accurate inference in models that make use of complex control flow, including stochastic recursion. It also includes primitives from Bayesian nonparametric statistics. Our experiments show that this approac...
Bayesian tomographic reconstruction of microsystems
Salem, Sofia Fekih; Vabre, Alexandre; Mohammad-Djafari, Ali
2007-11-01
The microtomography by X ray transmission plays an increasingly dominating role in the study and the understanding of microsystems. Within this framework, an experimental setup of high resolution X ray microtomography was developed at CEA-List to quantify the physical parameters related to the fluids flow in microsystems. Several difficulties rise from the nature of experimental data collected on this setup: enhanced error measurements due to various physical phenomena occurring during the image formation (diffusion, beam hardening), and specificities of the setup (limited angle, partial view of the object, weak contrast). To reconstruct the object we must solve an inverse problem. This inverse problem is known to be ill-posed. It therefore needs to be regularized by introducing prior information. The main prior information we account for is that the object is composed of a finite known number of different materials distributed in compact regions. This a priori information is introduced via a Gauss-Markov field for the contrast distributions with a hidden Potts-Markov field for the class materials in the Bayesian estimation framework. The computations are done by using an appropriate Markov Chain Monte Carlo (MCMC) technique. In this paper, we present first the basic steps of the proposed algorithms. Then we focus on one of the main steps in any iterative reconstruction method which is the computation of forward and adjoint operators (projection and backprojection). A fast implementation of these two operators is crucial for the real application of the method. We give some details on the fast computation of these steps and show some preliminary results of simulations.