Soliman, Ahmed A.; Al Sobhi, Mashail M.
2015-02-01
This article deals with the problem of estimating parameters of the Gompertz distribution (GD) based on progressive first-failure censored data using Bayesian and non-Bayesian approaches. The two-sample prediction problem is considered to derive Bayesian prediction bounds for both future order statistics and future record values based on progressive first failure censored informative samples from GD. The sampling schemes such as, first-failure censoring, progressive type II censoring, type II censoring and complete sample can be obtained as special cases of the progressive first-failure censored scheme. Markov chain Monte Carlo (MCMC) method with Gibbs sampling procedure is used to compute the Bayes estimates and also to construct the corresponding credible intervals of the parameters. A simulation study has been conducted in order to compare the proposed Bayes estimators with the maximum likelihood estimators MLE. Finally, some numerical computations with real data set are presented for illustrating all the proposed inferential procedures.
Directory of Open Access Journals (Sweden)
Allen Rodrigo
2006-01-01
Full Text Available Using the structured serial coalescent with Bayesian MCMC and serial samples, we estimate population size when some demes are not sampled or are hidden, ie ghost demes. It is found that even with the presence of a ghost deme, accurate inference was possible if the parameters are estimated with the true model. However with an incorrect model, estimates were biased and can be positively misleading. We extend these results to the case where there are sequences from the ghost at the last time sample. This case can arise in HIV patients, when some tissue samples and viral sequences only become available after death. When some sequences from the ghost deme are available at the last sampling time, estimation bias is reduced and accurate estimation of parameters associated with the ghost deme is possible despite sampling bias. Migration rates for this case are also shown to be good estimates when migration values are low.
Bayesian conformational analysis of ring molecules through reversible jump MCMC
DEFF Research Database (Denmark)
Nolsøe, Kim; Kessler, Mathieu; Pérez, José;
2005-01-01
In this paper we address the problem of classifying the conformations of mmembered rings using experimental observations obtained by crystal structure analysis. We formulate a model for the data generation mechanism that consists in a multidimensional mixture model. We perform inference for the p...... for the proportions and the components in a Bayesian framework, implementing an MCMC Reversible Jumps Algorithm to obtain samples of the posterior distributions. The method is illustrated on a simulated data set and on real data corresponding to cyclo-octane structures....
Measuring the reliability of MCMC inference with bidirectional Monte Carlo
Grosse, Roger B.; Ancha, Siddharth; Roy, Daniel M.
2016-01-01
Markov chain Monte Carlo (MCMC) is one of the main workhorses of probabilistic inference, but it is notoriously hard to measure the quality of approximate posterior samples. This challenge is particularly salient in black box inference methods, which can hide details and obscure inference failures. In this work, we extend the recently introduced bidirectional Monte Carlo technique to evaluate MCMC-based posterior inference algorithms. By running annealed importance sampling (AIS) chains both ...
boa: An R Package for MCMC Output Convergence Assessment and Posterior Inference
Directory of Open Access Journals (Sweden)
Brian J. Smith
2007-10-01
Full Text Available Markov chain Monte Carlo (MCMC is the most widely used method of estimating joint posterior distributions in Bayesian analysis. The idea of MCMC is to iteratively produce parameter values that are representative samples from the joint posterior. Unlike frequentist analysis where iterative model fitting routines are monitored for convergence to a single point, MCMC output is monitored for convergence to a distribution. Thus, specialized diagnostic tools are needed in the Bayesian setting. To this end, the R package boa was created. This manuscript presents the user's manual for boa, which outlines the use of and methodology upon which the software is based. Included is a description of the menu system, data management capabilities, and statistical/graphical methods for convergence assessment and posterior inference. Throughout the manual, a linear regression example is used to illustrate the software.
Bayesian inference for OPC modeling
Burbine, Andrew; Sturtevant, John; Fryer, David; Smith, Bruce W.
2016-03-01
The use of optical proximity correction (OPC) demands increasingly accurate models of the photolithographic process. Model building and inference techniques in the data science community have seen great strides in the past two decades which make better use of available information. This paper aims to demonstrate the predictive power of Bayesian inference as a method for parameter selection in lithographic models by quantifying the uncertainty associated with model inputs and wafer data. Specifically, the method combines the model builder's prior information about each modelling assumption with the maximization of each observation's likelihood as a Student's t-distributed random variable. Through the use of a Markov chain Monte Carlo (MCMC) algorithm, a model's parameter space is explored to find the most credible parameter values. During parameter exploration, the parameters' posterior distributions are generated by applying Bayes' rule, using a likelihood function and the a priori knowledge supplied. The MCMC algorithm used, an affine invariant ensemble sampler (AIES), is implemented by initializing many walkers which semiindependently explore the space. The convergence of these walkers to global maxima of the likelihood volume determine the parameter values' highest density intervals (HDI) to reveal champion models. We show that this method of parameter selection provides insights into the data that traditional methods do not and outline continued experiments to vet the method.
Frühwirth-Schnatter, Sylvia
1990-01-01
In the paper at hand we apply it to Bayesian statistics to obtain "Fuzzy Bayesian Inference". In the subsequent sections we will discuss a fuzzy valued likelihood function, Bayes' theorem for both fuzzy data and fuzzy priors, a fuzzy Bayes' estimator, fuzzy predictive densities and distributions, and fuzzy H.P.D .-Regions. (author's abstract)
Bayesian inference and Markov chain Monte Carlo in imaging
Higdon, David M.; Bowsher, James E.
1999-05-01
Over the past 20 years, many problems in Bayesian inference that were previously intractable can now be fairly routinely dealt with using a computationally intensive technique for exploring the posterior distribution called Markov chain Monte Carlo (MCMC). Primarily because of insufficient computing capabilities, most MCMC applications have been limited to rather standard statistical models. However, with the computing power of modern workstations, a fully Bayesian approach with MCMC, is now possible for many imaging applications. Such an approach can be quite useful because it leads not only to `point' estimates of an underlying image or emission source, but it also gives a means for quantifying uncertainties regarding the image. This paper gives an overview of Bayesian image analysis and focuses on applications relevant to medical imaging. Particular focus is on prior image models and outlining MCMC methods for these models.
Computationally efficient Bayesian inference for inverse problems.
Energy Technology Data Exchange (ETDEWEB)
Marzouk, Youssef M.; Najm, Habib N.; Rahn, Larry A.
2007-10-01
Bayesian statistics provides a foundation for inference from noisy and incomplete data, a natural mechanism for regularization in the form of prior information, and a quantitative assessment of uncertainty in the inferred results. Inverse problems - representing indirect estimation of model parameters, inputs, or structural components - can be fruitfully cast in this framework. Complex and computationally intensive forward models arising in physical applications, however, can render a Bayesian approach prohibitive. This difficulty is compounded by high-dimensional model spaces, as when the unknown is a spatiotemporal field. We present new algorithmic developments for Bayesian inference in this context, showing strong connections with the forward propagation of uncertainty. In particular, we introduce a stochastic spectral formulation that dramatically accelerates the Bayesian solution of inverse problems via rapid evaluation of a surrogate posterior. We also explore dimensionality reduction for the inference of spatiotemporal fields, using truncated spectral representations of Gaussian process priors. These new approaches are demonstrated on scalar transport problems arising in contaminant source inversion and in the inference of inhomogeneous material or transport properties. We also present a Bayesian framework for parameter estimation in stochastic models, where intrinsic stochasticity may be intermingled with observational noise. Evaluation of a likelihood function may not be analytically tractable in these cases, and thus several alternative Markov chain Monte Carlo (MCMC) schemes, operating on the product space of the observations and the parameters, are introduced.
Computational statistics using the bBayesian Inference Engine
Weinberg, Martin D
2012-01-01
This paper introduces the Bayesian Inference Engine (BIE), a general parallel-optimised software package for parameter inference and model selection. This package is motivated by the analysis needs of modern astronomical surveys and the need to organise and reuse expensive derived data. I describe key concepts that illustrate the power of Bayesian inference to address these needs and outline the computational challenge. The techniques presented are based on experience gained in modelling star-counts and stellar populations, analysing the morphology of galaxy images, and performing Bayesian investigations of semi-analytic models of galaxy formation. These inference problems require advanced Markov chain Monte Carlo (MCMC) algorithms that expedite sampling, mixing, and the analysis of the Bayesian posterior distribution. The BIE was designed to be a collaborative platform for applying Bayesian methodology to astronomy. By providing a variety of statistical algorithms for all phases of the inference problem, a u...
Improving the structure MCMC sampler for Bayesian networks by introducing a new edge reversal move
Grzegorczyk, Marco; Husmeier, Dirk
2008-01-01
Applications of Bayesian networks in systems biology are computationally demanding due to the large number of model parameters. Conventional MCMC schemes based on proposal moves in structure space tend to be too slow in mixing and convergence, and have recently been superseded by proposal moves in t
BigFoot: Bayesian alignment and phylogenetic footprinting with MCMC
Directory of Open Access Journals (Sweden)
Miklós István
2009-08-01
Full Text Available Abstract Background We have previously combined statistical alignment and phylogenetic footprinting to detect conserved functional elements without assuming a fixed alignment. Considering a probability-weighted distribution of alignments removes sensitivity to alignment errors, properly accommodates regions of alignment uncertainty, and increases the accuracy of functional element prediction. Our method utilized standard dynamic programming hidden markov model algorithms to analyze up to four sequences. Results We present a novel approach, implemented in the software package BigFoot, for performing phylogenetic footprinting on greater numbers of sequences. We have developed a Markov chain Monte Carlo (MCMC approach which samples both sequence alignments and locations of slowly evolving regions. We implement our method as an extension of the existing StatAlign software package and test it on well-annotated regions controlling the expression of the even-skipped gene in Drosophila and the α-globin gene in vertebrates. The results exhibit how adding additional sequences to the analysis has the potential to improve the accuracy of functional predictions, and demonstrate how BigFoot outperforms existing alignment-based phylogenetic footprinting techniques. Conclusion BigFoot extends a combined alignment and phylogenetic footprinting approach to analyze larger amounts of sequence data using MCMC. Our approach is robust to alignment error and uncertainty and can be applied to a variety of biological datasets. The source code and documentation are publicly available for download from http://www.stats.ox.ac.uk/~satija/BigFoot/
Inference algorithms and learning theory for Bayesian sparse factor analysis
International Nuclear Information System (INIS)
Bayesian sparse factor analysis has many applications; for example, it has been applied to the problem of inferring a sparse regulatory network from gene expression data. We describe a number of inference algorithms for Bayesian sparse factor analysis using a slab and spike mixture prior. These include well-established Markov chain Monte Carlo (MCMC) and variational Bayes (VB) algorithms as well as a novel hybrid of VB and Expectation Propagation (EP). For the case of a single latent factor we derive a theory for learning performance using the replica method. We compare the MCMC and VB/EP algorithm results with simulated data to the theoretical prediction. The results for MCMC agree closely with the theory as expected. Results for VB/EP are slightly sub-optimal but show that the new algorithm is effective for sparse inference. In large-scale problems MCMC is infeasible due to computational limitations and the VB/EP algorithm then provides a very useful computationally efficient alternative.
Murakami, Yohei; Takada, Shoji
2013-01-01
When model parameters in systems biology are not available from experiments, they need to be inferred so that the resulting simulation reproduces the experimentally known phenomena. For the purpose, Bayesian statistics with Markov chain Monte Carlo (MCMC) is a useful method. Conventional MCMC needs likelihood to evaluate a posterior distribution of acceptable parameters, while the approximate Bayesian computation (ABC) MCMC evaluates posterior distribution with use of qualitative fitness measure. However, none of these algorithms can deal with mixture of quantitative, i.e., likelihood, and qualitative fitness measures simultaneously. Here, to deal with this mixture, we formulated Bayesian formula for hybrid fitness measures (HFM). Then we implemented it to MCMC (MCMC-HFM). We tested MCMC-HFM first for a kinetic toy model with a positive feedback. Inferring kinetic parameters mainly related to the positive feedback, we found that MCMC-HFM reliably infer them using both qualitative and quantitative fitness measures. Then, we applied the MCMC-HFM to an apoptosis signal transduction network previously proposed. For kinetic parameters related to implicit positive feedbacks, which are important for bistability and irreversibility of the output, the MCMC-HFM reliably inferred these kinetic parameters. In particular, some kinetic parameters that have experimental estimates were inferred without using these data and the results were consistent with experiments. Moreover, for some parameters, the mixed use of quantitative and qualitative fitness measures narrowed down the acceptable range of parameters.
Inference in hybrid Bayesian networks
DEFF Research Database (Denmark)
Lanseth, Helge; Nielsen, Thomas Dyhre; Rumí, Rafael;
2009-01-01
Since the 1980s, Bayesian Networks (BNs) have become increasingly popular for building statistical models of complex systems. This is particularly true for boolean systems, where BNs often prove to be a more efficient modelling framework than traditional reliability-techniques (like fault trees...... decade's research on inference in hybrid Bayesian networks. The discussions are linked to an example model for estimating human reliability....
Partial Order MCMC for Structure Discovery in Bayesian Networks
Niinimaki, Teppo; Koivisto, Mikko
2012-01-01
We present a new Markov chain Monte Carlo method for estimating posterior probabilities of structural features in Bayesian networks. The method draws samples from the posterior distribution of partial orders on the nodes; for each sampled partial order, the conditional probabilities of interest are computed exactly. We give both analytical and empirical results that suggest the superiority of the new method compared to previous methods, which sample either directed acyclic graphs or linear orders on the nodes.
Perception, illusions and Bayesian inference.
Nour, Matthew M; Nour, Joseph M
2015-01-01
Descriptive psychopathology makes a distinction between veridical perception and illusory perception. In both cases a perception is tied to a sensory stimulus, but in illusions the perception is of a false object. This article re-examines this distinction in light of new work in theoretical and computational neurobiology, which views all perception as a form of Bayesian statistical inference that combines sensory signals with prior expectations. Bayesian perceptual inference can solve the 'inverse optics' problem of veridical perception and provides a biologically plausible account of a number of illusory phenomena, suggesting that veridical and illusory perceptions are generated by precisely the same inferential mechanisms.
Bayesian inference tools for inverse problems
Mohammad-Djafari, Ali
2013-08-01
In this paper, first the basics of Bayesian inference with a parametric model of the data is presented. Then, the needed extensions are given when dealing with inverse problems and in particular the linear models such as Deconvolution or image reconstruction in Computed Tomography (CT). The main point to discuss then is the prior modeling of signals and images. A classification of these priors is presented, first in separable and Markovien models and then in simple or hierarchical with hidden variables. For practical applications, we need also to consider the estimation of the hyper parameters. Finally, we see that we have to infer simultaneously on the unknowns, the hidden variables and the hyper parameters. Very often, the expression of this joint posterior law is too complex to be handled directly. Indeed, rarely we can obtain analytical solutions to any point estimators such the Maximum A posteriori (MAP) or Posterior Mean (PM). Three main tools are then can be used: Laplace approximation (LAP), Markov Chain Monte Carlo (MCMC) and Bayesian Variational Approximations (BVA). To illustrate all these aspects, we will consider a deconvolution problem where we know that the input signal is sparse and propose to use a Student-t prior for that. Then, to handle the Bayesian computations with this model, we use the property of Student-t which is modelling it via an infinite mixture of Gaussians, introducing thus hidden variables which are the variances. Then, the expression of the joint posterior of the input signal samples, the hidden variables (which are here the inverse variances of those samples) and the hyper-parameters of the problem (for example the variance of the noise) is given. From this point, we will present the joint maximization by alternate optimization and the three possible approximation methods. Finally, the proposed methodology is applied in different applications such as mass spectrometry, spectrum estimation of quasi periodic biological signals and
Probability biases as Bayesian inference
Directory of Open Access Journals (Sweden)
Andre; C. R. Martins
2006-11-01
Full Text Available In this article, I will show how several observed biases in human probabilistic reasoning can be partially explained as good heuristics for making inferences in an environment where probabilities have uncertainties associated to them. Previous results show that the weight functions and the observed violations of coalescing and stochastic dominance can be understood from a Bayesian point of view. We will review those results and see that Bayesian methods should also be used as part of the explanation behind other known biases. That means that, although the observed errors are still errors under the be understood as adaptations to the solution of real life problems. Heuristics that allow fast evaluations and mimic a Bayesian inference would be an evolutionary advantage, since they would give us an efficient way of making decisions. %XX In that sense, it should be no surprise that humans reason with % probability as it has been observed.
Bayesian inference for Hawkes processes
DEFF Research Database (Denmark)
Rasmussen, Jakob Gulddahl
The Hawkes process is a practically and theoretically important class of point processes, but parameter-estimation for such a process can pose various problems. In this paper we explore and compare two approaches to Bayesian inference. The first approach is based on the so-called conditional...
Bayesian inference for Hawkes processes
DEFF Research Database (Denmark)
Rasmussen, Jakob Gulddahl
2013-01-01
The Hawkes process is a practically and theoretically important class of point processes, but parameter-estimation for such a process can pose various problems. In this paper we explore and compare two approaches to Bayesian inference. The first approach is based on the so-called conditional...
Large-Scale Distributed Bayesian Matrix Factorization using Stochastic Gradient MCMC
S. Ahn; A. Korattikara; N. Liu; S. Rajan; M. Welling
2015-01-01
Despite having various attractive qualities such as high prediction accuracy and the ability to quantify uncertainty and avoid ovrfitting, Bayesian Matrix Factorization has not been widely adopted because of the prohibitive cost of inference. In this paper, we propose a scalable distributed Bayesian
Bayesian inference on proportional elections.
Brunello, Gabriel Hideki Vatanabe; Nakano, Eduardo Yoshio
2015-01-01
Polls for majoritarian voting systems usually show estimates of the percentage of votes for each candidate. However, proportional vote systems do not necessarily guarantee the candidate with the most percentage of votes will be elected. Thus, traditional methods used in majoritarian elections cannot be applied on proportional elections. In this context, the purpose of this paper was to perform a Bayesian inference on proportional elections considering the Brazilian system of seats distribution. More specifically, a methodology to answer the probability that a given party will have representation on the chamber of deputies was developed. Inferences were made on a Bayesian scenario using the Monte Carlo simulation technique, and the developed methodology was applied on data from the Brazilian elections for Members of the Legislative Assembly and Federal Chamber of Deputies in 2010. A performance rate was also presented to evaluate the efficiency of the methodology. Calculations and simulations were carried out using the free R statistical software. PMID:25786259
Bayesian Inference for Radio Observations
Lochner, Michelle; Zwart, Jonathan T L; Smirnov, Oleg; Bassett, Bruce A; Oozeer, Nadeem; Kunz, Martin
2015-01-01
(Abridged) New telescopes like the Square Kilometre Array (SKA) will push into a new sensitivity regime and expose systematics, such as direction-dependent effects, that could previously be ignored. Current methods for handling such systematics rely on alternating best estimates of instrumental calibration and models of the underlying sky, which can lead to inaccurate uncertainty estimates and biased results because such methods ignore any correlations between parameters. These deconvolution algorithms produce a single image that is assumed to be a true representation of the sky, when in fact it is just one realisation of an infinite ensemble of images compatible with the noise in the data. In contrast, here we report a Bayesian formalism that simultaneously infers both systematics and science. Our technique, Bayesian Inference for Radio Observations (BIRO), determines all parameters directly from the raw data, bypassing image-making entirely, by sampling from the joint posterior probability distribution. Thi...
Compiling Relational Bayesian Networks for Exact Inference
DEFF Research Database (Denmark)
Jaeger, Manfred; Chavira, Mark; Darwiche, Adnan
2004-01-01
We describe a system for exact inference with relational Bayesian networks as defined in the publicly available \\primula\\ tool. The system is based on compiling propositional instances of relational Bayesian networks into arithmetic circuits and then performing online inference by evaluating...... and differentiating these circuits in time linear in their size. We report on experimental results showing the successful compilation, and efficient inference, on relational Bayesian networks whose {\\primula}--generated propositional instances have thousands of variables, and whose jointrees have clusters...
Nonparametric Bayesian inference in biostatistics
Müller, Peter
2015-01-01
As chapters in this book demonstrate, BNP has important uses in clinical sciences and inference for issues like unknown partitions in genomics. Nonparametric Bayesian approaches (BNP) play an ever expanding role in biostatistical inference from use in proteomics to clinical trials. Many research problems involve an abundance of data and require flexible and complex probability models beyond the traditional parametric approaches. As this book's expert contributors show, BNP approaches can be the answer. Survival Analysis, in particular survival regression, has traditionally used BNP, but BNP's potential is now very broad. This applies to important tasks like arrangement of patients into clinically meaningful subpopulations and segmenting the genome into functionally distinct regions. This book is designed to both review and introduce application areas for BNP. While existing books provide theoretical foundations, this book connects theory to practice through engaging examples and research questions. Chapters c...
New inference strategies for solving Markov Decision Processes using reversible jump MCMC
Hoffman, Matthias; de Freitas, Nando; Doucet, Arnaud
2012-01-01
In this paper we build on previous work which uses inferences techniques, in particular Markov Chain Monte Carlo (MCMC) methods, to solve parameterized control problems. We propose a number of modifications in order to make this approach more practical in general, higher-dimensional spaces. We first introduce a new target distribution which is able to incorporate more reward information from sampled trajectories. We also show how to break strong correlations between the policy parameters and sampled trajectories in order to sample more freely. Finally, we show how to incorporate these techniques in a principled manner to obtain estimates of the optimal policy.
Fast MCMC sampling for Markov jump processes and continuous time Bayesian networks
Rao, Vinayak
2012-01-01
Markov jump processes and continuous time Bayesian networks are important classes of continuous time dynamical systems. In this paper, we tackle the problem of inferring unobserved paths in these models by introducing a fast auxiliary variable Gibbs sampler. Our approach is based on the idea of uniformization, and sets up a Markov chain over paths by sampling a finite set of virtual jump times and then running a standard hidden Markov model forward filtering-backward sampling algorithm over states at the set of extant and virtual jump times. We demonstrate significant computational benefits over a state-of-the-art Gibbs sampler on a number of continuous time Bayesian networks.
Parallel local approximation MCMC for expensive models
Conrad, Patrick; Davis, Andrew; Marzouk, Youssef; Pillai, Natesh; Smith, Aaron
2016-01-01
Performing Bayesian inference via Markov chain Monte Carlo (MCMC) can be exceedingly expensive when posterior evaluations invoke the evaluation of a computationally expensive model, such as a system of partial differential equations. In recent work [Conrad et al. JASA 2015, arXiv:1402.1694] we described a framework for constructing and refining local approximations of such models during an MCMC simulation. These posterior--adapted approximations harness regularity of the model to reduce the c...
Compiling Relational Bayesian Networks for Exact Inference
DEFF Research Database (Denmark)
Jaeger, Manfred; Darwiche, Adnan; Chavira, Mark
2006-01-01
We describe in this paper a system for exact inference with relational Bayesian networks as defined in the publicly available PRIMULA tool. The system is based on compiling propositional instances of relational Bayesian networks into arithmetic circuits and then performing online inference...... by evaluating and differentiating these circuits in time linear in their size. We report on experimental results showing successful compilation and efficient inference on relational Bayesian networks, whose PRIMULA--generated propositional instances have thousands of variables, and whose jointrees have clusters...
An application of Bayesian inference for solar-like pulsators
Benomar, O.
2008-12-01
As the amount of data collected by space-borne asteroseismic instruments (such as CoRoT and Kepler) increases drastically, it will be useful to have automated processes to extract a maximum of information from these data. The use of a Bayesian approach could be very help- ful for this goal. Only a few attempts have been made in this way (e.g. Brewer et al. 2007). We propose to use Markov Chain Monte Carlo simulations (MCMC) with Metropolis-Hasting (MH) based algorithms to infer the main stellar oscillation parameters from the power spec- trum, in the case of solar-like pulsators. Given a number of modes to be fitted, the algorithm is able to give the best set of parameters (frequency, linewidth, amplitude, rotational split- ting) corresponding to a chosen input model. We illustrate this algorithm with one of the first CoRoT targets: HD 49933.
AGNfitter: A Bayesian MCMC approach to fitting spectral energy distributions of AGN
Rivera, Gabriela Calistro; Hennawi, Joseph F; Hogg, David W
2016-01-01
We present AGNfitter, a publicly available open-source algorithm implementing a fully Bayesian Markov Chain Monte Carlo method to fit the spectral energy distributions (SEDs) of active galactic nuclei (AGN) from the sub-mm to the UV, allowing one to robustly disentangle the physical processes responsible for their emission. AGNfitter makes use of a large library of theoretical, empirical, and semi-empirical models to characterize both the nuclear and host galaxy emission simultaneously. The model consists of four physical emission components: an accretion disk, a torus of AGN heated dust, stellar populations, and cold dust in star forming regions. AGNfitter determines the posterior distributions of numerous parameters that govern the physics of AGN with a fully Bayesian treatment of errors and parameter degeneracies, allowing one to infer integrated luminosities, dust attenuation parameters, stellar masses, and star formation rates. We tested AGNfitter's performace on real data by fitting the SEDs of a sample...
Bayesian Inference on Gravitational Waves
Directory of Open Access Journals (Sweden)
Asad Ali
2015-12-01
Full Text Available The Bayesian approach is increasingly becoming popular among the astrophysics data analysis communities. However, the Pakistan statistics communities are unaware of this fertile interaction between the two disciplines. Bayesian methods have been in use to address astronomical problems since the very birth of the Bayes probability in eighteenth century. Today the Bayesian methods for the detection and parameter estimation of gravitational waves have solid theoretical grounds with a strong promise for the realistic applications. This article aims to introduce the Pakistan statistics communities to the applications of Bayesian Monte Carlo methods in the analysis of gravitational wave data with an overview of the Bayesian signal detection and estimation methods and demonstration by a couple of simplified examples.
Bayesian inference for Markov jump processes with informative observations.
Golightly, Andrew; Wilkinson, Darren J
2015-04-01
In this paper we consider the problem of parameter inference for Markov jump process (MJP) representations of stochastic kinetic models. Since transition probabilities are intractable for most processes of interest yet forward simulation is straightforward, Bayesian inference typically proceeds through computationally intensive methods such as (particle) MCMC. Such methods ostensibly require the ability to simulate trajectories from the conditioned jump process. When observations are highly informative, use of the forward simulator is likely to be inefficient and may even preclude an exact (simulation based) analysis. We therefore propose three methods for improving the efficiency of simulating conditioned jump processes. A conditioned hazard is derived based on an approximation to the jump process, and used to generate end-point conditioned trajectories for use inside an importance sampling algorithm. We also adapt a recently proposed sequential Monte Carlo scheme to our problem. Essentially, trajectories are reweighted at a set of intermediate time points, with more weight assigned to trajectories that are consistent with the next observation. We consider two implementations of this approach, based on two continuous approximations of the MJP. We compare these constructs for a simple tractable jump process before using them to perform inference for a Lotka-Volterra system. The best performing construct is used to infer the parameters governing a simple model of motility regulation in Bacillus subtilis. PMID:25720091
An Intuitive Dashboard for Bayesian Network Inference
International Nuclear Information System (INIS)
Current Bayesian network software packages provide good graphical interface for users who design and develop Bayesian networks for various applications. However, the intended end-users of these networks may not necessarily find such an interface appealing and at times it could be overwhelming, particularly when the number of nodes in the network is large. To circumvent this problem, this paper presents an intuitive dashboard, which provides an additional layer of abstraction, enabling the end-users to easily perform inferences over the Bayesian networks. Unlike most software packages, which display the nodes and arcs of the network, the developed tool organises the nodes based on the cause-and-effect relationship, making the user-interaction more intuitive and friendly. In addition to performing various types of inferences, the users can conveniently use the tool to verify the behaviour of the developed Bayesian network. The tool has been developed using QT and SMILE libraries in C++
An Intuitive Dashboard for Bayesian Network Inference
Reddy, Vikas; Charisse Farr, Anna; Wu, Paul; Mengersen, Kerrie; Yarlagadda, Prasad K. D. V.
2014-03-01
Current Bayesian network software packages provide good graphical interface for users who design and develop Bayesian networks for various applications. However, the intended end-users of these networks may not necessarily find such an interface appealing and at times it could be overwhelming, particularly when the number of nodes in the network is large. To circumvent this problem, this paper presents an intuitive dashboard, which provides an additional layer of abstraction, enabling the end-users to easily perform inferences over the Bayesian networks. Unlike most software packages, which display the nodes and arcs of the network, the developed tool organises the nodes based on the cause-and-effect relationship, making the user-interaction more intuitive and friendly. In addition to performing various types of inferences, the users can conveniently use the tool to verify the behaviour of the developed Bayesian network. The tool has been developed using QT and SMILE libraries in C++.
Kernel Bayesian Inference with Posterior Regularization
Song, Yang; Jun ZHU; Ren, Yong
2016-01-01
We propose a vector-valued regression problem whose solution is equivalent to the reproducing kernel Hilbert space (RKHS) embedding of the Bayesian posterior distribution. This equivalence provides a new understanding of kernel Bayesian inference. Moreover, the optimization problem induces a new regularization for the posterior embedding estimator, which is faster and has comparable performance to the squared regularization in kernel Bayes' rule. This regularization coincides with a former th...
Tactile length contraction as Bayesian inference.
Tong, Jonathan; Ngo, Vy; Goldreich, Daniel
2016-08-01
To perceive, the brain must interpret stimulus-evoked neural activity. This is challenging: The stochastic nature of the neural response renders its interpretation inherently uncertain. Perception would be optimized if the brain used Bayesian inference to interpret inputs in light of expectations derived from experience. Bayesian inference would improve perception on average but cause illusions when stimuli violate expectation. Intriguingly, tactile, auditory, and visual perception are all prone to length contraction illusions, characterized by the dramatic underestimation of the distance between punctate stimuli delivered in rapid succession; the origin of these illusions has been mysterious. We previously proposed that length contraction illusions occur because the brain interprets punctate stimulus sequences using Bayesian inference with a low-velocity expectation. A novel prediction of our Bayesian observer model is that length contraction should intensify if stimuli are made more difficult to localize. Here we report a tactile psychophysical study that tested this prediction. Twenty humans compared two distances on the forearm: a fixed reference distance defined by two taps with 1-s temporal separation and an adjustable comparison distance defined by two taps with temporal separation t ≤ 1 s. We observed significant length contraction: As t was decreased, participants perceived the two distances as equal only when the comparison distance was made progressively greater than the reference distance. Furthermore, the use of weaker taps significantly enhanced participants' length contraction. These findings confirm the model's predictions, supporting the view that the spatiotemporal percept is a best estimate resulting from a Bayesian inference process. PMID:27121574
Tactile length contraction as Bayesian inference.
Tong, Jonathan; Ngo, Vy; Goldreich, Daniel
2016-08-01
To perceive, the brain must interpret stimulus-evoked neural activity. This is challenging: The stochastic nature of the neural response renders its interpretation inherently uncertain. Perception would be optimized if the brain used Bayesian inference to interpret inputs in light of expectations derived from experience. Bayesian inference would improve perception on average but cause illusions when stimuli violate expectation. Intriguingly, tactile, auditory, and visual perception are all prone to length contraction illusions, characterized by the dramatic underestimation of the distance between punctate stimuli delivered in rapid succession; the origin of these illusions has been mysterious. We previously proposed that length contraction illusions occur because the brain interprets punctate stimulus sequences using Bayesian inference with a low-velocity expectation. A novel prediction of our Bayesian observer model is that length contraction should intensify if stimuli are made more difficult to localize. Here we report a tactile psychophysical study that tested this prediction. Twenty humans compared two distances on the forearm: a fixed reference distance defined by two taps with 1-s temporal separation and an adjustable comparison distance defined by two taps with temporal separation t ≤ 1 s. We observed significant length contraction: As t was decreased, participants perceived the two distances as equal only when the comparison distance was made progressively greater than the reference distance. Furthermore, the use of weaker taps significantly enhanced participants' length contraction. These findings confirm the model's predictions, supporting the view that the spatiotemporal percept is a best estimate resulting from a Bayesian inference process.
Type Ia Supernova Light Curve Inference: Hierarchical Bayesian Analysis in the Near Infrared
Mandel, Kaisey S; Friedman, Andrew S; Kirshner, Robert P
2009-01-01
We present a comprehensive statistical analysis of the properties of Type Ia SN light curves in the near infrared using recent data from PAIRITEL and the literature. We construct a hierarchical Bayesian framework, incorporating several uncertainties including photometric error, peculiar velocities, dust extinction and intrinsic variations, for coherent statistical inference. SN Ia light curve inferences are drawn from the global posterior probability of parameters describing both individual supernovae and the population conditioned on the entire SN Ia NIR dataset. The logical structure of the hierarchical Bayesian model is represented by a directed acyclic graph. Fully Bayesian analysis of the model and data is enabled by an efficient MCMC algorithm exploiting the conditional structure using Gibbs sampling. We apply this framework to the JHK_s SN Ia light curve data. A new light curve model captures the observed J-band light curve shape variations. The intrinsic variances in peak absolute magnitudes are: sigm...
Variational Bayesian Inference of Line Spectra
DEFF Research Database (Denmark)
Badiu, Mihai Alin; Hansen, Thomas Lundgaard; Fleury, Bernard Henri
2016-01-01
In this paper, we address the fundamental problem of line spectral estimation in a Bayesian framework. We target model order and parameter estimation via variational inference in a probabilistic model in which the frequencies are continuous-valued, i.e., not restricted to a grid; and the coeffici......In this paper, we address the fundamental problem of line spectral estimation in a Bayesian framework. We target model order and parameter estimation via variational inference in a probabilistic model in which the frequencies are continuous-valued, i.e., not restricted to a grid......; and the coefficients are governed by a Bernoulli-Gaussian prior model turning model order selection into binary sequence detection. Unlike earlier works which retain only point estimates of the frequencies, we undertake a more complete Bayesian treatment by estimating the posterior probability density functions (pdfs...
Decision generation tools and Bayesian inference
Jannson, Tomasz; Wang, Wenjian; Forrester, Thomas; Kostrzewski, Andrew; Veeris, Christian; Nielsen, Thomas
2014-05-01
Digital Decision Generation (DDG) tools are important software sub-systems of Command and Control (C2) systems and technologies. In this paper, we present a special type of DDGs based on Bayesian Inference, related to adverse (hostile) networks, including such important applications as terrorism-related networks and organized crime ones.
Computational statistics using the Bayesian Inference Engine
Weinberg, Martin D.
2013-09-01
This paper introduces the Bayesian Inference Engine (BIE), a general parallel, optimized software package for parameter inference and model selection. This package is motivated by the analysis needs of modern astronomical surveys and the need to organize and reuse expensive derived data. The BIE is the first platform for computational statistics designed explicitly to enable Bayesian update and model comparison for astronomical problems. Bayesian update is based on the representation of high-dimensional posterior distributions using metric-ball-tree based kernel density estimation. Among its algorithmic offerings, the BIE emphasizes hybrid tempered Markov chain Monte Carlo schemes that robustly sample multimodal posterior distributions in high-dimensional parameter spaces. Moreover, the BIE implements a full persistence or serialization system that stores the full byte-level image of the running inference and previously characterized posterior distributions for later use. Two new algorithms to compute the marginal likelihood from the posterior distribution, developed for and implemented in the BIE, enable model comparison for complex models and data sets. Finally, the BIE was designed to be a collaborative platform for applying Bayesian methodology to astronomy. It includes an extensible object-oriented and easily extended framework that implements every aspect of the Bayesian inference. By providing a variety of statistical algorithms for all phases of the inference problem, a scientist may explore a variety of approaches with a single model and data implementation. Additional technical details and download details are available from http://www.astro.umass.edu/bie. The BIE is distributed under the GNU General Public License.
Koepke, C.; Irving, J.
2015-12-01
Bayesian solutions to inverse problems in near-surface geophysics and hydrology have gained increasing popularity as a means of estimating not only subsurface model parameters, but also their corresponding uncertainties that can be used in probabilistic forecasting and risk analysis. In particular, Markov-chain-Monte-Carlo (MCMC) methods have attracted much recent attention as a means of statistically sampling from the Bayesian posterior distribution. In this regard, two approaches are commonly used to improve the computational tractability of the Bayesian-MCMC approach: (i) Forward models involving a simplification of the underlying physics are employed, which offer a significant reduction in the time required to calculate data, but generally at the expense of model accuracy, and (ii) the model parameter space is represented using a limited set of spatially correlated basis functions as opposed to a more intuitive high-dimensional pixel-based parameterization. It has become well understood that model inaccuracies resulting from (i) can lead to posterior parameter distributions that are highly biased and overly confident. Further, when performing model reduction as described in (ii), it is not clear how the prior distribution for the basis weights should be defined because simple (e.g., Gaussian or uniform) priors that may be suitable for a pixel-based parameterization may result in a strong prior bias when used for the weights. To address the issue of model error resulting from known forward model approximations, we generate a set of error training realizations and analyze them with principal component analysis (PCA) in order to generate a sparse basis. The latter is used in the MCMC inversion to remove the main model-error component from the residuals. To improve issues related to prior bias when performing model reduction, we also use a training realization approach, but this time models are simulated from the prior distribution and analyzed using independent
Bayesian inference of the metazoan phylogeny
DEFF Research Database (Denmark)
Glenner, Henrik; Hansen, Anders J; Sørensen, Martin V;
2004-01-01
been the only feasible combined approach but is highly sensitive to long-branch attraction. Recent development of stochastic models for discrete morphological characters and computationally efficient methods for Bayesian inference has enabled combined molecular and morphological data analysis...... with rigorous statistical approaches less prone to such inconsistencies. We present the first statistically founded analysis of a metazoan data set based on a combination of morphological and molecular data and compare the results with a traditional parsimony analysis. Interestingly, the Bayesian analyses...... such as the ecdysozoans and lophotrochozoans. Parsimony, on the contrary, shows conflicting results, with morphology being congruent to the Bayesian results and the molecular data set producing peculiarities that are largely reflected in the combined analysis....
Bayesianism and inference to the best explanation
Directory of Open Access Journals (Sweden)
Valeriano IRANZO
2008-01-01
Full Text Available Bayesianism and Inference to the best explanation (IBE are two different models of inference. Recently there has been some debate about the possibility of “bayesianizing” IBE. Firstly I explore several alternatives to include explanatory considerations in Bayes’s Theorem. Then I distinguish two different interpretations of prior probabilities: “IBE-Bayesianism” (IBE-Bay and “frequentist-Bayesianism” (Freq-Bay. After detailing the content of the latter, I propose a rule for assessing the priors. I also argue that Freq-Bay: (i endorses a role for explanatory value in the assessment of scientific hypotheses; (ii avoids a purely subjectivist reading of prior probabilities; and (iii fits better than IBE-Bayesianism with two basic facts about science, i.e., the prominent role played by empirical testing and the existence of many scientific theories in the past that failed to fulfil their promises and were subsequently abandoned.
Constrained bayesian inference of project performance models
Sunmola, Funlade
2013-01-01
Project performance models play an important role in the management of project success. When used for monitoring projects, they can offer predictive ability such as indications of possible delivery problems. Approaches for monitoring project performance relies on available project information including restrictions imposed on the project, particularly the constraints of cost, quality, scope and time. We study in this paper a Bayesian inference methodology for project performance modelling in ...
Using Alien Coins to Test Whether Simple Inference Is Bayesian
Cassey, Peter; Hawkins, Guy E.; Donkin, Chris; Brown, Scott D.
2016-01-01
Reasoning and inference are well-studied aspects of basic cognition that have been explained as statistically optimal Bayesian inference. Using a simplified experimental design, we conducted quantitative comparisons between Bayesian inference and human inference at the level of individuals. In 3 experiments, with more than 13,000 participants, we…
Bayesian Estimation and Inference Using Stochastic Electronics.
Thakur, Chetan Singh; Afshar, Saeed; Wang, Runchun M; Hamilton, Tara J; Tapson, Jonathan; van Schaik, André
2016-01-01
In this paper, we present the implementation of two types of Bayesian inference problems to demonstrate the potential of building probabilistic algorithms in hardware using single set of building blocks with the ability to perform these computations in real time. The first implementation, referred to as the BEAST (Bayesian Estimation and Stochastic Tracker), demonstrates a simple problem where an observer uses an underlying Hidden Markov Model (HMM) to track a target in one dimension. In this implementation, sensors make noisy observations of the target position at discrete time steps. The tracker learns the transition model for target movement, and the observation model for the noisy sensors, and uses these to estimate the target position by solving the Bayesian recursive equation online. We show the tracking performance of the system and demonstrate how it can learn the observation model, the transition model, and the external distractor (noise) probability interfering with the observations. In the second implementation, referred to as the Bayesian INference in DAG (BIND), we show how inference can be performed in a Directed Acyclic Graph (DAG) using stochastic circuits. We show how these building blocks can be easily implemented using simple digital logic gates. An advantage of the stochastic electronic implementation is that it is robust to certain types of noise, which may become an issue in integrated circuit (IC) technology with feature sizes in the order of tens of nanometers due to their low noise margin, the effect of high-energy cosmic rays and the low supply voltage. In our framework, the flipping of random individual bits would not affect the system performance because information is encoded in a bit stream. PMID:27047326
Bayesian Estimation and Inference Using Stochastic Electronics.
Thakur, Chetan Singh; Afshar, Saeed; Wang, Runchun M; Hamilton, Tara J; Tapson, Jonathan; van Schaik, André
2016-01-01
In this paper, we present the implementation of two types of Bayesian inference problems to demonstrate the potential of building probabilistic algorithms in hardware using single set of building blocks with the ability to perform these computations in real time. The first implementation, referred to as the BEAST (Bayesian Estimation and Stochastic Tracker), demonstrates a simple problem where an observer uses an underlying Hidden Markov Model (HMM) to track a target in one dimension. In this implementation, sensors make noisy observations of the target position at discrete time steps. The tracker learns the transition model for target movement, and the observation model for the noisy sensors, and uses these to estimate the target position by solving the Bayesian recursive equation online. We show the tracking performance of the system and demonstrate how it can learn the observation model, the transition model, and the external distractor (noise) probability interfering with the observations. In the second implementation, referred to as the Bayesian INference in DAG (BIND), we show how inference can be performed in a Directed Acyclic Graph (DAG) using stochastic circuits. We show how these building blocks can be easily implemented using simple digital logic gates. An advantage of the stochastic electronic implementation is that it is robust to certain types of noise, which may become an issue in integrated circuit (IC) technology with feature sizes in the order of tens of nanometers due to their low noise margin, the effect of high-energy cosmic rays and the low supply voltage. In our framework, the flipping of random individual bits would not affect the system performance because information is encoded in a bit stream.
Trans-Dimensional Bayesian Inference for Gravitational Lens Substructures
Brewer, Brendon J; Lewis, Geraint F
2015-01-01
We introduce a Bayesian solution to the problem of inferring the density profile of strong gravitational lenses when the lens galaxy may contain multiple dark or faint substructures. The source and lens models are based on a superposition of an unknown number of non-negative basis functions (or "blobs") whose form was chosen with speed as a primary criterion. The prior distribution for the blobs' properties is specified hierarchically, so the mass function of substructures is a natural output of the method. We use reversible jump Markov Chain Monte Carlo (MCMC) within Diffusive Nested Sampling (DNS) to sample the posterior distribution and evaluate the marginal likelihood of the model, including the summation over the unknown number of blobs in the source and the lens. We demonstrate the method on a simulated data set with a single substructure, which is recovered well with moderate uncertainties. We also apply the method to the g-band image of the "Cosmic Horseshoe" system, and find some hints of potential s...
Geometric ergodicity of a hybrid sampler for Bayesian inference of phylogenetic branch lengths.
Spade, David A; Herbei, Radu; Kubatko, Laura S
2015-10-01
One of the fundamental goals in phylogenetics is to make inferences about the evolutionary pattern among a group of individuals, such as genes or species, using present-day genetic material. This pattern is represented by a phylogenetic tree, and as computational methods have caught up to the statistical theory, Bayesian methods of making inferences about phylogenetic trees have become increasingly popular. Bayesian inference of phylogenetic trees requires sampling from intractable probability distributions. Common methods of sampling from these distributions include Markov chain Monte Carlo (MCMC) and Sequential Monte Carlo (SMC) methods, and one way that both of these methods can proceed is by first simulating a tree topology and then taking a sample from the posterior distribution of the branch lengths given the tree topology and the data set. In many MCMC methods, it is difficult to verify that the underlying Markov chain is geometrically ergodic, and thus, it is necessary to rely on output-based convergence diagnostics in order to assess convergence on an ad hoc basis. These diagnostics suffer from several important limitations, so in an effort to circumvent these limitations, this work establishes geometric convergence for a particular Markov chain that is used to sample branch lengths under a fairly general class of nucleotide substitution models and provides a numerical method for estimating the time this Markov chain takes to converge.
Bayesian Posterior Distributions Without Markov Chains
Cole, Stephen R.; Chu, Haitao; Greenland, Sander; Hamra, Ghassan; Richardson, David B.
2012-01-01
Bayesian posterior parameter distributions are often simulated using Markov chain Monte Carlo (MCMC) methods. However, MCMC methods are not always necessary and do not help the uninitiated understand Bayesian inference. As a bridge to understanding Bayesian inference, the authors illustrate a transparent rejection sampling method. In example 1, they illustrate rejection sampling using 36 cases and 198 controls from a case-control study (1976–1983) assessing the relation between residential ex...
Human collective intelligence as distributed Bayesian inference
Krafft, Peter M; Pan, Wei; Della Penna, Nicolás; Altshuler, Yaniv; Shmueli, Erez; Tenenbaum, Joshua B; Pentland, Alex
2016-01-01
Collective intelligence is believed to underly the remarkable success of human society. The formation of accurate shared beliefs is one of the key components of human collective intelligence. How are accurate shared beliefs formed in groups of fallible individuals? Answering this question requires a multiscale analysis. We must understand both the individual decision mechanisms people use, and the properties and dynamics of those mechanisms in the aggregate. As of yet, mathematical tools for such an approach have been lacking. To address this gap, we introduce a new analytical framework: We propose that groups arrive at accurate shared beliefs via distributed Bayesian inference. Distributed inference occurs through information processing at the individual level, and yields rational belief formation at the group level. We instantiate this framework in a new model of human social decision-making, which we validate using a dataset we collected of over 50,000 users of an online social trading platform where inves...
Bayesian Inference Methods for Sparse Channel Estimation
DEFF Research Database (Denmark)
Pedersen, Niels Lovmand
2013-01-01
This thesis deals with sparse Bayesian learning (SBL) with application to radio channel estimation. As opposed to the classical approach for sparse signal representation, we focus on the problem of inferring complex signals. Our investigations within SBL constitute the basis for the development...... complex prior representation achieve improved sparsity representations in low signalto- noise ratio as opposed to state-of-the-art sparse estimators. This result is of particular importance for the applicability of the algorithms in the field of channel estimation. We then derive various iterative...... and computational complexity. We also analyze the impact of transceiver filters on the sparseness of the channel response, and propose a dictionary design that permits the deployment of sparse inference methods in conditions of low bandwidth....
MultiNest: an efficient and robust Bayesian inference tool for cosmology and particle physics
Feroz, F; Bridges, M
2008-01-01
We present further development and the first public release of our multimodal nested sampling algorithm, called MultiNest. This Bayesian inference tool calculates the evidence, with an associated error estimate, and produces posterior samples from distributions that may contain multiple modes and pronounced (curving) degeneracies in high dimensions. The developments presented here lead to further substantial improvements in sampling efficiency and robustness, as compared to the original algorithm presented in Feroz & Hobson (2008), which itself significantly outperformed existing MCMC techniques in a wide range of astrophysical inference problems. The accuracy and economy of the MultiNest algorithm is demonstrated by application to two toy problems and to a cosmological inference problem focussing on the extension of the vanilla $\\Lambda$CDM model to include spatial curvature and a varying equation of state for dark energy. The MultiNest software, which is fully parallelized using MPI and includes an inte...
Bayesian Inference of a Multivariate Regression Model
Directory of Open Access Journals (Sweden)
Marick S. Sinay
2014-01-01
Full Text Available We explore Bayesian inference of a multivariate linear regression model with use of a flexible prior for the covariance structure. The commonly adopted Bayesian setup involves the conjugate prior, multivariate normal distribution for the regression coefficients and inverse Wishart specification for the covariance matrix. Here we depart from this approach and propose a novel Bayesian estimator for the covariance. A multivariate normal prior for the unique elements of the matrix logarithm of the covariance matrix is considered. Such structure allows for a richer class of prior distributions for the covariance, with respect to strength of beliefs in prior location hyperparameters, as well as the added ability, to model potential correlation amongst the covariance structure. The posterior moments of all relevant parameters of interest are calculated based upon numerical results via a Markov chain Monte Carlo procedure. The Metropolis-Hastings-within-Gibbs algorithm is invoked to account for the construction of a proposal density that closely matches the shape of the target posterior distribution. As an application of the proposed technique, we investigate a multiple regression based upon the 1980 High School and Beyond Survey.
Universal Darwinism as a process of Bayesian inference
Campbell, John O
2016-01-01
Many of the mathematical frameworks describing natural selection are equivalent to Bayes Theorem, also known as Bayesian updating. By definition, a process of Bayesian Inference is one which involves a Bayesian update, so we may conclude that these frameworks describe natural selection as a process of Bayesian inference. Thus natural selection serves as a counter example to a widely-held interpretation that restricts Bayesian Inference to human mental processes (including the endeavors of statisticians). As Bayesian inference can always be cast in terms of (variational) free energy minimization, natural selection can be viewed as comprising two components: a generative model of an "experiment" in the external world environment, and the results of that "experiment" or the "surprise" entailed by predicted and actual outcomes of the "experiment". Minimization of free energy implies that the implicit measure of "surprise" experienced serves to update the generative model in a Bayesian manner. This description clo...
Integrating distributed Bayesian inference and reinforcement learning for sensor management
C. Grappiolo; S. Whiteson; G. Pavlin; B. Bakker
2009-01-01
This paper introduces a sensor management approach that integrates distributed Bayesian inference (DBI) and reinforcement learning (RL). DBI is implemented using distributed perception networks (DPNs), a multiagent approach to performing efficient inference, while RL is used to automatically discove
Porter, Edward K
2014-01-01
With the advance in computational resources, Bayesian inference is increasingly becoming the standard tool of practise in GW astronomy. However, algorithms such as Markov Chain Monte Carlo (MCMC) require a large number of iterations to guarantee convergence to the target density. Each chain demands a large number of evaluations of the likelihood function, and in the case of a Hessian MCMC, calculations of the Fisher information matrix for use as a proposal distribution. As each iteration requires the generation of at least one gravitational waveform, we very quickly reach a point of exclusion for current Bayesian algorithms, especially for low mass systems where the length of the waveforms is large and the waveform generation time is on the order of seconds. This suddenly demands a timescale of many weeks for a single MCMC. As each likelihood and Fisher information matrix calculation requires the evaluation of noise-weighted scalar products, we demonstrate that by using the linearity of integration, and the f...
A Fast Iterative Bayesian Inference Algorithm for Sparse Channel Estimation
DEFF Research Database (Denmark)
Pedersen, Niels Lovmand; Manchón, Carles Navarro; Fleury, Bernard Henri
2013-01-01
representation of the Bessel K probability density function; a highly efficient, fast iterative Bayesian inference method is then applied to the proposed model. The resulting estimator outperforms other state-of-the-art Bayesian and non-Bayesian estimators, either by yielding lower mean squared estimation error...
Bayesian electron density inference from JET lithium beam emission spectra using Gaussian processes
Kwak, Sehyun; Brix, M; Ghim, Y -c
2016-01-01
A Bayesian model to infer edge electron density profiles is developed for the JET lithium beam emission spectroscopy system, measuring Li I line radiation using 26 channels with ~1 cm spatial resolution and 10~20 ms temporal resolution. The density profile is modelled using a Gaussian process prior, and the uncertainty of the density profile is calculated by a Markov Chain Monte Carlo (MCMC) scheme. From the spectra measured by the transmission grating spectrometer, the Li line intensities are extracted, and modelled as a function of the plasma density by a multi-state model which describes the relevant processes between neutral lithium beam atoms and plasma particles. The spectral model fully takes into account interference filter and instrument effects, that are separately estimated, again using Gaussian processes. The line intensities are inferred based on a spectral model consistent with the measured spectra within their uncertainties, which includes photon statistics and electronic noise. Our newly devel...
Quantum-Like Representation of Non-Bayesian Inference
Asano, M.; Basieva, I.; Khrennikov, A.; Ohya, M.; Tanaka, Y.
2013-01-01
This research is related to the problem of "irrational decision making or inference" that have been discussed in cognitive psychology. There are some experimental studies, and these statistical data cannot be described by classical probability theory. The process of decision making generating these data cannot be reduced to the classical Bayesian inference. For this problem, a number of quantum-like coginitive models of decision making was proposed. Our previous work represented in a natural way the classical Bayesian inference in the frame work of quantum mechanics. By using this representation, in this paper, we try to discuss the non-Bayesian (irrational) inference that is biased by effects like the quantum interference. Further, we describe "psychological factor" disturbing "rationality" as an "environment" correlating with the "main system" of usual Bayesian inference.
Mira Antonietta; Tenconi Paolo; Bressanini Dario
2003-01-01
We propose a general purpose variance reduction technique for MCMC estimators. The idea is obtained by combining standard variance reduction principles known for regular Monte Carlo simulations (Ripley, 1987) and the Zero-Variance principle introduced in the physics literature (Assaraf and Caffarel, 1999). The potential of the new idea is illustrated with some toy examples and an application to Bayesian estimation
Universal Darwinism As a Process of Bayesian Inference.
Campbell, John O
2016-01-01
Many of the mathematical frameworks describing natural selection are equivalent to Bayes' Theorem, also known as Bayesian updating. By definition, a process of Bayesian Inference is one which involves a Bayesian update, so we may conclude that these frameworks describe natural selection as a process of Bayesian inference. Thus, natural selection serves as a counter example to a widely-held interpretation that restricts Bayesian Inference to human mental processes (including the endeavors of statisticians). As Bayesian inference can always be cast in terms of (variational) free energy minimization, natural selection can be viewed as comprising two components: a generative model of an "experiment" in the external world environment, and the results of that "experiment" or the "surprise" entailed by predicted and actual outcomes of the "experiment." Minimization of free energy implies that the implicit measure of "surprise" experienced serves to update the generative model in a Bayesian manner. This description closely accords with the mechanisms of generalized Darwinian process proposed both by Dawkins, in terms of replicators and vehicles, and Campbell, in terms of inferential systems. Bayesian inference is an algorithm for the accumulation of evidence-based knowledge. This algorithm is now seen to operate over a wide range of evolutionary processes, including natural selection, the evolution of mental models and cultural evolutionary processes, notably including science itself. The variational principle of free energy minimization may thus serve as a unifying mathematical framework for universal Darwinism, the study of evolutionary processes operating throughout nature. PMID:27375438
Universal Darwinism As a Process of Bayesian Inference.
Campbell, John O
2016-01-01
Many of the mathematical frameworks describing natural selection are equivalent to Bayes' Theorem, also known as Bayesian updating. By definition, a process of Bayesian Inference is one which involves a Bayesian update, so we may conclude that these frameworks describe natural selection as a process of Bayesian inference. Thus, natural selection serves as a counter example to a widely-held interpretation that restricts Bayesian Inference to human mental processes (including the endeavors of statisticians). As Bayesian inference can always be cast in terms of (variational) free energy minimization, natural selection can be viewed as comprising two components: a generative model of an "experiment" in the external world environment, and the results of that "experiment" or the "surprise" entailed by predicted and actual outcomes of the "experiment." Minimization of free energy implies that the implicit measure of "surprise" experienced serves to update the generative model in a Bayesian manner. This description closely accords with the mechanisms of generalized Darwinian process proposed both by Dawkins, in terms of replicators and vehicles, and Campbell, in terms of inferential systems. Bayesian inference is an algorithm for the accumulation of evidence-based knowledge. This algorithm is now seen to operate over a wide range of evolutionary processes, including natural selection, the evolution of mental models and cultural evolutionary processes, notably including science itself. The variational principle of free energy minimization may thus serve as a unifying mathematical framework for universal Darwinism, the study of evolutionary processes operating throughout nature.
Universal Darwinism as a process of Bayesian inference
Directory of Open Access Journals (Sweden)
John Oberon Campbell
2016-06-01
Full Text Available Many of the mathematical frameworks describing natural selection are equivalent to Bayes’ Theorem, also known as Bayesian updating. By definition, a process of Bayesian Inference is one which involves a Bayesian update, so we may conclude that these frameworks describe natural selection as a process of Bayesian inference. Thus natural selection serves as a counter example to a widely-held interpretation that restricts Bayesian Inference to human mental processes (including the endeavors of statisticians. As Bayesian inference can always be cast in terms of (variational free energy minimization, natural selection can be viewed as comprising two components: a generative model of an ‘experiment’ in the external world environment, and the results of that 'experiment' or the 'surprise' entailed by predicted and actual outcomes of the ‘experiment’. Minimization of free energy implies that the implicit measure of 'surprise' experienced serves to update the generative model in a Bayesian manner. This description closely accords with the mechanisms of generalized Darwinian process proposed both by Dawkins, in terms of replicators and vehicles, and Campbell, in terms of inferential systems. Bayesian inference is an algorithm for the accumulation of evidence-based knowledge. This algorithm is now seen to operate over a wide range of evolutionary processes, including natural selection, the evolution of mental models and cultural evolutionary processes, notably including science itself. The variational principle of free energy minimization may thus serve as a unifying mathematical framework for universal Darwinism, the study of evolutionary processes operating throughout nature.
Bayesian multimodel inference for dose-response studies
Link, W.A.; Albers, P.H.
2007-01-01
Statistical inference in dose?response studies is model-based: The analyst posits a mathematical model of the relation between exposure and response, estimates parameters of the model, and reports conclusions conditional on the model. Such analyses rarely include any accounting for the uncertainties associated with model selection. The Bayesian inferential system provides a convenient framework for model selection and multimodel inference. In this paper we briefly describe the Bayesian paradigm and Bayesian multimodel inference. We then present a family of models for multinomial dose?response data and apply Bayesian multimodel inferential methods to the analysis of data on the reproductive success of American kestrels (Falco sparveriuss) exposed to various sublethal dietary concentrations of methylmercury.
Bayesian Inference and Online Learning in Poisson Neuronal Networks.
Huang, Yanping; Rao, Rajesh P N
2016-08-01
Motivated by the growing evidence for Bayesian computation in the brain, we show how a two-layer recurrent network of Poisson neurons can perform both approximate Bayesian inference and learning for any hidden Markov model. The lower-layer sensory neurons receive noisy measurements of hidden world states. The higher-layer neurons infer a posterior distribution over world states via Bayesian inference from inputs generated by sensory neurons. We demonstrate how such a neuronal network with synaptic plasticity can implement a form of Bayesian inference similar to Monte Carlo methods such as particle filtering. Each spike in a higher-layer neuron represents a sample of a particular hidden world state. The spiking activity across the neural population approximates the posterior distribution over hidden states. In this model, variability in spiking is regarded not as a nuisance but as an integral feature that provides the variability necessary for sampling during inference. We demonstrate how the network can learn the likelihood model, as well as the transition probabilities underlying the dynamics, using a Hebbian learning rule. We present results illustrating the ability of the network to perform inference and learning for arbitrary hidden Markov models.
Bayesian Inference and Optimal Design in the Sparse Linear Model
Seeger, Matthias; Steinke, Florian; Tsuda, Koji
2007-01-01
The sparse linear model has seen many successful applications in Statistics, Machine Learning, and Computational Biology, such as identification of gene regulatory networks from micro-array expression data. Prior work has either approximated Bayesian inference by expensive Markov chain Monte Carlo, or replaced it by point estimation. We show how to obtain a good approximation to Bayesian analysis efficiently, using the Expectation Propagation method. We also address the problems of optimal de...
Kernel Approximate Bayesian Computation for Population Genetic Inferences
Nakagome, Shigeki; Fukumizu, Kenji; Mano, Shuhei
2012-01-01
Approximate Bayesian computation (ABC) is a likelihood-free approach for Bayesian inferences based on a rejection algorithm method that applies a tolerance of dissimilarity between summary statistics from observed and simulated data. Although several improvements to the algorithm have been proposed, none of these improvements avoid the following two sources of approximation: 1) lack of sufficient statistics: sampling is not from the true posterior density given data but from an approximate po...
Approximate Bayesian inference for complex ecosystems
Michael P H Stumpf
2014-01-01
Mathematical models have been central to ecology for nearly a century. Simple models of population dynamics have allowed us to understand fundamental aspects underlying the dynamics and stability of ecological systems. What has remained a challenge, however, is to meaningfully interpret experimental or observational data in light of mathematical models. Here, we review recent developments, notably in the growing field of approximate Bayesian computation (ABC), that allow us to calibrate mathe...
Nonparametric Bayesian inference of the microcanonical stochastic block model
Peixoto, Tiago P
2016-01-01
A principled approach to characterize the hidden modular structure of networks is to formulate generative models, and then infer their parameters from data. When the desired structure is composed of modules or "communities", a suitable choice for this task is the stochastic block model (SBM), where nodes are divided into groups, and the placement of edges is conditioned on the group memberships. Here, we present a nonparametric Bayesian method to infer the modular structure of empirical networks, including the number of modules and their hierarchical organization. We focus on a microcanonical variant of the SBM, where the structure is imposed via hard constraints. We show how this simple model variation allows simultaneously for two important improvements over more traditional inference approaches: 1. Deeper Bayesian hierarchies, with noninformative priors replaced by sequences of priors and hyperpriors, that not only remove limitations that seriously degrade the inference on large networks, but also reveal s...
Methods for Bayesian power spectrum inference with galaxy surveys
Jasche, Jens
2013-01-01
We derive and implement a full Bayesian large scale structure inference method aiming at precision recovery of the cosmological power spectrum from galaxy redshift surveys. Our approach improves over previous Bayesian methods by performing a joint inference of the three dimensional density field, the cosmological power spectrum, luminosity dependent galaxy biases and corresponding normalizations. We account for all joint and correlated uncertainties between all inferred quantities. Classes of galaxies with different biases are treated as separate sub samples. The method therefore also allows the combined analysis of more than one galaxy survey. In particular, it solves the problem of inferring the power spectrum from galaxy surveys with non-trivial survey geometries by exploring the joint posterior distribution with efficient implementations of multiple block Markov chain and Hybrid Monte Carlo methods. Our Markov sampler achieves high statistical efficiency in low signal to noise regimes by using a determini...
A Bayesian Approach to Protein Inference Problem in Shotgun Proteomics
Li, Yong Fuga; Arnold, Randy J.; Li, Yixue; Radivojac, Predrag; Sheng, Quanhu; Tang, Haixu
2009-01-01
The protein inference problem represents a major challenge in shotgun proteomics. In this article, we describe a novel Bayesian approach to address this challenge by incorporating the predicted peptide detectabilities as the prior probabilities of peptide identification. We propose a rigorious probabilistic model for protein inference and provide practical algoritmic solutions to this problem. We used a complex synthetic protein mixture to test our method and obtained promising results.
Fast Bayesian inference of optical trap stiffness and particle diffusion
Bera, Sudipta; Singh, Rajesh; Ghosh, Dipanjan; Kundu, Avijit; Banerjee, Ayan; Adhikari, R
2016-01-01
Bayesian inference provides a principled way of estimating the parameters of a stochastic process that is observed discretely in time. The overdamped Brownian motion of a particle confined in an optical trap is generally modelled by the Ornstein-Uhlenbeck process and can be observed directly in experiment. Here we present Bayesian methods for inferring the parameters of this process, the trap stiffness and the particle diffusion coefficient, that use exact likelihoods and sufficient statistics to arrive at simple expressions for the maximum a posteriori estimates. This obviates the need for Monte Carlo sampling and yields methods that are both fast and accurate. We apply these to experimental data and demonstrate their advantage over commonly used non-Bayesian fitting methods.
Bayesian Inference in Monte-Carlo Tree Search
Tesauro, Gerald; Segal, Richard
2012-01-01
Monte-Carlo Tree Search (MCTS) methods are drawing great interest after yielding breakthrough results in computer Go. This paper proposes a Bayesian approach to MCTS that is inspired by distributionfree approaches such as UCT [13], yet significantly differs in important respects. The Bayesian framework allows potentially much more accurate (Bayes-optimal) estimation of node values and node uncertainties from a limited number of simulation trials. We further propose propagating inference in the tree via fast analytic Gaussian approximation methods: this can make the overhead of Bayesian inference manageable in domains such as Go, while preserving high accuracy of expected-value estimates. We find substantial empirical outperformance of UCT in an idealized bandit-tree test environment, where we can obtain valuable insights by comparing with known ground truth. Additionally we rigorously prove on-policy and off-policy convergence of the proposed methods.
Stan: A Probabilistic Programming Language for Bayesian Inference and Optimization
Gelman, Andrew; Lee, Daniel; Guo, Jiqiang
2015-01-01
Stan is a free and open-source C++ program that performs Bayesian inference or optimization for arbitrary user-specified models and can be called from the command line, R, Python, Matlab, or Julia and has great promise for fitting large and complex statistical models in many areas of application. We discuss Stan from users' and developers'…
Bayesian inference of chemical kinetic models from proposed reactions
Galagali, Nikhil
2015-02-01
© 2014 Elsevier Ltd. Bayesian inference provides a natural framework for combining experimental data with prior knowledge to develop chemical kinetic models and quantify the associated uncertainties, not only in parameter values but also in model structure. Most existing applications of Bayesian model selection methods to chemical kinetics have been limited to comparisons among a small set of models, however. The significant computational cost of evaluating posterior model probabilities renders traditional Bayesian methods infeasible when the model space becomes large. We present a new framework for tractable Bayesian model inference and uncertainty quantification using a large number of systematically generated model hypotheses. The approach involves imposing point-mass mixture priors over rate constants and exploring the resulting posterior distribution using an adaptive Markov chain Monte Carlo method. The posterior samples are used to identify plausible models, to quantify rate constant uncertainties, and to extract key diagnostic information about model structure-such as the reactions and operating pathways most strongly supported by the data. We provide numerical demonstrations of the proposed framework by inferring kinetic models for catalytic steam and dry reforming of methane using available experimental data.
Dimension-independent likelihood-informed MCMC
Cui, Tiangang
2015-10-08
Many Bayesian inference problems require exploring the posterior distribution of high-dimensional parameters that represent the discretization of an underlying function. This work introduces a family of Markov chain Monte Carlo (MCMC) samplers that can adapt to the particular structure of a posterior distribution over functions. Two distinct lines of research intersect in the methods developed here. First, we introduce a general class of operator-weighted proposal distributions that are well defined on function space, such that the performance of the resulting MCMC samplers is independent of the discretization of the function. Second, by exploiting local Hessian information and any associated low-dimensional structure in the change from prior to posterior distributions, we develop an inhomogeneous discretization scheme for the Langevin stochastic differential equation that yields operator-weighted proposals adapted to the non-Gaussian structure of the posterior. The resulting dimension-independent and likelihood-informed (DILI) MCMC samplers may be useful for a large class of high-dimensional problems where the target probability measure has a density with respect to a Gaussian reference measure. Two nonlinear inverse problems are used to demonstrate the efficiency of these DILI samplers: an elliptic PDE coefficient inverse problem and path reconstruction in a conditioned diffusion.
Bayesian Inference in the Modern Design of Experiments
DeLoach, Richard
2008-01-01
This paper provides an elementary tutorial overview of Bayesian inference and its potential for application in aerospace experimentation in general and wind tunnel testing in particular. Bayes Theorem is reviewed and examples are provided to illustrate how it can be applied to objectively revise prior knowledge by incorporating insights subsequently obtained from additional observations, resulting in new (posterior) knowledge that combines information from both sources. A logical merger of Bayesian methods and certain aspects of Response Surface Modeling is explored. Specific applications to wind tunnel testing, computational code validation, and instrumentation calibration are discussed.
Energy Technology Data Exchange (ETDEWEB)
Zhang, Guannan [ORNL; Webster, Clayton G [ORNL; Gunzburger, Max D [ORNL
2012-09-01
Although Bayesian analysis has become vital to the quantification of prediction uncertainty in groundwater modeling, its application has been hindered due to the computational cost associated with numerous model executions needed for exploring the posterior probability density function (PPDF) of model parameters. This is particularly the case when the PPDF is estimated using Markov Chain Monte Carlo (MCMC) sampling. In this study, we develop a new approach that improves computational efficiency of Bayesian inference by constructing a surrogate system based on an adaptive sparse-grid high-order stochastic collocation (aSG-hSC) method. Unlike previous works using first-order hierarchical basis, we utilize a compactly supported higher-order hierar- chical basis to construct the surrogate system, resulting in a significant reduction in the number of computational simulations required. In addition, we use hierarchical surplus as an error indi- cator to determine adaptive sparse grids. This allows local refinement in the uncertain domain and/or anisotropic detection with respect to the random model parameters, which further improves computational efficiency. Finally, we incorporate a global optimization technique and propose an iterative algorithm for building the surrogate system for the PPDF with multiple significant modes. Once the surrogate system is determined, the PPDF can be evaluated by sampling the surrogate system directly with very little computational cost. The developed method is evaluated first using a simple analytical density function with multiple modes and then using two synthetic groundwater reactive transport models. The groundwater models represent different levels of complexity; the first example involves coupled linear reactions and the second example simulates nonlinear ura- nium surface complexation. The results show that the aSG-hSC is an effective and efficient tool for Bayesian inference in groundwater modeling in comparison with conventional
Bayesian inference data evaluation and decisions
Harney, Hanns Ludwig
2016-01-01
This new edition offers a comprehensive introduction to the analysis of data using Bayes rule. It generalizes Gaussian error intervals to situations in which the data follow distributions other than Gaussian. This is particularly useful when the observed parameter is barely above the background or the histogram of multiparametric data contains many empty bins, so that the determination of the validity of a theory cannot be based on the chi-squared-criterion. In addition to the solutions of practical problems, this approach provides an epistemic insight: the logic of quantum mechanics is obtained as the logic of unbiased inference from counting data. New sections feature factorizing parameters, commuting parameters, observables in quantum mechanics, the art of fitting with coherent and with incoherent alternatives and fitting with multinomial distribution. Additional problems and examples help deepen the knowledge. Requiring no knowledge of quantum mechanics, the book is written on introductory level, with man...
Directory of Open Access Journals (Sweden)
Yufei Huang
2007-06-01
Full Text Available Reverse engineering of genetic regulatory networks from time series microarray data are investigated. We propose a dynamic Bayesian networks (DBNs modeling and a full Bayesian learning scheme. The proposed DBN directly models the continuous expression levels and also is associated with parameters that indicate the degree as well as the type of regulations. To learn the network from data, we proposed a reversible jump Markov chain Monte Carlo (RJMCMC algorithm. The RJMCMC algorithm can provide not only more accurate inference results than the deterministic alternative algorithms but also an estimate of the a posteriori probabilities (APPs of the network topology. The estimated APPs provide useful information on the confidence of the inferred results and can also be used for efficient Bayesian data integration. The proposed approach is tested on yeast cell cycle microarray data and the results are compared with the KEGG pathway map.
Parsing optical scanned 3D data by Bayesian inference
Xiong, Hanwei; Xu, Jun; Xu, Chenxi; Pan, Ming
2015-10-01
Optical devices are always used to digitize complex objects to get their shapes in form of point clouds. The results have no semantic meaning about the objects, and tedious process is indispensable to segment the scanned data to get meanings. The reason for a person to perceive an object correctly is the usage of knowledge, so Bayesian inference is used to the goal. A probabilistic And-Or-Graph is used as a unified framework of representation, learning, and recognition for a large number of object categories, and a probabilistic model defined on this And-Or-Graph is learned from a relatively small training set per category. Given a set of 3D scanned data, the Bayesian inference constructs a most probable interpretation of the object, and a semantic segment is obtained from the part decomposition. Some examples are given to explain the method.
Revealing ecological networks using Bayesian network inference algorithms.
Milns, Isobel; Beale, Colin M; Smith, V Anne
2010-07-01
Understanding functional relationships within ecological networks can help reveal keys to ecosystem stability or fragility. Revealing these relationships is complicated by the difficulties of isolating variables or performing experimental manipulations within a natural ecosystem, and thus inferences are often made by matching models to observational data. Such models, however, require assumptions-or detailed measurements-of parameters such as birth and death rate, encounter frequency, territorial exclusion, and predation success. Here, we evaluate the use of a Bayesian network inference algorithm, which can reveal ecological networks based upon species and habitat abundance alone. We test the algorithm's performance and applicability on observational data of avian communities and habitat in the Peak District National Park, United Kingdom. The resulting networks correctly reveal known relationships among habitat types and known interspecific relationships. In addition, the networks produced novel insights into ecosystem structure and identified key species with high connectivity. Thus, Bayesian networks show potential for becoming a valuable tool in ecosystem analysis. PMID:20715607
Bayesian inference model for fatigue life of laminated composites
DEFF Research Database (Denmark)
Dimitrov, Nikolay Krasimirov; Kiureghian, Armen Der; Berggreen, Christian
2016-01-01
A probabilistic model for estimating the fatigue life of laminated composite plates is developed. The model is based on lamina-level input data, making it possible to predict fatigue properties for a wide range of laminate configurations. Model parameters are estimated by Bayesian inference....... The reference data used consists of constant-amplitude cycle test results for four laminates with different layup configurations. The paper describes the modeling techniques and the parameter estimation procedure, supported by an illustrative application....
Bayesian Inference for Smoking Cessation with a Latent Cure State
Luo, Sheng; Crainiceanu, Ciprian M.; Thomas A Louis; Chatterjee, Nilanjan
2009-01-01
We present a Bayesian approach to modeling dynamic smoking addiction behavior processes when cure is not directly observed due to censoring. Subject-specific probabilities model the stochastic transitions among three behavioral states: smoking, transient quitting, and permanent quitting (absorbent state). A multivariate normal distribution for random effects is used to account for the potential correlation among the subject-specific transition probabilities. Inference is conducted using a Bay...
Exemplar models as a mechanism for performing Bayesian inference.
Shi, Lei; Griffiths, Thomas L; Feldman, Naomi H; Sanborn, Adam N
2010-08-01
Probabilistic models have recently received much attention as accounts of human cognition. However, most research in which probabilistic models have been used has been focused on formulating the abstract problems behind cognitive tasks and their optimal solutions, rather than on mechanisms that could implement these solutions. Exemplar models are a successful class of psychological process models in which an inventory of stored examples is used to solve problems such as identification, categorization, and function learning. We show that exemplar models can be used to perform a sophisticated form of Monte Carlo approximation known as importance sampling and thus provide a way to perform approximate Bayesian inference. Simulations of Bayesian inference in speech perception, generalization along a single dimension, making predictions about everyday events, concept learning, and reconstruction from memory show that exemplar models can often account for human performance with only a few exemplars, for both simple and relatively complex prior distributions. These results suggest that exemplar models provide a possible mechanism for implementing at least some forms of Bayesian inference. PMID:20702863
Halo detection via large-scale Bayesian inference
Merson, Alexander I.; Jasche, Jens; Abdalla, Filipe B.; Lahav, Ofer; Wandelt, Benjamin; Jones, D. Heath; Colless, Matthew
2016-08-01
We present a proof-of-concept of a novel and fully Bayesian methodology designed to detect haloes of different masses in cosmological observations subject to noise and systematic uncertainties. Our methodology combines the previously published Bayesian large-scale structure inference algorithm, HAmiltonian Density Estimation and Sampling algorithm (HADES), and a Bayesian chain rule (the Blackwell-Rao estimator), which we use to connect the inferred density field to the properties of dark matter haloes. To demonstrate the capability of our approach, we construct a realistic galaxy mock catalogue emulating the wide-area 6-degree Field Galaxy Survey, which has a median redshift of approximately 0.05. Application of HADES to the catalogue provides us with accurately inferred three-dimensional density fields and corresponding quantification of uncertainties inherent to any cosmological observation. We then use a cosmological simulation to relate the amplitude of the density field to the probability of detecting a halo with mass above a specified threshold. With this information, we can sum over the HADES density field realisations to construct maps of detection probabilities and demonstrate the validity of this approach within our mock scenario. We find that the probability of successful detection of haloes in the mock catalogue increases as a function of the signal to noise of the local galaxy observations. Our proposed methodology can easily be extended to account for more complex scientific questions and is a promising novel tool to analyse the cosmic large-scale structure in observations.
MCMC for non-linear state space models using ensembles of latent sequences
Shestopaloff, Alexander Y.; Neal, Radford M.
2013-01-01
Non-linear state space models are a widely-used class of models for biological, economic, and physical processes. Fitting these models to observed data is a difficult inference problem that has no straightforward solution. We take a Bayesian approach to the inference of unknown parameters of a non-linear state model; this, in turn, requires the availability of efficient Markov Chain Monte Carlo (MCMC) sampling methods for the latent (hidden) variables and model parameters. Using the ensemble ...
Bayesian Inference of the Evolution of HBV/E
Andernach, Iris E.; Hunewald, Oliver E.; Muller, Claude P.
2013-01-01
Despite its wide spread and high prevalence in sub-Saharan Africa, hepatitis B virus genotype E (HBV/E) has a surprisingly low genetic diversity, indicating an only recent emergence of this genotype in the general African population. Here, we performed extensive phylogeographic analyses, including Bayesian MCMC modeling. Our results indicate a mutation rate of 1.9×10−4 substitutions per site and year (s/s/y) and confirm a recent emergence of HBV/E, most likely within the last 130 years, and only after the transatlantic slave-trade had come to an end. Our analyses suggest that HBV/E originated from the area of Nigeria, before rapidly spreading throughout sub-Saharan Africa. Interestingly, viral strains found in Haiti seem to be the result of multiple introductions only in the second half of the 20th century, corroborating an absence of a significant number of HBV/E strains in West Africa several centuries ago. Our results confirm that the hyperendemicity of HBV(E) in today's Africa is a recent phenomenon and likely the result of dramatic changes in the routes of viral transmission in a relatively recent past. PMID:24312336
Andrade, Daniel
2012-01-01
We present a new method to propagate lower bounds on conditional probability distributions in conventional Bayesian networks. Our method guarantees to provide outer approximations of the exact lower bounds. A key advantage is that we can use any available algorithms and tools for Bayesian networks in order to represent and infer lower bounds. This new method yields results that are provable exact for trees with binary variables, and results which are competitive to existing approximations in credal networks for all other network structures. Our method is not limited to a specific kind of network structure. Basically, it is also not restricted to a specific kind of inference, but we restrict our analysis to prognostic inference in this article. The computational complexity is superior to that of other existing approaches.
Methods for Bayesian Power Spectrum Inference with Galaxy Surveys
Jasche, Jens; Wandelt, Benjamin D.
2013-12-01
We derive and implement a full Bayesian large scale structure inference method aiming at precision recovery of the cosmological power spectrum from galaxy redshift surveys. Our approach improves upon previous Bayesian methods by performing a joint inference of the three-dimensional density field, the cosmological power spectrum, luminosity dependent galaxy biases, and corresponding normalizations. We account for all joint and correlated uncertainties between all inferred quantities. Classes of galaxies with different biases are treated as separate subsamples. This method therefore also allows the combined analysis of more than one galaxy survey. In particular, it solves the problem of inferring the power spectrum from galaxy surveys with non-trivial survey geometries by exploring the joint posterior distribution with efficient implementations of multiple block Markov chain and Hybrid Monte Carlo methods. Our Markov sampler achieves high statistical efficiency in low signal-to-noise regimes by using a deterministic reversible jump algorithm. This approach reduces the correlation length of the sampler by several orders of magnitude, turning the otherwise numerically unfeasible problem of joint parameter exploration into a numerically manageable task. We test our method on an artificial mock galaxy survey, emulating characteristic features of the Sloan Digital Sky Survey data release 7, such as its survey geometry and luminosity-dependent biases. These tests demonstrate the numerical feasibility of our large scale Bayesian inference frame work when the parameter space has millions of dimensions. This method reveals and correctly treats the anti-correlation between bias amplitudes and power spectrum, which are not taken into account in current approaches to power spectrum estimation, a 20% effect across large ranges in k space. In addition, this method results in constrained realizations of density fields obtained without assuming the power spectrum or bias parameters
Approximate bayesian parameter inference for dynamical systems in systems biology
International Nuclear Information System (INIS)
This paper proposes to use approximate instead of exact stochastic simulation algorithms for approximate Bayesian parameter inference of dynamical systems in systems biology. It first presents the mathematical framework for the description of systems biology models, especially from the aspect of a stochastic formulation as opposed to deterministic model formulations based on the law of mass action. In contrast to maximum likelihood methods for parameter inference, approximate inference method- share presented which are based on sampling parameters from a known prior probability distribution, which gradually evolves toward a posterior distribution, through the comparison of simulated data from the model to a given data set of measurements. The paper then discusses the simulation process, where an over- view is given of the different exact and approximate methods for stochastic simulation and their improvements that we propose. The exact and approximate simulators are implemented and used within approximate Bayesian parameter inference methods. Our evaluation of these methods on two tasks of parameter estimation in two different models shows that equally good results are obtained much faster when using approximate simulation as compared to using exact simulation. (Author)
A Bayesian Approach to Inferring Rates of Selfing and Locus-Specific Mutation.
Redelings, Benjamin D; Kumagai, Seiji; Tatarenkov, Andrey; Wang, Liuyang; Sakai, Ann K; Weller, Stephen G; Culley, Theresa M; Avise, John C; Uyenoyama, Marcy K
2015-11-01
We present a Bayesian method for characterizing the mating system of populations reproducing through a mixture of self-fertilization and random outcrossing. Our method uses patterns of genetic variation across the genome as a basis for inference about reproduction under pure hermaphroditism, gynodioecy, and a model developed to describe the self-fertilizing killifish Kryptolebias marmoratus. We extend the standard coalescence model to accommodate these mating systems, accounting explicitly for multilocus identity disequilibrium, inbreeding depression, and variation in fertility among mating types. We incorporate the Ewens sampling formula (ESF) under the infinite-alleles model of mutation to obtain a novel expression for the likelihood of mating system parameters. Our Markov chain Monte Carlo (MCMC) algorithm assigns locus-specific mutation rates, drawn from a common mutation rate distribution that is itself estimated from the data using a Dirichlet process prior model. Our sampler is designed to accommodate additional information, including observations pertaining to the sex ratio, the intensity of inbreeding depression, and other aspects of reproduction. It can provide joint posterior distributions for the population-wide proportion of uniparental individuals, locus-specific mutation rates, and the number of generations since the most recent outcrossing event for each sampled individual. Further, estimation of all basic parameters of a given model permits estimation of functions of those parameters, including the proportion of the gene pool contributed by each sex and relative effective numbers. PMID:26374460
Simulation based bayesian econometric inference: principles and some recent computational advances.
L.F. Hoogerheide (Lennart); H.K. van Dijk (Herman); R.D. van Oest (Rutger)
2007-01-01
textabstractIn this paper we discuss several aspects of simulation based Bayesian econometric inference. We start at an elementary level on basic concepts of Bayesian analysis; evaluating integrals by simulation methods is a crucial ingredient in Bayesian inference. Next, the most popular and well-
DEFF Research Database (Denmark)
Møller, Jesper
.1 with the title ‘Inference'.) This contribution concerns statistical inference for parametric models used in stochastic geometry and based on quick and simple simulation free procedures as well as more comprehensive methods using Markov chain Monte Carlo (MCMC) simulations. Due to space limitations the focus...
A full bayesian approach for boolean genetic network inference.
Directory of Open Access Journals (Sweden)
Shengtong Han
Full Text Available Boolean networks are a simple but efficient model for describing gene regulatory systems. A number of algorithms have been proposed to infer Boolean networks. However, these methods do not take full consideration of the effects of noise and model uncertainty. In this paper, we propose a full Bayesian approach to infer Boolean genetic networks. Markov chain Monte Carlo algorithms are used to obtain the posterior samples of both the network structure and the related parameters. In addition to regular link addition and removal moves, which can guarantee the irreducibility of the Markov chain for traversing the whole network space, carefully constructed mixture proposals are used to improve the Markov chain Monte Carlo convergence. Both simulations and a real application on cell-cycle data show that our method is more powerful than existing methods for the inference of both the topology and logic relations of the Boolean network from observed data.
Bayesian Inference of Reticulate Phylogenies under the Multispecies Network Coalescent.
Wen, Dingqiao; Yu, Yun; Nakhleh, Luay
2016-05-01
The multispecies coalescent (MSC) is a statistical framework that models how gene genealogies grow within the branches of a species tree. The field of computational phylogenetics has witnessed an explosion in the development of methods for species tree inference under MSC, owing mainly to the accumulating evidence of incomplete lineage sorting in phylogenomic analyses. However, the evolutionary history of a set of genomes, or species, could be reticulate due to the occurrence of evolutionary processes such as hybridization or horizontal gene transfer. We report on a novel method for Bayesian inference of genome and species phylogenies under the multispecies network coalescent (MSNC). This framework models gene evolution within the branches of a phylogenetic network, thus incorporating reticulate evolutionary processes, such as hybridization, in addition to incomplete lineage sorting. As phylogenetic networks with different numbers of reticulation events correspond to points of different dimensions in the space of models, we devise a reversible-jump Markov chain Monte Carlo (RJMCMC) technique for sampling the posterior distribution of phylogenetic networks under MSNC. We implemented the methods in the publicly available, open-source software package PhyloNet and studied their performance on simulated and biological data. The work extends the reach of Bayesian inference to phylogenetic networks and enables new evolutionary analyses that account for reticulation. PMID:27144273
Inference of Gene Regulatory Network Based on Local Bayesian Networks.
Liu, Fei; Zhang, Shao-Wu; Guo, Wei-Feng; Wei, Ze-Gang; Chen, Luonan
2016-08-01
The inference of gene regulatory networks (GRNs) from expression data can mine the direct regulations among genes and gain deep insights into biological processes at a network level. During past decades, numerous computational approaches have been introduced for inferring the GRNs. However, many of them still suffer from various problems, e.g., Bayesian network (BN) methods cannot handle large-scale networks due to their high computational complexity, while information theory-based methods cannot identify the directions of regulatory interactions and also suffer from false positive/negative problems. To overcome the limitations, in this work we present a novel algorithm, namely local Bayesian network (LBN), to infer GRNs from gene expression data by using the network decomposition strategy and false-positive edge elimination scheme. Specifically, LBN algorithm first uses conditional mutual information (CMI) to construct an initial network or GRN, which is decomposed into a number of local networks or GRNs. Then, BN method is employed to generate a series of local BNs by selecting the k-nearest neighbors of each gene as its candidate regulatory genes, which significantly reduces the exponential search space from all possible GRN structures. Integrating these local BNs forms a tentative network or GRN by performing CMI, which reduces redundant regulations in the GRN and thus alleviates the false positive problem. The final network or GRN can be obtained by iteratively performing CMI and local BN on the tentative network. In the iterative process, the false or redundant regulations are gradually removed. When tested on the benchmark GRN datasets from DREAM challenge as well as the SOS DNA repair network in E.coli, our results suggest that LBN outperforms other state-of-the-art methods (ARACNE, GENIE3 and NARROMI) significantly, with more accurate and robust performance. In particular, the decomposition strategy with local Bayesian networks not only effectively reduce
Bayesian Inference for Structured Spike and Slab Priors
DEFF Research Database (Denmark)
Andersen, Michael Riis; Winther, Ole; Hansen, Lars Kai
2014-01-01
Sparse signal recovery addresses the problem of solving underdetermined linear inverse problems subject to a sparsity constraint. We propose a novel prior formulation, the structured spike and slab prior, which allows to incorporate a priori knowledge of the sparsity pattern by imposing a spatial...... Gaussian process on the spike and slab probabilities. Thus, prior information on the structure of the sparsity pattern can be encoded using generic covariance functions. Furthermore, we provide a Bayesian inference scheme for the proposed model based on the expectation propagation framework. Using...
Progress on Bayesian Inference of the Fast Ion Distribution Function
DEFF Research Database (Denmark)
Stagner, L.; Heidbrink, W.W,; Chen, X.;
2013-01-01
The fast-ion distribution function (DF) has a complicated dependence on several phase-space variables. The standard analysis procedure in energetic particle research is to compute the DF theoretically, use that DF in forward modeling to predict diagnostic signals, then compare with measured data....... However, when theory and experiment disagree (for one or more diagnostics), it is unclear how to proceed. Bayesian statistics provides a framework to infer the DF, quantify errors, and reconcile discrepant diagnostic measurements. Diagnostic errors and weight functions that describe the phase space...
Towards Bayesian Inference of the Fast-Ion Distribution Function
DEFF Research Database (Denmark)
Stagner, L.; Heidbrink, W.W.; Salewski, Mirko
2012-01-01
The fast-ion distribution function (DF) has a complicated dependence on several phase-space variables. The standard analysis procedure in energetic particle research is to compute the DF theoretically, use that DF in forward modeling to predict diagnostic signals, then compare with measured data....... However, when theory and experiment disagree (for one or more diagnostics), it is unclear how to proceed. Bayesian statistics provides a framework to infer the DF, quantify errors, and reconcile discrepant diagnostic measurements. Diagnostic errors and ``weight functions" that describe the phase space...
Applying Bayesian Inference to Galileon Solutions of the Muon Problem
Lamm, Henry
2016-01-01
We derive corrections to atomic energy levels from disformal couplings in Galileon theories. Through Bayesian inference, we constrain the cut-off radii and Galileon scale via these corrections. To connect different atomic systems, we assume the various cut-off radii related by a 1-parameter family of solutions. This introduces a new parameter $\\alpha$ which is also constrained. In this model, we predict shifts to muonic helium of $\\delta E_{He^3}=1.97^{+9.28}_{-1.87}$ meV and $\\delta E_{He^4}=1.69^{+9.25}_{-1.61}$ meV.
Structural damage identification using piezoelectric impedance and Bayesian inference
Shuai, Q.; Zhou, K.; Tang, J.
2015-04-01
Structural damage identification is a challenging subject in the structural health monitoring research. The piezoelectric impedance-based damage identification, which usually utilizes the matrix inverse-based optimization, may in theory identify the damage location and damage severity. However, the sensitivity matrix is oftentimes ill-conditioned in practice, since the number of unknowns may far exceed the useful measurements/inputs. In this research, a new method based on intelligent inference framework for damage identification is presented. Bayesian inference is used to directly predict damage location and severity using impedance measurement through forward prediction and comparison. Gaussian process is employed to enrich the forward analysis result, thereby reducing computational cost. Case study is carried out to illustrate the identification performance.
Inferring on the intentions of others by hierarchical Bayesian learning.
Directory of Open Access Journals (Sweden)
Andreea O Diaconescu
2014-09-01
Full Text Available Inferring on others' (potentially time-varying intentions is a fundamental problem during many social transactions. To investigate the underlying mechanisms, we applied computational modeling to behavioral data from an economic game in which 16 pairs of volunteers (randomly assigned to "player" or "adviser" roles interacted. The player performed a probabilistic reinforcement learning task, receiving information about a binary lottery from a visual pie chart. The adviser, who received more predictive information, issued an additional recommendation. Critically, the game was structured such that the adviser's incentives to provide helpful or misleading information varied in time. Using a meta-Bayesian modeling framework, we found that the players' behavior was best explained by the deployment of hierarchical learning: they inferred upon the volatility of the advisers' intentions in order to optimize their predictions about the validity of their advice. Beyond learning, volatility estimates also affected the trial-by-trial variability of decisions: participants were more likely to rely on their estimates of advice accuracy for making choices when they believed that the adviser's intentions were presently stable. Finally, our model of the players' inference predicted the players' interpersonal reactivity index (IRI scores, explicit ratings of the advisers' helpfulness and the advisers' self-reports on their chosen strategy. Overall, our results suggest that humans (i employ hierarchical generative models to infer on the changing intentions of others, (ii use volatility estimates to inform decision-making in social interactions, and (iii integrate estimates of advice accuracy with non-social sources of information. The Bayesian framework presented here can quantify individual differences in these mechanisms from simple behavioral readouts and may prove useful in future clinical studies of maladaptive social cognition.
Bayesian inference for generalized linear models for spiking neurons
Directory of Open Access Journals (Sweden)
Sebastian Gerwinn
2010-05-01
Full Text Available Generalized Linear Models (GLMs are commonly used statistical methods for modelling the relationship between neural population activity and presented stimuli. When the dimension of the parameter space is large, strong regularization has to be used in order to fit GLMs to datasets of realistic size without overfitting. By imposing properly chosen priors over parameters, Bayesian inference provides an effective and principled approach for achieving regularization. Here we show how the posterior distribution over model parameters of GLMs can be approximated by a Gaussian using the Expectation Propagation algorithm. In this way, we obtain an estimate of the posterior mean and posterior covariance, allowing us to calculate Bayesian confidence intervals that characterize the uncertainty about the optimal solution. From the posterior we also obtain a different point estimate, namely the posterior mean as opposed to the commonly used maximum a posteriori estimate. We systematically compare the different inference techniques on simulated as well as on multi-electrode recordings of retinal ganglion cells, and explore the effects of the chosen prior and the performance measure used. We find that good performance can be achieved by choosing an Laplace prior together with the posterior mean estimate.
Bayesian inference of population size history from multiple loci
Directory of Open Access Journals (Sweden)
Drummond Alexei J
2008-10-01
Full Text Available Abstract Background Effective population size (Ne is related to genetic variability and is a basic parameter in many models of population genetics. A number of methods for inferring current and past population sizes from genetic data have been developed since JFC Kingman introduced the n-coalescent in 1982. Here we present the Extended Bayesian Skyline Plot, a non-parametric Bayesian Markov chain Monte Carlo algorithm that extends a previous coalescent-based method in several ways, including the ability to analyze multiple loci. Results Through extensive simulations we show the accuracy and limitations of inferring population size as a function of the amount of data, including recovering information about evolutionary bottlenecks. We also analyzed two real data sets to demonstrate the behavior of the new method; a single gene Hepatitis C virus data set sampled from Egypt and a 10 locus Drosophila ananassae data set representing 16 different populations. Conclusion The results demonstrate the essential role of multiple loci in recovering population size dynamics. Multi-locus data from a small number of individuals can precisely recover past bottlenecks in population size which can not be characterized by analysis of a single locus. We also demonstrate that sequence data quality is important because even moderate levels of sequencing errors result in a considerable decrease in estimation accuracy for realistic levels of population genetic variability.
Bayesian Inference for Signal-Based Seismic Monitoring
Moore, D.
2015-12-01
Traditional seismic monitoring systems rely on discrete detections produced by station processing software, discarding significant information present in the original recorded signal. SIG-VISA (Signal-based Vertically Integrated Seismic Analysis) is a system for global seismic monitoring through Bayesian inference on seismic signals. By modeling signals directly, our forward model is able to incorporate a rich representation of the physics underlying the signal generation process, including source mechanisms, wave propagation, and station response. This allows inference in the model to recover the qualitative behavior of recent geophysical methods including waveform matching and double-differencing, all as part of a unified Bayesian monitoring system that simultaneously detects and locates events from a global network of stations. We demonstrate recent progress in scaling up SIG-VISA to efficiently process the data stream of global signals recorded by the International Monitoring System (IMS), including comparisons against existing processing methods that show increased sensitivity from our signal-based model and in particular the ability to locate events (including aftershock sequences that can tax analyst processing) precisely from waveform correlation effects. We also provide a Bayesian analysis of an alleged low-magnitude event near the DPRK test site in May 2010 [1] [2], investigating whether such an event could plausibly be detected through automated processing in a signal-based monitoring system. [1] Zhang, Miao and Wen, Lianxing. "Seismological Evidence for a Low-Yield Nuclear Test on 12 May 2010 in North Korea". Seismological Research Letters, January/February 2015. [2] Richards, Paul. "A Seismic Event in North Korea on 12 May 2010". CTBTO SnT 2015 oral presentation, video at https://video-archive.ctbto.org/index.php/kmc/preview/partner_id/103/uiconf_id/4421629/entry_id/0_ymmtpps0/delivery/http
Bayesian inference for identifying interaction rules in moving animal groups.
Mann, Richard P
2011-01-01
The emergence of similar collective patterns from different self-propelled particle models of animal groups points to a restricted set of "universal" classes for these patterns. While universality is interesting, it is often the fine details of animal interactions that are of biological importance. Universality thus presents a challenge to inferring such interactions from macroscopic group dynamics since these can be consistent with many underlying interaction models. We present a Bayesian framework for learning animal interaction rules from fine scale recordings of animal movements in swarms. We apply these techniques to the inverse problem of inferring interaction rules from simulation models, showing that parameters can often be inferred from a small number of observations. Our methodology allows us to quantify our confidence in parameter fitting. For example, we show that attraction and alignment terms can be reliably estimated when animals are milling in a torus shape, while interaction radius cannot be reliably measured in such a situation. We assess the importance of rate of data collection and show how to test different models, such as topological and metric neighbourhood models. Taken together our results both inform the design of experiments on animal interactions and suggest how these data should be best analysed. PMID:21829657
Bayesian inference for identifying interaction rules in moving animal groups.
Directory of Open Access Journals (Sweden)
Richard P Mann
Full Text Available The emergence of similar collective patterns from different self-propelled particle models of animal groups points to a restricted set of "universal" classes for these patterns. While universality is interesting, it is often the fine details of animal interactions that are of biological importance. Universality thus presents a challenge to inferring such interactions from macroscopic group dynamics since these can be consistent with many underlying interaction models. We present a Bayesian framework for learning animal interaction rules from fine scale recordings of animal movements in swarms. We apply these techniques to the inverse problem of inferring interaction rules from simulation models, showing that parameters can often be inferred from a small number of observations. Our methodology allows us to quantify our confidence in parameter fitting. For example, we show that attraction and alignment terms can be reliably estimated when animals are milling in a torus shape, while interaction radius cannot be reliably measured in such a situation. We assess the importance of rate of data collection and show how to test different models, such as topological and metric neighbourhood models. Taken together our results both inform the design of experiments on animal interactions and suggest how these data should be best analysed.
Inference of Gene Regulatory Network Based on Local Bayesian Networks.
Liu, Fei; Zhang, Shao-Wu; Guo, Wei-Feng; Wei, Ze-Gang; Chen, Luonan
2016-08-01
The inference of gene regulatory networks (GRNs) from expression data can mine the direct regulations among genes and gain deep insights into biological processes at a network level. During past decades, numerous computational approaches have been introduced for inferring the GRNs. However, many of them still suffer from various problems, e.g., Bayesian network (BN) methods cannot handle large-scale networks due to their high computational complexity, while information theory-based methods cannot identify the directions of regulatory interactions and also suffer from false positive/negative problems. To overcome the limitations, in this work we present a novel algorithm, namely local Bayesian network (LBN), to infer GRNs from gene expression data by using the network decomposition strategy and false-positive edge elimination scheme. Specifically, LBN algorithm first uses conditional mutual information (CMI) to construct an initial network or GRN, which is decomposed into a number of local networks or GRNs. Then, BN method is employed to generate a series of local BNs by selecting the k-nearest neighbors of each gene as its candidate regulatory genes, which significantly reduces the exponential search space from all possible GRN structures. Integrating these local BNs forms a tentative network or GRN by performing CMI, which reduces redundant regulations in the GRN and thus alleviates the false positive problem. The final network or GRN can be obtained by iteratively performing CMI and local BN on the tentative network. In the iterative process, the false or redundant regulations are gradually removed. When tested on the benchmark GRN datasets from DREAM challenge as well as the SOS DNA repair network in E.coli, our results suggest that LBN outperforms other state-of-the-art methods (ARACNE, GENIE3 and NARROMI) significantly, with more accurate and robust performance. In particular, the decomposition strategy with local Bayesian networks not only effectively reduce
Inference of Gene Regulatory Network Based on Local Bayesian Networks
Liu, Fei; Zhang, Shao-Wu; Guo, Wei-Feng; Chen, Luonan
2016-01-01
The inference of gene regulatory networks (GRNs) from expression data can mine the direct regulations among genes and gain deep insights into biological processes at a network level. During past decades, numerous computational approaches have been introduced for inferring the GRNs. However, many of them still suffer from various problems, e.g., Bayesian network (BN) methods cannot handle large-scale networks due to their high computational complexity, while information theory-based methods cannot identify the directions of regulatory interactions and also suffer from false positive/negative problems. To overcome the limitations, in this work we present a novel algorithm, namely local Bayesian network (LBN), to infer GRNs from gene expression data by using the network decomposition strategy and false-positive edge elimination scheme. Specifically, LBN algorithm first uses conditional mutual information (CMI) to construct an initial network or GRN, which is decomposed into a number of local networks or GRNs. Then, BN method is employed to generate a series of local BNs by selecting the k-nearest neighbors of each gene as its candidate regulatory genes, which significantly reduces the exponential search space from all possible GRN structures. Integrating these local BNs forms a tentative network or GRN by performing CMI, which reduces redundant regulations in the GRN and thus alleviates the false positive problem. The final network or GRN can be obtained by iteratively performing CMI and local BN on the tentative network. In the iterative process, the false or redundant regulations are gradually removed. When tested on the benchmark GRN datasets from DREAM challenge as well as the SOS DNA repair network in E.coli, our results suggest that LBN outperforms other state-of-the-art methods (ARACNE, GENIE3 and NARROMI) significantly, with more accurate and robust performance. In particular, the decomposition strategy with local Bayesian networks not only effectively reduce
Bayesian large-scale structure inference and cosmic web analysis
Leclercq, Florent
2015-01-01
Surveys of the cosmic large-scale structure carry opportunities for building and testing cosmological theories about the origin and evolution of the Universe. This endeavor requires appropriate data assimilation tools, for establishing the contact between survey catalogs and models of structure formation. In this thesis, we present an innovative statistical approach for the ab initio simultaneous analysis of the formation history and morphology of the cosmic web: the BORG algorithm infers the primordial density fluctuations and produces physical reconstructions of the dark matter distribution that underlies observed galaxies, by assimilating the survey data into a cosmological structure formation model. The method, based on Bayesian probability theory, provides accurate means of uncertainty quantification. We demonstrate the application of BORG to the Sloan Digital Sky Survey data and describe the primordial and late-time large-scale structure in the observed volume. We show how the approach has led to the fi...
Bayesian Inference Applied to the Electromagnetic Inverse Problem
Schmidt, D M; Wood, C C; Schmidt, David M.; George, John S.
1998-01-01
We present a new approach to the electromagnetic inverse problem that explicitly addresses the ambiguity associated with its ill-posed character. Rather than calculating a single ``best'' solution according to some criterion, our approach produces a large number of likely solutions that both fit the data and any prior information that is used. While the range of the different likely results is representative of the ambiguity in the inverse problem even with prior information present, features that are common across a large number of the different solutions can be identified and are associated with a high degree of probability. This approach is implemented and quantified within the formalism of Bayesian inference which combines prior information with that from measurement in a common framework using a single measure. To demonstrate this approach, a general neural activation model is constructed that includes a variable number of extended regions of activation and can incorporate a great deal of prior informati...
Unsupervised Transient Light Curve Analysis Via Hierarchical Bayesian Inference
Sanders, Nathan; Soderberg, Alicia
2014-01-01
Historically, light curve studies of supernovae (SNe) and other transient classes have focused on individual objects with copious and high signal-to-noise observations. In the nascent era of wide field transient searches, objects with detailed observations are decreasing as a fraction of the overall known SN population, and this strategy sacrifices the majority of the information contained in the data about the underlying population of transients. A population level modeling approach, simultaneously fitting all available observations of objects in a transient sub-class of interest, fully mines the data to infer the properties of the population and avoids certain systematic biases. We present a novel hierarchical Bayesian statistical model for population level modeling of transient light curves, and discuss its implementation using an efficient Hamiltonian Monte Carlo technique. As a test case, we apply this model to the Type IIP SN sample from the Pan-STARRS1 Medium Deep Survey, consisting of 18,837 photometr...
Bayesian Inference of Natural Rankings in Incomplete Competition Networks
Park, Juyong
2013-01-01
Competition between a complex system's constituents and a corresponding reward mechanism based on it have profound influence on the functioning, stability, and evolution of the system. But determining the dominance hierarchy or ranking among the constituent parts from the strongest to the weakest -- essential in determining reward or penalty -- is almost always an ambiguous task due to the incomplete nature of competition networks. Here we introduce ``Natural Ranking," a desirably unambiguous ranking method applicable to a complete (full) competition network, and formulate an analytical model based on the Bayesian formula inferring the expected mean and error of the natural ranking of nodes from an incomplete network. We investigate its potential and uses in solving issues in ranking by applying to a real-world competition network of economic and social importance.
Bayesian inference on the sphere beyond statistical isotropy
Das, Santanu; Souradeep, Tarun
2015-01-01
We present a general method for Bayesian inference of the underlying covariance structure of random fields on a sphere. We employ the Bipolar Spherical Harmonic (BipoSH) representation of general covariance structure on the sphere. We illustrate the efficacy of the method as a principled approach to assess violation of statistical isotropy (SI) in the sky maps of Cosmic Microwave Background (CMB) fluctuations. SI violation in observed CMB maps arise due to known physical effects such as Doppler boost and weak lensing; yet unknown theoretical possibilities like cosmic topology and subtle violations of the cosmological principle, as well as, expected observational artefacts of scanning the sky with a non-circular beam, masking, foreground residuals, anisotropic noise, etc. We explicitly demonstrate the recovery of the input SI violation signals with their full statistics in simulated CMB maps. Our formalism easily adapts to exploring parametric physical models with non-SI covariance, as we illustrate for the in...
vs. a polynomial chaos-based MCMC
Siripatana, Adil
2014-08-01
Bayesian Inference of Manning\\'s n coefficient in a Storm Surge Model Framework: comparison between Kalman lter and polynomial based method Adil Siripatana Conventional coastal ocean models solve the shallow water equations, which describe the conservation of mass and momentum when the horizontal length scale is much greater than the vertical length scale. In this case vertical pressure gradients in the momentum equations are nearly hydrostatic. The outputs of coastal ocean models are thus sensitive to the bottom stress terms de ned through the formulation of Manning\\'s n coefficients. This thesis considers the Bayesian inference problem of the Manning\\'s n coefficient in the context of storm surge based on the coastal ocean ADCIRC model. In the first part of the thesis, we apply an ensemble-based Kalman filter, the singular evolutive interpolated Kalman (SEIK) filter to estimate both a constant Manning\\'s n coefficient and a 2-D parameterized Manning\\'s coefficient on one ideal and one of more realistic domain using observation system simulation experiments (OSSEs). We study the sensitivity of the system to the ensemble size. we also access the benefits from using an in ation factor on the filter performance. To study the limitation of the Guassian restricted assumption on the SEIK lter, 5 we also implemented in the second part of this thesis a Markov Chain Monte Carlo (MCMC) method based on a Generalized Polynomial chaos (gPc) approach for the estimation of the 1-D and 2-D Mannning\\'s n coe cient. The gPc is used to build a surrogate model that imitate the ADCIRC model in order to make the computational cost of implementing the MCMC with the ADCIRC model reasonable. We evaluate the performance of the MCMC-gPc approach and study its robustness to di erent OSSEs scenario. we also compare its estimates with those resulting from SEIK in term of parameter estimates and full distributions. we present a full analysis of the solution of these two methods, of the
Rajabi, Mohammad Mahdi; Ataie-Ashtiani, Behzad
2016-05-01
Bayesian inference has traditionally been conceived as the proper framework for the formal incorporation of expert knowledge in parameter estimation of groundwater models. However, conventional Bayesian inference is incapable of taking into account the imprecision essentially embedded in expert provided information. In order to solve this problem, a number of extensions to conventional Bayesian inference have been introduced in recent years. One of these extensions is 'fuzzy Bayesian inference' which is the result of integrating fuzzy techniques into Bayesian statistics. Fuzzy Bayesian inference has a number of desirable features which makes it an attractive approach for incorporating expert knowledge in the parameter estimation process of groundwater models: (1) it is well adapted to the nature of expert provided information, (2) it allows to distinguishably model both uncertainty and imprecision, and (3) it presents a framework for fusing expert provided information regarding the various inputs of the Bayesian inference algorithm. However an important obstacle in employing fuzzy Bayesian inference in groundwater numerical modeling applications is the computational burden, as the required number of numerical model simulations often becomes extremely exhaustive and often computationally infeasible. In this paper, a novel approach of accelerating the fuzzy Bayesian inference algorithm is proposed which is based on using approximate posterior distributions derived from surrogate modeling, as a screening tool in the computations. The proposed approach is first applied to a synthetic test case of seawater intrusion (SWI) in a coastal aquifer. It is shown that for this synthetic test case, the proposed approach decreases the number of required numerical simulations by an order of magnitude. Then the proposed approach is applied to a real-world test case involving three-dimensional numerical modeling of SWI in Kish Island, located in the Persian Gulf. An expert
Bayesian inference for partially identified models exploring the limits of limited data
Gustafson, Paul
2015-01-01
Introduction Identification What Is against Us? What Is for Us? Some Simple Examples of Partially Identified ModelsThe Road Ahead The Structure of Inference in Partially Identified Models Bayesian Inference The Structure of Posterior Distributions in PIMs Computational Strategies Strength of Bayesian Updating, Revisited Posterior MomentsCredible Intervals Evaluating the Worth of Inference Partial Identification versus Model Misspecification The Siren Call of Identification Comp
Bayesian Inference for Functional Dynamics Exploring in fMRI Data
Directory of Open Access Journals (Sweden)
Xuan Guo
2016-01-01
Full Text Available This paper aims to review state-of-the-art Bayesian-inference-based methods applied to functional magnetic resonance imaging (fMRI data. Particularly, we focus on one specific long-standing challenge in the computational modeling of fMRI datasets: how to effectively explore typical functional interactions from fMRI time series and the corresponding boundaries of temporal segments. Bayesian inference is a method of statistical inference which has been shown to be a powerful tool to encode dependence relationships among the variables with uncertainty. Here we provide an introduction to a group of Bayesian-inference-based methods for fMRI data analysis, which were designed to detect magnitude or functional connectivity change points and to infer their functional interaction patterns based on corresponding temporal boundaries. We also provide a comparison of three popular Bayesian models, that is, Bayesian Magnitude Change Point Model (BMCPM, Bayesian Connectivity Change Point Model (BCCPM, and Dynamic Bayesian Variable Partition Model (DBVPM, and give a summary of their applications. We envision that more delicate Bayesian inference models will be emerging and play increasingly important roles in modeling brain functions in the years to come.
Bayesian Inference for Functional Dynamics Exploring in fMRI Data.
Guo, Xuan; Liu, Bing; Chen, Le; Chen, Guantao; Pan, Yi; Zhang, Jing
2016-01-01
This paper aims to review state-of-the-art Bayesian-inference-based methods applied to functional magnetic resonance imaging (fMRI) data. Particularly, we focus on one specific long-standing challenge in the computational modeling of fMRI datasets: how to effectively explore typical functional interactions from fMRI time series and the corresponding boundaries of temporal segments. Bayesian inference is a method of statistical inference which has been shown to be a powerful tool to encode dependence relationships among the variables with uncertainty. Here we provide an introduction to a group of Bayesian-inference-based methods for fMRI data analysis, which were designed to detect magnitude or functional connectivity change points and to infer their functional interaction patterns based on corresponding temporal boundaries. We also provide a comparison of three popular Bayesian models, that is, Bayesian Magnitude Change Point Model (BMCPM), Bayesian Connectivity Change Point Model (BCCPM), and Dynamic Bayesian Variable Partition Model (DBVPM), and give a summary of their applications. We envision that more delicate Bayesian inference models will be emerging and play increasingly important roles in modeling brain functions in the years to come.
Hernández, Mario R.; Francés, Félix
2015-04-01
One phase of the hydrological models implementation process, significantly contributing to the hydrological predictions uncertainty, is the calibration phase in which values of the unknown model parameters are tuned by optimizing an objective function. An unsuitable error model (e.g. Standard Least Squares or SLS) introduces noise into the estimation of the parameters. The main sources of this noise are the input errors and the hydrological model structural deficiencies. Thus, the biased calibrated parameters cause the divergence model phenomenon, where the errors variance of the (spatially and temporally) forecasted flows far exceeds the errors variance in the fitting period, and provoke the loss of part or all of the physical meaning of the modeled processes. In other words, yielding a calibrated hydrological model which works well, but not for the right reasons. Besides, an unsuitable error model yields a non-reliable predictive uncertainty assessment. Hence, with the aim of prevent all these undesirable effects, this research focuses on the Bayesian joint inference (BJI) of both the hydrological and error model parameters, considering a general additive (GA) error model that allows for correlation, non-stationarity (in variance and bias) and non-normality of model residuals. As hydrological model, it has been used a conceptual distributed model called TETIS, with a particular split structure of the effective model parameters. Bayesian inference has been performed with the aid of a Markov Chain Monte Carlo (MCMC) algorithm called Dream-ZS. MCMC algorithm quantifies the uncertainty of the hydrological and error model parameters by getting the joint posterior probability distribution, conditioned on the observed flows. The BJI methodology is a very powerful and reliable tool, but it must be used correctly this is, if non-stationarity in errors variance and bias is modeled, the Total Laws must be taken into account. The results of this research show that the
Bayesian analysis of Markov point processes
DEFF Research Database (Denmark)
Berthelsen, Kasper Klitgaard; Møller, Jesper
2006-01-01
Recently Møller, Pettitt, Berthelsen and Reeves introduced a new MCMC methodology for drawing samples from a posterior distribution when the likelihood function is only specified up to a normalising constant. We illustrate the method in the setting of Bayesian inference for Markov point processes...
Bayesian inference and life testing plans for generalized exponential distribution
Institute of Scientific and Technical Information of China (English)
KUNDU; Debasis; PRADHAN; Biswabrata
2009-01-01
Recently generalized exponential distribution has received considerable attentions.In this paper,we deal with the Bayesian inference of the unknown parameters of the progressively censored generalized exponential distribution.It is assumed that the scale and the shape parameters have independent gamma priors.The Bayes estimates of the unknown parameters cannot be obtained in the closed form.Lindley’s approximation and importance sampling technique have been suggested to compute the approximate Bayes estimates.Markov Chain Monte Carlo method has been used to compute the approximate Bayes estimates and also to construct the highest posterior density credible intervals.We also provide different criteria to compare two different sampling schemes and hence to ?nd the optimal sampling schemes.It is observed that ?nding the optimum censoring procedure is a computationally expensive process.And we have recommended to use the sub-optimal censoring procedure,which can be obtained very easily.Monte Carlo simulations are performed to compare the performances of the different methods and one data analysis has been performed for illustrative purposes.
Metainference: A Bayesian inference method for heterogeneous systems.
Bonomi, Massimiliano; Camilloni, Carlo; Cavalli, Andrea; Vendruscolo, Michele
2016-01-01
Modeling a complex system is almost invariably a challenging task. The incorporation of experimental observations can be used to improve the quality of a model and thus to obtain better predictions about the behavior of the corresponding system. This approach, however, is affected by a variety of different errors, especially when a system simultaneously populates an ensemble of different states and experimental data are measured as averages over such states. To address this problem, we present a Bayesian inference method, called "metainference," that is able to deal with errors in experimental measurements and with experimental measurements averaged over multiple states. To achieve this goal, metainference models a finite sample of the distribution of models using a replica approach, in the spirit of the replica-averaging modeling based on the maximum entropy principle. To illustrate the method, we present its application to a heterogeneous model system and to the determination of an ensemble of structures corresponding to the thermal fluctuations of a protein molecule. Metainference thus provides an approach to modeling complex systems with heterogeneous components and interconverting between different states by taking into account all possible sources of errors. PMID:26844300
Bayesian Inference for NASA Probabilistic Risk and Reliability Analysis
Dezfuli, Homayoon; Kelly, Dana; Smith, Curtis; Vedros, Kurt; Galyean, William
2009-01-01
This document, Bayesian Inference for NASA Probabilistic Risk and Reliability Analysis, is intended to provide guidelines for the collection and evaluation of risk and reliability-related data. It is aimed at scientists and engineers familiar with risk and reliability methods and provides a hands-on approach to the investigation and application of a variety of risk and reliability data assessment methods, tools, and techniques. This document provides both: A broad perspective on data analysis collection and evaluation issues. A narrow focus on the methods to implement a comprehensive information repository. The topics addressed herein cover the fundamentals of how data and information are to be used in risk and reliability analysis models and their potential role in decision making. Understanding these topics is essential to attaining a risk informed decision making environment that is being sought by NASA requirements and procedures such as 8000.4 (Agency Risk Management Procedural Requirements), NPR 8705.05 (Probabilistic Risk Assessment Procedures for NASA Programs and Projects), and the System Safety requirements of NPR 8715.3 (NASA General Safety Program Requirements).
Mocapy++ - A toolkit for inference and learning in dynamic Bayesian networks
DEFF Research Database (Denmark)
Paluszewski, Martin; Hamelryck, Thomas Wim
2010-01-01
Background Mocapy++ is a toolkit for parameter learning and inference in dynamic Bayesian networks (DBNs). It supports a wide range of DBN architectures and probability distributions, including distributions from directional statistics (the statistics of angles, directions and orientations...
Bayesian approaches to spatial inference: Modelling and computational challenges and solutions
Moores, Matthew; Mengersen, Kerrie
2014-12-01
We discuss a range of Bayesian modelling approaches for spatial data and investigate some of the associated computational challenges. This paper commences with a brief review of Bayesian mixture models and Markov random fields, with enabling computational algorithms including Markov chain Monte Carlo (MCMC) and integrated nested Laplace approximation (INLA). Following this, we focus on the Potts model as a canonical approach, and discuss the challenge of estimating the inverse temperature parameter that controls the degree of spatial smoothing. We compare three approaches to addressing the doubly intractable nature of the likelihood, namely pseudo-likelihood, path sampling and the exchange algorithm. These techniques are applied to satellite data used to analyse water quality in the Great Barrier Reef.
Bayesian inference of nonlinear unsteady aerodynamics from aeroelastic limit cycle oscillations
Sandhu, Rimple; Poirel, Dominique; Pettit, Chris; Khalil, Mohammad; Sarkar, Abhijit
2016-07-01
A Bayesian model selection and parameter estimation algorithm is applied to investigate the influence of nonlinear and unsteady aerodynamic loads on the limit cycle oscillation (LCO) of a pitching airfoil in the transitional Reynolds number regime. At small angles of attack, laminar boundary layer trailing edge separation causes negative aerodynamic damping leading to the LCO. The fluid-structure interaction of the rigid, but elastically mounted, airfoil and nonlinear unsteady aerodynamics is represented by two coupled nonlinear stochastic ordinary differential equations containing uncertain parameters and model approximation errors. Several plausible aerodynamic models with increasing complexity are proposed to describe the aeroelastic system leading to LCO. The likelihood in the posterior parameter probability density function (pdf) is available semi-analytically using the extended Kalman filter for the state estimation of the coupled nonlinear structural and unsteady aerodynamic model. The posterior parameter pdf is sampled using a parallel and adaptive Markov Chain Monte Carlo (MCMC) algorithm. The posterior probability of each model is estimated using the Chib-Jeliazkov method that directly uses the posterior MCMC samples for evidence (marginal likelihood) computation. The Bayesian algorithm is validated through a numerical study and then applied to model the nonlinear unsteady aerodynamic loads using wind-tunnel test data at various Reynolds numbers.
Elsheikh, Ahmed H.
2014-02-01
An efficient Bayesian calibration method based on the nested sampling (NS) algorithm and non-intrusive polynomial chaos method is presented. Nested sampling is a Bayesian sampling algorithm that builds a discrete representation of the posterior distributions by iteratively re-focusing a set of samples to high likelihood regions. NS allows representing the posterior probability density function (PDF) with a smaller number of samples and reduces the curse of dimensionality effects. The main difficulty of the NS algorithm is in the constrained sampling step which is commonly performed using a random walk Markov Chain Monte-Carlo (MCMC) algorithm. In this work, we perform a two-stage sampling using a polynomial chaos response surface to filter out rejected samples in the Markov Chain Monte-Carlo method. The combined use of nested sampling and the two-stage MCMC based on approximate response surfaces provides significant computational gains in terms of the number of simulation runs. The proposed algorithm is applied for calibration and model selection of subsurface flow models. © 2013.
Bayesian inference of the demographic history of chimpanzees.
Wegmann, Daniel; Excoffier, Laurent
2010-06-01
Due to an almost complete absence of fossil record, the evolutionary history of chimpanzees has only been studied recently on the basis of genetic data. Although the general topology of the chimpanzee phylogeny is well established, uncertainties remain concerning the size of current and past populations, the occurrence of bottlenecks or population expansions, or about divergence times and migrations rates between subspecies. Here, we present a novel attempt at globally inferring the detailed evolution of the Pan genus based on approximate Bayesian computation, an approach preferentially applied to complex models where the likelihood cannot be computed analytically. Based on two microsatellite and DNA sequence data sets and adjusting simulated data for local levels of inbreeding and patterns of missing data, we find support for several new features of chimpanzee evolution as compared with previous studies based on smaller data sets and simpler evolutionary models. We find that the central chimpanzees are certainly the oldest population of all P. troglodytes subspecies and that the other two P. t. subspecies diverged from the central chimpanzees by founder events. We also find an older divergence time (1.6 million years [My]) between common chimpanzee and Bonobos than previous studies (0.9-1.3 My), but this divergence appears to have been very progressive with the maintenance of relatively high levels of gene flow between the ancestral chimpanzee population and the Bonobos. Finally, we could also confirm the existence of strong unidirectional gene flow from the western into the central chimpanzee. These results show that interesting and innovative features of chimpanzee history emerge when considering their whole evolutionary history in a single analysis, rather than relying on simpler models involving several comparisons of pairs of populations. PMID:20118191
UNSUPERVISED TRANSIENT LIGHT CURVE ANALYSIS VIA HIERARCHICAL BAYESIAN INFERENCE
Energy Technology Data Exchange (ETDEWEB)
Sanders, N. E.; Soderberg, A. M. [Harvard-Smithsonian Center for Astrophysics, 60 Garden Street, Cambridge, MA 02138 (United States); Betancourt, M., E-mail: nsanders@cfa.harvard.edu [Department of Statistics, University of Warwick, Coventry CV4 7AL (United Kingdom)
2015-02-10
Historically, light curve studies of supernovae (SNe) and other transient classes have focused on individual objects with copious and high signal-to-noise observations. In the nascent era of wide field transient searches, objects with detailed observations are decreasing as a fraction of the overall known SN population, and this strategy sacrifices the majority of the information contained in the data about the underlying population of transients. A population level modeling approach, simultaneously fitting all available observations of objects in a transient sub-class of interest, fully mines the data to infer the properties of the population and avoids certain systematic biases. We present a novel hierarchical Bayesian statistical model for population level modeling of transient light curves, and discuss its implementation using an efficient Hamiltonian Monte Carlo technique. As a test case, we apply this model to the Type IIP SN sample from the Pan-STARRS1 Medium Deep Survey, consisting of 18,837 photometric observations of 76 SNe, corresponding to a joint posterior distribution with 9176 parameters under our model. Our hierarchical model fits provide improved constraints on light curve parameters relevant to the physical properties of their progenitor stars relative to modeling individual light curves alone. Moreover, we directly evaluate the probability for occurrence rates of unseen light curve characteristics from the model hyperparameters, addressing observational biases in survey methodology. We view this modeling framework as an unsupervised machine learning technique with the ability to maximize scientific returns from data to be collected by future wide field transient searches like LSST.
DEFF Research Database (Denmark)
Heller, Rasmus; Chikhi, Lounes; Siegismund, Hans
2013-01-01
when it is violated. Among the most widely applied demographic inference methods are Bayesian skyline plots (BSPs), which are used across a range of biological fields. Violations of the panmixia assumption are to be expected in many biological systems, but the consequences for skyline plot inferences...
Sythesis of MCMC and Belief Propagation
Energy Technology Data Exchange (ETDEWEB)
Ahn, Sungsoo [Korea Advanced Institute of Science and Technology, Daejeon (South Korea); Chertkov, Michael [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Shin, Jinwoo [Korea Advanced Institute of Science and Technology, Daejeon (South Korea)
2016-05-27
Markov Chain Monte Carlo (MCMC) and Belief Propagation (BP) are the most popular algorithms for computational inference in Graphical Models (GM). In principle, MCMC is an exact probabilistic method which, however, often suffers from exponentially slow mixing. In contrast, BP is a deterministic method, which is typically fast, empirically very successful, however in general lacking control of accuracy over loopy graphs. In this paper, we introduce MCMC algorithms correcting the approximation error of BP, i.e., we provide a way to compensate for BP errors via a consecutive BP-aware MCMC. Our framework is based on the Loop Calculus (LC) approach which allows to express the BP error as a sum of weighted generalized loops. Although the full series is computationally intractable, it is known that a truncated series, summing up all 2-regular loops, is computable in polynomial-time for planar pair-wise binary GMs and it also provides a highly accurate approximation empirically. Motivated by this, we first propose a polynomial-time approximation MCMC scheme for the truncated series of general (non-planar) pair-wise binary models. Our main idea here is to use the Worm algorithm, known to provide fast mixing in other (related) problems, and then design an appropriate rejection scheme to sample 2-regular loops. Furthermore, we also design an efficient rejection-free MCMC scheme for approximating the full series. The main novelty underlying our design is in utilizing the concept of cycle basis, which provides an efficient decomposition of the generalized loops. In essence, the proposed MCMC schemes run on transformed GM built upon the non-trivial BP solution, and our experiments show that this synthesis of BP and MCMC outperforms both direct MCMC and bare BP schemes.
Bayesian parameter inference and model selection by population annealing in systems biology.
Murakami, Yohei
2014-01-01
Parameter inference and model selection are very important for mathematical modeling in systems biology. Bayesian statistics can be used to conduct both parameter inference and model selection. Especially, the framework named approximate Bayesian computation is often used for parameter inference and model selection in systems biology. However, Monte Carlo methods needs to be used to compute Bayesian posterior distributions. In addition, the posterior distributions of parameters are sometimes almost uniform or very similar to their prior distributions. In such cases, it is difficult to choose one specific value of parameter with high credibility as the representative value of the distribution. To overcome the problems, we introduced one of the population Monte Carlo algorithms, population annealing. Although population annealing is usually used in statistical mechanics, we showed that population annealing can be used to compute Bayesian posterior distributions in the approximate Bayesian computation framework. To deal with un-identifiability of the representative values of parameters, we proposed to run the simulations with the parameter ensemble sampled from the posterior distribution, named "posterior parameter ensemble". We showed that population annealing is an efficient and convenient algorithm to generate posterior parameter ensemble. We also showed that the simulations with the posterior parameter ensemble can, not only reproduce the data used for parameter inference, but also capture and predict the data which was not used for parameter inference. Lastly, we introduced the marginal likelihood in the approximate Bayesian computation framework for Bayesian model selection. We showed that population annealing enables us to compute the marginal likelihood in the approximate Bayesian computation framework and conduct model selection depending on the Bayes factor.
Jannson, Tomasz; Wang, Wenjian; Hodelin, Juan; Forrester, Thomas; Romanov, Volodymyr; Kostrzewski, Andrew
2016-05-01
In this paper, Bayesian Binary Sensing (BBS) is discussed as an effective tool for Bayesian Inference (BI) evaluation in interdisciplinary areas such as ISR (and, C3I), Homeland Security, QC, medicine, defense, and many others. In particular, Hilbertian Sine (HS) as an absolute measure of BI, is introduced, while avoiding relativity of decision threshold identification, as in the case of traditional measures of BI, related to false positives and false negatives.
Learning Bayesian networks for discrete data
Liang, Faming
2009-02-01
Bayesian networks have received much attention in the recent literature. In this article, we propose an approach to learn Bayesian networks using the stochastic approximation Monte Carlo (SAMC) algorithm. Our approach has two nice features. Firstly, it possesses the self-adjusting mechanism and thus avoids essentially the local-trap problem suffered by conventional MCMC simulation-based approaches in learning Bayesian networks. Secondly, it falls into the class of dynamic importance sampling algorithms; the network features can be inferred by dynamically weighted averaging the samples generated in the learning process, and the resulting estimates can have much lower variation than the single model-based estimates. The numerical results indicate that our approach can mix much faster over the space of Bayesian networks than the conventional MCMC simulation-based approaches. © 2008 Elsevier B.V. All rights reserved.
Sparse Bayesian Inference and the Temperature Structure of the Solar Corona
Warren, Harry P; Crump, Nicholas A
2016-01-01
Measuring the temperature structure of the solar atmosphere is critical to understanding how it is heated to high temperatures. Unfortunately, the temperature of the upper atmosphere cannot be observed directly, but must be inferred from spectrally resolved observations of individual emission lines that span a wide range of temperatures. Such observations are "inverted" to determine the distribution of plasma temperatures along the line of sight. This inversion is ill-posed and, in the absence of regularization, tends to produce wildly oscillatory solutions. We introduce the application of sparse Bayesian inference to the problem of inferring the temperature structure of the solar corona. Within a Bayesian framework a preference for solutions that utilize a minimum number of basis functions can be encoded into the prior and many ad hoc assumptions can be avoided. We demonstrate the efficacy of the Bayesian approach by considering a test library of 40 assumed temperature distributions.
Nonparametric Bayesian inference for multidimensional compound Poisson processes
S. Gugushvili; F. van der Meulen; P. Spreij
2015-01-01
Given a sample from a discretely observed multidimensional compound Poisson process, we study the problem of nonparametric estimation of its jump size density r0 and intensity λ0. We take a nonparametric Bayesian approach to the problem and determine posterior contraction rates in this context, whic
Bayesian inference using WBDev: a tutorial for social scientists
R. Wetzels; M.D. Lee; E.-J. Wagenmakers
2010-01-01
Over the last decade, the popularity of Bayesian data analysis in the empirical sciences has greatly increased. This is partly due to the availability of WinBUGS, a free and flexible statistical software package that comes with an array of predefined functions and distributions, allowing users to bu
Analysis of Gumbel Model for Software Reliability Using Bayesian Paradigm
Directory of Open Access Journals (Sweden)
Raj Kumar
2012-12-01
Full Text Available In this paper, we have illustrated the suitability of Gumbel Model for software reliability data. The model parameters are estimated using likelihood based inferential procedure: classical as well as Bayesian. The quasi Newton-Raphson algorithm is applied to obtain the maximum likelihood estimates and associated probability intervals. The Bayesian estimates of the parameters of Gumbel model are obtained using Markov Chain Monte Carlo(MCMC simulation method in OpenBUGS(established software for Bayesian analysis using Markov Chain Monte Carlo methods. The R functions are developed to study the statistical properties, model validation and comparison tools of the model and the output analysis of MCMC samples generated from OpenBUGS. Details of applying MCMC to parameter estimation for the Gumbel model are elaborated and a real software reliability data set is considered to illustrate the methods of inference discussed in this paper.
Stochastic Collapsed Variational Bayesian Inference for Latent Dirichlet Allocation
J. Foulds; L. Boyles; C. DuBois; P. Smyth; M. Welling
2013-01-01
There has been an explosion in the amount of digital text information available in recent years, leading to challenges of scale for traditional inference algorithms for topic models. Recent advances in stochastic variational inference algorithms for latent Dirichlet allocation (LDA) have made it fea
Fancher, Chris M.; Han, Zhen; Levin, Igor; Page, Katharine; Reich, Brian J.; Smith, Ralph C.; Wilson, Alyson G.; Jones, Jacob L.
2016-01-01
A Bayesian inference method for refining crystallographic structures is presented. The distribution of model parameters is stochastically sampled using Markov chain Monte Carlo. Posterior probability distributions are constructed for all model parameters to properly quantify uncertainty by appropriately modeling the heteroskedasticity and correlation of the error structure. The proposed method is demonstrated by analyzing a National Institute of Standards and Technology silicon standard reference material. The results obtained by Bayesian inference are compared with those determined by Rietveld refinement. Posterior probability distributions of model parameters provide both estimates and uncertainties. The new method better estimates the true uncertainties in the model as compared to the Rietveld method. PMID:27550221
Institute of Scientific and Technical Information of China (English)
LIU; Jianfeng; ZHANG; Yuan; ZHANG; Qin; WANG; Lixian; ZHANG; Jigang
2006-01-01
It is a challenging issue to map Quantitative Trait Loci (QTL) underlying complex discrete traits, which usually show discontinuous distribution and less information, using conventional statistical methods. Bayesian-Markov chain Monte Carlo (Bayesian-MCMC) approach is the key procedure in mapping QTL for complex binary traits, which provides a complete posterior distribution for QTL parameters using all prior information. As a consequence, Bayesian estimates of all interested variables can be obtained straightforwardly basing on their posterior samples simulated by the MCMC algorithm. In our study, utilities of Bayesian-MCMC are demonstrated using simulated several animal outbred full-sib families with different family structures for a complex binary trait underlied by both a QTL and polygene. Under the Identity-by-Descent-Based variance component random model, three samplers basing on MCMC, including Gibbs sampling, Metropolis algorithm and reversible jump MCMC, were implemented to generate the joint posterior distribution of all unknowns so that the QTL parameters were obtained by Bayesian statistical inferring. The results showed that Bayesian-MCMC approach could work well and robust under different family structures and QTL effects. As family size increases and the number of family decreases, the accuracy of the parameter estimates will be improved. When the true QTL has a small effect, using outbred population experiment design with large family size is the optimal mapping strategy.
Efficient variational inference in large-scale Bayesian compressed sensing
Papandreou, George
2011-01-01
We study linear models under heavy-tailed priors from a probabilistic viewpoint. Instead of computing a single sparse most probable (MAP) solution as in standard compressed sensing, the focus in the Bayesian framework shifts towards capturing the full posterior distribution on the latent variables, which allows quantifying the estimation uncertainty and learning model parameters using maximum likelihood. The exact posterior distribution under the sparse linear model is intractable and we concentrate on a number of alternative variational Bayesian techniques to approximate it. Repeatedly computing Gaussian variances turns out to be a key requisite for all these approximations and constitutes the main computational bottleneck in applying variational techniques in large-scale problems. We leverage on the recently proposed Perturb-and-MAP algorithm for drawing exact samples from Gaussian Markov random fields (GMRF). The main technical contribution of our paper is to show that estimating Gaussian variances using a...
A localization model to localize multiple sources using Bayesian inference
Dunham, Joshua Rolv
Accurate localization of a sound source in a room setting is important in both psychoacoustics and architectural acoustics. Binaural models have been proposed to explain how the brain processes and utilizes the interaural time differences (ITDs) and interaural level differences (ILDs) of sound waves arriving at the ears of a listener in determining source location. Recent work shows that applying Bayesian methods to this problem is proving fruitful. In this thesis, pink noise samples are convolved with head-related transfer functions (HRTFs) and compared to combinations of one and two anechoic speech signals convolved with different HRTFs or binaural room impulse responses (BRIRs) to simulate room positions. Through exhaustive calculation of Bayesian posterior probabilities and using a maximal likelihood approach, model selection will determine the number of sources present, and parameter estimation will result in azimuthal direction of the source(s).
What is the `relevant population' in Bayesian forensic inference?
Brümmer, Niko; de Villiers, Edward
2014-01-01
In works discussing the Bayesian paradigm for presenting forensic evidence in court, the concept of a `relevant population' is often mentioned, without a clear definition of what is meant, and without recommendations of how to select such populations. This note is to try to better understand this concept. Our analysis is intended to be general enough to be applicable to different forensic technologies and we shall consider both DNA profiling and speaker recognition as examples.
Robust Bayesian inference in Iq-Spherical models
Osiewalski, Jacek; Mark F.J. Steel
1992-01-01
The class of multivariate lq-spherical distributions is introduced and defined through their isodensity surfaces. We prove that, under a Jeffreys' type improper prior on the scale parameter, posterior inference on the location parameters is the same for all lq-spherical sampling models with common q. This gives us perfect inference robustness with respect to any departures from the reference case of independent sampling from the exponential power distribution.
Bayesian inference for a wavefront model of the Neolithisation of Europe
Baggaley, Andrew W; Shukurov, Anvar; Boys, Richard J; Golightly, Andrew
2012-01-01
We consider a wavefront model for the spread of Neolithic culture across Europe, and use Bayesian inference techniques to provide estimates for the parameters within this model, as constrained by radiocarbon data from Southern and Western Europe. Our wavefront model allows for both an isotropic background spread (incorporating the effects of local geography), and a localized anisotropic spread associated with major waterways. We introduce an innovative numerical scheme to track the wavefront, allowing us to simulate the times of the first arrival at any site orders of magnitude more efficiently than traditional PDE approaches. We adopt a Bayesian approach to inference and use Gaussian process emulators to facilitate further increases in efficiency in the inference scheme, thereby making Markov chain Monte Carlo methods practical. We allow for uncertainty in the fit of our model, and also infer a parameter specifying the magnitude of this uncertainty. We obtain a magnitude for the background spread of order 1 ...
Cornuet, Jean-Marie; Santos, Filipe; Beaumont, Mark A; Robert, Christian P.; Marin, Jean-Michel; Balding, David J.; Guillemaud, Thomas; Estoup, Arnaud
2008-01-01
Summary: Genetic data obtained on population samples convey information about their evolutionary history. Inference methods can extract part of this information but they require sophisticated statistical techniques that have been made available to the biologist community (through computer programs) only for simple and standard situations typically involving a small number of samples. We propose here a computer program (DIY ABC) for inference based on approximate Bayesian computation (ABC), in...
Directory of Open Access Journals (Sweden)
Moritz eBoos
2016-05-01
Full Text Available Cognitive determinants of probabilistic inference were examined using hierarchical Bayesian modelling techniques. A classic urn-ball paradigm served as experimental strategy, involving a factorial two (prior probabilities by two (likelihoods design. Five computational models of cognitive processes were compared with the observed behaviour. Parameter-free Bayesian posterior probabilities and parameter-free base rate neglect provided inadequate models of probabilistic inference. The introduction of distorted subjective probabilities yielded more robust and generalizable results. A general class of (inverted S-shaped probability weighting functions had been proposed; however, the possibility of large differences in probability distortions not only across experimental conditions, but also across individuals, seems critical for the model’s success. It also seems advantageous to consider individual differences in parameters of probability weighting as being sampled from weakly informative prior distributions of individual parameter values. Thus, the results from hierarchical Bayesian modelling converge with previous results in revealing that probability weighting parameters show considerable task dependency and individual differences. Methodologically, this work exemplifies the usefulness of hierarchical Bayesian modelling techniques for cognitive psychology. Theoretically, human probabilistic inference might be best described as the application of individualized strategic policies for Bayesian belief revision.
Boos, Moritz; Seer, Caroline; Lange, Florian; Kopp, Bruno
2016-01-01
Cognitive determinants of probabilistic inference were examined using hierarchical Bayesian modeling techniques. A classic urn-ball paradigm served as experimental strategy, involving a factorial two (prior probabilities) by two (likelihoods) design. Five computational models of cognitive processes were compared with the observed behavior. Parameter-free Bayesian posterior probabilities and parameter-free base rate neglect provided inadequate models of probabilistic inference. The introduction of distorted subjective probabilities yielded more robust and generalizable results. A general class of (inverted) S-shaped probability weighting functions had been proposed; however, the possibility of large differences in probability distortions not only across experimental conditions, but also across individuals, seems critical for the model's success. It also seems advantageous to consider individual differences in parameters of probability weighting as being sampled from weakly informative prior distributions of individual parameter values. Thus, the results from hierarchical Bayesian modeling converge with previous results in revealing that probability weighting parameters show considerable task dependency and individual differences. Methodologically, this work exemplifies the usefulness of hierarchical Bayesian modeling techniques for cognitive psychology. Theoretically, human probabilistic inference might be best described as the application of individualized strategic policies for Bayesian belief revision.
Bayesian Inference of Empirical Coefficient for Foundation Settlement
Institute of Scientific and Technical Information of China (English)
LI Zhen-yu; WANG Yong-he; YANG Guo-lin
2009-01-01
A new approach based on Bayesian theory is proposed to determine the empirical coefficient in soil settlement calculation. Prior distribution is assumed to be uniform in [0.2,1.4]. Posterior density function is developed in the condition of prior distribution combined with the information of observed samples at four locations on a passenger dedicated line. The results show that the posterior distribution of the empirical coefficient obeys Gaussian distribution. The mean value of the empirical coefficient decreases gradually with the increasing of the load on ground, and variance variation shows no regularity.
Bayesian Inference on Predictors of Sex of the Baby
Scarpa, Bruno
2016-01-01
It is well known that the sex ratio at birth is a biological constant, being about 106 boys to 100 girls. However couples have always wanted to know and decide in advance the sex of a newborn. For example, a large number of papers appeared connecting biometrical variables, such as length of follicular phase in the woman menstrual cycle or timing of intercourse acts to the sex of new baby. In this paper, we propose a Bayesian model to validate some of these theories by using an independent dat...
Perkins, Simon; Zwart, Jonathan; Natarajan, Iniyan; Smirnov, Oleg
2015-01-01
We present Montblanc, a GPU implementation of the Radio interferometer measurement equation (RIME) in support of the Bayesian inference for radio observations (BIRO) technique. BIRO uses Bayesian inference to select sky models that best match the visibilities observed by a radio interferometer. To accomplish this, BIRO evaluates the RIME multiple times, varying sky model parameters to produce multiple model visibilities. Chi-squared values computed from the model and observed visibilities are used as likelihood values to drive the Bayesian sampling process and select the best sky model. As most of the elements of the RIME and chi-squared calculation are independent of one another, they are highly amenable to parallel computation. Additionally, Montblanc caters for iterative RIME evaluation to produce multiple chi-squared values. Only modified model parameters are transferred to the GPU between each iteration. We implemented Montblanc as a Python package based upon NVIDIA's CUDA architecture. As such, it is ea...
Bayesian Inference and Prediction in an M/G/1 with Optional Second Service
Mohammadi, A.; Salehi-Rad, M. R.
2012-01-01
In this article, we exploit the Bayesian inference and prediction for an M/G/1 queuing model with optional second re-service. In this model, a service unit attends customers arriving following a Poisson process and demanding service according to a general distribution and some of customers need to r
Markov Model of Wind Power Time Series UsingBayesian Inference of Transition Matrix
DEFF Research Database (Denmark)
Chen, Peiyuan; Berthelsen, Kasper Klitgaard; Bak-Jensen, Birgitte;
2009-01-01
This paper proposes to use Bayesian inference of transition matrix when developing a discrete Markov model of a wind speed/power time series and 95% credible interval for the model verification. The Dirichlet distribution is used as a conjugate prior for the transition matrix. Three discrete Markov...
Bayesian inference and model comparison for metallic fatigue data
Babuška, Ivo
2016-02-23
In this work, we present a statistical treatment of stress-life (S-N) data drawn from a collection of records of fatigue experiments that were performed on 75S-T6 aluminum alloys. Our main objective is to predict the fatigue life of materials by providing a systematic approach to model calibration, model selection and model ranking with reference to S-N data. To this purpose, we consider fatigue-limit models and random fatigue-limit models that are specially designed to allow the treatment of the run-outs (right-censored data). We first fit the models to the data by maximum likelihood methods and estimate the quantiles of the life distribution of the alloy specimen. To assess the robustness of the estimation of the quantile functions, we obtain bootstrap confidence bands by stratified resampling with respect to the cycle ratio. We then compare and rank the models by classical measures of fit based on information criteria. We also consider a Bayesian approach that provides, under the prior distribution of the model parameters selected by the user, their simulation-based posterior distributions. We implement and apply Bayesian model comparison methods, such as Bayes factor ranking and predictive information criteria based on cross-validation techniques under various a priori scenarios.
Bayesian inference and model comparison for metallic fatigue data
Babuška, Ivo; Sawlan, Zaid; Scavino, Marco; Szabó, Barna; Tempone, Raúl
2016-06-01
In this work, we present a statistical treatment of stress-life (S-N) data drawn from a collection of records of fatigue experiments that were performed on 75S-T6 aluminum alloys. Our main objective is to predict the fatigue life of materials by providing a systematic approach to model calibration, model selection and model ranking with reference to S-N data. To this purpose, we consider fatigue-limit models and random fatigue-limit models that are specially designed to allow the treatment of the run-outs (right-censored data). We first fit the models to the data by maximum likelihood methods and estimate the quantiles of the life distribution of the alloy specimen. To assess the robustness of the estimation of the quantile functions, we obtain bootstrap confidence bands by stratified resampling with respect to the cycle ratio. We then compare and rank the models by classical measures of fit based on information criteria. We also consider a Bayesian approach that provides, under the prior distribution of the model parameters selected by the user, their simulation-based posterior distributions. We implement and apply Bayesian model comparison methods, such as Bayes factor ranking and predictive information criteria based on cross-validation techniques under various a priori scenarios.
Bayesian inference analyses of the polygenic architecture of rheumatoid arthritis
Stahl, Eli A.; Wegmann, Daniel; Trynka, Gosia; Gutierrez-Achury, Javier; Do, Ron; Voight, Benjamin F.; Kraft, Peter; Chen, Robert; Kallberg, Henrik J.; Kurreeman, Fina A. S.; Kathiresan, Sekar; Wijmenga, Cisca; Gregersen, Peter K.; Alfredsson, Lars; Siminovitch, Katherine A.; Worthington, Jane; de Bakker, Paul I. W.; Raychaudhuri, Soumya; Plenge, Robert M.
2012-01-01
The genetic architectures of common, complex diseases are largely uncharacterized. We modeled the genetic architecture underlying genome-wide association study (GWAS) data for rheumatoid arthritis and developed a new method using polygenic risk-score analyses to infer the total liability-scale varia
Shah, Abhik; Woolf, Peter
2009-06-01
In this paper, we introduce pebl, a Python library and application for learning Bayesian network structure from data and prior knowledge that provides features unmatched by alternative software packages: the ability to use interventional data, flexible specification of structural priors, modeling with hidden variables and exploitation of parallel processing. PMID:20161541
Hierarchical Bayesian inference of galaxy redshift distributions from photometric surveys
Leistedt, Boris; Peiris, Hiranya V
2016-01-01
Accurately characterizing the redshift distributions of galaxies is essential for analysing deep photometric surveys and testing cosmological models. We present a technique to simultaneously infer redshift distributions and individual redshifts from photometric galaxy catalogues. Our model constructs a piecewise constant representation (effectively a histogram) of the distribution of galaxy types and redshifts, the parameters of which are efficiently inferred from noisy photometric flux measurements. This approach can be seen as a generalization of template-fitting photometric redshift methods and relies on a library of spectral templates to relate the photometric fluxes of individual galaxies to their redshifts. We illustrate this technique on simulated galaxy survey data, and demonstrate that it delivers correct posterior distributions on the underlying type and redshift distributions, as well as on the individual types and redshifts of galaxies. We show that even with uninformative priors, large photometri...
Approximate Bayesian inference in semi-mechanistic models
Aderhold, Andrej; Husmeier, Dirk; Grzegorczyk, Marco
2016-01-01
Inference of interaction networks represented by systems of differential equations is a challenging problem in many scientific disciplines. In the present article, we follow a semi-mechanistic modelling approach based on gradient matching. We investigate the extent to which key factors, including the kinetic model, statistical formulation and numerical methods, impact upon performance at network reconstruction. We emphasize general lessons for computational statisticians when faced with the c...
Solving #SAT and Bayesian Inference with Backtracking Search
Bacchus, Fahiem; Dalmao, Shannon; Pitassi, Toniann
2014-01-01
Inference in Bayes Nets (BAYES) is an important problem with numerous applications in probabilistic reasoning. Counting the number of satisfying assignments of a propositional formula (#SAT) is a closely related problem of fundamental theoretical importance. Both these problems, and others, are members of the class of sum-of-products (SUMPROD) problems. In this paper we show that standard backtracking search when augmented with a simple memoization scheme (caching) can solve any sum-of-produc...
Directory of Open Access Journals (Sweden)
A. A. Zolotin
2015-07-01
Full Text Available Posteriori inference is one of the three kinds of probabilistic-logic inferences in the probabilistic graphical models theory and the base for processing of knowledge patterns with probabilistic uncertainty using Bayesian networks. The paper deals with a task of local posteriori inference description in algebraic Bayesian networks that represent a class of probabilistic graphical models by means of matrix-vector equations. The latter are essentially based on the use of tensor product of matrices, Kronecker degree and Hadamard product. Matrix equations for calculating posteriori probabilities vectors within posteriori inference in knowledge patterns with quanta propositions are obtained. Similar equations of the same type have already been discussed within the confines of the theory of algebraic Bayesian networks, but they were built only for the case of posteriori inference in the knowledge patterns on the ideals of conjuncts. During synthesis and development of matrix-vector equations on quanta propositions probability vectors, a number of earlier results concerning normalizing factors in posteriori inference and assignment of linear projective operator with a selector vector was adapted. We consider all three types of incoming evidences - deterministic, stochastic and inaccurate - combined with scalar and interval estimation of probability truth of propositional formulas in the knowledge patterns. Linear programming problems are formed. Their solution gives the desired interval values of posterior probabilities in the case of inaccurate evidence or interval estimates in a knowledge pattern. That sort of description of a posteriori inference gives the possibility to extend the set of knowledge pattern types that we can use in the local and global posteriori inference, as well as simplify complex software implementation by use of existing third-party libraries, effectively supporting submission and processing of matrices and vectors when
Wu, Yuefeng; Hooker, Giles
2013-01-01
This paper introduces a hierarchical framework to incorporate Hellinger distance methods into Bayesian analysis. We propose to modify a prior over non-parametric densities with the exponential of twice the Hellinger distance between a candidate and a parametric density. By incorporating a prior over the parameters of the second density, we arrive at a hierarchical model in which a non-parametric model is placed between parameters and the data. The parameters of the family can then be estimate...
Hierarchical Bayesian inference of galaxy redshift distributions from photometric surveys
Leistedt, Boris; Mortlock, Daniel J.; Peiris, Hiranya V.
2016-08-01
Accurately characterizing the redshift distributions of galaxies is essential for analysing deep photometric surveys and testing cosmological models. We present a technique to simultaneously infer redshift distributions and individual redshifts from photometric galaxy catalogues. Our model constructs a piecewise constant representation (effectively a histogram) of the distribution of galaxy types and redshifts, the parameters of which are efficiently inferred from noisy photometric flux measurements. This approach can be seen as a generalization of template-fitting photometric redshift methods and relies on a library of spectral templates to relate the photometric fluxes of individual galaxies to their redshifts. We illustrate this technique on simulated galaxy survey data, and demonstrate that it delivers correct posterior distributions on the underlying type and redshift distributions, as well as on the individual types and redshifts of galaxies. We show that even with uninformative priors, large photometric errors and parameter degeneracies, the redshift and type distributions can be recovered robustly thanks to the hierarchical nature of the model, which is not possible with common photometric redshift estimation techniques. As a result, redshift uncertainties can be fully propagated in cosmological analyses for the first time, fulfilling an essential requirement for the current and future generations of surveys.
An empirical Bayesian approach for model-based inference of cellular signaling networks
Directory of Open Access Journals (Sweden)
Klinke David J
2009-11-01
Full Text Available Abstract Background A common challenge in systems biology is to infer mechanistic descriptions of biological process given limited observations of a biological system. Mathematical models are frequently used to represent a belief about the causal relationships among proteins within a signaling network. Bayesian methods provide an attractive framework for inferring the validity of those beliefs in the context of the available data. However, efficient sampling of high-dimensional parameter space and appropriate convergence criteria provide barriers for implementing an empirical Bayesian approach. The objective of this study was to apply an Adaptive Markov chain Monte Carlo technique to a typical study of cellular signaling pathways. Results As an illustrative example, a kinetic model for the early signaling events associated with the epidermal growth factor (EGF signaling network was calibrated against dynamic measurements observed in primary rat hepatocytes. A convergence criterion, based upon the Gelman-Rubin potential scale reduction factor, was applied to the model predictions. The posterior distributions of the parameters exhibited complicated structure, including significant covariance between specific parameters and a broad range of variance among the parameters. The model predictions, in contrast, were narrowly distributed and were used to identify areas of agreement among a collection of experimental studies. Conclusion In summary, an empirical Bayesian approach was developed for inferring the confidence that one can place in a particular model that describes signal transduction mechanisms and for inferring inconsistencies in experimental measurements.
Large-Scale Optimization for Bayesian Inference in Complex Systems
Energy Technology Data Exchange (ETDEWEB)
Willcox, Karen [MIT; Marzouk, Youssef [MIT
2013-11-12
The SAGUARO (Scalable Algorithms for Groundwater Uncertainty Analysis and Robust Optimization) Project focused on the development of scalable numerical algorithms for large-scale Bayesian inversion in complex systems that capitalize on advances in large-scale simulation-based optimization and inversion methods. The project was a collaborative effort among MIT, the University of Texas at Austin, Georgia Institute of Technology, and Sandia National Laboratories. The research was directed in three complementary areas: efficient approximations of the Hessian operator, reductions in complexity of forward simulations via stochastic spectral approximations and model reduction, and employing large-scale optimization concepts to accelerate sampling. The MIT--Sandia component of the SAGUARO Project addressed the intractability of conventional sampling methods for large-scale statistical inverse problems by devising reduced-order models that are faithful to the full-order model over a wide range of parameter values; sampling then employs the reduced model rather than the full model, resulting in very large computational savings. Results indicate little effect on the computed posterior distribution. On the other hand, in the Texas--Georgia Tech component of the project, we retain the full-order model, but exploit inverse problem structure (adjoint-based gradients and partial Hessian information of the parameter-to-observation map) to implicitly extract lower dimensional information on the posterior distribution; this greatly speeds up sampling methods, so that fewer sampling points are needed. We can think of these two approaches as ``reduce then sample'' and ``sample then reduce.'' In fact, these two approaches are complementary, and can be used in conjunction with each other. Moreover, they both exploit deterministic inverse problem structure, in the form of adjoint-based gradient and Hessian information of the underlying parameter-to-observation map, to
Bayesian Spatial Modelling with R-INLA
Finn Lindgren; Håvard Rue
2015-01-01
The principles behind the interface to continuous domain spatial models in the R- INLA software package for R are described. The integrated nested Laplace approximation (INLA) approach proposed by Rue, Martino, and Chopin (2009) is a computationally effective alternative to MCMC for Bayesian inference. INLA is designed for latent Gaussian models, a very wide and flexible class of models ranging from (generalized) linear mixed to spatial and spatio-temporal models. Combined with the stochastic...
Luo, Sheng; Ma, Junsheng; Kieburtz, Karl D
2013-09-30
Many randomized clinical trials collect multivariate longitudinal measurements in different scales, for example, binary, ordinal, and continuous. Multilevel item response models are used to evaluate the global treatment effects across multiple outcomes while accounting for all sources of correlation. Continuous measurements are often assumed to be normally distributed. But the model inference is not robust when the normality assumption is violated because of heavy tails and outliers. In this article, we develop a Bayesian method for multilevel item response models replacing the normal distributions with symmetric heavy-tailed normal/independent distributions. The inference is conducted using a Bayesian framework via Markov Chain Monte Carlo simulation implemented in BUGS language. Our proposed method is evaluated by simulation studies and is applied to Earlier versus Later Levodopa Therapy in Parkinson's Disease study, a motivating clinical trial assessing the effect of Levodopa therapy on the Parkinson's disease progression rate. PMID:23494809
VIGoR: Variational Bayesian Inference for Genome-Wide Regression
Directory of Open Access Journals (Sweden)
Akio Onogi
2016-04-01
Full Text Available Genome-wide regression using a number of genome-wide markers as predictors is now widely used for genome-wide association mapping and genomic prediction. We developed novel software for genome-wide regression which we named VIGoR (variational Bayesian inference for genome-wide regression. Variational Bayesian inference is computationally much faster than widely used Markov chain Monte Carlo algorithms. VIGoR implements seven regression methods, and is provided as a command line program package for Linux/Mac, and as a cross-platform R package. In addition to model fitting, cross-validation and hyperparameter tuning using cross-validation can be automatically performed by modifying a single argument. VIGoR is available at https://github.com/Onogi/VIGoR. The R package is also available at https://cran.r-project.org/web/packages/VIGoR/index.html.
Dorn, C; Khan, A; Heng, K; Alibert, Y; Helled, R; Rivoldini, A; Benz, W
2016-01-01
We aim to present a generalized Bayesian inference method for constraining interiors of super Earths and sub-Neptunes. Our methodology succeeds in quantifying the degeneracy and correlation of structural parameters for high dimensional parameter spaces. Specifically, we identify what constraints can be placed on composition and thickness of core, mantle, ice, ocean, and atmospheric layers given observations of mass, radius, and bulk refractory abundance constraints (Fe, Mg, Si) from observations of the host star's photospheric composition. We employed a full probabilistic Bayesian inference analysis that formally accounts for observational and model uncertainties. Using a Markov chain Monte Carlo technique, we computed joint and marginal posterior probability distributions for all structural parameters of interest. We included state-of-the-art structural models based on self-consistent thermodynamics of core, mantle, high-pressure ice, and liquid water. Furthermore, we tested and compared two different atmosp...
Ball, William T; Egerton, Jack S; Haigh, Joanna D
2014-01-01
We investigate the relationship between spectral solar irradiance (SSI) and ozone in the tropical upper stratosphere. We find that solar cycle (SC) changes in ozone can be well approximated by considering the ozone response to SSI changes in a small number individual wavelength bands between 176 and 310 nm, operating independently of each other. Additionally, we find that the ozone varies approximately linearly with changes in the SSI. Using these facts, we present a Bayesian formalism for inferring SC SSI changes and uncertainties from measured SC ozone profiles. Bayesian inference is a powerful, mathematically self-consistent method of considering both the uncertainties of the data and additional external information to provide the best estimate of parameters being estimated. Using this method, we show that, given measurement uncertainties in both ozone and SSI datasets, it is not currently possible to distinguish between observed or modelled SSI datasets using available estimates of ozone change profiles, ...
Prudhomme, Serge
2015-09-17
Parameter estimation for complex models using Bayesian inference is usually a very costly process as it requires a large number of solves of the forward problem. We show here how the construction of adaptive surrogate models using a posteriori error estimates for quantities of interest can significantly reduce the computational cost in problems of statistical inference. As surrogate models provide only approximations of the true solutions of the forward problem, it is nevertheless necessary to control these errors in order to construct an accurate reduced model with respect to the observables utilized in the identification of the model parameters. Effectiveness of the proposed approach is demonstrated on a numerical example dealing with the Spalart–Allmaras model for the simulation of turbulent channel flows. In particular, we illustrate how Bayesian model selection using the adapted surrogate model in place of solving the coupled nonlinear equations leads to the same quality of results while requiring fewer nonlinear PDE solves.
DEFF Research Database (Denmark)
Møller, Jesper
2010-01-01
Chapter 9: This contribution concerns statistical inference for parametric models used in stochastic geometry and based on quick and simple simulation free procedures as well as more comprehensive methods based on a maximum likelihood or Bayesian approach combined with markov chain Monte Carlo...
Directory of Open Access Journals (Sweden)
Benjamin W. Y. Lo
2013-01-01
Full Text Available Objective. The novel clinical prediction approach of Bayesian neural networks with fuzzy logic inferences is created and applied to derive prognostic decision rules in cerebral aneurysmal subarachnoid hemorrhage (aSAH. Methods. The approach of Bayesian neural networks with fuzzy logic inferences was applied to data from five trials of Tirilazad for aneurysmal subarachnoid hemorrhage (3551 patients. Results. Bayesian meta-analyses of observational studies on aSAH prognostic factors gave generalizable posterior distributions of population mean log odd ratios (ORs. Similar trends were noted in Bayesian and linear regression ORs. Significant outcome predictors include normal motor response, cerebral infarction, history of myocardial infarction, cerebral edema, history of diabetes mellitus, fever on day 8, prior subarachnoid hemorrhage, admission angiographic vasospasm, neurological grade, intraventricular hemorrhage, ruptured aneurysm size, history of hypertension, vasospasm day, age and mean arterial pressure. Heteroscedasticity was present in the nontransformed dataset. Artificial neural networks found nonlinear relationships with 11 hidden variables in 1 layer, using the multilayer perceptron model. Fuzzy logic decision rules (centroid defuzzification technique denoted cut-off points for poor prognosis at greater than 2.5 clusters. Discussion. This aSAH prognostic system makes use of existing knowledge, recognizes unknown areas, incorporates one's clinical reasoning, and compensates for uncertainty in prognostication.
Bayesian Statistical Inference in Ion-Channel Models with Exact Missed Event Correction.
Epstein, Michael; Calderhead, Ben; Girolami, Mark A; Sivilotti, Lucia G
2016-07-26
The stochastic behavior of single ion channels is most often described as an aggregated continuous-time Markov process with discrete states. For ligand-gated channels each state can represent a different conformation of the channel protein or a different number of bound ligands. Single-channel recordings show only whether the channel is open or shut: states of equal conductance are aggregated, so transitions between them have to be inferred indirectly. The requirement to filter noise from the raw signal further complicates the modeling process, as it limits the time resolution of the data. The consequence of the reduced bandwidth is that openings or shuttings that are shorter than the resolution cannot be observed; these are known as missed events. Postulated models fitted using filtered data must therefore explicitly account for missed events to avoid bias in the estimation of rate parameters and therefore assess parameter identifiability accurately. In this article, we present the first, to our knowledge, Bayesian modeling of ion-channels with exact missed events correction. Bayesian analysis represents uncertain knowledge of the true value of model parameters by considering these parameters as random variables. This allows us to gain a full appreciation of parameter identifiability and uncertainty when estimating values for model parameters. However, Bayesian inference is particularly challenging in this context as the correction for missed events increases the computational complexity of the model likelihood. Nonetheless, we successfully implemented a two-step Markov chain Monte Carlo method that we called "BICME", which performs Bayesian inference in models of realistic complexity. The method is demonstrated on synthetic and real single-channel data from muscle nicotinic acetylcholine channels. We show that parameter uncertainty can be characterized more accurately than with maximum-likelihood methods. Our code for performing inference in these ion channel
Dorazio, R.M.; Johnson, F.A.
2003-01-01
Bayesian inference and decision theory may be used in the solution of relatively complex problems of natural resource management, owing to recent advances in statistical theory and computing. In particular, Markov chain Monte Carlo algorithms provide a computational framework for fitting models of adequate complexity and for evaluating the expected consequences of alternative management actions. We illustrate these features using an example based on management of waterfowl habitat.
GNU MCSim : bayesian statistical inference for SBML-coded systems biology models
Bois, Frédéric Y.
2009-01-01
International audience Statistical inference about the parameter values of complex models, such as the ones routinely developed in systems biology, is efficiently performed through Bayesian numerical techniques. In that framework, prior information and multiple levels of uncertainty can be seamlessly integrated. GNU MCSim was precisely developed to achieve those aims, in a general non-linear differential context. Starting with version 5.3.0, GNU MCSim reads in and simulates Systems Biology...
Bayesian inference of models and hyper-parameters for robust optic-flow estimation
Héas, Patrick; Herzet, Cédric; Memin, Etienne
2012-01-01
International audience Selecting optimal models and hyper-parameters is crucial for accurate optic-flow estimation. This paper provides a solution to the problem in a generic Bayesian framework. The method is based on a conditional model linking the image intensity function, the unknown velocity field, hyper-parameters and the prior and likelihood motion models. Inference is performed on each of the three-level of this so-defined hierarchical model by maximization of marginalized \\textit{a...
Bayesian inferences of the thermal properties of a wall using temperature and heat flux measurements
Iglesias, Marco; Sawlan, Zaid; Scavino, Marco; Tempone, Raul; Wood, Christopher
2016-01-01
We develop a hierarchical Bayesian inference method to estimate the thermal resistance and the volumetric heat capacity of a wall. These thermal properties are essential for accurate building energy simulations that are needed to make effective energy-saving policies. We apply our methodology to an experimental case study conducted in an environmental chamber, where measurements are recorded every minute from temperature probes and heat flux sensors placed on both sides of a solid brick wall ...
Constraining East Antarctic mass trends using a Bayesian inference approach
Martin-Español, Alba; Bamber, Jonathan L.
2016-04-01
East Antarctica is an order of magnitude larger than its western neighbour and the Greenland ice sheet. It has the greatest potential to contribute to sea level rise of any source, including non-glacial contributors. It is, however, the most challenging ice mass to constrain because of a range of factors including the relative paucity of in-situ observations and the poor signal to noise ratio of Earth Observation data such as satellite altimetry and gravimetry. A recent study using satellite radar and laser altimetry (Zwally et al. 2015) concluded that the East Antarctic Ice Sheet (EAIS) had been accumulating mass at a rate of 136±28 Gt/yr for the period 2003-08. Here, we use a Bayesian hierarchical model, which has been tested on, and applied to, the whole of Antarctica, to investigate the impact of different assumptions regarding the origin of elevation changes of the EAIS. We combined GRACE, satellite laser and radar altimeter data and GPS measurements to solve simultaneously for surface processes (primarily surface mass balance, SMB), ice dynamics and glacio-isostatic adjustment over the period 2003-13. The hierarchical model partitions mass trends between SMB and ice dynamics based on physical principles and measures of statistical likelihood. Without imposing the division between these processes, the model apportions about a third of the mass trend to ice dynamics, +18 Gt/yr, and two thirds, +39 Gt/yr, to SMB. The total mass trend for that period for the EAIS was 57±20 Gt/yr. Over the period 2003-08, we obtain an ice dynamic trend of 12 Gt/yr and a SMB trend of 15 Gt/yr, with a total mass trend of 27 Gt/yr. We then imposed the condition that the surface mass balance is tightly constrained by the regional climate model RACMO2.3 and allowed height changes due to ice dynamics to occur in areas of low surface velocities (<10 m/yr) , such as those in the interior of East Antarctica (a similar condition as used in Zwally 2015). The model must find a solution that
Trans-dimensional Bayesian inference for large sequential data sets
Mandolesi, E.; Dettmer, J.; Dosso, S. E.; Holland, C. W.
2015-12-01
This work develops a sequential Monte Carlo method to infer seismic parameters of layered seabeds from large sequential reflection-coefficient data sets. The approach provides parameter estimates and uncertainties along survey tracks with the goal to aid in the detection of unexploded ordnance in shallow water. The sequential data are acquired by a moving platform with source and receiver array towed close to the seabed. This geometry requires consideration of spherical reflection coefficients, computed efficiently by massively parallel implementation of the Sommerfeld integral via Levin integration on a graphics processing unit. The seabed is parametrized with a trans-dimensional model to account for changes in the environment (i.e. changes in layering) along the track. The method combines advanced Markov chain Monte Carlo methods (annealing) with particle filtering (resampling). Since data from closely-spaced source transmissions (pings) often sample similar environments, the solution from one ping can be utilized to efficiently estimate the posterior for data from subsequent pings. Since reflection-coefficient data are highly informative, the likelihood function can be extremely peaked, resulting in little overlap between posteriors of adjacent pings. This is addressed by adding bridging distributions (via annealed importance sampling) between pings for more efficient transitions. The approach assumes the environment to be changing slowly enough to justify the local 1D parametrization. However, bridging allows rapid changes between pings to be addressed and we demonstrate the method to be stable in such situations. Results are in terms of trans-D parameter estimates and uncertainties along the track. The algorithm is examined for realistic simulated data along a track and applied to a dataset collected by an autonomous underwater vehicle on the Malta Plateau, Mediterranean Sea. [Work supported by the SERDP, DoD.
A novel multimode process monitoring method integrating LDRSKM with Bayesian inference
Institute of Scientific and Technical Information of China (English)
Shi-jin REN; Yin LIANG; Xiang-jun ZHAO; Mao-yun YANG
2015-01-01
A local discriminant regularized soft k-means (LDRSKM) method with Bayesian inference is proposed for multimode process monitoring. LDRSKM extends the regularized soft k-means algorithm by exploiting the local and non-local geometric information of the data and generalized linear discriminant analysis to provide a better and more meaningful data partition. LDRSKM can perform clustering and subspace selection simultaneously, enhancing the separability of data residing in different clusters. With the data partition obtained, kernel support vector data description (KSVDD) is used to establish the monitoring statistics and control limits. Two Bayesian inference based global fault detection indicators are then developed using the local monitoring results associated with principal and residual subspaces. Based on clustering analysis, Bayesian inference and mani-fold learning methods, the within and cross-mode correlations, and local geometric information can be exploited to enhance monitoring performances for nonlinear and non-Gaussian processes. The effectiveness and efficiency of the proposed method are evaluated using the Tennessee Eastman benchmark process.
International Nuclear Information System (INIS)
Bayesian methodology has been used widely used in various research fields. It is method of inference using Bayes' rule to update the estimation of probability for the certain hypothesis when additional evidences are acquired. According to the current researches, malfunction of nuclear power plant can be detected by using this Bayesian inference which consistently piles up the newly incoming data and updates its estimation. However, those researches are based on the assumption that people are doing like computer perfectly, which can be criticized and may cause a problem in real world application. Studies in cognitive psychology indicates that when the amount of information becomes larger, people can't save the whole data because people have limited memory capacity which is well known as working memory, and also they have attention problem. The purpose of this paper is to consider the psychological factors and confirm how much this working memory and attention will affect the resulted estimation based on the Bayesian inference. To confirm this, experiment on human is needed, and the tool of experiment is Compact Nuclear Simulator (CNS)
Energy Technology Data Exchange (ETDEWEB)
Kang, Seong Keun; Seong, Poong Hyun [KAIST, Daejeon (Korea, Republic of)
2014-08-15
Bayesian methodology has been used widely used in various research fields. It is method of inference using Bayes' rule to update the estimation of probability for the certain hypothesis when additional evidences are acquired. According to the current researches, malfunction of nuclear power plant can be detected by using this Bayesian inference which consistently piles up the newly incoming data and updates its estimation. However, those researches are based on the assumption that people are doing like computer perfectly, which can be criticized and may cause a problem in real world application. Studies in cognitive psychology indicates that when the amount of information becomes larger, people can't save the whole data because people have limited memory capacity which is well known as working memory, and also they have attention problem. The purpose of this paper is to consider the psychological factors and confirm how much this working memory and attention will affect the resulted estimation based on the Bayesian inference. To confirm this, experiment on human is needed, and the tool of experiment is Compact Nuclear Simulator (CNS)
Bayesian inference of T Tauri star properties using multi-wavelength survey photometry
Barentsen, Geert; Drew, Janet E; Sale, Stuart E
2012-01-01
There are many pertinent open issues in the area of star and planet formation. Large statistical samples of young stars across star-forming regions are needed to trigger a breakthrough in our understanding, but most optical studies are based on a wide variety of spectrographs and analysis methods, which introduces large biases. Here we show how graphical Bayesian networks can be employed to construct a hierarchical probabilistic model which allows pre-main sequence ages, masses, accretion rates, and extinctions to be estimated using two widely available photometric survey databases (IPHAS r/i/Halpha and 2MASS J-band magnitudes.) Because our approach does not rely on spectroscopy, it can easily be applied to homogeneously study the large number of clusters for which Gaia will yield membership lists. We explain how the analysis is carried out using the Markov Chain Monte Carlo (MCMC) method and provide Python source code. We then demonstrate its use on 587 known low-mass members of the star-forming region NGC 2...
AGNfitter: SED-fitting code for AGN and galaxies from a MCMC approach
Calistro Rivera, Gabriela; Lusso, Elisabeta; Hennawi, Joseph F.; Hogg, David W.
2016-07-01
AGNfitter is a fully Bayesian MCMC method to fit the spectral energy distributions (SEDs) of active galactic nuclei (AGN) and galaxies from the sub-mm to the UV; it enables robust disentanglement of the physical processes responsible for the emission of sources. Written in Python, AGNfitter makes use of a large library of theoretical, empirical, and semi-empirical models to characterize both the nuclear and host galaxy emission simultaneously. The model consists of four physical emission components: an accretion disk, a torus of AGN heated dust, stellar populations, and cold dust in star forming regions. AGNfitter determines the posterior distributions of numerous parameters that govern the physics of AGN with a fully Bayesian treatment of errors and parameter degeneracies, allowing one to infer integrated luminosities, dust attenuation parameters, stellar masses, and star formation rates.
Unraveling multiple changes in complex climate time series using Bayesian inference
Berner, Nadine; Trauth, Martin H.; Holschneider, Matthias
2016-04-01
Change points in time series are perceived as heterogeneities in the statistical or dynamical characteristics of observations. Unraveling such transitions yields essential information for the understanding of the observed system. The precise detection and basic characterization of underlying changes is therefore of particular importance in environmental sciences. We present a kernel-based Bayesian inference approach to investigate direct as well as indirect climate observations for multiple generic transition events. In order to develop a diagnostic approach designed to capture a variety of natural processes, the basic statistical features of central tendency and dispersion are used to locally approximate a complex time series by a generic transition model. A Bayesian inversion approach is developed to robustly infer on the location and the generic patterns of such a transition. To systematically investigate time series for multiple changes occurring at different temporal scales, the Bayesian inversion is extended to a kernel-based inference approach. By introducing basic kernel measures, the kernel inference results are composed into a proxy probability to a posterior distribution of multiple transitions. Thus, based on a generic transition model a probability expression is derived that is capable to indicate multiple changes within a complex time series. We discuss the method's performance by investigating direct and indirect climate observations. The approach is applied to environmental time series (about 100 a), from the weather station in Tuscaloosa, Alabama, and confirms documented instrumentation changes. Moreover, the approach is used to investigate a set of complex terrigenous dust records from the ODP sites 659, 721/722 and 967 interpreted as climate indicators of the African region of the Plio-Pleistocene period (about 5 Ma). The detailed inference unravels multiple transitions underlying the indirect climate observations coinciding with established
Temperature-emissivity separation for LWIR sensing using MCMC
Ash, Joshua N.; Meola, Joseph
2016-05-01
Signal processing for long-wave infrared (LWIR) sensing is made complicated by unknown surface temperatures in a scene which impact measured radiance through temperature-dependent black-body radiation of in-scene objects. The unknown radiation levels give rise to the temperature-emissivity separation (TES) problem describing the intrinsic ambiguity between an object's temperature and emissivity. In this paper we present a novel Bayesian TES algorithm that produces a probabilistic posterior estimate of a material's unknown temperature and emissivity. The statistical uncertainty characterization provided by the algorithm is important for subsequent signal processing tasks such as classification and sensor fusion. The algorithm is based on Markov chain Monte Carlo (MCMC) methods and exploits conditional linearity to achieve efficient block-wise Gibbs sampling for rapid inference. In contrast to existing work, the algorithm optimally incorporates prior knowledge about inscene materials via Bayesian priors which may optionally be learned using training data and a material database. Examples demonstrate up to an order of magnitude reduction in error compared to classical filter-based TES methods.
Directory of Open Access Journals (Sweden)
Marcelo Costa Souza
2004-10-01
, in which the parameters are regarded as fixed quantities, not assuming changes in time. This work aimed at fitting of autoregressive models with order 2, AR(2, specified in the form of dynamic linear models using Bayesian inference. Monte Carlo Markov Chain (MCMC was used to obtain the estimates, via Gibbs Sampler and Forward Filtering Backward Sampling (FFBS. To evaluate the fitting, two chains with 8000 iterations each, and three different series sizes, with 200, 500 and 800 observations were sampled. The Canadian lynx series (NICHOLLS and QUIN, 1982, was fitted with different discount factors (0.90, 0.95 and 0.99, and the resulting mean square error was used to compare to the fitting using classical inference. A better fit for the model with discount equal to 0.99 was observed. One-step ahead forecasts were done to check the estimates obtained for the updated and the backward sampled series. To the latter, the fitting was better and mean square error lower. In general, it was observed a good fit of the AR(2 dynamic models via Bayesian inference, and this gives a better understanding of the fitting in different situations, both simulated and real.
Truth, models, model sets, AIC, and multimodel inference: a Bayesian perspective
Barker, Richard J.; Link, William A.
2015-01-01
Statistical inference begins with viewing data as realizations of stochastic processes. Mathematical models provide partial descriptions of these processes; inference is the process of using the data to obtain a more complete description of the stochastic processes. Wildlife and ecological scientists have become increasingly concerned with the conditional nature of model-based inference: what if the model is wrong? Over the last 2 decades, Akaike's Information Criterion (AIC) has been widely and increasingly used in wildlife statistics for 2 related purposes, first for model choice and second to quantify model uncertainty. We argue that for the second of these purposes, the Bayesian paradigm provides the natural framework for describing uncertainty associated with model choice and provides the most easily communicated basis for model weighting. Moreover, Bayesian arguments provide the sole justification for interpreting model weights (including AIC weights) as coherent (mathematically self consistent) model probabilities. This interpretation requires treating the model as an exact description of the data-generating mechanism. We discuss the implications of this assumption, and conclude that more emphasis is needed on model checking to provide confidence in the quality of inference.
Gaffney, Jim A; Sonnad, Vijay; Libby, Stephen B
2013-01-01
First principles microphysics models are essential to the design and analysis of high energy density physics experiments. Using experimental data to investigate the underlying physics is also essential, particularly when simulations and experiments are not consistent with each other. This is a difficult task, due to the large number of physical models that play a role, and due to the complex (and as a result, noisy) nature of the experiments. This results in a large number of parameters that make any inference a daunting task; it is also very important to consistently treat both experimental and prior understanding of the problem. In this paper we present a Bayesian method that includes both these effects, and allows the inference of a set of modifiers which have been constructed to give information about microphysics models from experimental data. We pay particular attention to radiation transport models. The inference takes into account a large set of experimental parameters and an estimate of the prior kno...
Planetary micro-rover operations on Mars using a Bayesian framework for inference and control
Post, Mark A.; Li, Junquan; Quine, Brendan M.
2016-03-01
With the recent progress toward the application of commercially-available hardware to small-scale space missions, it is now becoming feasible for groups of small, efficient robots based on low-power embedded hardware to perform simple tasks on other planets in the place of large-scale, heavy and expensive robots. In this paper, we describe design and programming of the Beaver micro-rover developed for Northern Light, a Canadian initiative to send a small lander and rover to Mars to study the Martian surface and subsurface. For a small, hardware-limited rover to handle an uncertain and mostly unknown environment without constant management by human operators, we use a Bayesian network of discrete random variables as an abstraction of expert knowledge about the rover and its environment, and inference operations for control. A framework for efficient construction and inference into a Bayesian network using only the C language and fixed-point mathematics on embedded hardware has been developed for the Beaver to make intelligent decisions with minimal sensor data. We study the performance of the Beaver as it probabilistically maps a simple outdoor environment with sensor models that include uncertainty. Results indicate that the Beaver and other small and simple robotic platforms can make use of a Bayesian network to make intelligent decisions in uncertain planetary environments.
Directory of Open Access Journals (Sweden)
Michael J McGeachie
2014-06-01
Full Text Available Bayesian Networks (BN have been a popular predictive modeling formalism in bioinformatics, but their application in modern genomics has been slowed by an inability to cleanly handle domains with mixed discrete and continuous variables. Existing free BN software packages either discretize continuous variables, which can lead to information loss, or do not include inference routines, which makes prediction with the BN impossible. We present CGBayesNets, a BN package focused around prediction of a clinical phenotype from mixed discrete and continuous variables, which fills these gaps. CGBayesNets implements Bayesian likelihood and inference algorithms for the conditional Gaussian Bayesian network (CGBNs formalism, one appropriate for predicting an outcome of interest from, e.g., multimodal genomic data. We provide four different network learning algorithms, each making a different tradeoff between computational cost and network likelihood. CGBayesNets provides a full suite of functions for model exploration and verification, including cross validation, bootstrapping, and AUC manipulation. We highlight several results obtained previously with CGBayesNets, including predictive models of wood properties from tree genomics, leukemia subtype classification from mixed genomic data, and robust prediction of intensive care unit mortality outcomes from metabolomic profiles. We also provide detailed example analysis on public metabolomic and gene expression datasets. CGBayesNets is implemented in MATLAB and available as MATLAB source code, under an Open Source license and anonymous download at http://www.cgbayesnets.com.
McGeachie, Michael J; Chang, Hsun-Hsien; Weiss, Scott T
2014-06-01
Bayesian Networks (BN) have been a popular predictive modeling formalism in bioinformatics, but their application in modern genomics has been slowed by an inability to cleanly handle domains with mixed discrete and continuous variables. Existing free BN software packages either discretize continuous variables, which can lead to information loss, or do not include inference routines, which makes prediction with the BN impossible. We present CGBayesNets, a BN package focused around prediction of a clinical phenotype from mixed discrete and continuous variables, which fills these gaps. CGBayesNets implements Bayesian likelihood and inference algorithms for the conditional Gaussian Bayesian network (CGBNs) formalism, one appropriate for predicting an outcome of interest from, e.g., multimodal genomic data. We provide four different network learning algorithms, each making a different tradeoff between computational cost and network likelihood. CGBayesNets provides a full suite of functions for model exploration and verification, including cross validation, bootstrapping, and AUC manipulation. We highlight several results obtained previously with CGBayesNets, including predictive models of wood properties from tree genomics, leukemia subtype classification from mixed genomic data, and robust prediction of intensive care unit mortality outcomes from metabolomic profiles. We also provide detailed example analysis on public metabolomic and gene expression datasets. CGBayesNets is implemented in MATLAB and available as MATLAB source code, under an Open Source license and anonymous download at http://www.cgbayesnets.com.
BioEM: GPU-accelerated computing of Bayesian inference of electron microscopy images
Cossio, Pilar; Baruffa, Fabio; Rampp, Markus; Lindenstruth, Volker; Hummer, Gerhard
2016-01-01
In cryo-electron microscopy (EM), molecular structures are determined from large numbers of projection images of individual particles. To harness the full power of this single-molecule information, we use the Bayesian inference of EM (BioEM) formalism. By ranking structural models using posterior probabilities calculated for individual images, BioEM in principle addresses the challenge of working with highly dynamic or heterogeneous systems not easily handled in traditional EM reconstruction. However, the calculation of these posteriors for large numbers of particles and models is computationally demanding. Here we present highly parallelized, GPU-accelerated computer software that performs this task efficiently. Our flexible formulation employs CUDA, OpenMP, and MPI parallelization combined with both CPU and GPU computing. The resulting BioEM software scales nearly ideally both on pure CPU and on CPU+GPU architectures, thus enabling Bayesian analysis of tens of thousands of images in a reasonable time. The g...
Evidence Cross-Validation and Bayesian Inference of MAST Plasma Equilibria
von Nessi, G T; Svensson, J; Appel, L
2011-01-01
In this paper, current profiles for plasma discharges on the Mega-Ampere Spherical Tokamak (MAST) are directly calculated from pickup coil, flux loop and Motional-Stark Effect (MSE) observations via methods based in the statistical theory of Bayesian analysis. By representing toroidal plasma current as a series of axisymmetric current beams with rectangular cross-section and inferring the current for each one of these beams, flux-surface geometry and q-profiles are subsequently calculated by elementary application of Biot-Savart's law. The use of this plasma model in the context of Bayesian analysis was pioneered by Svensson and Werner on the Joint-European Tokamak (JET) [J. Svensson and A. Werner. Current tomography for axisymmetric plasmas. {\\em Plasma Physics and Controlled Fusion}, 50(8):085002, 2008]. In this framework, linear forward models are used to generate diagnostic predictions, and the probability distribution for the currents in the collection of plasma beams was subsequently calculated directly...
DIP -- Diagnostics for Insufficiencies of Posterior calculations in Bayesian signal inference
Dorn, Sebastian; lin, Torsten A Enß
2013-01-01
We present an error-diagnostic validation method for posterior distributions in Bayesian signal inference. It transfers deviations from the correct posterior into characteristic deviations from a uniform distribution of a quantity constructed for this purpose. We show that this method is able to reveal and discriminate several kinds of numerical and approximation errors. For this we present a number of analytical examples of posteriors with incorrect variance, skewness, position of the maximum, or normalization. We show further how this test can be applied to multidimensional signals.
Bayesian inference of the initial conditions from large-scale structure surveys
Leclercq, Florent
2016-10-01
Analysis of three-dimensional cosmological surveys has the potential to answer outstanding questions on the initial conditions from which structure appeared, and therefore on the very high energy physics at play in the early Universe. We report on recently proposed statistical data analysis methods designed to study the primordial large-scale structure via physical inference of the initial conditions in a fully Bayesian framework, and applications to the Sloan Digital Sky Survey data release 7. We illustrate how this approach led to a detailed characterization of the dynamic cosmic web underlying the observed galaxy distribution, based on the tidal environment.
Bayesian inference of the initial conditions from large-scale structure surveys
Leclercq, Florent
2014-01-01
Analysis of three-dimensional cosmological surveys has the potential to answer outstanding questions on the initial conditions from which structure appeared, and therefore on the very high energy physics at play in the early Universe. We report on recently proposed statistical data analysis methods designed to study the primordial large-scale structure via physical inference of the initial conditions in a fully Bayesian framework, and applications to the Sloan Digital Sky Survey data release 7. We illustrate how this approach led to a detailed characterization of the dynamic cosmic web underlying the observed galaxy distribution, based on the tidal environment.
DEFF Research Database (Denmark)
Møller, Jesper; Jacobsen, Robert Dahl
We introduce a promising alternative to the usual hidden Markov tree model for Gaussian wavelet coefficients, where their variances are specified by the hidden states and take values in a finite set. In our new model, the hidden states have a similar dependence structure but they are jointly...... Gaussian, and the wavelet coefficients have log-variances equal to the hidden states. We argue why this provides a flexible model where frequentist and Bayesian inference procedures become tractable for estimation of parameters and hidden states. Our methodology is illustrated for denoising and edge...
Bayesian functional integral method for inferring continuous data from discrete measurements.
Heuett, William J; Miller, Bernard V; Racette, Susan B; Holloszy, John O; Chow, Carson C; Periwal, Vipul
2012-02-01
Inference of the insulin secretion rate (ISR) from C-peptide measurements as a quantification of pancreatic β-cell function is clinically important in diseases related to reduced insulin sensitivity and insulin action. ISR derived from C-peptide concentration is an example of nonparametric Bayesian model selection where a proposed ISR time-course is considered to be a "model". An inferred value of inaccessible continuous variables from discrete observable data is often problematic in biology and medicine, because it is a priori unclear how robust the inference is to the deletion of data points, and a closely related question, how much smoothness or continuity the data actually support. Predictions weighted by the posterior distribution can be cast as functional integrals as used in statistical field theory. Functional integrals are generally difficult to evaluate, especially for nonanalytic constraints such as positivity of the estimated parameters. We propose a computationally tractable method that uses the exact solution of an associated likelihood function as a prior probability distribution for a Markov-chain Monte Carlo evaluation of the posterior for the full model. As a concrete application of our method, we calculate the ISR from actual clinical C-peptide measurements in human subjects with varying degrees of insulin sensitivity. Our method demonstrates the feasibility of functional integral Bayesian model selection as a practical method for such data-driven inference, allowing the data to determine the smoothing timescale and the width of the prior probability distribution on the space of models. In particular, our model comparison method determines the discrete time-step for interpolation of the unobservable continuous variable that is supported by the data. Attempts to go to finer discrete time-steps lead to less likely models. PMID:22325261
Evidence cross-validation and Bayesian inference of MAST plasma equilibria
Energy Technology Data Exchange (ETDEWEB)
Nessi, G. T. von; Hole, M. J. [Research School of Physical Sciences and Engineering, Australian National University, Canberra ACT 0200 (Australia); Svensson, J. [Max-Planck-Institut fuer Plasmaphysik, D-17491 Greifswald (Germany); Appel, L. [EURATOM/CCFE Fusion Association, Culham Science Centre, Abingdon, Oxon OX14 3DB (United Kingdom)
2012-01-15
In this paper, current profiles for plasma discharges on the mega-ampere spherical tokamak are directly calculated from pickup coil, flux loop, and motional-Stark effect observations via methods based in the statistical theory of Bayesian analysis. By representing toroidal plasma current as a series of axisymmetric current beams with rectangular cross-section and inferring the current for each one of these beams, flux-surface geometry and q-profiles are subsequently calculated by elementary application of Biot-Savart's law. The use of this plasma model in the context of Bayesian analysis was pioneered by Svensson and Werner on the joint-European tokamak [Svensson and Werner,Plasma Phys. Controlled Fusion 50(8), 085002 (2008)]. In this framework, linear forward models are used to generate diagnostic predictions, and the probability distribution for the currents in the collection of plasma beams was subsequently calculated directly via application of Bayes' formula. In this work, we introduce a new diagnostic technique to identify and remove outlier observations associated with diagnostics falling out of calibration or suffering from an unidentified malfunction. These modifications enable a good agreement between Bayesian inference of the last-closed flux-surface with other corroborating data, such as that from force balance considerations using EFIT++[Appel et al., ''A unified approach to equilibrium reconstruction'' Proceedings of the 33rd EPS Conference on Plasma Physics (Rome, Italy, 2006)]. In addition, this analysis also yields errors on the plasma current profile and flux-surface geometry as well as directly predicting the Shafranov shift of the plasma core.
Sraj, Ihab
2015-10-22
This paper addresses model dimensionality reduction for Bayesian inference based on prior Gaussian fields with uncertainty in the covariance function hyper-parameters. The dimensionality reduction is traditionally achieved using the Karhunen-Loève expansion of a prior Gaussian process assuming covariance function with fixed hyper-parameters, despite the fact that these are uncertain in nature. The posterior distribution of the Karhunen-Loève coordinates is then inferred using available observations. The resulting inferred field is therefore dependent on the assumed hyper-parameters. Here, we seek to efficiently estimate both the field and covariance hyper-parameters using Bayesian inference. To this end, a generalized Karhunen-Loève expansion is derived using a coordinate transformation to account for the dependence with respect to the covariance hyper-parameters. Polynomial Chaos expansions are employed for the acceleration of the Bayesian inference using similar coordinate transformations, enabling us to avoid expanding explicitly the solution dependence on the uncertain hyper-parameters. We demonstrate the feasibility of the proposed method on a transient diffusion equation by inferring spatially-varying log-diffusivity fields from noisy data. The inferred profiles were found closer to the true profiles when including the hyper-parameters’ uncertainty in the inference formulation.
Understanding the Scalability of Bayesian Network Inference Using Clique Tree Growth Curves
Mengshoel, Ole J.
2010-01-01
One of the main approaches to performing computation in Bayesian networks (BNs) is clique tree clustering and propagation. The clique tree approach consists of propagation in a clique tree compiled from a Bayesian network, and while it was introduced in the 1980s, there is still a lack of understanding of how clique tree computation time depends on variations in BN size and structure. In this article, we improve this understanding by developing an approach to characterizing clique tree growth as a function of parameters that can be computed in polynomial time from BNs, specifically: (i) the ratio of the number of a BN s non-root nodes to the number of root nodes, and (ii) the expected number of moral edges in their moral graphs. Analytically, we partition the set of cliques in a clique tree into different sets, and introduce a growth curve for the total size of each set. For the special case of bipartite BNs, there are two sets and two growth curves, a mixed clique growth curve and a root clique growth curve. In experiments, where random bipartite BNs generated using the BPART algorithm are studied, we systematically increase the out-degree of the root nodes in bipartite Bayesian networks, by increasing the number of leaf nodes. Surprisingly, root clique growth is well-approximated by Gompertz growth curves, an S-shaped family of curves that has previously been used to describe growth processes in biology, medicine, and neuroscience. We believe that this research improves the understanding of the scaling behavior of clique tree clustering for a certain class of Bayesian networks; presents an aid for trade-off studies of clique tree clustering using growth curves; and ultimately provides a foundation for benchmarking and developing improved BN inference and machine learning algorithms.
Bickel, David R
2011-01-01
In statistical practice, whether a Bayesian or frequentist approach is used in inference depends not only on the availability of prior information but also on the attitude taken toward partial prior information, with frequentists tending to be more cautious than Bayesians. The proposed framework defines that attitude in terms of a specified amount of caution, thereby enabling data analysis at the level of caution desired and on the basis of any prior information. The caution parameter represents the attitude toward partial prior information in much the same way as a loss function represents the attitude toward risk. When there is very little prior information and nonzero caution, the resulting inferences correspond to those of the candidate confidence intervals and p-values that are most similar to the credible intervals and hypothesis probabilities of the specified Bayesian posterior. On the other hand, in the presence of a known physical distribution of the parameter, inferences are based only on the corres...
Dagne, Getachew; Huang, Yangxin
2016-01-01
Censored data are characteristics of many bioassays in HIV/AIDS studies where assays may not be sensitive enough to determine gradations in viral load determination among those below a detectable threshold. Not accounting for such left-censoring appropriately can lead to biased parameter estimates in most data analysis. To properly adjust for left-censoring, this paper presents an extension of the Tobit model for fitting nonlinear dynamic mixed-effects models with skew distributions. Such extensions allow one to specify the conditional distributions for viral load response to account for left-censoring, skewness and heaviness in the tails of the distributions of the response variable. A Bayesian modeling approach via Markov Chain Monte Carlo (MCMC) algorithm is used to estimate model parameters. The proposed methods are illustrated using real data from an HIV/AIDS study. PMID:22992288
Dagne, Getachew; Huang, Yangxin
2012-01-01
Censored data are characteristics of many bioassays in HIV/AIDS studies where assays may not be sensitive enough to determine gradations in viral load determination among those below a detectable threshold. Not accounting for such left-censoring appropriately can lead to biased parameter estimates in most data analysis. To properly adjust for left-censoring, this paper presents an extension of the Tobit model for fitting nonlinear dynamic mixed-effects models with skew distributions. Such extensions allow one to specify the conditional distributions for viral load response to account for left-censoring, skewness and heaviness in the tails of the distributions of the response variable. A Bayesian modeling approach via Markov Chain Monte Carlo (MCMC) algorithm is used to estimate model parameters. The proposed methods are illustrated using real data from an HIV/AIDS study. PMID:22992288
Probabilistic Damage Characterization Using the Computationally-Efficient Bayesian Approach
Warner, James E.; Hochhalter, Jacob D.
2016-01-01
This work presents a computationally-ecient approach for damage determination that quanti es uncertainty in the provided diagnosis. Given strain sensor data that are polluted with measurement errors, Bayesian inference is used to estimate the location, size, and orientation of damage. This approach uses Bayes' Theorem to combine any prior knowledge an analyst may have about the nature of the damage with information provided implicitly by the strain sensor data to form a posterior probability distribution over possible damage states. The unknown damage parameters are then estimated based on samples drawn numerically from this distribution using a Markov Chain Monte Carlo (MCMC) sampling algorithm. Several modi cations are made to the traditional Bayesian inference approach to provide signi cant computational speedup. First, an ecient surrogate model is constructed using sparse grid interpolation to replace a costly nite element model that must otherwise be evaluated for each sample drawn with MCMC. Next, the standard Bayesian posterior distribution is modi ed using a weighted likelihood formulation, which is shown to improve the convergence of the sampling process. Finally, a robust MCMC algorithm, Delayed Rejection Adaptive Metropolis (DRAM), is adopted to sample the probability distribution more eciently. Numerical examples demonstrate that the proposed framework e ectively provides damage estimates with uncertainty quanti cation and can yield orders of magnitude speedup over standard Bayesian approaches.
Inference of reactive transport model parameters using a Bayesian multivariate approach
Carniato, Luca; Schoups, Gerrit; van de Giesen, Nick
2014-08-01
Parameter estimation of subsurface transport models from multispecies data requires the definition of an objective function that includes different types of measurements. Common approaches are weighted least squares (WLS), where weights are specified a priori for each measurement, and weighted least squares with weight estimation (WLS(we)) where weights are estimated from the data together with the parameters. In this study, we formulate the parameter estimation task as a multivariate Bayesian inference problem. The WLS and WLS(we) methods are special cases in this framework, corresponding to specific prior assumptions about the residual covariance matrix. The Bayesian perspective allows for generalizations to cases where residual correlation is important and for efficient inference by analytically integrating out the variances (weights) and selected covariances from the joint posterior. Specifically, the WLS and WLS(we) methods are compared to a multivariate (MV) approach that accounts for specific residual correlations without the need for explicit estimation of the error parameters. When applied to inference of reactive transport model parameters from column-scale data on dissolved species concentrations, the following results were obtained: (1) accounting for residual correlation between species provides more accurate parameter estimation for high residual correlation levels whereas its influence for predictive uncertainty is negligible, (2) integrating out the (co)variances leads to an efficient estimation of the full joint posterior with a reduced computational effort compared to the WLS(we) method, and (3) in the presence of model structural errors, none of the methods is able to identify the correct parameter values.
Albert, Carlo; Ulzega, Simone; Stoop, Ruedi
2016-04-01
Parameter inference is a fundamental problem in data-driven modeling. Given observed data that is believed to be a realization of some parameterized model, the aim is to find parameter values that are able to explain the observed data. In many situations, the dominant sources of uncertainty must be included into the model for making reliable predictions. This naturally leads to stochastic models. Stochastic models render parameter inference much harder, as the aim then is to find a distribution of likely parameter values. In Bayesian statistics, which is a consistent framework for data-driven learning, this so-called posterior distribution can be used to make probabilistic predictions. We propose a novel, exact, and very efficient approach for generating posterior parameter distributions for stochastic differential equation models calibrated to measured time series. The algorithm is inspired by reinterpreting the posterior distribution as a statistical mechanics partition function of an object akin to a polymer, where the measurements are mapped on heavier beads compared to those of the simulated data. To arrive at distribution samples, we employ a Hamiltonian Monte Carlo approach combined with a multiple time-scale integration. A separation of time scales naturally arises if either the number of measurement points or the number of simulation points becomes large. Furthermore, at least for one-dimensional problems, we can decouple the harmonic modes between measurement points and solve the fastest part of their dynamics analytically. Our approach is applicable to a wide range of inference problems and is highly parallelizable.
Alsing, Justin; Jaffe, Andrew H
2016-01-01
We apply two Bayesian hierarchical inference schemes to infer shear power spectra, shear maps and cosmological parameters from the CFHTLenS weak lensing survey - the first application of this method to data. In the first approach, we sample the joint posterior distribution of the shear maps and power spectra by Gibbs sampling, with minimal model assumptions. In the second approach, we sample the joint posterior of the shear maps and cosmological parameters, providing a new, accurate and principled approach to cosmological parameter inference from cosmic shear data. As a first demonstration on data we perform a 2-bin tomographic analysis to constrain cosmological parameters and investigate the possibility of photometric redshift bias in the CFHTLenS data. Under the baseline $\\Lambda$CDM model we constrain $S_8 = \\sigma_8(\\Omega_\\mathrm{m}/0.3)^{0.5} = 0.67 ^{\\scriptscriptstyle+ 0.03 }_{\\scriptscriptstyle- 0.03 }$ $(68\\%)$, consistent with previous CFHTLenS analysis but in tension with Planck. Adding neutrino m...
Elshall, A. S.; Ye, M.; Niu, G. Y.; Barron-Gafford, G.
2015-12-01
Models in biogeoscience involve uncertainties in observation data, model inputs, model structure, model processes and modeling scenarios. To accommodate for different sources of uncertainty, multimodal analysis such as model combination, model selection, model elimination or model discrimination are becoming more popular. To illustrate theoretical and practical challenges of multimodal analysis, we use an example about microbial soil respiration modeling. Global soil respiration releases more than ten times more carbon dioxide to the atmosphere than all anthropogenic emissions. Thus, improving our understanding of microbial soil respiration is essential for improving climate change models. This study focuses on a poorly understood phenomena, which is the soil microbial respiration pulses in response to episodic rainfall pulses (the "Birch effect"). We hypothesize that the "Birch effect" is generated by the following three mechanisms. To test our hypothesis, we developed and assessed five evolving microbial-enzyme models against field measurements from a semiarid Savannah that is characterized by pulsed precipitation. These five model evolve step-wise such that the first model includes none of these three mechanism, while the fifth model includes the three mechanisms. The basic component of Bayesian multimodal analysis is the estimation of marginal likelihood to rank the candidate models based on their overall likelihood with respect to observation data. The first part of the study focuses on using this Bayesian scheme to discriminate between these five candidate models. The second part discusses some theoretical and practical challenges, which are mainly the effect of likelihood function selection and the marginal likelihood estimation methods on both model ranking and Bayesian model averaging. The study shows that making valid inference from scientific data is not a trivial task, since we are not only uncertain about the candidate scientific models, but also about
Validi, AbdoulAhad
2013-01-01
This study introduces a non-intrusive approach in the context of low-rank separated representation to construct a surrogate of high-dimensional stochastic functions, e.g., PDEs/ODEs, in order to decrease the computational cost of Markov Chain Monte Carlo simulations in Bayesian inference. The surrogate model is constructed via a regularized alternative least-square regression with Tikhonov regularization using a roughening matrix computing the gradient of the solution, in conjunction with a perturbation-based error indicator to detect optimal model complexities. The model approximates a vector of a continuous solution at discrete values of a physical variable. The required number of random realizations to achieve a successful approximation linearly depends on the function dimensionality. The computational cost of the model construction is quadratic in the number of random inputs, which potentially tackles the curse of dimensionality in high-dimensional stochastic functions. Furthermore, this vector valued sep...
Hill, T; Minier, V; Burton, M G; Cunningham, M R
2008-01-01
Concatenating data from the millimetre regime to the infrared, we have performed spectral energy distribution modelling for 227 of the 405 millimetre continuum sources of Hill et al. (2005) which are thought to contain young massive stars in the earliest stages of their formation. Three main parameters are extracted from the fits: temperature, mass and luminosity. The method employed was Bayesian inference, which allows a statistically probable range of suitable values for each parameter to be drawn for each individual protostellar candidate. This is the first application of this method to massive star formation. The cumulative distribution plots of the SED modelled parameters in this work indicate that collectively, the sources without methanol maser and/or radio continuum associations (MM-only cores) display similar characteristics to those of high mass star formation regions. Attributing significance to the marginal distinctions between the MM-only cores and the high-mass star formation sample we draw hypo...
cosmoabc: Likelihood-free inference via Population Monte Carlo Approximate Bayesian Computation
Ishida, E E O; Penna-Lima, M; Cisewski, J; de Souza, R S; Trindade, A M M; Cameron, E
2015-01-01
Approximate Bayesian Computation (ABC) enables parameter inference for complex physical systems in cases where the true likelihood function is unknown, unavailable, or computationally too expensive. It relies on the forward simulation of mock data and comparison between observed and synthetic catalogues. Here we present cosmoabc, a Python ABC sampler featuring a Population Monte Carlo (PMC) variation of the original ABC algorithm, which uses an adaptive importance sampling scheme. The code is very flexible and can be easily coupled to an external simulator, while allowing to incorporate arbitrary distance and prior functions. As an example of practical application, we coupled cosmoabc with the numcosmo library and demonstrate how it can be used to estimate posterior probability distributions over cosmological parameters based on measurements of galaxy clusters number counts without computing the likelihood function. cosmoabc is published under the GPLv3 license on PyPI and GitHub and documentation is availabl...
Chakraborty, Shubhankar; Roy Chaudhuri, Partha; Das, Prasanta Kr.
2016-07-01
In this communication, a novel optical technique has been proposed for the reconstruction of the shape of a Taylor bubble using measurements from multiple arrays of optical sensors. The deviation of an optical beam passing through the bubble depends on the contour of bubble surface. A theoretical model of the deviation of a beam during the traverse of a Taylor bubble through it has been developed. Using this model and the time history of the deviation captured by the sensor array, the bubble shape has been reconstructed. The reconstruction has been performed using an inverse algorithm based on Bayesian inference technique and Markov chain Monte Carlo sampling algorithm. The reconstructed nose shape has been compared with the true shape, extracted through image processing of high speed images. Finally, an error analysis has been performed to pinpoint the sources of the errors.
Mocapy++ - A toolkit for inference and learning in dynamic Bayesian networks
Directory of Open Access Journals (Sweden)
Hamelryck Thomas
2010-03-01
Full Text Available Abstract Background Mocapy++ is a toolkit for parameter learning and inference in dynamic Bayesian networks (DBNs. It supports a wide range of DBN architectures and probability distributions, including distributions from directional statistics (the statistics of angles, directions and orientations. Results The program package is freely available under the GNU General Public Licence (GPL from SourceForge http://sourceforge.net/projects/mocapy. The package contains the source for building the Mocapy++ library, several usage examples and the user manual. Conclusions Mocapy++ is especially suitable for constructing probabilistic models of biomolecular structure, due to its support for directional statistics. In particular, it supports the Kent distribution on the sphere and the bivariate von Mises distribution on the torus. These distributions have proven useful to formulate probabilistic models of protein and RNA structure in atomic detail.
Optimal modeling of 1D azimuth correlations in the context of Bayesian inference
De Kock, Michiel B; Trainor, Thomas A
2015-01-01
Analysis and interpretation of spectrum and correlation data from high-energy nuclear collisions is currently controversial because two opposing physics narratives derive contradictory implications from the same data-one narrative claiming collision dynamics is dominated by dijet production and projectile-nucleon fragmentation, the other claiming collision dynamics is dominated by a dense, flowing QCD medium. Opposing interpretations seem to be supported by alternative data models, and current model-comparison schemes are unable to distinguish between them. There is clearly need for a convincing new methodology to break the deadlock. In this study we introduce Bayesian Inference (BI) methods applied to angular correlation data as a basis to evaluate competing data models. For simplicity the data considered are projections of 2D angular correlations onto 1D azimuth from three centrality classes of 200 GeV Au-Au collisions. We consider several data models typical of current model choices, including Fourier seri...
Chakraborty, Shubhankar; Roy Chaudhuri, Partha; Das, Prasanta Kr
2016-07-01
In this communication, a novel optical technique has been proposed for the reconstruction of the shape of a Taylor bubble using measurements from multiple arrays of optical sensors. The deviation of an optical beam passing through the bubble depends on the contour of bubble surface. A theoretical model of the deviation of a beam during the traverse of a Taylor bubble through it has been developed. Using this model and the time history of the deviation captured by the sensor array, the bubble shape has been reconstructed. The reconstruction has been performed using an inverse algorithm based on Bayesian inference technique and Markov chain Monte Carlo sampling algorithm. The reconstructed nose shape has been compared with the true shape, extracted through image processing of high speed images. Finally, an error analysis has been performed to pinpoint the sources of the errors.
Improving PWR core simulations by Monte Carlo uncertainty analysis and Bayesian inference
Castro, Emilio; Buss, Oliver; Garcia-Herranz, Nuria; Hoefer, Axel; Porsch, Dieter
2016-01-01
A Monte Carlo-based Bayesian inference model is applied to the prediction of reactor operation parameters of a PWR nuclear power plant. In this non-perturbative framework, high-dimensional covariance information describing the uncertainty of microscopic nuclear data is combined with measured reactor operation data in order to provide statistically sound, well founded uncertainty estimates of integral parameters, such as the boron letdown curve and the burnup-dependent reactor power distribution. The performance of this methodology is assessed in a blind test approach, where we use measurements of a given reactor cycle to improve the prediction of the subsequent cycle. As it turns out, the resulting improvement of the prediction quality is impressive. In particular, the prediction uncertainty of the boron letdown curve, which is of utmost importance for the planning of the reactor cycle length, can be reduced by one order of magnitude by including the boron concentration measurement information of the previous...
DEFF Research Database (Denmark)
Heller, Rasmus; Lorenzen, Eline D.; Okello, J.B.A;
2008-01-01
pandemic in the late 1800s, but little is known about the earlier demographic history of the species. We analysed genetic variation at 17 microsatellite loci and a 302-bp fragment of the mitochondrial DNA control region to infer past demographic changes in buffalo populations from East Africa. Two Bayesian...
Giuntini, Michael E.; Giuntini, Ronald E.
1991-01-01
A Bayesian inference process for system logistical planning is presented which provides a method for incorporating actual failures with prediction data for an ongoing and improving reliability estimates. The process uses the Weibull distribution, and provides a means for examining and updating logistical and maintenance support needs.
Duforet-Frebourg, Nicolas; Blum, Michael G B
2014-04-01
Patterns of isolation-by-distance (IBD) arise when population differentiation increases with increasing geographic distances. Patterns of IBD are usually caused by local spatial dispersal, which explains why differences of allele frequencies between populations accumulate with distance. However, spatial variations of demographic parameters such as migration rate or population density can generate nonstationary patterns of IBD where the rate at which genetic differentiation accumulates varies across space. To characterize nonstationary patterns of IBD, we infer local genetic differentiation based on Bayesian kriging. Local genetic differentiation for a sampled population is defined as the average genetic differentiation between the sampled population and fictive neighboring populations. To avoid defining populations in advance, the method can also be applied at the scale of individuals making it relevant for landscape genetics. Inference of local genetic differentiation relies on a matrix of pairwise similarity or dissimilarity between populations or individuals such as matrices of FST between pairs of populations. Simulation studies show that maps of local genetic differentiation can reveal barriers to gene flow but also other patterns such as continuous variations of gene flow across habitat. The potential of the method is illustrated with two datasets: single nucleotide polymorphisms from human Swedish populations and dominant markers for alpine plant species.
Bayesian inference for a wave-front model of the neolithization of Europe.
Baggaley, Andrew W; Sarson, Graeme R; Shukurov, Anvar; Boys, Richard J; Golightly, Andrew
2012-07-01
We consider a wave-front model for the spread of neolithic culture across Europe, and use Bayesian inference techniques to provide estimates for the parameters within this model, as constrained by radiocarbon data from southern and western Europe. Our wave-front model allows for both an isotropic background spread (incorporating the effects of local geography) and a localized anisotropic spread associated with major waterways. We introduce an innovative numerical scheme to track the wave front, and use Gaussian process emulators to further increase the efficiency of our model, thereby making Markov chain Monte Carlo methods practical. We allow for uncertainty in the fit of our model, and discuss the inferred distribution of the parameter specifying this uncertainty, along with the distributions of the parameters of our wave-front model. We subsequently use predictive distributions, taking account of parameter uncertainty, to identify radiocarbon sites which do not agree well with our model. These sites may warrant further archaeological study or motivate refinements to the model.
Bayesian inference in camera trapping studies for a class of spatial capture-recapture models
Royle, J. Andrew; Karanth, K. Ullas; Gopalaswamy, Arjun M.; Kumar, N. Samba
2009-01-01
We develop a class of models for inference about abundance or density using spatial capture-recapture data from studies based on camera trapping and related methods. The model is a hierarchical model composed of two components: a point process model describing the distribution of individuals in space (or their home range centers) and a model describing the observation of individuals in traps. We suppose that trap- and individual-specific capture probabilities are a function of distance between individual home range centers and trap locations. We show that the models can be regarded as generalized linear mixed models, where the individual home range centers are random effects. We adopt a Bayesian framework for inference under these models using a formulation based on data augmentation. We apply the models to camera trapping data on tigers from the Nagarahole Reserve, India, collected over 48 nights in 2006. For this study, 120 camera locations were used, but cameras were only operational at 30 locations during any given sample occasion. Movement of traps is common in many camera-trapping studies and represents an important feature of the observation model that we address explicitly in our application.
Bayesian Inference of the Composition and Inflation Power of Hot Jupiters
Thorngren, Daniel Peter; Fortney, Jonathan J.
2016-10-01
The radius of a planet for a given mass is the result of its composition and thermal evolutionary history. For cooler giants, where thermal evolution is relatively well-understood, we can infer a planet's bulk composition from its mass, radius, stellar insolation and age, since all being equal, more metal-rich planets are smaller and denser. For inflated hot giants, there is a degeneracy between inferred composition and inflation power. Within a Bayesian framework we examine both groups, beginning with the cool giant planets. Among these, we observe that the internal heavy-element mass correlates well with the total planet mass, and the metal enrichment relative to the parent star is correlated negatively with planet mass. However, it appears that there is not a simple relation between the planet heavy-element mass and stellar metallicity. These fundamental "mass-metallicity" results are consistent with the core accretion model of planet formation. For the hotter inflated gas giants, we estimate the functional dependence of inflation power on stellar insolation by demanding that the same metal to mass relation applies to both cold and hot gas giants. We consider various forms for this relation and the resulting outliers. This inflation power result is robust to assumptions about metal placement within the planet and equation of state because it relies only on matching the two groups of planets. These results serve as a new way to connect models of planet inflation to existing observations of giant planets.
Bayesian inference of biochemical kinetic parameters using the linear noise approximation
Directory of Open Access Journals (Sweden)
Finkenstädt Bärbel
2009-10-01
Full Text Available Abstract Background Fluorescent and luminescent gene reporters allow us to dynamically quantify changes in molecular species concentration over time on the single cell level. The mathematical modeling of their interaction through multivariate dynamical models requires the deveopment of effective statistical methods to calibrate such models against available data. Given the prevalence of stochasticity and noise in biochemical systems inference for stochastic models is of special interest. In this paper we present a simple and computationally efficient algorithm for the estimation of biochemical kinetic parameters from gene reporter data. Results We use the linear noise approximation to model biochemical reactions through a stochastic dynamic model which essentially approximates a diffusion model by an ordinary differential equation model with an appropriately defined noise process. An explicit formula for the likelihood function can be derived allowing for computationally efficient parameter estimation. The proposed algorithm is embedded in a Bayesian framework and inference is performed using Markov chain Monte Carlo. Conclusion The major advantage of the method is that in contrast to the more established diffusion approximation based methods the computationally costly methods of data augmentation are not necessary. Our approach also allows for unobserved variables and measurement error. The application of the method to both simulated and experimental data shows that the proposed methodology provides a useful alternative to diffusion approximation based methods.
ESTIMATE OF THE HYPSOMETRIC RELATIONSHIP WITH NONLINEAR MODELS FITTED BY EMPIRICAL BAYESIAN METHODS
Directory of Open Access Journals (Sweden)
Monica Fabiana Bento Moreira
2015-09-01
Full Text Available In this paper we propose a Bayesian approach to solve the inference problem with restriction on parameters, regarding to nonlinear models used to represent the hypsometric relationship in clones of Eucalyptus sp. The Bayesian estimates are calculated using Monte Carlo Markov Chain (MCMC method. The proposed method was applied to different groups of actual data from which two were selected to show the results. These results were compared to the results achieved by the minimum square method, highlighting the superiority of the Bayesian approach, since this approach always generate the biologically consistent results for hipsometric relationship.
Storz, Jay F; Beaumont, Mark A; Alberts, Susan C
2002-11-01
The purpose of this study was to test for evidence that savannah baboons (Papio cynocephalus) underwent a population expansion in concert with a hypothesized expansion of African human and chimpanzee populations during the late Pleistocene. The rationale is that any type of environmental event sufficient to cause simultaneous population expansions in African humans and chimpanzees would also be expected to affect other codistributed mammals. To test for genetic evidence of population expansion or contraction, we performed a coalescent analysis of multilocus microsatellite data using a hierarchical Bayesian model. Markov chain Monte Carlo (MCMC) simulations were used to estimate the posterior probability density of demographic and genealogical parameters. The model was designed to allow interlocus variation in mutational and demographic parameters, which made it possible to detect aberrant patterns of variation at individual loci that could result from heterogeneity in mutational dynamics or from the effects of selection at linked sites. Results of the MCMC simulations were consistent with zero variance in demographic parameters among loci, but there was evidence for a 10- to 20-fold difference in mutation rate between the most slowly and most rapidly evolving loci. Results of the model provided strong evidence that savannah baboons have undergone a long-term historical decline in population size. The mode of the highest posterior density for the joint distribution of current and ancestral population size indicated a roughly eightfold contraction over the past 1,000 to 250,000 years. These results indicate that savannah baboons apparently did not share a common demographic history with other codistributed primate species. PMID:12411607
Strategies for MCMC computation inquantitative genetics
DEFF Research Database (Denmark)
Waagepetersen, Rasmus; Ibánēz-Escriche, Noelia; Sorensen, Daniel
both in size and regarding the inferences concerning the genetic covariance parameters. Section 2 discusses general strategies for obtaining efficient MCMC algorithms while Section 3 considers these strategies in the specific context of the San Cristobal-Gaudy et al. (1998) model. Section 4 presents...... be implemented relatively straightforwardly. The assumptions of normality, linearity, and variance homogeneity are in many cases not valid. One may then consider generalized linear mixed models where the genetic random effects enter at the level of the linear predictor. San Cristobal-Gaudy et al. (1998) proposed...... likelihood inference is complicated since it is not possible to evaluate explicitly the likelihood function and conventional Gibbs sampling is difficult since the full conditional distributions are not anymore of standard forms. The aim of this paper is to discuss strategies to obtain efficient Markov chain...
qPR: An adaptive partial-report procedure based on Bayesian inference
Baek, Jongsoo; Lesmes, Luis Andres; Lu, Zhong-Lin
2016-01-01
Iconic memory is best assessed with the partial report procedure in which an array of letters appears briefly on the screen and a poststimulus cue directs the observer to report the identity of the cued letter(s). Typically, 6–8 cue delays or 600–800 trials are tested to measure the iconic memory decay function. Here we develop a quick partial report, or qPR, procedure based on a Bayesian adaptive framework to estimate the iconic memory decay function with much reduced testing time. The iconic memory decay function is characterized by an exponential function and a joint probability distribution of its three parameters. Starting with a prior of the parameters, the method selects the stimulus to maximize the expected information gain in the next test trial. It then updates the posterior probability distribution of the parameters based on the observer's response using Bayesian inference. The procedure is reiterated until either the total number of trials or the precision of the parameter estimates reaches a certain criterion. Simulation studies showed that only 100 trials were necessary to reach an average absolute bias of 0.026 and a precision of 0.070 (both in terms of probability correct). A psychophysical validation experiment showed that estimates of the iconic memory decay function obtained with 100 qPR trials exhibited good precision (the half width of the 68.2% credible interval = 0.055) and excellent agreement with those obtained with 1,600 trials of the conventional method of constant stimuli procedure (RMSE = 0.063). Quick partial-report relieves the data collection burden in characterizing iconic memory and makes it possible to assess iconic memory in clinical populations. PMID:27580045
Bazin, Eric; Dawson, Kevin J; Beaumont, Mark A
2010-06-01
We address the problem of finding evidence of natural selection from genetic data, accounting for the confounding effects of demographic history. In the absence of natural selection, gene genealogies should all be sampled from the same underlying distribution, often approximated by a coalescent model. Selection at a particular locus will lead to a modified genealogy, and this motivates a number of recent approaches for detecting the effects of natural selection in the genome as "outliers" under some models. The demographic history of a population affects the sampling distribution of genealogies, and therefore the observed genotypes and the classification of outliers. Since we cannot see genealogies directly, we have to infer them from the observed data under some model of mutation and demography. Thus the accuracy of an outlier-based approach depends to a greater or a lesser extent on the uncertainty about the demographic and mutational model. A natural modeling framework for this type of problem is provided by Bayesian hierarchical models, in which parameters, such as mutation rates and selection coefficients, are allowed to vary across loci. It has proved quite difficult computationally to implement fully probabilistic genealogical models with complex demographies, and this has motivated the development of approximations such as approximate Bayesian computation (ABC). In ABC the data are compressed into summary statistics, and computation of the likelihood function is replaced by simulation of data under the model. In a hierarchical setting one may be interested both in hyperparameters and parameters, and there may be very many of the latter--for example, in a genetic model, these may be parameters describing each of many loci or populations. This poses a problem for ABC in that one then requires summary statistics for each locus, which, if used naively, leads to a consequent difficulty in conditional density estimation. We develop a general method for applying
Xu, Chengcheng; Wang, Wei; Liu, Pan; Li, Zhibin
2015-12-01
This study aimed to develop a real-time crash risk model with limited data in China by using Bayesian meta-analysis and Bayesian inference approach. A systematic review was first conducted by using three different Bayesian meta-analyses, including the fixed effect meta-analysis, the random effect meta-analysis, and the meta-regression. The meta-analyses provided a numerical summary of the effects of traffic variables on crash risks by quantitatively synthesizing results from previous studies. The random effect meta-analysis and the meta-regression produced a more conservative estimate for the effects of traffic variables compared with the fixed effect meta-analysis. Then, the meta-analyses results were used as informative priors for developing crash risk models with limited data. Three different meta-analyses significantly affect model fit and prediction accuracy. The model based on meta-regression can increase the prediction accuracy by about 15% as compared to the model that was directly developed with limited data. Finally, the Bayesian predictive densities analysis was used to identify the outliers in the limited data. It can further improve the prediction accuracy by 5.0%.
Comparing rates of springtail predation by web-building spiders using Bayesian inference.
Welch, Kelton D; Schofield, Matthew R; Chapman, Eric G; Harwood, James D
2014-08-01
A major goal of gut-content analysis is to quantify predation rates by predators in the field, which could provide insights into the mechanisms behind ecosystem structure and function, as well as quantification of ecosystem services provided. However, percentage-positive results from molecular assays are strongly influenced by factors other than predation rate, and thus can only be reliably used to quantify predation rates under very restrictive conditions. Here, we develop two statistical approaches, one using a parametric bootstrap and the other in terms of Bayesian inference, to build upon previous techniques that use DNA decay rates to rank predators by their rate of prey consumption, by allowing a statistical assessment of confidence in the inferred ranking. To demonstrate the utility of this technique in evaluating ecological data, we test web-building spiders for predation on a primary prey item, springtails. Using these approaches we found that an orb-weaving spider consumes springtail prey at a higher rate than a syntopic sheet-weaving spider, despite occupying microhabitats where springtails are less frequently encountered. We suggest that spider-web architecture (orb web vs. sheet web) is a primary determinant of prey-consumption rates within this assemblage of predators, which demonstrates the potential influence of predator foraging behaviour on trophic web structure. We also discuss how additional assumptions can be incorporated into the same analysis to allow broader application of the technique beyond the specific example presented. We believe that such modelling techniques can greatly advance the field of molecular gut-content analysis. PMID:24635414
Raue, Andreas; Theis, Fabian Joachim; Timmer, Jens
2012-01-01
Increasingly complex applications involve large datasets in combination with non-linear and high dimensional mathematical models. In this context, statistical inference is a challenging issue that calls for pragmatic approaches that take advantage of both Bayesian and frequentist methods. The elegance of Bayesian methodology is founded in the propagation of information content provided by experimental data and prior assumptions to the posterior probability distribution of model predictions. However, for complex applications experimental data and prior assumptions potentially constrain the posterior probability distribution insufficiently. In these situations Bayesian Markov chain Monte Carlo sampling can be infeasible. From a frequentist point of view insufficient experimental data and prior assumptions can be interpreted as non-identifiability. The profile likelihood approach offers to detect and to resolve non-identifiability by experimental design iteratively. Therefore, it allows one to better constrain t...
Bayesian inference on earthquake size distribution: a case study in Italy
Licia, Faenza; Carlo, Meletti; Laura, Sandri
2010-05-01
This paper is focused on the study of earthquake size statistical distribution by using Bayesian inference. The strategy consists in the definition of an a priori distribution based on instrumental seismicity, and modeled as a power law distribution. By using the observed historical data, the power law is then modified in order to obtain the posterior distribution. The aim of this paper is to define the earthquake size distribution using all the seismic database available (i.e., instrumental and historical catalogs) and a robust statistical technique. We apply this methodology to the Italian seismicity, dividing the territory in source zones as done for the seismic hazard assessment, taken here as a reference model. The results suggest that each area has its own peculiar trend: while the power law is able to capture the mean aspect of the earthquake size distribution, the posterior emphasizes different slopes in different areas. Our results are in general agreement with the ones used in the seismic hazard assessment in Italy. However, there are areas in which a flattening in the curve is shown, meaning a significant departure from the power law behavior and implying that there are some local aspects that a power law distribution is not able to capture.
Bayesian Inference of the Evolution of a Phenotype Distribution on a Phylogenetic Tree
Ansari, M. Azim; Didelot, Xavier
2016-01-01
The distribution of a phenotype on a phylogenetic tree is often a quantity of interest. Many phenotypes have imperfect heritability, so that a measurement of the phenotype for an individual can be thought of as a single realization from the phenotype distribution of that individual. If all individuals in a phylogeny had the same phenotype distribution, measured phenotypes would be randomly distributed on the tree leaves. This is, however, often not the case, implying that the phenotype distribution evolves over time. Here we propose a new model based on this principle of evolving phenotype distribution on the branches of a phylogeny, which is different from ancestral state reconstruction where the phenotype itself is assumed to evolve. We develop an efficient Bayesian inference method to estimate the parameters of our model and to test the evidence for changes in the phenotype distribution. We use multiple simulated data sets to show that our algorithm has good sensitivity and specificity properties. Since our method identifies branches on the tree on which the phenotype distribution has changed, it is able to break down a tree into components for which this distribution is unique and constant. We present two applications of our method, one investigating the association between HIV genetic variation and human leukocyte antigen and the other studying host range distribution in a lineage of Salmonella enterica, and we discuss many other potential applications. PMID:27412711
Bayesian Predictive Inference of a Proportion Under a Twofold Small-Area Model
Directory of Open Access Journals (Sweden)
Nandram Balgobin
2016-03-01
Full Text Available We extend the twofold small-area model of Stukel and Rao (1997; 1999 to accommodate binary data. An example is the Third International Mathematics and Science Study (TIMSS, in which pass-fail data for mathematics of students from US schools (clusters are available at the third grade by regions and communities (small areas. We compare the finite population proportions of these small areas. We present a hierarchical Bayesian model in which the firststage binary responses have independent Bernoulli distributions, and each subsequent stage is modeled using a beta distribution, which is parameterized by its mean and a correlation coefficient. This twofold small-area model has an intracluster correlation at the first stage and an intercluster correlation at the second stage. The final-stage mean and all correlations are assumed to be noninformative independent random variables. We show how to infer the finite population proportion of each area. We have applied our models to synthetic TIMSS data to show that the twofold model is preferred over a onefold small-area model that ignores the clustering within areas. We further compare these models using a simulation study, which shows that the intracluster correlation is particularly important.
Cuevas Rivera, Dario; Bitzer, Sebastian; Kiebel, Stefan J
2015-10-01
The olfactory information that is received by the insect brain is encoded in the form of spatiotemporal patterns in the projection neurons of the antennal lobe. These dense and overlapping patterns are transformed into a sparse code in Kenyon cells in the mushroom body. Although it is clear that this sparse code is the basis for rapid categorization of odors, it is yet unclear how the sparse code in Kenyon cells is computed and what information it represents. Here we show that this computation can be modeled by sequential firing rate patterns using Lotka-Volterra equations and Bayesian online inference. This new model can be understood as an 'intelligent coincidence detector', which robustly and dynamically encodes the presence of specific odor features. We found that the model is able to qualitatively reproduce experimentally observed activity in both the projection neurons and the Kenyon cells. In particular, the model explains mechanistically how sparse activity in the Kenyon cells arises from the dense code in the projection neurons. The odor classification performance of the model proved to be robust against noise and time jitter in the observed input sequences. As in recent experimental results, we found that recognition of an odor happened very early during stimulus presentation in the model. Critically, by using the model, we found surprising but simple computational explanations for several experimental phenomena. PMID:26451888
Wang, Dong; Tsui, Kwok-Leung; Zhou, Qiang
2016-05-01
Rolling element bearings are commonly used in machines to provide support for rotating shafts. Bearing failures may cause unexpected machine breakdowns and increase economic cost. To prevent machine breakdowns and reduce unnecessary economic loss, bearing faults should be detected as early as possible. Because wavelet transform can be used to highlight impulses caused by localized bearing faults, wavelet transform has been widely investigated and proven to be one of the most effective and efficient methods for bearing fault diagnosis. In this paper, a new Gauss-Hermite integration based Bayesian inference method is proposed to estimate the posterior distribution of wavelet parameters. The innovations of this paper are illustrated as follows. Firstly, a non-linear state space model of wavelet parameters is constructed to describe the relationship between wavelet parameters and hypothetical measurements. Secondly, the joint posterior probability density function of wavelet parameters and hypothetical measurements is assumed to follow a joint Gaussian distribution so as to generate Gaussian perturbations for the state space model. Thirdly, Gauss-Hermite integration is introduced to analytically predict and update moments of the joint Gaussian distribution, from which optimal wavelet parameters are derived. At last, an optimal wavelet filtering is conducted to extract bearing fault features and thus identify localized bearing faults. Two instances are investigated to illustrate how the proposed method works. Two comparisons with the fast kurtogram are used to demonstrate that the proposed method can achieve better visual inspection performances than the fast kurtogram.
Palacios, Julia A; Minin, Vladimir N
2013-03-01
Changes in population size influence genetic diversity of the population and, as a result, leave a signature of these changes in individual genomes in the population. We are interested in the inverse problem of reconstructing past population dynamics from genomic data. We start with a standard framework based on the coalescent, a stochastic process that generates genealogies connecting randomly sampled individuals from the population of interest. These genealogies serve as a glue between the population demographic history and genomic sequences. It turns out that only the times of genealogical lineage coalescences contain information about population size dynamics. Viewing these coalescent times as a point process, estimating population size trajectories is equivalent to estimating a conditional intensity of this point process. Therefore, our inverse problem is similar to estimating an inhomogeneous Poisson process intensity function. We demonstrate how recent advances in Gaussian process-based nonparametric inference for Poisson processes can be extended to Bayesian nonparametric estimation of population size dynamics under the coalescent. We compare our Gaussian process (GP) approach to one of the state-of-the-art Gaussian Markov random field (GMRF) methods for estimating population trajectories. Using simulated data, we demonstrate that our method has better accuracy and precision. Next, we analyze two genealogies reconstructed from real sequences of hepatitis C and human Influenza A viruses. In both cases, we recover more believed aspects of the viral demographic histories than the GMRF approach. We also find that our GP method produces more reasonable uncertainty estimates than the GMRF method.
Evaluation of semi-empirical and non-linear drying models by Bayesian inference
Directory of Open Access Journals (Sweden)
Ricardo C. de Oliveira
2014-09-01
Full Text Available Current analysis investigates thin layer drying experiments on passion fruit seeds. Drying characteristics of passion fruit seeds were examined using ambient air for temperature range between 40 and 65ºC and air flow velocity range between 0.6 and 1.2 m s-1. The semi-empirical models were then fitted to the drying data by Bayesian inference, based on the ratios of the difference between the initial and final moisture contents and equilibrium moisture content. The effective diffusivity varied between 1.70 x 10-10 and 4.68 x 10-10 m2 s-1 with temperature and air flow rate increase. Temperature dependence on the diffusivity coefficient was described by an Arrhenius-type relationship. The activation energy for moisture diffusion was 24.36, 35.24 and 13.86 kJ moL-1 for drying air speeds respectively equal to 0.6, 0.9 and 1.2 m s-1.
The Application of Bayesian Inference to Gravitational Waves from Core-Collapse Supernovae
Gossan, Sarah; Ott, Christian; Kalmus, Peter; Logue, Joshua; Heng, Siong
2013-04-01
The gravitational wave (GW) signature of core-collapse supernovae (CCSNe) encodes important information on the supernova explosion mechanism, the workings of which cannot be explored via observations in the electromagnetic spectrum. Recent research has shown that the CCSNe explosion mechanism can be inferred through the application of Bayesian model selection to gravitational wave signals from supernova explosions powered by the neutrino, magnetorotational and acoustic mechanisms. Extending this work, we apply Principal Component Analysis to the GW spectrograms from CCSNe to take into account also the time-frequency evolution of the emitted signals. We do so in the context of Advanced LIGO, to establish if any improvement on distinguishing between various explosion mechanisms can be obtained. Further to this, we consider a five-detector network of interferometers (comprised of the two Advanced LIGO detectors, Advanced Virgo, LIGO India and KAGRA) and generalize the aforementioned analysis for a source of known position but unknown distance, using realistic, re-colored detector data (as opposed to Gaussian noise), in order to make more reliable statements regarding our ability to distinguish between various explosion mechanisms on the basis of their GW signatures.
Directory of Open Access Journals (Sweden)
Dario Cuevas Rivera
2015-10-01
Full Text Available The olfactory information that is received by the insect brain is encoded in the form of spatiotemporal patterns in the projection neurons of the antennal lobe. These dense and overlapping patterns are transformed into a sparse code in Kenyon cells in the mushroom body. Although it is clear that this sparse code is the basis for rapid categorization of odors, it is yet unclear how the sparse code in Kenyon cells is computed and what information it represents. Here we show that this computation can be modeled by sequential firing rate patterns using Lotka-Volterra equations and Bayesian online inference. This new model can be understood as an 'intelligent coincidence detector', which robustly and dynamically encodes the presence of specific odor features. We found that the model is able to qualitatively reproduce experimentally observed activity in both the projection neurons and the Kenyon cells. In particular, the model explains mechanistically how sparse activity in the Kenyon cells arises from the dense code in the projection neurons. The odor classification performance of the model proved to be robust against noise and time jitter in the observed input sequences. As in recent experimental results, we found that recognition of an odor happened very early during stimulus presentation in the model. Critically, by using the model, we found surprising but simple computational explanations for several experimental phenomena.
Cuevas Rivera, Dario; Bitzer, Sebastian; Kiebel, Stefan J.
2015-01-01
The olfactory information that is received by the insect brain is encoded in the form of spatiotemporal patterns in the projection neurons of the antennal lobe. These dense and overlapping patterns are transformed into a sparse code in Kenyon cells in the mushroom body. Although it is clear that this sparse code is the basis for rapid categorization of odors, it is yet unclear how the sparse code in Kenyon cells is computed and what information it represents. Here we show that this computation can be modeled by sequential firing rate patterns using Lotka-Volterra equations and Bayesian online inference. This new model can be understood as an ‘intelligent coincidence detector’, which robustly and dynamically encodes the presence of specific odor features. We found that the model is able to qualitatively reproduce experimentally observed activity in both the projection neurons and the Kenyon cells. In particular, the model explains mechanistically how sparse activity in the Kenyon cells arises from the dense code in the projection neurons. The odor classification performance of the model proved to be robust against noise and time jitter in the observed input sequences. As in recent experimental results, we found that recognition of an odor happened very early during stimulus presentation in the model. Critically, by using the model, we found surprising but simple computational explanations for several experimental phenomena. PMID:26451888
Bayesian model comparison and parameter inference in systems biology using nested sampling.
Pullen, Nick; Morris, Richard J
2014-01-01
Inferring parameters for models of biological processes is a current challenge in systems biology, as is the related problem of comparing competing models that explain the data. In this work we apply Skilling's nested sampling to address both of these problems. Nested sampling is a Bayesian method for exploring parameter space that transforms a multi-dimensional integral to a 1D integration over likelihood space. This approach focuses on the computation of the marginal likelihood or evidence. The ratio of evidences of different models leads to the Bayes factor, which can be used for model comparison. We demonstrate how nested sampling can be used to reverse-engineer a system's behaviour whilst accounting for the uncertainty in the results. The effect of missing initial conditions of the variables as well as unknown parameters is investigated. We show how the evidence and the model ranking can change as a function of the available data. Furthermore, the addition of data from extra variables of the system can deliver more information for model comparison than increasing the data from one variable, thus providing a basis for experimental design. PMID:24523891
Bayesian model comparison and parameter inference in systems biology using nested sampling.
Directory of Open Access Journals (Sweden)
Nick Pullen
Full Text Available Inferring parameters for models of biological processes is a current challenge in systems biology, as is the related problem of comparing competing models that explain the data. In this work we apply Skilling's nested sampling to address both of these problems. Nested sampling is a Bayesian method for exploring parameter space that transforms a multi-dimensional integral to a 1D integration over likelihood space. This approach focuses on the computation of the marginal likelihood or evidence. The ratio of evidences of different models leads to the Bayes factor, which can be used for model comparison. We demonstrate how nested sampling can be used to reverse-engineer a system's behaviour whilst accounting for the uncertainty in the results. The effect of missing initial conditions of the variables as well as unknown parameters is investigated. We show how the evidence and the model ranking can change as a function of the available data. Furthermore, the addition of data from extra variables of the system can deliver more information for model comparison than increasing the data from one variable, thus providing a basis for experimental design.
International Nuclear Information System (INIS)
We have developed a Bayesian approach to the analysis of neural electromagnetic (MEG/EEG) data that can incorporate or fuse information from other imaging modalities and addresses the ill-posed inverse problem by sarnpliig the many different solutions which could have produced the given data. From these samples one can draw probabilistic inferences about regions of activation. Our source model assumes a variable number of variable size cortical regions of stimulus-correlated activity. An active region consists of locations on the cortical surf ace, within a sphere centered on some location in cortex. The number and radi of active regions can vary to defined maximum values. The goal of the analysis is to determine the posterior probability distribution for the set of parameters that govern the number, location, and extent of active regions. Markov Chain Monte Carlo is used to generate a large sample of sets of parameters distributed according to the posterior distribution. This sample is representative of the many different source distributions that could account for given data, and allows identification of probable (i.e. consistent) features across solutions. Examples of the use of this analysis technique with both simulated and empirical MEG data are presented
MCMC for Wind Power Simulation
Papaefthymiou, G.; Klöckl, B.
2008-01-01
This paper contributes a Markov chain Monte Carlo (MCMC) method for the direct generation of synthetic time series of wind power output. It is shown that obtaining a stochastic model directly in the wind power domain leads to reduced number of states and to lower order of the Markov chain at equal p
Duggento, Andrea; McClintock, Peter V E; Stefanovska, Aneta
2012-01-01
Living systems have time-evolving interactions that, until recently, could not be identified accurately from recorded time series in the presence of noise. Stankovski et al. (Phys. Rev. Lett. 109 024101, 2012) introduced a method based on dynamical Bayesian inference that facilitates the simultaneous detection of time-varying synchronization, directionality of influence, and coupling functions. It can distinguish unsynchronized dynamics from noise-induced phase slips. The method is based on phase dynamics, with Bayesian inference of the time- evolving parameters being achieved by shaping the prior densities to incorporate knowledge of previous samples. We now present the method in detail using numerically-generated data, data from an analog electronic circuit, and cardio-respiratory data. We also generalize the method to encompass networks of interacting oscillators and thus demonstrate its applicability to small-scale networks.
A Fatty Acid Based Bayesian Approach for Inferring Diet in Aquatic Consumers.
Galloway, Aaron W E; Brett, Michael T; Holtgrieve, Gordon W; Ward, Eric J; Ballantyne, Ashley P; Burns, Carolyn W; Kainz, Martin J; Müller-Navarra, Doerthe C; Persson, Jonas; Ravet, Joseph L; Strandberg, Ursula; Taipale, Sami J; Alhgren, Gunnel
2015-01-01
We modified the stable isotope mixing model MixSIR to infer primary producer contributions to consumer diets based on their fatty acid composition. To parameterize the algorithm, we generated a 'consumer-resource library' of FA signatures of Daphnia fed different algal diets, using 34 feeding trials representing diverse phytoplankton lineages. This library corresponds to the resource or producer file in classic Bayesian mixing models such as MixSIR or SIAR. Because this library is based on the FA profiles of zooplankton consuming known diets, and not the FA profiles of algae directly, trophic modification of consumer lipids is directly accounted for. To test the model, we simulated hypothetical Daphnia comprised of 80% diatoms, 10% green algae, and 10% cryptophytes and compared the FA signatures of these known pseudo-mixtures to outputs generated by the mixing model. The algorithm inferred these simulated consumers were comprised of 82% (63-92%) [median (2.5th to 97.5th percentile credible interval)] diatoms, 11% (4-22%) green algae, and 6% (0-25%) cryptophytes. We used the same model with published phytoplankton stable isotope (SI) data for δ13C and δ15N to examine how a SI based approach resolved a similar scenario. With SI, the algorithm inferred that the simulated consumer assimilated 52% (4-91%) diatoms, 23% (1-78%) green algae, and 18% (1-73%) cyanobacteria. The accuracy and precision of SI based estimates was extremely sensitive to both resource and consumer uncertainty, as well as the trophic fractionation assumption. These results indicate that when using only two tracers with substantial uncertainty for the putative resources, as is often the case in this class of analyses, the underdetermined constraint in consumer-resource SI analyses may be intractable. The FA based approach alleviated the underdetermined constraint because many more FA biomarkers were utilized (n algae, and cryptophytes) have very characteristic FA compositions, and the FA profiles
A Fatty Acid Based Bayesian Approach for Inferring Diet in Aquatic Consumers.
Directory of Open Access Journals (Sweden)
Aaron W E Galloway
Full Text Available We modified the stable isotope mixing model MixSIR to infer primary producer contributions to consumer diets based on their fatty acid composition. To parameterize the algorithm, we generated a 'consumer-resource library' of FA signatures of Daphnia fed different algal diets, using 34 feeding trials representing diverse phytoplankton lineages. This library corresponds to the resource or producer file in classic Bayesian mixing models such as MixSIR or SIAR. Because this library is based on the FA profiles of zooplankton consuming known diets, and not the FA profiles of algae directly, trophic modification of consumer lipids is directly accounted for. To test the model, we simulated hypothetical Daphnia comprised of 80% diatoms, 10% green algae, and 10% cryptophytes and compared the FA signatures of these known pseudo-mixtures to outputs generated by the mixing model. The algorithm inferred these simulated consumers were comprised of 82% (63-92% [median (2.5th to 97.5th percentile credible interval] diatoms, 11% (4-22% green algae, and 6% (0-25% cryptophytes. We used the same model with published phytoplankton stable isotope (SI data for δ13C and δ15N to examine how a SI based approach resolved a similar scenario. With SI, the algorithm inferred that the simulated consumer assimilated 52% (4-91% diatoms, 23% (1-78% green algae, and 18% (1-73% cyanobacteria. The accuracy and precision of SI based estimates was extremely sensitive to both resource and consumer uncertainty, as well as the trophic fractionation assumption. These results indicate that when using only two tracers with substantial uncertainty for the putative resources, as is often the case in this class of analyses, the underdetermined constraint in consumer-resource SI analyses may be intractable. The FA based approach alleviated the underdetermined constraint because many more FA biomarkers were utilized (n < 20, different primary producers (e.g., diatoms, green algae, and
Directory of Open Access Journals (Sweden)
Hero Alfred
2010-11-01
Full Text Available Abstract Background Nonparametric Bayesian techniques have been developed recently to extend the sophistication of factor models, allowing one to infer the number of appropriate factors from the observed data. We consider such techniques for sparse factor analysis, with application to gene-expression data from three virus challenge studies. Particular attention is placed on employing the Beta Process (BP, the Indian Buffet Process (IBP, and related sparseness-promoting techniques to infer a proper number of factors. The posterior density function on the model parameters is computed using Gibbs sampling and variational Bayesian (VB analysis. Results Time-evolving gene-expression data are considered for respiratory syncytial virus (RSV, Rhino virus, and influenza, using blood samples from healthy human subjects. These data were acquired in three challenge studies, each executed after receiving institutional review board (IRB approval from Duke University. Comparisons are made between several alternative means of per-forming nonparametric factor analysis on these data, with comparisons as well to sparse-PCA and Penalized Matrix Decomposition (PMD, closely related non-Bayesian approaches. Conclusions Applying the Beta Process to the factor scores, or to the singular values of a pseudo-SVD construction, the proposed algorithms infer the number of factors in gene-expression data. For real data the "true" number of factors is unknown; in our simulations we consider a range of noise variances, and the proposed Bayesian models inferred the number of factors accurately relative to other methods in the literature, such as sparse-PCA and PMD. We have also identified a "pan-viral" factor of importance for each of the three viruses considered in this study. We have identified a set of genes associated with this pan-viral factor, of interest for early detection of such viruses based upon the host response, as quantified via gene-expression data.
Multi-scale inference of interaction rules in animal groups using Bayesian model selection.
Directory of Open Access Journals (Sweden)
Richard P Mann
2012-01-01
Full Text Available Inference of interaction rules of animals moving in groups usually relies on an analysis of large scale system behaviour. Models are tuned through repeated simulation until they match the observed behaviour. More recent work has used the fine scale motions of animals to validate and fit the rules of interaction of animals in groups. Here, we use a Bayesian methodology to compare a variety of models to the collective motion of glass prawns (Paratya australiensis. We show that these exhibit a stereotypical 'phase transition', whereby an increase in density leads to the onset of collective motion in one direction. We fit models to this data, which range from: a mean-field model where all prawns interact globally; to a spatial Markovian model where prawns are self-propelled particles influenced only by the current positions and directions of their neighbours; up to non-Markovian models where prawns have 'memory' of previous interactions, integrating their experiences over time when deciding to change behaviour. We show that the mean-field model fits the large scale behaviour of the system, but does not capture fine scale rules of interaction, which are primarily mediated by physical contact. Conversely, the Markovian self-propelled particle model captures the fine scale rules of interaction but fails to reproduce global dynamics. The most sophisticated model, the non-Markovian model, provides a good match to the data at both the fine scale and in terms of reproducing global dynamics. We conclude that prawns' movements are influenced by not just the current direction of nearby conspecifics, but also those encountered in the recent past. Given the simplicity of prawns as a study system our research suggests that self-propelled particle models of collective motion should, if they are to be realistic at multiple biological scales, include memory of previous interactions and other non-Markovian effects.
HASSET: a probability event tree tool to evaluate future volcanic scenarios using Bayesian inference
Sobradelo, Rosa; Bartolini, Stefania; Martí, Joan
2014-01-01
Event tree structures constitute one of the most useful and necessary tools in modern volcanology for assessment of hazards from future volcanic scenarios (those that culminate in an eruptive event as well as those that do not). They are particularly relevant for evaluation of long- and short-term probabilities of occurrence of possible volcanic scenarios and their potential impacts on urbanized areas. In this paper, we introduce Hazard Assessment Event Tree (HASSET), a probability tool, built on an event tree structure that uses Bayesian inference to estimate the probability of occurrence of a future volcanic scenario and to evaluate the most relevant sources of uncertainty from the corresponding volcanic system. HASSET includes hazard assessment of noneruptive and nonmagmatic volcanic scenarios, that is, episodes of unrest that do not evolve into volcanic eruption but have an associated volcanic hazard (e.g., sector collapse and phreatic explosion), as well as unrest episodes triggered by external triggers rather than the magmatic system alone. Additionally, HASSET introduces the Delta method to assess precision of the probability estimates, by reporting a 1 standard deviation variability interval around the expected value for each scenario. HASSET is presented as a free software package in the form of a plug-in for the open source geographic information system Quantum Gis (QGIS), providing a graphically supported computation of the event tree structure in an interactive and user-friendly way. We also include further in-depth explanations for each node together with an application of HASSET to Teide-Pico Viejo volcanic complex (Spain).
Energy Technology Data Exchange (ETDEWEB)
Kang, Seongkeun; Seong, Poong Hyun [Korea Advanced Institute of Science and Technology, Daejeon (Korea, Republic of)
2014-05-15
The purpose of this paper is to confirm if Bayesian inference can properly reflect the situation awareness of real human operators, and find the difference between the situation of ideal and practical operators, and investigate the factors which contributes to those difference. As a results, human can not think like computer. If human can memorize all the information, and their thinking process is same to the CPU of computer, the results of these two experiments come out more than 99%. However the probability of finding right malfunction by humans are only 64.52% in simple experiment, and 51.61% in complex experiment. Cognition is the mental processing that includes the attention of working memory, comprehending and producing language, calculating, reasoning, problem solving, and decision making. There are many reasons why human thinking process is different with computer, but in this experiment, we suggest that the working memory is the most important factor. Humans have limited working memory which has only seven chunks capacity. These seven chunks are called magic number. If there are more than seven sequential information, people start to forget the previous information because their working memory capacity is running over. We can check how much working memory affects to the result through the simple experiment. Then what if we neglect the effect of working memory? The total number of subjects who have incorrect memory is 7 (subject 3, 5, 6, 7, 8, 15, 25). They could find the right malfunction if the memory hadn't changed because of lack of working memory. Then the probability of find correct malfunction will be increased to 87.10% from 64.52%. Complex experiment has similar result. In this case, eight subjects(1, 5, 8, 9, 15, 17, 18, 30) had changed the memory, and it affects to find the right malfunction. Considering it, then the probability would be (16+8)/31 = 77.42%.
International Nuclear Information System (INIS)
The purpose of this paper is to confirm if Bayesian inference can properly reflect the situation awareness of real human operators, and find the difference between the situation of ideal and practical operators, and investigate the factors which contributes to those difference. As a results, human can not think like computer. If human can memorize all the information, and their thinking process is same to the CPU of computer, the results of these two experiments come out more than 99%. However the probability of finding right malfunction by humans are only 64.52% in simple experiment, and 51.61% in complex experiment. Cognition is the mental processing that includes the attention of working memory, comprehending and producing language, calculating, reasoning, problem solving, and decision making. There are many reasons why human thinking process is different with computer, but in this experiment, we suggest that the working memory is the most important factor. Humans have limited working memory which has only seven chunks capacity. These seven chunks are called magic number. If there are more than seven sequential information, people start to forget the previous information because their working memory capacity is running over. We can check how much working memory affects to the result through the simple experiment. Then what if we neglect the effect of working memory? The total number of subjects who have incorrect memory is 7 (subject 3, 5, 6, 7, 8, 15, 25). They could find the right malfunction if the memory hadn't changed because of lack of working memory. Then the probability of find correct malfunction will be increased to 87.10% from 64.52%. Complex experiment has similar result. In this case, eight subjects(1, 5, 8, 9, 15, 17, 18, 30) had changed the memory, and it affects to find the right malfunction. Considering it, then the probability would be (16+8)/31 = 77.42%
Multi-scale inference of interaction rules in animal groups using Bayesian model selection.
Directory of Open Access Journals (Sweden)
Richard P Mann
Full Text Available Inference of interaction rules of animals moving in groups usually relies on an analysis of large scale system behaviour. Models are tuned through repeated simulation until they match the observed behaviour. More recent work has used the fine scale motions of animals to validate and fit the rules of interaction of animals in groups. Here, we use a Bayesian methodology to compare a variety of models to the collective motion of glass prawns (Paratya australiensis. We show that these exhibit a stereotypical 'phase transition', whereby an increase in density leads to the onset of collective motion in one direction. We fit models to this data, which range from: a mean-field model where all prawns interact globally; to a spatial Markovian model where prawns are self-propelled particles influenced only by the current positions and directions of their neighbours; up to non-Markovian models where prawns have 'memory' of previous interactions, integrating their experiences over time when deciding to change behaviour. We show that the mean-field model fits the large scale behaviour of the system, but does not capture the observed locality of interactions. Traditional self-propelled particle models fail to capture the fine scale dynamics of the system. The most sophisticated model, the non-Markovian model, provides a good match to the data at both the fine scale and in terms of reproducing global dynamics, while maintaining a biologically plausible perceptual range. We conclude that prawns' movements are influenced by not just the current direction of nearby conspecifics, but also those encountered in the recent past. Given the simplicity of prawns as a study system our research suggests that self-propelled particle models of collective motion should, if they are to be realistic at multiple biological scales, include memory of previous interactions and other non-Markovian effects.
Analysis of simulated data for the KArlsruhe TRItium Neutrino experiment using Bayesian inference
DEFF Research Database (Denmark)
Riis, Anna Sejersen; Hannestad, Steen; Weinheimer, C.
2011-01-01
The KATRIN (Karlsruhe Tritium Neutrino) experiment will analyze the tritium β spectrum to determine the mass of the neutrino with a sensitivity of 0.2 eV (90% C.L.). This approach to a measurement of the absolute value of the neutrino mass relies only on the principle of energy conservation and can...... neutrinos. As an alternative to the frequentist minimization methods used in the analysis of the earlier experiments in Mainz and Troitsk we have been investigating Markov chain Monte Carlo (MCMC) methods which are very well suited for probing multiparameter spaces. We found that implementing the KATRIN χ2...
MCMC joint separation and segmentation of hidden Markov fields
Snoussi, H; Snoussi, Hichem; Mohammad-Djafari, Ali
2002-01-01
In this contribution, we consider the problem of the blind separation of noisy instantaneously mixed images. The images are modelized by hidden Markov fields with unknown parameters. Given the observed images, we give a Bayesian formulation and we propose to solve the resulting data augmentation problem by implementing a Monte Carlo Markov Chain (MCMC) procedure. We separate the unknown variables into two categories: 1. The parameters of interest which are the mixing matrix, the noise covariance and the parameters of the sources distributions. 2. The hidden variables which are the unobserved sources and the unobserved pixels classification labels. The proposed algorithm provides in the stationary regime samples drawn from the posterior distributions of all the variables involved in the problem leading to a flexibility in the cost function choice. We discuss and characterize some problems of non identifiability and degeneracies of the parameters likelihood and the behavior of the MCMC algorithm in this case. F...
Directory of Open Access Journals (Sweden)
Fang-Rong Yan
Full Text Available This article provides a fully bayesian approach for modeling of single-dose and complete pharmacokinetic data in a population pharmacokinetic (PK model. To overcome the impact of outliers and the difficulty of computation, a generalized linear model is chosen with the hypothesis that the errors follow a multivariate Student t distribution which is a heavy-tailed distribution. The aim of this study is to investigate and implement the performance of the multivariate t distribution to analyze population pharmacokinetic data. Bayesian predictive inferences and the Metropolis-Hastings algorithm schemes are used to process the intractable posterior integration. The precision and accuracy of the proposed model are illustrated by the simulating data and a real example of theophylline data.
Directory of Open Access Journals (Sweden)
Edson Sandoval-Castellanos
Full Text Available Inference of population demographic history has vastly improved in recent years due to a number of technological and theoretical advances including the use of ancient DNA. Approximate Bayesian computation (ABC stands among the most promising methods due to its simple theoretical fundament and exceptional flexibility. However, limited availability of user-friendly programs that perform ABC analysis renders it difficult to implement, and hence programming skills are frequently required. In addition, there is limited availability of programs able to deal with heterochronous data. Here we present the software BaySICS: Bayesian Statistical Inference of Coalescent Simulations. BaySICS provides an integrated and user-friendly platform that performs ABC analyses by means of coalescent simulations from DNA sequence data. It estimates historical demographic population parameters and performs hypothesis testing by means of Bayes factors obtained from model comparisons. Although providing specific features that improve inference from datasets with heterochronous data, BaySICS also has several capabilities making it a suitable tool for analysing contemporary genetic datasets. Those capabilities include joint analysis of independent tables, a graphical interface and the implementation of Markov-chain Monte Carlo without likelihoods.
Knuth, K. H.
2001-05-01
We consider the application of Bayesian inference to the study of self-organized structures in complex adaptive systems. In particular, we examine the distribution of elements, agents, or processes in systems dominated by hierarchical structure. We demonstrate that results obtained by Caianiello [1] on Hierarchical Modular Systems (HMS) can be found by applying Jaynes' Principle of Group Invariance [2] to a few key assumptions about our knowledge of hierarchical organization. Subsequent application of the Principle of Maximum Entropy allows inferences to be made about specific systems. The utility of the Bayesian method is considered by examining both successes and failures of the hierarchical model. We discuss how Caianiello's original statements suffer from the Mind Projection Fallacy [3] and we restate his assumptions thus widening the applicability of the HMS model. The relationship between inference and statistical physics, described by Jaynes [4], is reiterated with the expectation that this realization will aid the field of complex systems research by moving away from often inappropriate direct application of statistical mechanics to a more encompassing inferential methodology.
Sankararaman, Shankar
2016-01-01
This paper presents a computational framework for uncertainty characterization and propagation, and sensitivity analysis under the presence of aleatory and epistemic un- certainty, and develops a rigorous methodology for efficient refinement of epistemic un- certainty by identifying important epistemic variables that significantly affect the overall performance of an engineering system. The proposed methodology is illustrated using the NASA Langley Uncertainty Quantification Challenge (NASA-LUQC) problem that deals with uncertainty analysis of a generic transport model (GTM). First, Bayesian inference is used to infer subsystem-level epistemic quantities using the subsystem-level model and corresponding data. Second, tools of variance-based global sensitivity analysis are used to identify four important epistemic variables (this limitation specified in the NASA-LUQC is reflective of practical engineering situations where not all epistemic variables can be refined due to time/budget constraints) that significantly affect system-level performance. The most significant contribution of this paper is the development of the sequential refine- ment methodology, where epistemic variables for refinement are not identified all-at-once. Instead, only one variable is first identified, and then, Bayesian inference and global sensi- tivity calculations are repeated to identify the next important variable. This procedure is continued until all 4 variables are identified and the refinement in the system-level perfor- mance is computed. The advantages of the proposed sequential refinement methodology over the all-at-once uncertainty refinement approach are explained, and then applied to the NASA Langley Uncertainty Quantification Challenge problem.
Gustafson, Paul
2014-01-01
Partially identified models are characterized by the distribution of observables being compatible with a set of values for the target parameter, rather than a single value. This set is often referred to as an identification region. From a non-Bayesian point of view, the identification region is the object revealed to the investigator in the limit of increasing sample size. Conversely, a Bayesian analysis provides the identification region plus the limiting posterior distribution over this reg...
GSEVM v.2: MCMC software to analyse genetically structured environmental variance models
DEFF Research Database (Denmark)
Ibáñez-Escriche, N; Garcia, M; Sorensen, D
2010-01-01
This note provides a description of software that allows to fit Bayesian genetically structured variance models using Markov chain Monte Carlo (MCMC). The gsevm v.2 program was written in Fortran 90. The DOS and Unix executable programs, the user's guide, and some example files are freely availab...
Morrissey, Edward R; Juárez, Miguel A; Denby, Katherine J; Burroughs, Nigel J
2011-10-01
We propose a semiparametric Bayesian model, based on penalized splines, for the recovery of the time-invariant topology of a causal interaction network from longitudinal data. Our motivation is inference of gene regulatory networks from low-resolution microarray time series, where existence of nonlinear interactions is well known. Parenthood relations are mapped by augmenting the model with kinship indicators and providing these with either an overall or gene-wise hierarchical structure. Appropriate specification of the prior is crucial to control the flexibility of the splines, especially under circumstances of scarce data; thus, we provide an informative, proper prior. Substantive improvement in network inference over a linear model is demonstrated using synthetic data drawn from ordinary differential equation models and gene expression from an experimental data set of the Arabidopsis thaliana circadian rhythm.
Tang, An-Min; Tang, Nian-Sheng
2015-02-28
We propose a semiparametric multivariate skew-normal joint model for multivariate longitudinal and multivariate survival data. One main feature of the posited model is that we relax the commonly used normality assumption for random effects and within-subject error by using a centered Dirichlet process prior to specify the random effects distribution and using a multivariate skew-normal distribution to specify the within-subject error distribution and model trajectory functions of longitudinal responses semiparametrically. A Bayesian approach is proposed to simultaneously obtain Bayesian estimates of unknown parameters, random effects and nonparametric functions by combining the Gibbs sampler and the Metropolis-Hastings algorithm. Particularly, a Bayesian local influence approach is developed to assess the effect of minor perturbations to within-subject measurement error and random effects. Several simulation studies and an example are presented to illustrate the proposed methodologies. PMID:25404574
Yu, Jianbo
2015-12-01
Prognostics is much efficient to achieve zero-downtime performance, maximum productivity and proactive maintenance of machines. Prognostics intends to assess and predict the time evolution of machine health degradation so that machine failures can be predicted and prevented. A novel prognostics system is developed based on the data-model-fusion scheme using the Bayesian inference-based self-organizing map (SOM) and an integration of logistic regression (LR) and high-order particle filtering (HOPF). In this prognostics system, a baseline SOM is constructed to model the data distribution space of healthy machine under an assumption that predictable fault patterns are not available. Bayesian inference-based probability (BIP) derived from the baseline SOM is developed as a quantification indication of machine health degradation. BIP is capable of offering failure probability for the monitored machine, which has intuitionist explanation related to health degradation state. Based on those historic BIPs, the constructed LR and its modeling noise constitute a high-order Markov process (HOMP) to describe machine health propagation. HOPF is used to solve the HOMP estimation to predict the evolution of the machine health in the form of a probability density function (PDF). An on-line model update scheme is developed to adapt the Markov process changes to machine health dynamics quickly. The experimental results on a bearing test-bed illustrate the potential applications of the proposed system as an effective and simple tool for machine health prognostics.
Bayesian networks in educational assessment
Almond, Russell G; Steinberg, Linda S; Yan, Duanli; Williamson, David M
2015-01-01
Bayesian inference networks, a synthesis of statistics and expert systems, have advanced reasoning under uncertainty in medicine, business, and social sciences. This innovative volume is the first comprehensive treatment exploring how they can be applied to design and analyze innovative educational assessments. Part I develops Bayes nets’ foundations in assessment, statistics, and graph theory, and works through the real-time updating algorithm. Part II addresses parametric forms for use with assessment, model-checking techniques, and estimation with the EM algorithm and Markov chain Monte Carlo (MCMC). A unique feature is the volume’s grounding in Evidence-Centered Design (ECD) framework for assessment design. This “design forward” approach enables designers to take full advantage of Bayes nets’ modularity and ability to model complex evidentiary relationships that arise from performance in interactive, technology-rich assessments such as simulations. Part III describes ECD, situates Bayes nets as ...
Adaptive multiscale MCMC algorithm for uncertainty quantification in seismic parameter estimation
Tan, Xiaosi
2014-08-05
Formulating an inverse problem in a Bayesian framework has several major advantages (Sen and Stoffa, 1996). It allows finding multiple solutions subject to flexible a priori information and performing uncertainty quantification in the inverse problem. In this paper, we consider Bayesian inversion for the parameter estimation in seismic wave propagation. The Bayes\\' theorem allows writing the posterior distribution via the likelihood function and the prior distribution where the latter represents our prior knowledge about physical properties. One of the popular algorithms for sampling this posterior distribution is Markov chain Monte Carlo (MCMC), which involves making proposals and calculating their acceptance probabilities. However, for large-scale problems, MCMC is prohibitevely expensive as it requires many forward runs. In this paper, we propose a multilevel MCMC algorithm that employs multilevel forward simulations. Multilevel forward simulations are derived using Generalized Multiscale Finite Element Methods that we have proposed earlier (Efendiev et al., 2013a; Chung et al., 2013). Our overall Bayesian inversion approach provides a substantial speed-up both in the process of the sampling via preconditioning using approximate posteriors and the computation of the forward problems for different proposals by using the adaptive nature of multiscale methods. These aspects of the method are discussed n the paper. This paper is motivated by earlier work of M. Sen and his collaborators (Hong and Sen, 2007; Hong, 2008) who proposed the development of efficient MCMC techniques for seismic applications. In the paper, we present some preliminary numerical results.
Bayesian inference of the resonance content of p(gamma,K^+)Lambda
De Cruz, Lesley; Vancraeyveld, Pieter; Ryckebusch, Jan
2011-01-01
A Bayesian analysis of the world's p(gamma,K^+)Lambda data is presented. We find that the following nucleon resonances have the highest probability of contributing to the reaction: S11(1535), S11(1650), F15(1680), P13(1720), D13(1900), P13(1900), P11(1900), and F15(2000). We adopt a Regge-plus-resonance framework featuring consistent couplings for nucleon resonances up to spin J=5/2. We evaluate all possible combinations of 11 candidate resonances. The best model is selected from the 2048 model variants by calculating the Bayesian evidence values against the world's p(gamma,K^+)Lambda data.
From least squares to multilevel modeling: A graphical introduction to Bayesian inference
Loredo, Thomas J.
2016-01-01
This tutorial presentation will introduce some of the key ideas and techniques involved in applying Bayesian methods to problems in astrostatistics. The focus will be on the big picture: understanding the foundations (interpreting probability, Bayes's theorem, the law of total probability and marginalization), making connections to traditional methods (propagation of errors, least squares, chi-squared, maximum likelihood, Monte Carlo simulation), and highlighting problems where a Bayesian approach can be particularly powerful (Poisson processes, density estimation and curve fitting with measurement error). The "graphical" component of the title reflects an emphasis on pictorial representations of some of the math, but also on the use of graphical models (multilevel or hierarchical models) for analyzing complex data. Code for some examples from the talk will be available to participants, in Python and in the Stan probabilistic programming language.
Ge, L.; Asseldonk, van, N.; Valeeva, N.I.; Hennen, W.H.G.J.; Bergevoet, R.H.M.
2011-01-01
Efficient policy intervention to reduce antibiotic use in livestock production requires knowledge about the rationale underlying antibiotic usage. Animal health status and management quality are considered the two most important factors that influence farmersâ¿¿ decision-making concerning antibiotic use. Information on these two factors is therefore crucial in designing incentive mechanisms. In this paper, a Bayesian belief network (BBN) is built to represent the knowledge on how these factor...
DEFF Research Database (Denmark)
Iglesias, Juan Eugenio; Sabuncu, Mert Rory; Van Leemput, Koen
2013-01-01
Many segmentation algorithms in medical image analysis use Bayesian modeling to augment local image appearance with prior anatomical knowledge. Such methods often contain a large number of free parameters that are first estimated and then kept fixed during the actual segmentation process. However...... in an Alzheimer’s disease classification task. As an additional benefit, the technique also allows one to compute informative “error bars” on the volume estimates of individual structures....
Safner, T.; Miller, M.P.; McRae, B.H.; Fortin, M.-J.; Manel, S.
2011-01-01
Recently, techniques available for identifying clusters of individuals or boundaries between clusters using genetic data from natural populations have expanded rapidly. Consequently, there is a need to evaluate these different techniques. We used spatially-explicit simulation models to compare three spatial Bayesian clustering programs and two edge detection methods. Spatially-structured populations were simulated where a continuous population was subdivided by barriers. We evaluated the ability of each method to correctly identify boundary locations while varying: (i) time after divergence, (ii) strength of isolation by distance, (iii) level of genetic diversity, and (iv) amount of gene flow across barriers. To further evaluate the methods' effectiveness to detect genetic clusters in natural populations, we used previously published data on North American pumas and a European shrub. Our results show that with simulated and empirical data, the Bayesian spatial clustering algorithms outperformed direct edge detection methods. All methods incorrectly detected boundaries in the presence of strong patterns of isolation by distance. Based on this finding, we support the application of Bayesian spatial clustering algorithms for boundary detection in empirical datasets, with necessary tests for the influence of isolation by distance. ?? 2011 by the authors; licensee MDPI, Basel, Switzerland.
Linden, Daniel W; Roloff, Gary J
2015-08-01
Pilot studies are often used to design short-term research projects and long-term ecological monitoring programs, but data are sometimes discarded when they do not match the eventual survey design. Bayesian hierarchical modeling provides a convenient framework for integrating multiple data sources while explicitly separating sample variation into observation and ecological state processes. Such an approach can better estimate state uncertainty and improve inferences from short-term studies in dynamic systems. We used a dynamic multistate occupancy model to estimate the probabilities of occurrence and nesting for white-headed woodpeckers Picoides albolarvatus in recent harvest units within managed forests of northern California, USA. Our objectives were to examine how occupancy states and state transitions were related to forest management practices, and how the probabilities changed over time. Using Gibbs variable selection, we made inferences using multiple model structures and generated model-averaged estimates. Probabilities of white-headed woodpecker occurrence and nesting were high in 2009 and 2010, and the probability that nesting persisted at a site was positively related to the snag density in harvest units. Prior-year nesting resulted in higher probabilities of subsequent occurrence and nesting. We demonstrate the benefit of forest management practices that increase the density of retained snags in harvest units for providing white-headed woodpecker nesting habitat. While including an additional year of data from our pilot study did not drastically alter management recommendations, it changed the interpretation of the mechanism behind the observed dynamics. Bayesian hierarchical modeling has the potential to maximize the utility of studies based on small sample sizes while fully accounting for measurement error and both estimation and model uncertainty, thereby improving the ability of observational data to inform conservation and management strategies
Bayesian inference for data assimilation using Least-Squares Finite Element methods
International Nuclear Information System (INIS)
It has recently been observed that Least-Squares Finite Element methods (LS-FEMs) can be used to assimilate experimental data into approximations of PDEs in a natural way, as shown by Heyes et al. in the case of incompressible Navier-Stokes flow. The approach was shown to be effective without regularization terms, and can handle substantial noise in the experimental data without filtering. Of great practical importance is that - unlike other data assimilation techniques - it is not significantly more expensive than a single physical simulation. However the method as presented so far in the literature is not set in the context of an inverse problem framework, so that for example the meaning of the final result is unclear. In this paper it is shown that the method can be interpreted as finding a maximum a posteriori (MAP) estimator in a Bayesian approach to data assimilation, with normally distributed observational noise, and a Bayesian prior based on an appropriate norm of the governing equations. In this setting the method may be seen to have several desirable properties: most importantly discretization and modelling error in the simulation code does not affect the solution in limit of complete experimental information, so these errors do not have to be modelled statistically. Also the Bayesian interpretation better justifies the choice of the method, and some useful generalizations become apparent. The technique is applied to incompressible Navier-Stokes flow in a pipe with added velocity data, where its effectiveness, robustness to noise, and application to inverse problems is demonstrated.
DEFF Research Database (Denmark)
Møller, Jesper; Rasmussen, Jakob Gulddahl
We introduce a flexible spatial point process model for spatial point patterns exhibiting linear structures, without incorporating a latent line process. The model is given by an underlying sequential point process model, i.e. each new point is generated given the previous points. Under this model...... previous points is such that the dependent cluster point is likely to occur closely to a previous cluster point. We demonstrate the flexibility of the model for producing point patterns with linear structures, and propose to use the model as the likelihood in a Bayesian setting when analyzing a spatial...
MCMC with Strings and Branes: The Suburban Algorithm
Heckman, Jonathan J; Vigoda, Ben
2016-01-01
Motivated by the physics of strings and branes, we introduce a general suite of Markov chain Monte Carlo (MCMC) "suburban samplers" (i.e., spread out Metropolis). The suburban algorithm involves an ensemble of statistical agents connected together by a random network. Performance of the collective in reaching a fast and accurate inference depends primarily on the average number of nearest neighbor connections. Increasing the average number of neighbors above zero initially leads to an increase in performance, though there is a critical connectivity with effective dimension d_eff ~ 1, above which "groupthink" takes over, and the performance of the sampler declines.
Directory of Open Access Journals (Sweden)
Mariana Inés Pocovi
2015-06-01
Full Text Available Understanding the population structure and genetic diversity in sugarcane (Saccharum officinarum L. accessions from INTA germplasm bank (Argentina will be of great importance for germplasm collection and breeding improvement as it will identify diverse parental combinations to create segregating progenies with maximum genetic variability for further selection. A Bayesian approach, ordination methods (PCoA, Principal Coordinate Analysis and clustering analysis (UPGMA, Unweighted Pair Group Method with Arithmetic Mean were applied to this purpose. Sixty three INTA sugarcane hybrids were genotyped for 107 Simple Sequence Repeat (SSR and 136 Amplified Fragment Length Polymorphism (AFLP loci. Given the low probability values found with AFLP for individual assignment (4.7%, microsatellites seemed to perform better (54% for STRUCTURE analysis that revealed the germplasm to exist in five optimum groups with partly corresponding to their origin. However clusters shown high degree of admixture, F ST values confirmed the existence of differences among groups. Dissimilarity coefficients ranged from 0.079 to 0.651. PCoA separated sugarcane in groups that did not agree with those identified by STRUCTURE. The clustering including all genotypes neither showed resemblance to populations find by STRUCTURE, but clustering performed considering only individuals displaying a proportional membership > 0.6 in their primary population obtained with STRUCTURE showed close similarities. The Bayesian method indubitably brought more information on cultivar origins than classical PCoA and hierarchical clustering method.
Bayesian Inference in Hidden Markov Random Fields for Binary Data Defined on Large Lattices
Friel, N.; Pettitt, A.N.; Reeves, R.; Wit, E.
2009-01-01
Hidden Markov random fields represent a complex hierarchical model, where the hidden latent process is an undirected graphical structure. Performing inference for such models is difficult primarily because the likelihood of the hidden states is often unavailable. The main contribution of this articl
Inversion of hierarchical Bayesian models using Gaussian processes.
Lomakina, Ekaterina I; Paliwal, Saee; Diaconescu, Andreea O; Brodersen, Kay H; Aponte, Eduardo A; Buhmann, Joachim M; Stephan, Klaas E
2015-09-01
Over the past decade, computational approaches to neuroimaging have increasingly made use of hierarchical Bayesian models (HBMs), either for inferring on physiological mechanisms underlying fMRI data (e.g., dynamic causal modelling, DCM) or for deriving computational trajectories (from behavioural data) which serve as regressors in general linear models. However, an unresolved problem is that standard methods for inverting the hierarchical Bayesian model are either very slow, e.g. Markov Chain Monte Carlo Methods (MCMC), or are vulnerable to local minima in non-convex optimisation problems, such as variational Bayes (VB). This article considers Gaussian process optimisation (GPO) as an alternative approach for global optimisation of sufficiently smooth and efficiently evaluable objective functions. GPO avoids being trapped in local extrema and can be computationally much more efficient than MCMC. Here, we examine the benefits of GPO for inverting HBMs commonly used in neuroimaging, including DCM for fMRI and the Hierarchical Gaussian Filter (HGF). Importantly, to achieve computational efficiency despite high-dimensional optimisation problems, we introduce a novel combination of GPO and local gradient-based search methods. The utility of this GPO implementation for DCM and HGF is evaluated against MCMC and VB, using both synthetic data from simulations and empirical data. Our results demonstrate that GPO provides parameter estimates with equivalent or better accuracy than the other techniques, but at a fraction of the computational cost required for MCMC. We anticipate that GPO will prove useful for robust and efficient inversion of high-dimensional and nonlinear models of neuroimaging data. PMID:26048619
van den Hout, Ardo; Fox, Jean-Paul; Klein Entink, Rinke H
2015-12-01
Longitudinal data can be used to estimate the transition intensities between healthy and unhealthy states prior to death. An illness-death model for history of stroke is presented, where time-dependent transition intensities are regressed on a latent variable representing cognitive function. The change of this function over time is described by a linear growth model with random effects. Occasion-specific cognitive function is measured by an item response model for longitudinal scores on the Mini-Mental State Examination, a questionnaire used to screen for cognitive impairment. The illness-death model will be used to identify and to explore the relationship between occasion-specific cognitive function and stroke. Combining a multi-state model with the latent growth model defines a joint model which extends current statistical inference regarding disease progression and cognitive function. Markov chain Monte Carlo methods are used for Bayesian inference. Data stem from the Medical Research Council Cognitive Function and Ageing Study in the UK (1991-2005). PMID:22080595
Energy Technology Data Exchange (ETDEWEB)
Marzouk, Youssef; Fast P. (Lawrence Livermore National Laboratory, Livermore, CA); Kraus, M. (Peterson AFB, CO); Ray, J. P.
2006-01-01
Terrorist attacks using an aerosolized pathogen preparation have gained credibility as a national security concern after the anthrax attacks of 2001. The ability to characterize such attacks, i.e., to estimate the number of people infected, the time of infection, and the average dose received, is important when planning a medical response. We address this question of characterization by formulating a Bayesian inverse problem predicated on a short time-series of diagnosed patients exhibiting symptoms. To be of relevance to response planning, we limit ourselves to 3-5 days of data. In tests performed with anthrax as the pathogen, we find that these data are usually sufficient, especially if the model of the outbreak used in the inverse problem is an accurate one. In some cases the scarcity of data may initially support outbreak characterizations at odds with the true one, but with sufficient data the correct inferences are recovered; in other words, the inverse problem posed and its solution methodology are consistent. We also explore the effect of model error-situations for which the model used in the inverse problem is only a partially accurate representation of the outbreak; here, the model predictions and the observations differ by more than a random noise. We find that while there is a consistent discrepancy between the inferred and the true characterizations, they are also close enough to be of relevance when planning a response.
Blanc, Guillermo A; Vogt, Frédéric P A; Dopita, Michael A
2014-01-01
We present a new method for inferring the metallicity (Z) and ionization parameter (q) of HII regions and star-forming galaxies using strong nebular emission lines (SEL). We use Bayesian inference to derive the joint and marginalized posterior probability density functions for Z and q given a set of observed line fluxes and an input photo-ionization model. Our approach allows the use of arbitrary sets of SELs and the inclusion of flux upper limits. The method provides a self-consistent way of determining the physical conditions of ionized nebulae that is not tied to the arbitrary choice of a particular SEL diagnostic and uses all the available information. Unlike theoretically calibrated SEL diagnostics the method is flexible and not tied to a particular photo-ionization model. We describe our algorithm, validate it against other methods, and present a tool that implements it called IZI. Using a sample of nearby extra-galactic HII regions we assess the performance of commonly used SEL abundance diagnostics. W...
Bayesian inference of non-positive spectral functions in quantum field theory
Rothkopf, Alexander
2016-01-01
We present the generalization to non positive definite spectral functions of a recently proposed Bayesian deconvolution approach (BR method). The novel prior used here retains many of the beneficial analytic properties of the original method, in particular it allows us to integrate out the hyperparameter $\\alpha$ directly. To preserve the underlying axiom of scale invariance, we introduce a second default-model related function, whose role is discussed. Our reconstruction prescription is contrasted with existing direct methods, as well as with an approach where shift functions are introduced to compensate for negative spectral features. A mock spectrum analysis inspired by the study of gluon spectral functions in QCD illustrates the capabilities of this new approach.
Bayesian Inference for LISA Pathfinder using Markov Chain Monte Carlo Methods
Ferraioli, Luigi; Plagnol, Eric
2012-01-01
We present a parameter estimation procedure based on a Bayesian framework by applying a Markov Chain Monte Carlo algorithm to the calibration of the dynamical parameters of a space based gravitational wave detector. The method is based on the Metropolis-Hastings algorithm and a two-stage annealing treatment in order to ensure an effective exploration of the parameter space at the beginning of the chain. We compare two versions of the algorithm with an application to a LISA Pathfinder data analysis problem. The two algorithms share the same heating strategy but with one moving in coordinate directions using proposals from a multivariate Gaussian distribution, while the other uses the natural logarithm of some parameters and proposes jumps in the eigen-space of the Fisher Information matrix. The algorithm proposing jumps in the eigen-space of the Fisher Information matrix demonstrates a higher acceptance rate and a slightly better convergence towards the equilibrium parameter distributions in the application to...
BayesWave: Bayesian Inference for Gravitational Wave Bursts and Instrument Glitches
Cornish, Neil J
2014-01-01
A central challenge in Gravitational Wave Astronomy is identifying weak signals in the presence of non-stationary and non-Gaussian noise. The separation of gravitational wave signals from noise requires good models for both. When accurate signal models are available, such as for binary Neutron star systems, it is possible to make robust detection statements even when the noise is poorly understood. In contrast, searches for "un-modeled" transient signals are strongly impacted by the methods used to characterize the noise. Here we take a Bayesian approach and introduce a multi-component, variable dimension, parameterized noise model that explicitly accounts for non-stationarity and non-Gaussianity in data from interferometric gravitational wave detectors. Instrumental transients (glitches) and burst sources of gravitational waves are modeled using a Morlet-Gabor continuous wavelet basis. The number and placement of the wavelets is determined by a trans-dimensional Reversible Jump Markov Chain Monte Carlo algor...
Multi-Pitch Estimation and Tracking Using Bayesian Inference in Block Sparsity
DEFF Research Database (Denmark)
Karimian-Azari, Sam; Jakobsson, Andreas; Jensen, Jesper Rindom;
2015-01-01
In this paper, we consider the problem of multi-pitch estimation and tracking of an unknown number of harmonic audio sources. The regularized least-squares is a solution for simultaneous sparse source selection and parameter estimation. Exploiting block sparsity, the method allows for reliable...... tracking of the found sources, without posing detailed a priori assumptions of the number of harmonics for each source. The method incorporates a Bayesian prior and assigns data-dependent regularization coefficients to efficiently incorporate both earlier and future data blocks in the tracking of estimates....... In comparison with fix regularization coefficients, the simulation results, using both real and synthetic audio signals, confirm the performance of the proposed method....
Bayesian inference of T Tauri star properties using multi-wavelength survey photometry
Barentsen, Geert; Vink, J. S.; Drew, J. E.; Sale, S. E.
2013-03-01
There are many pertinent open issues in the area of star and planet formation. Large statistical samples of young stars across star-forming regions are needed to trigger a breakthrough in our understanding, but most optical studies are based on a wide variety of spectrographs and analysis methods, which introduces large biases. Here we show how graphical Bayesian networks can be employed to construct a hierarchical probabilistic model which allows pre-main-sequence ages, masses, accretion rates and extinctions to be estimated using two widely available photometric survey data bases (Isaac Newton Telescope Photometric Hα Survey r'/Hα/i' and Two Micron All Sky Survey J-band magnitudes). Because our approach does not rely on spectroscopy, it can easily be applied to ho-mogeneously study the large number of clusters for which Gaia will yield membership lists. We explain how the analysis is carried out using the Markov chain Monte Carlo method and provide PYTHON source code. We then demonstrate its use on 587 known low-mass members of the star-forming region NGC 2264 (Cone Nebula), arriving at a median age of 3.0 Myr, an accretion fraction of 20 ± 2 per cent and a median accretion rate of 10-8.4 M⊙ yr-1. The Bayesian analysis formulated in this work delivers results which are in agreement with spectroscopic studies already in the literature, but achieves this with great efficiency by depending only on photometry. It is a significant step forward from previous photometric studies because the probabilistic approach ensures that nuisance parameters, such as extinction and distance, are fully included in the analysis with a clear picture on any degeneracies.
Directory of Open Access Journals (Sweden)
Oliver Ratmann
Full Text Available A key priority in infectious disease research is to understand the ecological and evolutionary drivers of viral diseases from data on disease incidence as well as viral genetic and antigenic variation. We propose using a simulation-based, Bayesian method known as Approximate Bayesian Computation (ABC to fit and assess phylodynamic models that simulate pathogen evolution and ecology against summaries of these data. We illustrate the versatility of the method by analyzing two spatial models describing the phylodynamics of interpandemic human influenza virus subtype A(H3N2. The first model captures antigenic drift phenomenologically with continuously waning immunity, and the second epochal evolution model describes the replacement of major, relatively long-lived antigenic clusters. Combining features of long-term surveillance data from The Netherlands with features of influenza A (H3N2 hemagglutinin gene sequences sampled in northern Europe, key phylodynamic parameters can be estimated with ABC. Goodness-of-fit analyses reveal that the irregularity in interannual incidence and H3N2's ladder-like hemagglutinin phylogeny are quantitatively only reproduced under the epochal evolution model within a spatial context. However, the concomitant incidence dynamics result in a very large reproductive number and are not consistent with empirical estimates of H3N2's population level attack rate. These results demonstrate that the interactions between the evolutionary and ecological processes impose multiple quantitative constraints on the phylodynamic trajectories of influenza A(H3N2, so that sequence and surveillance data can be used synergistically. ABC, one of several data synthesis approaches, can easily interface a broad class of phylodynamic models with various types of data but requires careful calibration of the summaries and tolerance parameters.
Bayesian large-scale structure inference: initial conditions and the cosmic web
Leclercq, Florent
2014-01-01
We describe an innovative statistical approach for the ab initio simultaneous analysis of the formation history and morphology of the large-scale structure of the inhomogeneous Universe. Our algorithm explores the joint posterior distribution of the many millions of parameters involved via efficient Hamiltonian Markov Chain Monte Carlo sampling. We describe its application to the Sloan Digital Sky Survey data release 7 and an additional non-linear filtering step. We illustrate the use of our findings for cosmic web analysis: identification of structures via tidal shear analysis and inference of dark matter voids.
Systematic validation of non-equilibrium thermochemical models using Bayesian inference
Energy Technology Data Exchange (ETDEWEB)
Miki, Kenji [NASA Glenn Research Center, OAI, 22800 Cedar Point Rd, Cleveland, OH 44142 (United States); Panesi, Marco, E-mail: mpanesi@illinois.edu [Department of Aerospace Engineering, University of Illinois at Urbana-Champaign, 306 Talbot Lab, 104 S. Wright St., Urbana, IL 61801 (United States); Prudhomme, Serge [Département de mathématiques et de génie industriel, Ecole Polytechnique de Montréal, C.P. 6079, succ. Centre-ville, Montréal, QC, H3C 3A7 (Canada)
2015-10-01
The validation process proposed by Babuška et al. [1] is applied to thermochemical models describing post-shock flow conditions. In this validation approach, experimental data is involved only in the calibration of the models, and the decision process is based on quantities of interest (QoIs) predicted on scenarios that are not necessarily amenable experimentally. Moreover, uncertainties present in the experimental data, as well as those resulting from an incomplete physical model description, are propagated to the QoIs. We investigate four commonly used thermochemical models: a one-temperature model (which assumes thermal equilibrium among all inner modes), and two-temperature models developed by Macheret et al. [2], Marrone and Treanor [3], and Park [4]. Up to 16 uncertain parameters are estimated using Bayesian updating based on the latest absolute volumetric radiance data collected at the Electric Arc Shock Tube (EAST) installed inside the NASA Ames Research Center. Following the solution of the inverse problems, the forward problems are solved in order to predict the radiative heat flux, QoI, and examine the validity of these models. Our results show that all four models are invalid, but for different reasons: the one-temperature model simply fails to reproduce the data while the two-temperature models exhibit unacceptably large uncertainties in the QoI predictions.
Fardal, Mark A; Babul, Arif; Irwin, Mike J; Guhathakurta, Puragra; Gilbert, Karoline M; Ferguson, Annette M N; Ibata, Rodrigo A; Lewis, Geraint F; Tanvir, Nial R; Huxor, Avon P
2013-01-01
M31 has a giant stream of stars extending far to the south and a great deal of other tidal debris in its halo, much of which is thought to be directly associated with the southern stream. We model this structure by means of Bayesian sampling of parameter space, where each sample uses an N-body simulation of a satellite disrupting in M31's potential. We combine constraints on stellar surface densities from the Isaac Newton Telescope survey of M31 with kinematic data and photometric distances. This combination of data tightly constrains the model, indicating a stellar mass at last pericentric passage of log(M_s / Msun) = 9.5+-0.1, comparable to the LMC. Any existing remnant of the satellite is expected to lie in the NE Shelf region beside M31's disk, at velocities more negative than M31's disk in this region. This rules out the prominent satellites M32 or NGC 205 as the progenitor, but an overdensity recently discovered in M31's NE disk sits at the edge of the progenitor locations found in the model. M31's viri...
Park, Taeyoung; Jeong, Jong-Hyeon; Lee, Jae Won
2012-08-15
There is often an interest in estimating a residual life function as a summary measure of survival data. For ease in presentation of the potential therapeutic effect of a new drug, investigators may summarize survival data in terms of the remaining life years of patients. Under heavy right censoring, however, some reasonably high quantiles (e.g., median) of a residual lifetime distribution cannot be always estimated via a popular nonparametric approach on the basis of the Kaplan-Meier estimator. To overcome the difficulties in dealing with heavily censored survival data, this paper develops a Bayesian nonparametric approach that takes advantage of a fully model-based but highly flexible probabilistic framework. We use a Dirichlet process mixture of Weibull distributions to avoid strong parametric assumptions on the unknown failure time distribution, making it possible to estimate any quantile residual life function under heavy censoring. Posterior computation through Markov chain Monte Carlo is straightforward and efficient because of conjugacy properties and partial collapse. We illustrate the proposed methods by using both simulated data and heavily censored survival data from a recent breast cancer clinical trial conducted by the National Surgical Adjuvant Breast and Bowel Project. PMID:22437758
Fajardo, Alvaro; Soñora, Martín; Moreno, Pilar; Moratorio, Gonzalo; Cristina, Juan
2016-10-01
Zika virus (ZIKV) is a member of the family Flaviviridae. In 2015, ZIKV triggered an epidemic in Brazil and spread across Latin America. By May of 2016, the World Health Organization warns over spread of ZIKV beyond this region. Detailed studies on the mode of evolution of ZIKV strains are extremely important for our understanding of the emergence and spread of ZIKV populations. In order to gain insight into these matters, a Bayesian coalescent Markov Chain Monte Carlo analysis of complete genome sequences of recently isolated ZIKV strains was performed. The results of these studies revealed a mean rate of evolution of 1.20 × 10(-3) nucleotide substitutions per site per year (s/s/y) for ZIKV strains enrolled in this study. Several variants isolated in China are grouped together with all strains isolated in Latin America. Another genetic group composed exclusively by Chinese strains were also observed, suggesting the co-circulation of different genetic lineages in China. These findings indicate a high level of diversification of ZIKV populations. Strains isolated from microcephaly cases do not share amino acid substitutions, suggesting that other factors besides viral genetic differences may play a role for the proposed pathogenesis caused by ZIKV infection. J. Med. Virol. 88:1672-1676, 2016. © 2016 Wiley Periodicals, Inc. PMID:27278855
Fajardo, Alvaro; Soñora, Martín; Moreno, Pilar; Moratorio, Gonzalo; Cristina, Juan
2016-10-01
Zika virus (ZIKV) is a member of the family Flaviviridae. In 2015, ZIKV triggered an epidemic in Brazil and spread across Latin America. By May of 2016, the World Health Organization warns over spread of ZIKV beyond this region. Detailed studies on the mode of evolution of ZIKV strains are extremely important for our understanding of the emergence and spread of ZIKV populations. In order to gain insight into these matters, a Bayesian coalescent Markov Chain Monte Carlo analysis of complete genome sequences of recently isolated ZIKV strains was performed. The results of these studies revealed a mean rate of evolution of 1.20 × 10(-3) nucleotide substitutions per site per year (s/s/y) for ZIKV strains enrolled in this study. Several variants isolated in China are grouped together with all strains isolated in Latin America. Another genetic group composed exclusively by Chinese strains were also observed, suggesting the co-circulation of different genetic lineages in China. These findings indicate a high level of diversification of ZIKV populations. Strains isolated from microcephaly cases do not share amino acid substitutions, suggesting that other factors besides viral genetic differences may play a role for the proposed pathogenesis caused by ZIKV infection. J. Med. Virol. 88:1672-1676, 2016. © 2016 Wiley Periodicals, Inc.
Directory of Open Access Journals (Sweden)
María Zubillaga
Full Text Available Understanding the mechanisms that drive population dynamics is fundamental for management of wild populations. The guanaco (Lama guanicoe is one of two wild camelid species in South America. We evaluated the effects of density dependence and weather variables on population regulation based on a time series of 36 years of population sampling of guanacos in Tierra del Fuego, Chile. The population density varied between 2.7 and 30.7 guanaco/km2, with an apparent monotonic growth during the first 25 years; however, in the last 10 years the population has shown large fluctuations, suggesting that it might have reached its carrying capacity. We used a Bayesian state-space framework and model selection to determine the effect of density and environmental variables on guanaco population dynamics. Our results show that the population is under density dependent regulation and that it is currently fluctuating around an average carrying capacity of 45,000 guanacos. We also found a significant positive effect of previous winter temperature while sheep density has a strong negative effect on the guanaco population growth. We conclude that there are significant density dependent processes and that climate as well as competition with domestic species have important effects determining the population size of guanacos, with important implications for management and conservation.
Systematic validation of non-equilibrium thermochemical models using Bayesian inference
Miki, Kenji
2015-10-01
© 2015 Elsevier Inc. The validation process proposed by Babuška et al. [1] is applied to thermochemical models describing post-shock flow conditions. In this validation approach, experimental data is involved only in the calibration of the models, and the decision process is based on quantities of interest (QoIs) predicted on scenarios that are not necessarily amenable experimentally. Moreover, uncertainties present in the experimental data, as well as those resulting from an incomplete physical model description, are propagated to the QoIs. We investigate four commonly used thermochemical models: a one-temperature model (which assumes thermal equilibrium among all inner modes), and two-temperature models developed by Macheret et al. [2], Marrone and Treanor [3], and Park [4]. Up to 16 uncertain parameters are estimated using Bayesian updating based on the latest absolute volumetric radiance data collected at the Electric Arc Shock Tube (EAST) installed inside the NASA Ames Research Center. Following the solution of the inverse problems, the forward problems are solved in order to predict the radiative heat flux, QoI, and examine the validity of these models. Our results show that all four models are invalid, but for different reasons: the one-temperature model simply fails to reproduce the data while the two-temperature models exhibit unacceptably large uncertainties in the QoI predictions.
Zubillaga, María; Skewes, Oscar; Soto, Nicolás; Rabinovich, Jorge E.; Colchero, Fernando
2014-01-01
Understanding the mechanisms that drive population dynamics is fundamental for management of wild populations. The guanaco (Lama guanicoe) is one of two wild camelid species in South America. We evaluated the effects of density dependence and weather variables on population regulation based on a time series of 36 years of population sampling of guanacos in Tierra del Fuego, Chile. The population density varied between 2.7 and 30.7 guanaco/km2, with an apparent monotonic growth during the first 25 years; however, in the last 10 years the population has shown large fluctuations, suggesting that it might have reached its carrying capacity. We used a Bayesian state-space framework and model selection to determine the effect of density and environmental variables on guanaco population dynamics. Our results show that the population is under density dependent regulation and that it is currently fluctuating around an average carrying capacity of 45,000 guanacos. We also found a significant positive effect of previous winter temperature while sheep density has a strong negative effect on the guanaco population growth. We conclude that there are significant density dependent processes and that climate as well as competition with domestic species have important effects determining the population size of guanacos, with important implications for management and conservation. PMID:25514510
Niwayama, Ritsuya; Nagao, Hiromichi; Kitajima, Tomoya S; Hufnagel, Lars; Shinohara, Kyosuke; Higuchi, Tomoyuki; Ishikawa, Takuji; Kimura, Akatsuki
2016-01-01
Cellular structures are hydrodynamically interconnected, such that force generation in one location can move distal structures. One example of this phenomenon is cytoplasmic streaming, whereby active forces at the cell cortex induce streaming of the entire cytoplasm. However, it is not known how the spatial distribution and magnitude of these forces move distant objects within the cell. To address this issue, we developed a computational method that used cytoplasm hydrodynamics to infer the spatial distribution of shear stress at the cell cortex induced by active force generators from experimentally obtained flow field of cytoplasmic streaming. By applying this method, we determined the shear-stress distribution that quantitatively reproduces in vivo flow fields in Caenorhabditis elegans embryos and mouse oocytes during meiosis II. Shear stress in mouse oocytes were predicted to localize to a narrower cortical region than that with a high cortical flow velocity and corresponded with the localization of the cortical actin cap. The predicted patterns of pressure gradient in both species were consistent with species-specific cytoplasmic streaming functions. The shear-stress distribution inferred by our method can contribute to the characterization of active force generation driving biological streaming.
Niwayama, Ritsuya; Nagao, Hiromichi; Kitajima, Tomoya S.; Hufnagel, Lars; Shinohara, Kyosuke; Higuchi, Tomoyuki; Ishikawa, Takuji
2016-01-01
Cellular structures are hydrodynamically interconnected, such that force generation in one location can move distal structures. One example of this phenomenon is cytoplasmic streaming, whereby active forces at the cell cortex induce streaming of the entire cytoplasm. However, it is not known how the spatial distribution and magnitude of these forces move distant objects within the cell. To address this issue, we developed a computational method that used cytoplasm hydrodynamics to infer the spatial distribution of shear stress at the cell cortex induced by active force generators from experimentally obtained flow field of cytoplasmic streaming. By applying this method, we determined the shear-stress distribution that quantitatively reproduces in vivo flow fields in Caenorhabditis elegans embryos and mouse oocytes during meiosis II. Shear stress in mouse oocytes were predicted to localize to a narrower cortical region than that with a high cortical flow velocity and corresponded with the localization of the cortical actin cap. The predicted patterns of pressure gradient in both species were consistent with species-specific cytoplasmic streaming functions. The shear-stress distribution inferred by our method can contribute to the characterization of active force generation driving biological streaming. PMID:27472658
Coping with Trial-to-Trial Variability of Event Related Signals: A Bayesian Inference Approach
Ding, Mingzhou; Chen, Youghong; Knuth, Kevin H.; Bressler, Steven L.; Schroeder, Charles E.
2005-01-01
In electro-neurophysiology, single-trial brain responses to a sensory stimulus or a motor act are commonly assumed to result from the linear superposition of a stereotypic event-related signal (e.g. the event-related potential or ERP) that is invariant across trials and some ongoing brain activity often referred to as noise. To extract the signal, one performs an ensemble average of the brain responses over many identical trials to attenuate the noise. To date, h s simple signal-plus-noise (SPN) model has been the dominant approach in cognitive neuroscience. Mounting empirical evidence has shown that the assumptions underlying this model may be overly simplistic. More realistic models have been proposed that account for the trial-to-trial variability of the event-related signal as well as the possibility of multiple differentially varying components within a given ERP waveform. The variable-signal-plus-noise (VSPN) model, which has been demonstrated to provide the foundation for separation and characterization of multiple differentially varying components, has the potential to provide a rich source of information for questions related to neural functions that complement the SPN model. Thus, being able to estimate the amplitude and latency of each ERP component on a trial-by-trial basis provides a critical link between the perceived benefits of the VSPN model and its many concrete applications. In this paper we describe a Bayesian approach to deal with this issue and the resulting strategy is referred to as the differentially Variable Component Analysis (dVCA). We compare the performance of dVCA on simulated data with Independent Component Analysis (ICA) and analyze neurobiological recordings from monkeys performing cognitive tasks.
A method of spherical harmonic analysis in the geosciences via hierarchical Bayesian inference
Muir, J. B.; Tkalčić, H.
2015-11-01
The problem of decomposing irregular data on the sphere into a set of spherical harmonics is common in many fields of geosciences where it is necessary to build a quantitative understanding of a globally varying field. For example, in global seismology, a compressional or shear wave speed that emerges from tomographic images is used to interpret current state and composition of the mantle, and in geomagnetism, secular variation of magnetic field intensity measured at the surface is studied to better understand the changes in the Earth's core. Optimization methods are widely used for spherical harmonic analysis of irregular data, but they typically do not treat the dependence of the uncertainty estimates on the imposed regularization. This can cause significant difficulties in interpretation, especially when the best-fit model requires more variables as a result of underestimating data noise. Here, with the above limitations in mind, the problem of spherical harmonic expansion of irregular data is treated within the hierarchical Bayesian framework. The hierarchical approach significantly simplifies the problem by removing the need for regularization terms and user-supplied noise estimates. The use of the corrected Akaike Information Criterion for picking the optimal maximum degree of spherical harmonic expansion and the resulting spherical harmonic analyses are first illustrated on a noisy synthetic data set. Subsequently, the method is applied to two global data sets sensitive to the Earth's inner core and lowermost mantle, consisting of PKPab-df and PcP-P differential traveltime residuals relative to a spherically symmetric Earth model. The posterior probability distributions for each spherical harmonic coefficient are calculated via Markov Chain Monte Carlo sampling; the uncertainty obtained for the coefficients thus reflects the noise present in the real data and the imperfections in the spherical harmonic expansion.
Dead or gone? Bayesian inference on mortality for the dispersing sex.
Barthold, Julia A; Packer, Craig; Loveridge, Andrew J; Macdonald, David W; Colchero, Fernando
2016-07-01
Estimates of age-specific mortality are regularly used in ecology, evolution, and conservation research. However, estimating mortality of the dispersing sex, in species where one sex undergoes natal dispersal, is difficult. This is because it is often unclear whether members of the dispersing sex that disappear from monitored areas have died or dispersed. Here, we develop an extension of a multievent model that imputes dispersal state (i.e., died or dispersed) for uncertain records of the dispersing sex as a latent state and estimates age-specific mortality and dispersal parameters in a Bayesian hierarchical framework. To check the performance of our model, we first conduct a simulation study. We then apply our model to a long-term data set of African lions. Using these data, we further study how well our model estimates mortality of the dispersing sex by incrementally reducing the level of uncertainty in the records of male lions. We achieve this by taking advantage of an expert's indication on the likely fate of each missing male (i.e., likely died or dispersed). We find that our model produces accurate mortality estimates for simulated data of varying sample sizes and proportions of uncertain male records. From the empirical study, we learned that our model provides similar mortality estimates for different levels of uncertainty in records. However, a sensitivity of the mortality estimates to varying uncertainty is, as can be expected, detectable. We conclude that our model provides a solution to the challenge of estimating mortality of the dispersing sex in species with data deficiency due to natal dispersal. Given the utility of sex-specific mortality estimates in biological and conservation research, and the virtual ubiquity of sex-biased dispersal, our model will be useful to a wide variety of applications. PMID:27547322
Final Report: Large-Scale Optimization for Bayesian Inference in Complex Systems
Energy Technology Data Exchange (ETDEWEB)
Ghattas, Omar [The University of Texas at Austin
2013-10-15
The SAGUARO (Scalable Algorithms for Groundwater Uncertainty Analysis and Robust Optimiza- tion) Project focuses on the development of scalable numerical algorithms for large-scale Bayesian inversion in complex systems that capitalize on advances in large-scale simulation-based optimiza- tion and inversion methods. Our research is directed in three complementary areas: efficient approximations of the Hessian operator, reductions in complexity of forward simulations via stochastic spectral approximations and model reduction, and employing large-scale optimization concepts to accelerate sampling. Our efforts are integrated in the context of a challenging testbed problem that considers subsurface reacting flow and transport. The MIT component of the SAGUARO Project addresses the intractability of conventional sampling methods for large-scale statistical inverse problems by devising reduced-order models that are faithful to the full-order model over a wide range of parameter values; sampling then employs the reduced model rather than the full model, resulting in very large computational savings. Results indicate little effect on the computed posterior distribution. On the other hand, in the Texas-Georgia Tech component of the project, we retain the full-order model, but exploit inverse problem structure (adjoint-based gradients and partial Hessian information of the parameter-to- observation map) to implicitly extract lower dimensional information on the posterior distribution; this greatly speeds up sampling methods, so that fewer sampling points are needed. We can think of these two approaches as "reduce then sample" and "sample then reduce." In fact, these two approaches are complementary, and can be used in conjunction with each other. Moreover, they both exploit deterministic inverse problem structure, in the form of adjoint-based gradient and Hessian information of the underlying parameter-to-observation map, to achieve their speedups.
Bayesian inference of selection in a heterogeneous environment from genetic time-series data.
Gompert, Zachariah
2016-01-01
Evolutionary geneticists have sought to characterize the causes and molecular targets of selection in natural populations for many years. Although this research programme has been somewhat successful, most statistical methods employed were designed to detect consistent, weak to moderate selection. In contrast, phenotypic studies in nature show that selection varies in time and that individual bouts of selection can be strong. Measurements of the genomic consequences of such fluctuating selection could help test and refine hypotheses concerning the causes of ecological specialization and the maintenance of genetic variation in populations. Herein, I proposed a Bayesian nonhomogeneous hidden Markov model to estimate effective population sizes and quantify variable selection in heterogeneous environments from genetic time-series data. The model is described and then evaluated using a series of simulated data, including cases where selection occurs on a trait with a simple or polygenic molecular basis. The proposed method accurately distinguished neutral loci from non-neutral loci under strong selection, but not from those under weak selection. Selection coefficients were accurately estimated when selection was constant or when the fitness values of genotypes varied linearly with the environment, but these estimates were less accurate when fitness was polygenic or the relationship between the environment and the fitness of genotypes was nonlinear. Past studies of temporal evolutionary dynamics in laboratory populations have been remarkably successful. The proposed method makes similar analyses of genetic time-series data from natural populations more feasible and thereby could help answer fundamental questions about the causes and consequences of evolution in the wild.
BayesLine: Bayesian Inference for Spectral Estimation of Gravitational Wave Detector Noise
Littenberg, Tyson B
2014-01-01
Gravitational wave data from ground-based detectors is dominated by instrument noise. Signals will be comparatively weak, and our understanding of the noise will influence detection confidence and signal characterization. Mis-modeled noise can produce large systematic biases in both model selection and parameter estimation. Here we introduce a multi-component, variable dimension, parameterized model to describe the Gaussian-noise power spectrum for data from ground-based gravitational wave interferometers. Called BayesLine, the algorithm models the noise power spectral density using cubic splines for smoothly varying broad-band noise and Lorentzians for narrow-band line features in the spectrum. We describe the algorithm and demonstrate its performance on data from the fifth and sixth LIGO science runs. Once fully integrated into LIGO/Virgo data analysis software, BayesLine will produce accurate spectral estimation and provide a means for marginalizing inferences drawn from the data over all plausible noise s...
WHOOMP! (There It Is) Rapid Bayesian position reconstruction for gravitational-wave transients
Singer, Leo P
2015-01-01
Within the next few years, Advanced LIGO and Virgo should detect gravitational waves (GWs) from binary neutron star and neutron star-black hole mergers. These sources are also predicted to power a broad array of electromagnetic transients. Because the X-ray and optical signatures can be faint and fade rapidly, observing them hinges on rapidly inferring the sky location from the gravitational wave observations. Markov chain Monte Carlo (MCMC) methods for gravitational-wave parameter estimation can take hours or more. We introduce BAYESTAR, a rapid, Bayesian, non-MCMC sky localization algorithm that takes just seconds to produce probability sky maps that are comparable in accuracy to the full analysis. Prompt localizations from BAYESTAR will make it possible to search electromagnetic counterparts of compact binary mergers.
Directory of Open Access Journals (Sweden)
Mario A Pardo
Full Text Available We inferred the population densities of blue whales (Balaenoptera musculus and short-beaked common dolphins (Delphinus delphis in the Northeast Pacific Ocean as functions of the water-column's physical structure by implementing hierarchical models in a Bayesian framework. This approach allowed us to propagate the uncertainty of the field observations into the inference of species-habitat relationships and to generate spatially explicit population density predictions with reduced effects of sampling heterogeneity. Our hypothesis was that the large-scale spatial distributions of these two cetacean species respond primarily to ecological processes resulting from shoaling and outcropping of the pycnocline in regions of wind-forced upwelling and eddy-like circulation. Physically, these processes affect the thermodynamic balance of the water column, decreasing its volume and thus the height of the absolute dynamic topography (ADT. Biologically, they lead to elevated primary productivity and persistent aggregation of low-trophic-level prey. Unlike other remotely sensed variables, ADT provides information about the structure of the entire water column and it is also routinely measured at high spatial-temporal resolution by satellite altimeters with uniform global coverage. Our models provide spatially explicit population density predictions for both species, even in areas where the pycnocline shoals but does not outcrop (e.g. the Costa Rica Dome and the North Equatorial Countercurrent thermocline ridge. Interannual variations in distribution during El Niño anomalies suggest that the population density of both species decreases dramatically in the Equatorial Cold Tongue and the Costa Rica Dome, and that their distributions retract to particular areas that remain productive, such as the more oceanic waters in the central California Current System, the northern Gulf of California, the North Equatorial Countercurrent thermocline ridge, and the more
Pardo, Mario A; Gerrodette, Tim; Beier, Emilio; Gendron, Diane; Forney, Karin A; Chivers, Susan J; Barlow, Jay; Palacios, Daniel M
2015-01-01
We inferred the population densities of blue whales (Balaenoptera musculus) and short-beaked common dolphins (Delphinus delphis) in the Northeast Pacific Ocean as functions of the water-column's physical structure by implementing hierarchical models in a Bayesian framework. This approach allowed us to propagate the uncertainty of the field observations into the inference of species-habitat relationships and to generate spatially explicit population density predictions with reduced effects of sampling heterogeneity. Our hypothesis was that the large-scale spatial distributions of these two cetacean species respond primarily to ecological processes resulting from shoaling and outcropping of the pycnocline in regions of wind-forced upwelling and eddy-like circulation. Physically, these processes affect the thermodynamic balance of the water column, decreasing its volume and thus the height of the absolute dynamic topography (ADT). Biologically, they lead to elevated primary productivity and persistent aggregation of low-trophic-level prey. Unlike other remotely sensed variables, ADT provides information about the structure of the entire water column and it is also routinely measured at high spatial-temporal resolution by satellite altimeters with uniform global coverage. Our models provide spatially explicit population density predictions for both species, even in areas where the pycnocline shoals but does not outcrop (e.g. the Costa Rica Dome and the North Equatorial Countercurrent thermocline ridge). Interannual variations in distribution during El Niño anomalies suggest that the population density of both species decreases dramatically in the Equatorial Cold Tongue and the Costa Rica Dome, and that their distributions retract to particular areas that remain productive, such as the more oceanic waters in the central California Current System, the northern Gulf of California, the North Equatorial Countercurrent thermocline ridge, and the more southern portion of the
Directory of Open Access Journals (Sweden)
Heringstad Bjørg
2010-07-01
Full Text Available Abstract Background In the genetic analysis of binary traits with one observation per animal, animal threshold models frequently give biased heritability estimates. In some cases, this problem can be circumvented by fitting sire- or sire-dam models. However, these models are not appropriate in cases where individual records exist on parents. Therefore, the aim of our study was to develop a new Gibbs sampling algorithm for a proper estimation of genetic (covariance components within an animal threshold model framework. Methods In the proposed algorithm, individuals are classified as either "informative" or "non-informative" with respect to genetic (covariance components. The "non-informative" individuals are characterized by their Mendelian sampling deviations (deviance from the mid-parent mean being completely confounded with a single residual on the underlying liability scale. For threshold models, residual variance on the underlying scale is not identifiable. Hence, variance of fully confounded Mendelian sampling deviations cannot be identified either, but can be inferred from the between-family variation. In the new algorithm, breeding values are sampled as in a standard animal model using the full relationship matrix, but genetic (covariance components are inferred from the sampled breeding values and relationships between "informative" individuals (usually parents only. The latter is analogous to a sire-dam model (in cases with no individual records on the parents. Results When applied to simulated data sets, the standard animal threshold model failed to produce useful results since samples of genetic variance always drifted towards infinity, while the new algorithm produced proper parameter estimates essentially identical to the results from a sire-dam model (given the fact that no individual records exist for the parents. Furthermore, the new algorithm showed much faster Markov chain mixing properties for genetic parameters (similar to
Sobradelo, Rosa; Martí, Joan
2015-01-01
One of the most challenging aspects of managing a volcanic crisis is the interpretation of the monitoring data, so as to anticipate to the evolution of the unrest and implement timely mitigation actions. An unrest episode may include different stages or time intervals of increasing activity that may or may not precede a volcanic eruption, depending on the causes of the unrest (magmatic, geothermal or tectonic). Therefore, one of the main goals in monitoring volcanic unrest is to forecast whether or not such increase of activity will end up with an eruption, and if this is the case, how, when, and where this eruption will take place. As an alternative method to expert elicitation for assessing and merging monitoring data and relevant past information, we present a probabilistic method to transform precursory activity into the probability of experiencing a significant variation by the next time interval (i.e. the next step in the unrest), given its preceding evolution, and by further estimating the probability of the occurrence of a particular eruptive scenario combining monitoring and past data. With the 1991 Pinatubo volcanic crisis as a reference, we have developed such a method to assess short-term volcanic hazard using Bayesian inference.
Ata, Metin; Müller, Volker
2014-01-01
We present a Bayesian reconstruction algorithm to generate unbiased samples of the underlying dark matter field from galaxy redshift data. Our new contribution consists of implementing a non-Poisson likelihood including a deterministic non-linear and scale-dependent bias. In particular we present the Hamiltonian equations of motions for the negative binomial (NB) probability distribution function. This permits us to efficiently sample the posterior distribution function of density fields given a sample of galaxies using the Hamiltonian Monte Carlo technique implemented in the Argo code. We have tested our algorithm with the Bolshoi N-body simulation, inferring the underlying dark matter density field from a subsample of the halo catalogue. Our method shows that we can draw closely unbiased samples (compatible within 1-$\\sigma$) from the posterior distribution up to scales of about k~1 h/Mpc in terms of power-spectra and cell-to-cell correlations. We find that a Poisson likelihood yields reconstructions with p...
D'Agostini, G
2010-01-01
Triggered by a recent interesting New Scientist article on the too frequent incorrect use of probabilistic evidence in courts, I introduce the basic concepts of probabilistic inference with a toy model, and discuss several important issues that need to be understood in order to extend the basic reasoning to real life cases. In particular, I emphasize the often neglected point that degrees of beliefs are updated not by `bare facts' alone, but by all available information pertaining to them, including how they have been acquired. In this light I show that, contrary to what claimed in that article, there was no "probabilistic pitfall" in the Columbo's episode pointed as example of "bad mathematics" yielding "rough justice". Instead, such a criticism could have a `negative reaction' to the article itself and to the use of Bayesian reasoning in courts, as well as in all other places in which probabilities need to be assessed and decisions need to be made. Anyway, besides introductory/recreational aspects, the pape...
Sobradelo, Rosa; Bartolini, Stefania; Martí, Joan
2014-05-01
Event tree structures constitute one of the most useful and necessary tools of modern volcanology to assess the volcanic hazard of future volcanic scenarios. They are particularly relevant to evaluate long- and short-term probabilities of occurrence of possible volcanic scenarios and their potential impacts on urbanized areas. Here we introduce HASSET, a Hazard Assessment Event Tree probability tool, built on an event tree structure that uses Bayesian inference to estimate the probability of occurrence of a future volcanic scenario, and to evaluate the most relevant sources of uncertainty from the corresponding volcanic system. HASSET includes hazard assessment of non-eruptive and non-magmatic volcanic scenarios, that is, episodes of unrest that do not evolve into volcanic eruption but have an associated volcanic hazard (eg. sector collapse and phreatic explosion), as well as those with external triggers as primary sources of unrest (as opposed to magmatic unrest alone). Additionally, HASSET introduces the Delta method to assess how precise the probability estimates are, by reporting a one standard deviation variability interval around the expected value for each scenario. HASSET is presented as a free software package in the form of a plugin for the open source geographic information system Quantum Gis (QGIS), providing a graphically supported computation of the event tree structure in an interactive and user-friendly way. We also include an example of HASSET applied to Teide-Pico Viejo volcanic complex (Spain).
Ata, Metin; Kitaura, Francisco-Shu; Müller, Volker
2015-02-01
We present a Bayesian reconstruction algorithm to generate unbiased samples of the underlying dark matter field from halo catalogues. Our new contribution consists of implementing a non-Poisson likelihood including a deterministic non-linear and scale-dependent bias. In particular we present the Hamiltonian equations of motions for the negative binomial (NB) probability distribution function. This permits us to efficiently sample the posterior distribution function of density fields given a sample of galaxies using the Hamiltonian Monte Carlo technique implemented in the ARGO code. We have tested our algorithm with the Bolshoi N-body simulation at redshift z = 0, inferring the underlying dark matter density field from subsamples of the halo catalogue with biases smaller and larger than one. Our method shows that we can draw closely unbiased samples (compatible within 1-σ) from the posterior distribution up to scales of about k ˜ 1 h Mpc-1 in terms of power-spectra and cell-to-cell correlations. We find that a Poisson likelihood including a scale-dependent non-linear deterministic bias can yield reconstructions with power spectra deviating more than 10 per cent at k = 0.2 h Mpc-1. Our reconstruction algorithm is especially suited for emission line galaxy data for which a complex non-linear stochastic biasing treatment beyond Poissonity becomes indispensable.
Directory of Open Access Journals (Sweden)
Yan Gao
2014-01-01
Full Text Available With the rapid development and application of medical sensor networks, the security has become a big challenge to be resolved. Trust mechanism as a method of “soft security” has been proposed to guarantee the network security. Trust models to compute the trustworthiness of single node and each path are constructed, respectively, in this paper. For the trust relationship between nodes, trust value in every interval is quantified based on Bayesian inference. A node estimates the parameters of prior distribution by using the collected recommendation information and obtains the posterior distribution combined with direct interactions. Further, the weights of trust values are allocated through using the ordered weighted vector twice and overall trust degree is represented. With the associated properties of Tsallis entropy, the definition of path Tsallis entropy is put forward, which can comprehensively measure the uncertainty of each path. Then a method to calculate the credibility of each path is derived. The simulation results show that the proposed models can correctly reflect the dynamic of node behavior, quickly identify the malicious attacks, and effectively avoid such path containing low-trust nodes so as to enhance the robustness.
Harris Recurrence and MCMC: A Simplified Approach
DEFF Research Database (Denmark)
Asmussen, Søren; Glynn, Peter W.
A key result underlying the theory of MCMC is that any η-irreducible Markov chain having a transition density with respect to η and possessing a stationary distribution is automatically positive Harris recurrent. This paper provides a short self-contained proof of this fact.......A key result underlying the theory of MCMC is that any η-irreducible Markov chain having a transition density with respect to η and possessing a stationary distribution is automatically positive Harris recurrent. This paper provides a short self-contained proof of this fact....
Bayesian Spatial Modelling with R-INLA
Directory of Open Access Journals (Sweden)
Finn Lindgren
2015-02-01
Full Text Available The principles behind the interface to continuous domain spatial models in the R- INLA software package for R are described. The integrated nested Laplace approximation (INLA approach proposed by Rue, Martino, and Chopin (2009 is a computationally effective alternative to MCMC for Bayesian inference. INLA is designed for latent Gaussian models, a very wide and flexible class of models ranging from (generalized linear mixed to spatial and spatio-temporal models. Combined with the stochastic partial differential equation approach (SPDE, Lindgren, Rue, and Lindstrm 2011, one can accommodate all kinds of geographically referenced data, including areal and geostatistical ones, as well as spatial point process data. The implementation interface covers stationary spatial mod- els, non-stationary spatial models, and also spatio-temporal models, and is applicable in epidemiology, ecology, environmental risk assessment, as well as general geostatistics.
Shariati, M M; Korsgaard, I R; Sorensen, D
2009-04-01
Markov chain Monte Carlo (MCMC) enables fitting complex hierarchical models that may adequately reflect the process of data generation. Some of these models may contain more parameters than can be uniquely inferred from the distribution of the data, causing non-identifiability. The reaction norm model with unknown covariates (RNUC) is a model in which unknown environmental effects can be inferred jointly with the remaining parameters. The problem of identifiability of parameters at the level of the likelihood and the associated behaviour of MCMC chains were discussed using the RNUC as an example. It was shown theoretically that when environmental effects (covariates) are considered as random effects, estimable functions of the fixed effects, (co)variance components and genetic effects are identifiable as well as the environmental effects. When the environmental effects are treated as fixed and there are other fixed factors in the model, the contrasts involving environmental effects, the variance of environmental sensitivities (genetic slopes) and the residual variance are the only identifiable parameters. These different identifiability scenarios were generated by changing the formulation of the model and the structure of the data and the models were then implemented via MCMC. The output of MCMC sampling schemes was interpreted in the light of the theoretical findings. The erratic behaviour of the MCMC chains was shown to be associated with identifiability problems in the likelihood, despite propriety of posterior distributions, achieved by arbitrarily chosen uniform (bounded) priors. In some cases, very long chains were needed before the pattern of behaviour of the chain may signal the existence of problems. The paper serves as a warning concerning the implementation of complex models where identifiability problems can be difficult to detect a priori. We conclude that it would be good practice to experiment with a proposed model and to understand its features
Equifinality of formal (DREAM) and informal (GLUE) bayesian approaches in hydrologic modeling?
Energy Technology Data Exchange (ETDEWEB)
Vrugt, Jasper A [Los Alamos National Laboratory; Robinson, Bruce A [Los Alamos National Laboratory; Ter Braak, Cajo J F [NON LANL; Gupta, Hoshin V [NON LANL
2008-01-01
In recent years, a strong debate has emerged in the hydrologic literature regarding what constitutes an appropriate framework for uncertainty estimation. Particularly, there is strong disagreement whether an uncertainty framework should have its roots within a proper statistical (Bayesian) context, or whether such a framework should be based on a different philosophy and implement informal measures and weaker inference to summarize parameter and predictive distributions. In this paper, we compare a formal Bayesian approach using Markov Chain Monte Carlo (MCMC) with generalized likelihood uncertainty estimation (GLUE) for assessing uncertainty in conceptual watershed modeling. Our formal Bayesian approach is implemented using the recently developed differential evolution adaptive metropolis (DREAM) MCMC scheme with a likelihood function that explicitly considers model structural, input and parameter uncertainty. Our results demonstrate that DREAM and GLUE can generate very similar estimates of total streamflow uncertainty. This suggests that formal and informal Bayesian approaches have more common ground than the hydrologic literature and ongoing debate might suggest. The main advantage of formal approaches is, however, that they attempt to disentangle the effect of forcing, parameter and model structural error on total predictive uncertainty. This is key to improving hydrologic theory and to better understand and predict the flow of water through catchments.
Boers, Niklas; Goswami, Bedartha; Chekroun, Mickael; Svensson, Anders; Rousseau, Denis-Didier; Ghil, Michael
2016-04-01
In the recent past, empirical stochastic models have been successfully applied to model a wide range of climatic phenomena [1,2]. In addition to enhancing our understanding of the geophysical systems under consideration, multilayer stochastic models (MSMs) have been shown to be solidly grounded in the Mori-Zwanzig formalism of statistical physics [3]. They are also well-suited for predictive purposes, e.g., for the El Niño Southern Oscillation [4] and the Madden-Julian Oscillation [5]. In general, these models are trained on a given time series under consideration, and then assumed to reproduce certain dynamical properties of the underlying natural system. Most existing approaches are based on least-squares fitting to determine optimal model parameters, which does not allow for an uncertainty estimation of these parameters. This approach significantly limits the degree to which dynamical characteristics of the time series can be safely inferred from the model. Here, we are specifically interested in fitting low-dimensional stochastic models to time series obtained from paleoclimatic proxy records, such as the oxygen isotope ratio and dust concentration of the NGRIP record [6]. The time series derived from these records exhibit substantial dating uncertainties, in addition to the proxy measurement errors. In particular, for time series of this kind, it is crucial to obtain uncertainty estimates for the final model parameters. Following [7], we first propose a statistical procedure to shift dating uncertainties from the time axis to the proxy axis of layer-counted paleoclimatic records. Thereafter, we show how Maximum Likelihood Estimation in combination with Markov Chain Monte Carlo parameter sampling can be employed to translate all uncertainties present in the original proxy time series to uncertainties of the parameter estimates of the stochastic model. We compare time series simulated by the empirical model to the original time series in terms of standard
Gupta, Cherry; Cobre, Juliana; Polpo, Adriano; Sinha, Debjayoti
2016-09-01
Existing cure-rate survival models are generally not convenient for modeling and estimating the survival quantiles of a patient with specified covariate values. This paper proposes a novel class of cure-rate model, the transform-both-sides cure-rate model (TBSCRM), that can be used to make inferences about both the cure-rate and the survival quantiles. We develop the Bayesian inference about the covariate effects on the cure-rate as well as on the survival quantiles via Markov Chain Monte Carlo (MCMC) tools. We also show that the TBSCRM-based Bayesian method outperforms existing cure-rate models based methods in our simulation studies and in application to the breast cancer survival data from the National Cancer Institute's Surveillance, Epidemiology, and End Results (SEER) database.
MCMC and variational approaches for Bayesian inversion in diffraction imaging
Ayasso, Hacheme; Duchêne, Bernard; Mohammad-Djafari, Ali
2015-01-01
International audience The term “diffraction imaging” is meant, herein, in the sense of an “inverse scattering problem” where the goal is to build up an image of an unknown object from measurements of the scattered field that results from its interaction with a known probing wave. This type of problem occurs in many imaging and non-destructive testing applications. It corresponds to the situation where looking for a good trade-off between the image resolution and the penetration of the inc...
Gradient-based MCMC samplers for dynamic causal modelling.
Sengupta, Biswa; Friston, Karl J; Penny, Will D
2016-01-15
In this technical note, we derive two MCMC (Markov chain Monte Carlo) samplers for dynamic causal models (DCMs). Specifically, we use (a) Hamiltonian MCMC (HMC-E) where sampling is simulated using Hamilton's equation of motion and (b) Langevin Monte Carlo algorithm (LMC-R and LMC-E) that simulates the Langevin diffusion of samples using gradients either on a Euclidean (E) or on a Riemannian (R) manifold. While LMC-R requires minimal tuning, the implementation of HMC-E is heavily dependent on its tuning parameters. These parameters are therefore optimised by learning a Gaussian process model of the time-normalised sample correlation matrix. This allows one to formulate an objective function that balances tuning parameter exploration and exploitation, furnishing an intervention-free inference scheme. Using neural mass models (NMMs)-a class of biophysically motivated DCMs-we find that HMC-E is statistically more efficient than LMC-R (with a Riemannian metric); yet both gradient-based samplers are far superior to the random walk Metropolis algorithm, which proves inadequate to steer away from dynamical instability.
Directory of Open Access Journals (Sweden)
Simon Boitard
2016-03-01
Full Text Available Inferring the ancestral dynamics of effective population size is a long-standing question in population genetics, which can now be tackled much more accurately thanks to the massive genomic data available in many species. Several promising methods that take advantage of whole-genome sequences have been recently developed in this context. However, they can only be applied to rather small samples, which limits their ability to estimate recent population size history. Besides, they can be very sensitive to sequencing or phasing errors. Here we introduce a new approximate Bayesian computation approach named PopSizeABC that allows estimating the evolution of the effective population size through time, using a large sample of complete genomes. This sample is summarized using the folded allele frequency spectrum and the average zygotic linkage disequilibrium at different bins of physical distance, two classes of statistics that are widely used in population genetics and can be easily computed from unphased and unpolarized SNP data. Our approach provides accurate estimations of past population sizes, from the very first generations before present back to the expected time to the most recent common ancestor of the sample, as shown by simulations under a wide range of demographic scenarios. When applied to samples of 15 or 25 complete genomes in four cattle breeds (Angus, Fleckvieh, Holstein and Jersey, PopSizeABC revealed a series of population declines, related to historical events such as domestication or modern breed creation. We further highlight that our approach is robust to sequencing errors, provided summary statistics are computed from SNPs with common alleles.
Trovão, Nídia Sequeira; Suchard, Marc A; Baele, Guy; Gilbert, Marius; Lemey, Philippe
2015-12-01
Since its first isolation in 1996 in Guangdong, China, the highly pathogenic avian influenza virus (HPAIV) H5N1 has circulated in avian hosts for almost two decades and spread to more than 60 countries worldwide. The role of different avian hosts and the domestic-wild bird interface has been critical in shaping the complex HPAIV H5N1 disease ecology, but remains difficult to ascertain. To shed light on the large-scale H5N1 transmission patterns and disentangle the contributions of different avian hosts on the tempo and mode of HPAIV H5N1 dispersal, we apply Bayesian evolutionary inference techniques to comprehensive sets of hemagglutinin and neuraminidase gene sequences sampled between 1996 and 2011 throughout Asia and Russia. Our analyses demonstrate that the large-scale H5N1 transmission dynamics are structured according to different avian flyways, and that the incursion of the Central Asian flyway specifically was driven by Anatidae hosts coinciding with rapid rate of spread and an epidemic wavefront acceleration. This also resulted in long-distance dispersal that is likely to be explained by wild bird migration. We identify a significant degree of asymmetry in the large-scale transmission dynamics between Anatidae and Phasianidae, with the latter largely representing poultry as an evolutionary sink. A joint analysis of host dynamics and continuous spatial diffusion demonstrates that the rate of viral dispersal and host diffusivity is significantly higher for Anatidae compared with Phasianidae. These findings complement risk modeling studies and satellite tracking of wild birds in demonstrating a continental-scale structuring into areas of H5N1 persistence that are connected through migratory waterfowl.
Introduction to Bayesian statistics
Bolstad, William M
2016-01-01
There is a strong upsurge in the use of Bayesian methods in applied statistical analysis, yet most introductory statistics texts only present frequentist methods. Bayesian statistics has many important advantages that students should learn about if they are going into fields where statistics will be used. In this Third Edition, four newly-added chapters address topics that reflect the rapid advances in the field of Bayesian staistics. The author continues to provide a Bayesian treatment of introductory statistical topics, such as scientific data gathering, discrete random variables, robust Bayesian methods, and Bayesian approaches to inferenfe cfor discrete random variables, bionomial proprotion, Poisson, normal mean, and simple linear regression. In addition, newly-developing topics in the field are presented in four new chapters: Bayesian inference with unknown mean and variance; Bayesian inference for Multivariate Normal mean vector; Bayesian inference for Multiple Linear RegressionModel; and Computati...
Sraj, Ihab
2014-11-01
Tsunami computational models are employed to explore multiple flooding scenarios and to predict water elevations. However, accurate estimation of water elevations requires accurate estimation of many model parameters including the Manning\\'s n friction parameterization. Our objective is to develop an efficient approach for the uncertainty quantification and inference of the Manning\\'s n coefficient which we characterize here by three different parameters set to be constant in the on-shore, near-shore and deep-water regions as defined using iso-baths. We use Polynomial Chaos (PC) to build an inexpensive surrogate for the G. eoC. law model and employ Bayesian inference to estimate and quantify uncertainties related to relevant parameters using the DART buoy data collected during the Tōhoku tsunami. The surrogate model significantly reduces the computational burden of the Markov Chain Monte-Carlo (MCMC) sampling of the Bayesian inference. The PC surrogate is also used to perform a sensitivity analysis.
Vehicle Trajectory Estimation Using Spatio-Temporal MCMC
Directory of Open Access Journals (Sweden)
Francois Bardet
2010-01-01
Full Text Available This paper presents an algorithm for modeling and tracking vehicles in video sequences within one integrated framework. Most of the solutions are based on sequential methods that make inference according to current information. In contrast, we propose a deferred logical inference method that makes a decision according to a sequence of observations, thus processing a spatio-temporal search on the whole trajectory. One of the drawbacks of deferred logical inference methods is that the solution space of hypotheses grows exponentially related to the depth of observation. Our approach takes into account both the kinematic model of the vehicle and a driver behavior model in order to reduce the space of the solutions. The resulting proposed state model explains the trajectory with only 11 parameters. The solution space is then sampled with a Markov Chain Monte Carlo (MCMC that uses a model-driven proposal distribution in order to control random walk behavior. We demonstrate our method on real video sequences from which we have ground truth provided by a RTK GPS (Real-Time Kinematic GPS. Experimental results show that the proposed algorithm outperforms a sequential inference solution (particle filter.
Griffiths, Thomas L.; Tenenbaum, Joshua B.
2011-01-01
Predicting the future is a basic problem that people have to solve every day and a component of planning, decision making, memory, and causal reasoning. In this article, we present 5 experiments testing a Bayesian model of predicting the duration or extent of phenomena from their current state. This Bayesian model indicates how people should…
Intracluster Moves for Constrained Discrete-Space MCMC
Hamze, Firas
2012-01-01
This paper addresses the problem of sampling from binary distributions with constraints. In particular, it proposes an MCMC method to draw samples from a distribution of the set of all states at a specified distance from some reference state. For example, when the reference state is the vector of zeros, the algorithm can draw samples from a binary distribution with a constraint on the number of active variables, say the number of 1's. We motivate the need for this algorithm with examples from statistical physics and probabilistic inference. Unlike previous algorithms proposed to sample from binary distributions with these constraints, the new algorithm allows for large moves in state space and tends to propose them such that they are energetically favourable. The algorithm is demonstrated on three Boltzmann machines of varying difficulty: A ferromagnetic Ising model (with positive potentials), a restricted Boltzmann machine with learned Gabor-like filters as potentials, and a challenging three-dimensional spi...
Energy Technology Data Exchange (ETDEWEB)
Keats, A.; Lien, F.S. [Waterloo Univ., ON (Canada). Dept. of Mechanical Engineering; Yee, E. [Defence Research and Development Canada, Medicine Hat, AB (Canada)
2006-07-01
A Bayesian probabilistic inferential framework capable of incorporating errors and prior information was presented. Bayesian inference was used to find the posterior probability density function of the source parameters in a set of concentration measurements. A method of calculating the source-receptor relationship required for the determination of direct probability was provided which used the adjoint of the transport equation for the scalar concentration. The posterior distribution of the source parameters was sampled using a Markov chain Monte Carlo method. The inverse source determination method was validated against real data sets obtained from a highly disturbed, complex flow field in an urban environment. Data sets included a water-channel simulation of near-field dispersion of contaminant plumes in a large array of building-like obstacles, and a full-scale experiment in Oklahoma City. It was concluded that the 2 examples validated the proposed approach for inverse source determination.
Trajectory averaging for stochastic approximation MCMC algorithms
Liang, Faming
2010-01-01
The subject of stochastic approximation was founded by Robbins and Monro [Ann. Math. Statist. 22 (1951) 400--407]. After five decades of continual development, it has developed into an important area in systems control and optimization, and it has also served as a prototype for the development of adaptive algorithms for on-line estimation and control of stochastic systems. Recently, it has been used in statistics with Markov chain Monte Carlo for solving maximum likelihood estimation problems and for general simulation and optimizations. In this paper, we first show that the trajectory averaging estimator is asymptotically efficient for the stochastic approximation MCMC (SAMCMC) algorithm under mild conditions, and then apply this result to the stochastic approximation Monte Carlo algorithm [Liang, Liu and Carroll J. Amer. Statist. Assoc. 102 (2007) 305--320]. The application of the trajectory averaging estimator to other stochastic approximation MCMC algorithms, for example, a stochastic approximation MLE al...
MCMC Analysis of biases in the interpretation of disk galaxy kinematics
Aquino-Ortíz, E.; Valenzuela, O.; Cano-Díaz, M.; Sánchez-Sánchez, S. F.; Hernández-Toledo, H.
2016-06-01
The new generation of galaxy surveys like SAMI, CALIFA and MaNGA opens up the possibility of studying simultaneously properties of galaxies such as spiral arms, bars, disk geometry and orientation, stellar and gas mass distribution, 2D kinematics, etc. The previous task involves exploring a complicated multi-dimensional parameter space. Puglielli et al. (2010) introduced Bayesian statistics and MCMC (Monte Carlo Markov Chain) techniques to construct dynamical models of spiral galaxies. In our study we used synthetic velocity fields that include non-circular motions and assume different disk orientations in order to produce mock observations. We apply popular reconstruction techniques in order to estimate the geometrical disk parameters, systemic velocities, rotation curve shape and maximum circular velocity which are crucial to construct the scaling relations. We conclude that a detailed analysis of kinematics in galaxies using MCMC technique will be reflected in accurate estimations of galaxy properties and more robust scalings relations, otherwise physical conclusions may be importantly biased.
Christley, Scott; Emr, Bryanna; Ghosh, Auyon; Satalin, Josh; Gatto, Louis; Vodovotz, Yoram; Nieman, Gary F.; An, Gary
2013-06-01
Acute respiratory distress syndrome (ARDS) is acute lung failure secondary to severe systemic inflammation, resulting in a derangement of alveolar mechanics (i.e. the dynamic change in alveolar size and shape during tidal ventilation), leading to alveolar instability that can cause further damage to the pulmonary parenchyma. Mechanical ventilation is a mainstay in the treatment of ARDS, but may induce mechano-physical stresses on unstable alveoli, which can paradoxically propagate the cellular and molecular processes exacerbating ARDS pathology. This phenomenon is called ventilator induced lung injury (VILI), and plays a significant role in morbidity and mortality associated with ARDS. In order to identify optimal ventilation strategies to limit VILI and treat ARDS, it is necessary to understand the complex interplay between biological and physical mechanisms of VILI, first at the alveolar level, and then in aggregate at the whole-lung level. Since there is no current consensus about the underlying dynamics of alveolar mechanics, as an initial step we investigate the ventilatory dynamics of an alveolar sac (AS) with the lung alveolar spatial model (LASM), a 3D spatial biomechanical representation of the AS and its interaction with airflow pressure and the surface tension effects of pulmonary surfactant. We use the LASM to identify the mechanical ramifications of alveolar dynamics associated with ARDS. Using graphical processing unit parallel algorithms, we perform Bayesian inference on the model parameters using experimental data from rat lung under control and Tween-induced ARDS conditions. Our results provide two plausible models that recapitulate two fundamental hypotheses about volume change at the alveolar level: (1) increase in alveolar size through isotropic volume change, or (2) minimal change in AS radius with primary expansion of the mouth of the AS, with the implication that the majority of change in lung volume during the respiratory cycle occurs in the
International Nuclear Information System (INIS)
Acute respiratory distress syndrome (ARDS) is acute lung failure secondary to severe systemic inflammation, resulting in a derangement of alveolar mechanics (i.e. the dynamic change in alveolar size and shape during tidal ventilation), leading to alveolar instability that can cause further damage to the pulmonary parenchyma. Mechanical ventilation is a mainstay in the treatment of ARDS, but may induce mechano-physical stresses on unstable alveoli, which can paradoxically propagate the cellular and molecular processes exacerbating ARDS pathology. This phenomenon is called ventilator induced lung injury (VILI), and plays a significant role in morbidity and mortality associated with ARDS. In order to identify optimal ventilation strategies to limit VILI and treat ARDS, it is necessary to understand the complex interplay between biological and physical mechanisms of VILI, first at the alveolar level, and then in aggregate at the whole-lung level. Since there is no current consensus about the underlying dynamics of alveolar mechanics, as an initial step we investigate the ventilatory dynamics of an alveolar sac (AS) with the lung alveolar spatial model (LASM), a 3D spatial biomechanical representation of the AS and its interaction with airflow pressure and the surface tension effects of pulmonary surfactant. We use the LASM to identify the mechanical ramifications of alveolar dynamics associated with ARDS. Using graphical processing unit parallel algorithms, we perform Bayesian inference on the model parameters using experimental data from rat lung under control and Tween-induced ARDS conditions. Our results provide two plausible models that recapitulate two fundamental hypotheses about volume change at the alveolar level: (1) increase in alveolar size through isotropic volume change, or (2) minimal change in AS radius with primary expansion of the mouth of the AS, with the implication that the majority of change in lung volume during the respiratory cycle occurs in the
MCM-C Multichip Module Manufacturing Guide
Energy Technology Data Exchange (ETDEWEB)
Blazek, R.J.; Kautz, D.R.; Galichia, J.V.
2000-11-20
Honeywell Federal Manufacturing & Technologies (FM&T) provides complete microcircuit capabilities from design layout through manufacturing and final electrical testing. Manufacturing and testing capabilities include design layout, electrical and mechanical computer simulation and modeling, circuit analysis, component analysis, network fabrication, microelectronic assembly, electrical tester design, electrical testing, materials analysis, and environmental evaluation. This document provides manufacturing guidelines for multichip module-ceramic (MCM-C) microcircuits. Figure 1 illustrates an example MCM-C configuration with the parts and processes that are available. The MCM-C technology is used to manufacture microcircuits for electronic systems that require increased performance, reduced volume, and higher density that cannot be achieved by the standard hybrid microcircuit or printed wiring board technologies. The guidelines focus on the manufacturability issues that must be considered for low-temperature cofired ceramic (LTCC) network fabrication and MCM assembly and the impact that process capabilities have on the overall MCM design layout and product yield. Prerequisites that are necessary to initiate the MCM design layout include electrical, mechanical, and environmental requirements. Customer design data can be accepted in many standard electronic file formats. Other requirements include schedule, quantity, cost, classification, and quality level. Design considerations include electrical, network, packaging, and producibility; and deliverables include finished product, drawings, documentation, and electronic files.
Bayesian Concordance Correlation Coefficient with Application to Repeatedly Measured Data
Directory of Open Access Journals (Sweden)
Atanu BHATTACHARJEE
2015-10-01
Full Text Available Objective: In medical research, Lin's classical concordance correlation coefficient (CCC is frequently applied to evaluate the similarity of the measurements produced by different raters or methods on the same subjects. It is particularly useful for continuous data. The objective of this paper is to propose the Bayesian counterpart to compute CCC for continuous data. Material and Methods: A total of 33 patients of astrocytoma brain treated in the Department of Radiation Oncology at Malabar Cancer Centre is enrolled in this work. It is a continuous data of tumor volume and tumor size repeatedly measured during baseline pretreatment workup and post surgery follow-ups for all patients. The tumor volume and tumor size are measured separately by MRI and CT scan. The agreement of measurement between MRI and CT scan is calculated through CCC. The statistical inference is performed through Markov Chain Monte Carlo (MCMC technique. Results: Bayesian CCC is found suitable to get prominent evidence for test statistics to explore the relation between concordance measurements. The posterior mean estimates and 95% credible interval of CCC on tumor size and tumor volume are observed with 0.96(0.87,0.99 and 0.98(0.95,0.99 respectively. Conclusion: The Bayesian inference is adopted for development of the computational algorithm. The approach illustrated in this work provides the researchers an opportunity to find out the most appropriate model for specific data and apply CCC to fulfill the desired hypothesis.
Fast MCMC sampling for hidden markov models to determine copy number variations
Directory of Open Access Journals (Sweden)
Mahmud Md Pavel
2011-11-01
Full Text Available Abstract Background Hidden Markov Models (HMM are often used for analyzing Comparative Genomic Hybridization (CGH data to identify chromosomal aberrations or copy number variations by segmenting observation sequences. For efficiency reasons the parameters of a HMM are often estimated with maximum likelihood and a segmentation is obtained with the Viterbi algorithm. This introduces considerable uncertainty in the segmentation, which can be avoided with Bayesian approaches integrating out parameters using Markov Chain Monte Carlo (MCMC sampling. While the advantages of Bayesian approaches have been clearly demonstrated, the likelihood based approaches are still preferred in practice for their lower running times; datasets coming from high-density arrays and next generation sequencing amplify these problems. Results We propose an approximate sampling technique, inspired by compression of discrete sequences in HMM computations and by kd-trees to leverage spatial relations between data points in typical data sets, to speed up the MCMC sampling. Conclusions We test our approximate sampling method on simulated and biological ArrayCGH datasets and high-density SNP arrays, and demonstrate a speed-up of 10 to 60 respectively 90 while achieving competitive results with the state-of-the art Bayesian approaches. Availability: An implementation of our method will be made available as part of the open source GHMM library from http://ghmm.org.
Inferring Genotype of DNA Molecular Marker by Bayesian Theorem%应用贝叶斯理论推断DNA分子标记基因型
Institute of Scientific and Technical Information of China (English)
莫惠栋; 姜长鉴
2002-01-01
引入贝叶斯理论用以从DNA分子标记的表现型(电泳谱带)推断其基因型(DNA来源).结果表明,根据标记座位独立假定而确定的遗传信息不完全标记的基因型概率,与根据邻近的遗传信息完全标记的基因型和有关重组率算得的相应贝叶斯概率,通常都有很大的差异.所以在进行数量性状基因定位和标记辅助选择等工作之前,应当计算每一个体基因组上所有遗传信息不完全座位的有关基因型的贝叶斯概率.文中列出计算未知基因型的贝叶斯概率的详细过程,也讨论了贝叶斯概率的若干推广应用.%Bayesian theorem is applied to infer the DNA molecular marker genotype(DNA chain type) from its phenotype (electrophoresis band type). The results indicated that large differences often present in the genotype probability of a molecular marker with incomplete genetic information when it is obtained from the assumption of independence among markers as compared with that inferred from the genotypes of the flanking markers with the complete genetic information and the recombination fractions among them based on the Bayesian theorem. Therefore, before utilizing the marker information, such as in mapping quantitative trait loci (QTL), marker assisted selection (MAS) etc., Bayesian probability of the genotype for all markers with incomplete genetic information must be calculated over the whole genome for every individual. This study provides detailed procedure for the calculation of the Bayesian probability of the unknown genotype. Several extensions were also discussed for the application of the Bayesian theorem.
Use of SAMC for Bayesian analysis of statistical models with intractable normalizing constants
Jin, Ick Hoon
2014-03-01
Statistical inference for the models with intractable normalizing constants has attracted much attention. During the past two decades, various approximation- or simulation-based methods have been proposed for the problem, such as the Monte Carlo maximum likelihood method and the auxiliary variable Markov chain Monte Carlo methods. The Bayesian stochastic approximation Monte Carlo algorithm specifically addresses this problem: It works by sampling from a sequence of approximate distributions with their average converging to the target posterior distribution, where the approximate distributions can be achieved using the stochastic approximation Monte Carlo algorithm. A strong law of large numbers is established for the Bayesian stochastic approximation Monte Carlo estimator under mild conditions. Compared to the Monte Carlo maximum likelihood method, the Bayesian stochastic approximation Monte Carlo algorithm is more robust to the initial guess of model parameters. Compared to the auxiliary variable MCMC methods, the Bayesian stochastic approximation Monte Carlo algorithm avoids the requirement for perfect samples, and thus can be applied to many models for which perfect sampling is not available or very expensive. The Bayesian stochastic approximation Monte Carlo algorithm also provides a general framework for approximate Bayesian analysis. © 2012 Elsevier B.V. All rights reserved.
Olmi, L; Elia, D; Molinari, S; Pestalozzi, M; Pezzuto, S; Schisano, E; Testi, L; Thompson, M
2013-01-01
Context. Stars form in dense, dusty clumps of molecular clouds, but little is known about their origin, their evolution and their detailed physical properties. In particular, the relationship between the mass distribution of these clumps (also known as the "clump mass function", or CMF) and the stellar initial mass function (IMF), is still poorly understood. Aims. In order to better understand how the CMF evolve toward the IMF, and to discern the "true" shape of the CMF, large samples of bona-fide pre- and proto-stellar clumps are required. Two such datasets obtained from the Herschel infrared GALactic Plane Survey (Hi-GAL) have been described in paper I. Robust statistical methods are needed in order to infer the parameters describing the models used to fit the CMF, and to compare the competing models themselves. Methods. In this paper we apply Bayesian inference to the analysis of the CMF of the two regions discussed in Paper I. First, we determine the Bayesian posterior probability distribution for each of...
Institute of Scientific and Technical Information of China (English)
Druzhinina I S; Kubicek C P
2004-01-01
@@ The Hypocrea lixii/Trichoderma harzianum species aggregate contains a group of taxa (H. lixii/T.harzianum , T. aggressivum , T. tomentosum , T. cerinum , T. velutinum , H. tawa ) of which some (e. g. T. harzianum) are important for biocontrol of plant pathogenic fungi in agriculture, whereas others are aggressive pathogens of Agaricus spp. and Pleurotus spp. in mushroom farms (T. aggressivum), or opportunistic pathogens of immunocompromised mammals including humans (T. harzianum). We characterized the evolutionary properties of three genomic regions in Hypocrea/Trichoderma: the internal transcribed spacer regions ITS1 and 2 of rDNA, the large intron of translation elongation factor 1-alpha (tef1a), and a portion of the large exon of the endochitinase 42 gene (ech42 ), selected the best model which describes the evolution of every fragment, tested the molecular clock hypothesis and made an estimation of the usability of the combined three fragments data matrix for the phylogenetic analysis of the genus as a whole as well as on the level of the holomorphic H. liaxii/T. harzianum species clade and separate clonal lineages. To this end, we applied Bayesian phylogenetic inferences to 124 sequences of ITS1 and 2 and of the large tef1a intron, and to 64 ech42 gene sequences to resolve the evolution of H. lixii/T. harzianum with respect to the position of other taxa with closely related phenotypes. The resulting phylogram clearly identified T.aggressivum, T. velutinum, H. tawa, T. cerinum and T. tomentosum as phylogenetic species, and in addition identified three new unknown phylogenetic species as members of this clacle. The clear distinction between T. tomentosum and T. cerinum was not recognized in all trees, but was supported by multivariate analysis of phenotype micro arrays. In contrast, H. lixii/T. harzianum did not form a single phylogenetic species in this study, as its monophyly was not supported in any analysis. Strains morphologically identified as H. lixii
Pujolar, J M; Zane, L; Congiu, L
2012-06-01
The aim of our study is to examine the phylogenetic relationship, divergence times and demographic history of the five close-related Mediterranean and North-eastern Atlantic species/forms of Atherina using the full Bayesian framework for species tree estimation recently implemented in ∗BEAST. The inference is made possible by multilocus data using three mitochondrial genes (12S rRNA, 16S rRNA, control region) and one nuclear gene (rhodopsin) from multiple individuals per species available in GenBank. Bayesian phylogenetic analysis of the complete gene dataset produced a tree with strong support for the monophyly of each species, as well as high support for higher level nodes. An old origin of the Atherina group was suggested (19.2 MY), with deep split events within the Atherinidae predating the Messinian Salinity Crisis. Regional genetic substructuring was observed among populations of A. boyeri, with AMOVA and MultiDimensional Scaling suggesting the existence of five groupings (Atlantic/West Mediterranean, Adriatic, Greece, Black Sea and Tunis). The level of subdivision found might be consequence of the hydrographic isolation within the Mediterranean Sea. Bayesian inference of past demographic histories showed a clear signature of demographic expansion for the European coast populations of A. presbyter, possibly linked to post-glacial colonizations, but not for the Azores/Canary Islands, which is expected in isolated populations because of the impossibility of finding new habitats. Within the Mediterranean, signatures of recent demographic expansion were only found for the Adriatic population of A. boyeri, which could be associated with the relatively recent emergence of the Adriatic Sea. PMID:22425706
Métodos avanzados de muestreo : MCMC
Pascual Del Olmo, Víctor
2011-01-01
Este proyecto se propone estudiar, analizar e investigar las diferentes metodologías de generación de números aleatorios mediante técnicas avanzadas y modernas de Monte Carlo Markov Chain (MCMC). Los métodos de Monte Carlo son métodos numéricos usados para calcular, aproximar y simular expresiones o sistemas matemáticos complejos y difíciles de evaluar. Aunque estos métodos comenzaron a desarrollarse en los años cuarenta, hasta que las computadoras no se hicieron más potentes estuvieron en un...
Institute of Scientific and Technical Information of China (English)
张杨; 赵继广; 陈景鹏; 王亚琪
2014-01-01
在计算初因事件频率的过程中,当初因事件获取的数据较少时,便要考虑不确定性.在概率安全评估(PSA)中,通常以概率分布的形式表示事件的不确定性.本文针对“贮罐内压过大”这个初因事件数据量获取少的问题,采用贝叶斯-马尔科夫链蒙特卡洛方法(Bayesian-MCMC方法)对其频率进行了不确定性分析,得到了初因事件频率的不确定分布图形,并分析了不确定分布图形的特点,同时与直接计算频率方法进行了结果比较,从而验证了该方法的正确性和有效性.
DEFF Research Database (Denmark)
Shariati, M M; Korsgaard, I R; Sorensen, D
2009-01-01
Markov chain Monte Carlo (MCMC) enables fitting complex hierarchical models that may adequately reflect the process of data generation. Some of these models may contain more parameters than can be uniquely inferred from the distribution of the data, causing non-identifiability. The reaction norm ...... of complex models where identifiability problems can be difficult to detect a priori. We conclude that it would be good practice to experiment with a proposed model and to understand its features before embarking on a full MCMC implementation...
Caticha, Ariel
2011-03-01
In this tutorial we review the essential arguments behing entropic inference. We focus on the epistemological notion of information and its relation to the Bayesian beliefs of rational agents. The problem of updating from a prior to a posterior probability distribution is tackled through an eliminative induction process that singles out the logarithmic relative entropy as the unique tool for inference. The resulting method of Maximum relative Entropy (ME), includes as special cases both MaxEnt and Bayes' rule, and therefore unifies the two themes of these workshops—the Maximum Entropy and the Bayesian methods—into a single general inference scheme.
Caticha, Ariel
2010-01-01
In this tutorial we review the essential arguments behing entropic inference. We focus on the epistemological notion of information and its relation to the Bayesian beliefs of rational agents. The problem of updating from a prior to a posterior probability distribution is tackled through an eliminative induction process that singles out the logarithmic relative entropy as the unique tool for inference. The resulting method of Maximum relative Entropy (ME), includes as special cases both MaxEnt and Bayes' rule, and therefore unifies the two themes of these workshops -- the Maximum Entropy and the Bayesian methods -- into a single general inference scheme.
Lafrenière-Bérubé, Charles; Chouteau, Michel; Shamsipour, Pejman; Olivo, Gema R.
2016-04-01
Spectral induced polarization (SIP) parameters can be extracted from field or laboratory complex resistivity measurements, and even airborne or ground frequency domain electromagnetic data. With the growing interest in application of complex resistivity measurements to environmental and mineral exploration problems, there is a need for accurate and easy-to-use inversion tools to estimate SIP parameters. These parameters, which often include chargeability and relaxation time may then be studied and related to other rock attributes such as porosity or metallic grain content, in the case of mineral exploration. We present an open source program, available both as a standalone application or Python module, to estimate SIP parameters using Markov-chain Monte Carlo (MCMC) sampling. The Python language is a high level, open source language that is now widely used in scientific computing. Our program allows the user to choose between the more common Cole-Cole (Pelton), Dias, or Debye decomposition models. Simple circuits composed of resistances and constant phase elements may also be used to represent SIP data. Initial guesses are required when using more classic inversion techniques such as the least-squares formulation, and wrong estimates are often the cause of bad curve fitting. In stochastic optimization using MCMC, the effect of the starting values disappears as the simulation proceeds. Our program is then optimized to do batch inversion over large data sets with as little user-interaction as possible. Additionally, the Bayesian formulation allows the user to do quality control by fully propagating the measurement errors in the inversion process, providing an estimation of the SIP parameters uncertainty. This information is valuable when trying to relate chargeability or relaxation time to other physical properties. We test the inversion program on complex resistivity measurements of 12 core samples from the world-class gold deposit of Canadian Malartic. Results show
Trajectory averaging for stochastic approximation MCMC algorithms
Liang, Faming
2010-10-01
The subject of stochastic approximation was founded by Robbins and Monro [Ann. Math. Statist. 22 (1951) 400-407]. After five decades of continual development, it has developed into an important area in systems control and optimization, and it has also served as a prototype for the development of adaptive algorithms for on-line estimation and control of stochastic systems. Recently, it has been used in statistics with Markov chain Monte Carlo for solving maximum likelihood estimation problems and for general simulation and optimizations. In this paper, we first show that the trajectory averaging estimator is asymptotically efficient for the stochastic approximation MCMC (SAMCMC) algorithm under mild conditions, and then apply this result to the stochastic approximation Monte Carlo algorithm [Liang, Liu and Carroll J. Amer. Statist. Assoc. 102 (2007) 305-320]. The application of the trajectory averaging estimator to other stochastic approximationMCMC algorithms, for example, a stochastic approximation MLE algorithm for missing data problems, is also considered in the paper. © Institute of Mathematical Statistics, 2010.
Modeling error distributions of growth curve models through Bayesian methods.
Zhang, Zhiyong
2016-06-01
Growth curve models are widely used in social and behavioral sciences. However, typical growth curve models often assume that the errors are normally distributed although non-normal data may be even more common than normal data. In order to avoid possible statistical inference problems in blindly assuming normality, a general Bayesian framework is proposed to flexibly model normal and non-normal data through the explicit specification of the error distributions. A simulation study shows when the distribution of the error is correctly specified, one can avoid the loss in the efficiency of standard error estimates. A real example on the analysis of mathematical ability growth data from the Early Childhood Longitudinal Study, Kindergarten Class of 1998-99 is used to show the application of the proposed methods. Instructions and code on how to conduct growth curve analysis with both normal and non-normal error distributions using the the MCMC procedure of SAS are provided. PMID:26019004
A Bayesian Approach to Detection of Small Low Emission Sources
Xun, Xiaolei; Carroll, Raymond J; Kuchment, Peter
2011-01-01
The article addresses the problem of detecting presence and location of a small low emission source inside of an object, when the background noise dominates. This problem arises, for instance, in some homeland security applications. The goal is to reach the signal-to-noise ratio (SNR) levels on the order of $10^{-3}$. A Bayesian approach to this problem is implemented in 2D. The method allows inference not only about the existence of the source, but also about its location. We derive Bayes factors for model selection and estimation of location based on Markov Chain Monte Carlo (MCMC) simulation. A simulation study shows that with sufficiently high total emission level, our method can effectively locate the source.
Gelman, Andrew; Stern, Hal S; Dunson, David B; Vehtari, Aki; Rubin, Donald B
2013-01-01
FUNDAMENTALS OF BAYESIAN INFERENCEProbability and InferenceSingle-Parameter Models Introduction to Multiparameter Models Asymptotics and Connections to Non-Bayesian ApproachesHierarchical ModelsFUNDAMENTALS OF BAYESIAN DATA ANALYSISModel Checking Evaluating, Comparing, and Expanding ModelsModeling Accounting for Data Collection Decision AnalysisADVANCED COMPUTATION Introduction to Bayesian Computation Basics of Markov Chain Simulation Computationally Efficient Markov Chain Simulation Modal and Distributional ApproximationsREGRESSION MODELS Introduction to Regression Models Hierarchical Linear
Yuan, Ying; MacKinnon, David P.
2009-01-01
This article proposes Bayesian analysis of mediation effects. Compared to conventional frequentist mediation analysis, the Bayesian approach has several advantages. First, it allows researchers to incorporate prior information into the mediation analysis, thus potentially improving the efficiency of estimates. Second, under the Bayesian mediation analysis, inference is straightforward and exact, which makes it appealing for studies with small samples. Third, the Bayesian approach is conceptua...
Bayesian artificial intelligence
Korb, Kevin B
2010-01-01
Updated and expanded, Bayesian Artificial Intelligence, Second Edition provides a practical and accessible introduction to the main concepts, foundation, and applications of Bayesian networks. It focuses on both the causal discovery of networks and Bayesian inference procedures. Adopting a causal interpretation of Bayesian networks, the authors discuss the use of Bayesian networks for causal modeling. They also draw on their own applied research to illustrate various applications of the technology.New to the Second EditionNew chapter on Bayesian network classifiersNew section on object-oriente
Efficiency of alternative MCMC strategies illustrated using the reaction norm model
DEFF Research Database (Denmark)
Shariati, Mohammad Mahdi; Sørensen, D.
2008-01-01
inferences may be affected. The objective of this study was to compare the efficiency (in terms of the asymptotic variance of features of posterior distributions of chosen parameters, and in terms of computing cost) of six MCMC strategies to sample parameters using simulated data generated with a reaction...... of the parameters, and no method comes out as an overall winner across all parameters. TSG and BG show very good performance in terms of asymptotic variance especially when the posterior correlation between genetic effects is high. In terms of computing cost, TSG performs best except for dispersion parameters...... in the low correlation scenario where SG was the best strategy. The two LH proposals could not compete with any of the Gibbs sampling algorithms. In this study it was not possible to find an MCMC strategy that performs optimally across the range of target distributions and across all possible values...
Directory of Open Access Journals (Sweden)
Navid Feroze
2016-03-01
Full Text Available The families of mixture distributions have a wider range of applications in different fields such as fisheries, agriculture, botany, economics, medicine, psychology, electrophoresis, finance, communication theory, geology and zoology. They provide the necessary flexibility to model failure distributions of components with multiple failure modes. Mostly, the Bayesian procedure for the estimation of parameters of mixture model is described under the scheme of Type-I censoring. In particular, the Bayesian analysis for the mixture models under doubly censored samples has not been considered in the literature yet. The main objective of this paper is to develop the Bayes estimation of the inverse Weibull mixture distributions under doubly censoring. The posterior estimation has been conducted under the assumption of gamma and inverse levy using precautionary loss function and weighted squared error loss function. The comparisons among the different estimators have been made based on analysis of simulated and real life data sets.
mbb_emcee: Modified Blackbody MCMC
Conley, Alexander
2016-02-01
Mbb_emcee fits modified blackbodies to photometry data using an affine invariant MCMC. It has large number of options which, for example, allow computation of the IR luminosity or dustmass as part of the fit. Carrying out a fit produces a HDF5 output file containing the results, which can either be read directly, or read back into a mbb_results object for analysis. Upper and lower limits can be imposed as well as Gaussian priors on the model parameters. These additions are useful for analyzing poorly constrained data. In addition to standard Python packages scipy, numpy, and cython, mbb_emcee requires emcee (ascl:1303.002), Astropy (ascl:1304.002), h5py, and for unit tests, nose.
Directory of Open Access Journals (Sweden)
Kevin McNally
2012-01-01
Full Text Available There are numerous biomonitoring programs, both recent and ongoing, to evaluate environmental exposure of humans to chemicals. Due to the lack of exposure and kinetic data, the correlation of biomarker levels with exposure concentrations leads to difficulty in utilizing biomonitoring data for biological guidance values. Exposure reconstruction or reverse dosimetry is the retrospective interpretation of external exposure consistent with biomonitoring data. We investigated the integration of physiologically based pharmacokinetic modelling, global sensitivity analysis, Bayesian inference, and Markov chain Monte Carlo simulation to obtain a population estimate of inhalation exposure to m-xylene. We used exhaled breath and venous blood m-xylene and urinary 3-methylhippuric acid measurements from a controlled human volunteer study in order to evaluate the ability of our computational framework to predict known inhalation exposures. We also investigated the importance of model structure and dimensionality with respect to its ability to reconstruct exposure.
ABCtoolbox: a versatile toolkit for approximate Bayesian computations
Directory of Open Access Journals (Sweden)
Neuenschwander Samuel
2010-03-01
Full Text Available Abstract Background The estimation of demographic parameters from genetic data often requires the computation of likelihoods. However, the likelihood function is computationally intractable for many realistic evolutionary models, and the use of Bayesian inference has therefore been limited to very simple models. The situation changed recently with the advent of Approximate Bayesian Computation (ABC algorithms allowing one to obtain parameter posterior distributions based on simulations not requiring likelihood computations. Results Here we present ABCtoolbox, a series of open source programs to perform Approximate Bayesian Computations (ABC. It implements various ABC algorithms including rejection sampling, MCMC without likelihood, a Particle-based sampler and ABC-GLM. ABCtoolbox is bundled with, but not limited to, a program that allows parameter inference in a population genetics context and the simultaneous use of different types of markers with different ploidy levels. In addition, ABCtoolbox can also interact with most simulation and summary statistics computation programs. The usability of the ABCtoolbox is demonstrated by inferring the evolutionary history of two evolutionary lineages of Microtus arvalis. Using nuclear microsatellites and mitochondrial sequence data in the same estimation procedure enabled us to infer sex-specific population sizes and migration rates and to find that males show smaller population sizes but much higher levels of migration than females. Conclusion ABCtoolbox allows a user to perform all the necessary steps of a full ABC analysis, from parameter sampling from prior distributions, data simulations, computation of summary statistics, estimation of posterior distributions, model choice, validation of the estimation procedure, and visualization of the results.
Prediction of trajectory based on modified Bayesian inference%基于改进贝叶斯方法的轨迹预测算法研究
Institute of Scientific and Technical Information of China (English)
李万高; 赵雪梅; 孙德厂
2013-01-01
The existing algorithms for trajectory prediction have very low prediction accuracy when there are a limited number of available trajectories.To address this problem,the Modified Bayesian Inference (MBI) approach was proposed,which constructed the Markov model to quantify the correlation between adjacent locations.MBI decomposed historical trajectories into sub-trajectories to get more precise Markov model and the probability formula of Bayesian inference was obtained.The experimental results based on real datasets show that MBI approach is two to three times faster than the existing algorithm,and it has higher prediction accuracy and stability.MBI makes full use of the available trajectories and improves the efficiency and accuracy for the prediction of trajectory.%针对传统轨迹预测方法在历史轨迹数目有限时,预测准确度较低的问题,提出一种改进的贝叶斯推理(MBI)方法,MBI构建了马尔可夫模型来量化相邻位置的相关性,并通过对历史轨迹进行分解来获得更准确的马尔可夫模型,最后得到改进的贝叶斯推理公式.实验结果表明,MBI方法比现有方法的预测速度快2到3倍,并且有较高的准确度和稳定性.MBI方法充分利用现有轨迹信息,不仅提高了查询效率,还保证了较高的预测精度.
Directory of Open Access Journals (Sweden)
Zoltán Csörnyei
2010-01-01
Full Text Available Genetic parameters of number of piglets born alive (NBA and gestation length (GL were analyzed for 39798 Hungarian Landrace (HLA, 141397 records and 70356 Hungarian Large White (HLW, 246961 records sows. Bivariate repeatability animal models were used, applying a Bayesian statistics. Estimated and heritabilitie repeatabilities (within brackets, were low for NBA, 0.07 (0.14 for HLA and 0.08 (0.17 for HLW, but somewhat higher for GL, 0.18 (0.27 for HLA and 0.26 (0.35 for HLW. Estimated genetic correlations between NBA and GL were low, -0.08 for HLA and -0.05 for HLW.
Pitombeira-Neto, Anselmo Ramalho; Loureiro, Carlos Felipe Grangeiro; Carvalho, Luis Eduardo
2016-01-01
Estimation of origin-destination (OD) demand plays a key role in successful transportation studies. In this paper, we consider the estimation of time-varying day-to-day OD flows given data on traffic volumes in a transportation network for a sequence of days. We propose a dynamic linear model (DLM) in order to represent the stochastic evolution of OD flows over time. DLM's are Bayesian state-space models which can capture non-stationarity. We take into account the hierarchical relationships b...
Bayesian target tracking based on particle filter
Institute of Scientific and Technical Information of China (English)
无
2005-01-01
For being able to deal with the nonlinear or non-Gaussian problems, particle filters have been studied by many researchers. Based on particle filter, the extended Kalman filter (EKF) proposal function is applied to Bayesian target tracking. Markov chain Monte Carlo (MCMC) method, the resampling step, etc novel techniques are also introduced into Bayesian target tracking. And the simulation results confirm the improved particle filter with these techniques outperforms the basic one.
Tools for investigating the prior distribution in Bayesian hydrology
Tang, Yating; Marshall, Lucy; Sharma, Ashish; Smith, Tyler
2016-07-01
Bayesian inference is one of the most popular tools for uncertainty analysis in hydrological modeling. While much emphasis has been placed on the selection of appropriate likelihood functions within Bayesian hydrology, few researchers have evaluated the importance of the prior distribution in deriving appropriate posterior distributions. This paper describes tools for the evaluation of parameter sensitivity to the prior distribution to provide guidelines for defining meaningful priors. The tools described here consist of two measurements, the Kullback-Leibler Divergence (KLD) and the prior information elasticity. The Kullback-Leibler Divergence (KLD) is applied to calculate differences between the prior and posterior distributions for different cases. The prior information elasticity is then used to quantify the responsiveness of the KLD values to the change of prior distributions and length of available data. The tools are demonstrated via a Bayesian framework using an MCMC algorithm for a conceptual hydrologic model with both synthetic and real cases. The results of the application of this toolkit suggest the prior distribution can have a significant impact on the posterior distribution and should be more routinely assessed in hydrologic studies.
Bayesian estimation of generalized exponential distribution under noninformative priors
Moala, Fernando Antonio; Achcar, Jorge Alberto; Tomazella, Vera Lúcia Damasceno
2012-10-01
The generalized exponential distribution, proposed by Gupta and Kundu (1999), is a good alternative to standard lifetime distributions as exponential, Weibull or gamma. Several authors have considered the problem of Bayesian estimation of the parameters of generalized exponential distribution, assuming independent gamma priors and other informative priors. In this paper, we consider a Bayesian analysis of the generalized exponential distribution by assuming the conventional noninformative prior distributions, as Jeffreys and reference prior, to estimate the parameters. These priors are compared with independent gamma priors for both parameters. The comparison is carried out by examining the frequentist coverage probabilities of Bayesian credible intervals. We shown that maximal data information prior implies in an improper posterior distribution for the parameters of a generalized exponential distribution. It is also shown that the choice of a parameter of interest is very important for the reference prior. The different choices lead to different reference priors in this case. Numerical inference is illustrated for the parameters by considering data set of different sizes and using MCMC (Markov Chain Monte Carlo) methods.
A Bayesian Alternative for Multi-objective Ecohydrological Model Specification
Tang, Y.; Marshall, L. A.; Sharma, A.; Ajami, H.
2015-12-01
Process-based ecohydrological models combine the study of hydrological, physical, biogeochemical and ecological processes of the catchments, which are usually more complex and parametric than conceptual hydrological models. Thus, appropriate calibration objectives and model uncertainty analysis are essential for ecohydrological modeling. In recent years, Bayesian inference has become one of the most popular tools for quantifying the uncertainties in hydrological modeling with the development of Markov Chain Monte Carlo (MCMC) techniques. Our study aims to develop appropriate prior distributions and likelihood functions that minimize the model uncertainties and bias within a Bayesian ecohydrological framework. In our study, a formal Bayesian approach is implemented in an ecohydrological model which combines a hydrological model (HyMOD) and a dynamic vegetation model (DVM). Simulations focused on one objective likelihood (Streamflow/LAI) and multi-objective likelihoods (Streamflow and LAI) with different weights are compared. Uniform, weakly informative and strongly informative prior distributions are used in different simulations. The Kullback-leibler divergence (KLD) is used to measure the dis(similarity) between different priors and corresponding posterior distributions to examine the parameter sensitivity. Results show that different prior distributions can strongly influence posterior distributions for parameters, especially when the available data is limited or parameters are insensitive to the available data. We demonstrate differences in optimized parameters and uncertainty limits in different cases based on multi-objective likelihoods vs. single objective likelihoods. We also demonstrate the importance of appropriately defining the weights of objectives in multi-objective calibration according to different data types.
Bayesian segmentation of hyperspectral images
Mohammadpour, Adel; Mohammad-Djafari, Ali
2007-01-01
In this paper we consider the problem of joint segmentation of hyperspectral images in the Bayesian framework. The proposed approach is based on a Hidden Markov Modeling (HMM) of the images with common segmentation, or equivalently with common hidden classification label variables which is modeled by a Potts Markov Random Field. We introduce an appropriate Markov Chain Monte Carlo (MCMC) algorithm to implement the method and show some simulation results.
Bayesian segmentation of hyperspectral images
Mohammadpour, Adel; Féron, Olivier; Mohammad-Djafari, Ali
2004-11-01
In this paper we consider the problem of joint segmentation of hyperspectral images in the Bayesian framework. The proposed approach is based on a Hidden Markov Modeling (HMM) of the images with common segmentation, or equivalently with common hidden classification label variables which is modeled by a Potts Markov Random Field. We introduce an appropriate Markov Chain Monte Carlo (MCMC) algorithm to implement the method and show some simulation results.
Kim, Seongryong; Tkalčić, Hrvoje; Rhie, Junkee; Chen, Youlin
2016-08-01
Intraplate volcanism adjacent to active continental margins is not simply explained by plate tectonics or plume interaction. Recent volcanoes in northeast (NE) Asia, including NE China and the Korean Peninsula, are characterized by heterogeneous tectonic structures and geochemical compositions. Here we apply a transdimensional Bayesian tomography to estimate high-resolution images of group and phase velocity variations (with periods between 8 and 70 s). The method provides robust estimations of velocity maps, and the reliability of results is tested through carefully designed synthetic recovery experiments. Our maps reveal two sublithospheric low-velocity anomalies that connect back-arc regions (in Japan and Ryukyu Trench) with current margins of continental lithosphere where the volcanoes are distributed. Combined with evidences from previous geochemical and geophysical studies, we argue that the volcanoes are related to the low-velocity structures associated with back-arc processes and preexisting continental lithosphere.
Degradation monitoring using probabilistic inference
Alpay, Bulent
In order to increase safety and improve economy and performance in a nuclear power plant (NPP), the source and extent of component degradations should be identified before failures and breakdowns occur. It is also crucial for the next generation of NPPs, which are designed to have a long core life and high fuel burnup to have a degradation monitoring system in order to keep the reactor in a safe state, to meet the designed reactor core lifetime and to optimize the scheduled maintenance. Model-based methods are based on determining the inconsistencies between the actual and expected behavior of the plant, and use these inconsistencies for detection and diagnostics of degradations. By defining degradation as a random abrupt change from the nominal to a constant degraded state of a component, we employed nonlinear filtering techniques based on state/parameter estimation. We utilized a Bayesian recursive estimation formulation in the sequential probabilistic inference framework and constructed a hidden Markov model to represent a general physical system. By addressing the problem of a filter's inability to estimate an abrupt change, which is called the oblivious filter problem in nonlinear extensions of Kalman filtering, and the sample impoverishment problem in particle filtering, we developed techniques to modify filtering algorithms by utilizing additional data sources to improve the filter's response to this problem. We utilized a reliability degradation database that can be constructed from plant specific operational experience and test and maintenance reports to generate proposal densities for probable degradation modes. These are used in a multiple hypothesis testing algorithm. We then test samples drawn from these proposal densities with the particle filtering estimates based on the Bayesian recursive estimation formulation with the Metropolis Hastings algorithm, which is a well-known Markov chain Monte Carlo method (MCMC). This multiple hypothesis testing
Bhadra, Dhiman; Daniels, Michael J; Kim, Sungduk; Ghosh, Malay; Mukherjee, Bhramar
2012-06-01
In a typical case-control study, exposure information is collected at a single time point for the cases and controls. However, case-control studies are often embedded in existing cohort studies containing a wealth of longitudinal exposure history about the participants. Recent medical studies have indicated that incorporating past exposure history, or a constructed summary measure of cumulative exposure derived from the past exposure history, when available, may lead to more precise and clinically meaningful estimates of the disease risk. In this article, we propose a flexible Bayesian semiparametric approach to model the longitudinal exposure profiles of the cases and controls and then use measures of cumulative exposure based on a weighted integral of this trajectory in the final disease risk model. The estimation is done via a joint likelihood. In the construction of the cumulative exposure summary, we introduce an influence function, a smooth function of time to characterize the association pattern of the exposure profile on the disease status with different time windows potentially having differential influence/weights. This enables us to analyze how the present disease status of a subject is influenced by his/her past exposure history conditional on the current ones. The joint likelihood formulation allows us to properly account for uncertainties associated with both stages of the estimation process in an integrated manner. Analysis is carried out in a hierarchical Bayesian framework using reversible jump Markov chain Monte Carlo algorithms. The proposed methodology is motivated by, and applied to a case-control study of prostate cancer where longitudinal biomarker information is available for the cases and controls. PMID:22313248
An MCMC Circumstellar Disks Modeling Tool
Wolff, Schuyler; Perrin, Marshall D.; Mazoyer, Johan; Choquet, Elodie; Soummer, Remi; Ren, Bin; Pueyo, Laurent; Debes, John H.; Duchene, Gaspard; Pinte, Christophe; Menard, Francois
2016-01-01
We present an enhanced software framework for the Monte Carlo Markov Chain modeling of circumstellar disk observations, including spectral energy distributions and multi wavelength images from a variety of instruments (e.g. GPI, NICI, HST, WFIRST). The goal is to self-consistently and simultaneously fit a wide variety of observables in order to place constraints on the physical properties of a given disk, while also rigorously assessing the uncertainties in the derived properties. This modular code is designed to work with a collection of existing modeling tools, ranging from simple scripts to define the geometry for optically thin debris disks, to full radiative transfer modeling of complex grain structures in protoplanetary disks (using the MCFOST radiative transfer modeling code). The MCMC chain relies on direct chi squared comparison of model images/spectra to observations. We will include a discussion of how best to weight different observations in the modeling of a single disk and how to incorporate forward modeling from PCA PSF subtraction techniques. The code is open source, python, and available from github. Results for several disks at various evolutionary stages will be discussed.
MCMC curve sampling and geometric conditional simulation
Fan, Ayres; Fisher, John W., III; Kane, Jonathan; Willsky, Alan S.
2008-02-01
We present an algorithm to generate samples from probability distributions on the space of curves. Traditional curve evolution methods use gradient descent to find a local minimum of a specified energy functional. Here, we view the energy functional as a negative log probability distribution and sample from it using a Markov chain Monte Carlo (MCMC) algorithm. We define a proposal distribution by generating smooth perturbations to the normal of the curve, update the curve using level-set methods, and show how to compute the transition probabilities to ensure that we compute samples from the posterior. We demonstrate the benefits of sampling methods (such as robustness to local minima, better characterization of multi-modal distributions, and access to some measures of estimation error) on medical and geophysical applications. We then use our sampling framework to construct a novel semi-automatic segmentation approach which takes in partial user segmentations and conditionally simulates the unknown portion of the curve. This allows us to dramatically lower the estimation variance in low-SNR and ill-posed problems.
Chain ladder method: Bayesian bootstrap versus classical bootstrap
Peters, Gareth W.; Mario V. W\\"uthrich; Shevchenko, Pavel V.
2010-01-01
The intention of this paper is to estimate a Bayesian distribution-free chain ladder (DFCL) model using approximate Bayesian computation (ABC) methodology. We demonstrate how to estimate quantities of interest in claims reserving and compare the estimates to those obtained from classical and credibility approaches. In this context, a novel numerical procedure utilising Markov chain Monte Carlo (MCMC), ABC and a Bayesian bootstrap procedure was developed in a truly distribution-free setting. T...
Yao, Zhewei; Hu, Zixi; Li, Jinglai
2016-07-01
Many scientific and engineering problems require to perform Bayesian inferences in function spaces, where the unknowns are of infinite dimension. In such problems, choosing an appropriate prior distribution is an important task. In particular, when the function to infer is subject to sharp jumps, the commonly used Gaussian measures become unsuitable. On the other hand, the so-called total variation (TV) prior can only be defined in a finite-dimensional setting, and does not lead to a well-defined posterior measure in function spaces. In this work we present a TV-Gaussian (TG) prior to address such problems, where the TV term is used to detect sharp jumps of the function, and the Gaussian distribution is used as a reference measure so that it results in a well-defined posterior measure in the function space. We also present an efficient Markov Chain Monte Carlo (MCMC) algorithm to draw samples from the posterior distribution of the TG prior. With numerical examples we demonstrate the performance of the TG prior and the efficiency of the proposed MCMC algorithm.
Energy Technology Data Exchange (ETDEWEB)
Sigeti, David E. [Los Alamos National Laboratory; Pelak, Robert A. [Los Alamos National Laboratory
2012-09-11
We present a Bayesian statistical methodology for identifying improvement in predictive simulations, including an analysis of the number of (presumably expensive) simulations that will need to be made in order to establish with a given level of confidence that an improvement has been observed. Our analysis assumes the ability to predict (or postdict) the same experiments with legacy and new simulation codes and uses a simple binomial model for the probability, {theta}, that, in an experiment chosen at random, the new code will provide a better prediction than the old. This model makes it possible to do statistical analysis with an absolute minimum of assumptions about the statistics of the quantities involved, at the price of discarding some potentially important information in the data. In particular, the analysis depends only on whether or not the new code predicts better than the old in any given experiment, and not on the magnitude of the improvement. We show how the posterior distribution for {theta} may be used, in a kind of Bayesian hypothesis testing, both to decide if an improvement has been observed and to quantify our confidence in that decision. We quantify the predictive probability that should be assigned, prior to taking any data, to the possibility of achieving a given level of confidence, as a function of sample size. We show how this predictive probability depends on the true value of {theta} and, in particular, how there will always be a region around {theta} = 1/2 where it is highly improbable that we will be able to identify an improvement in predictive capability, although the width of this region will shrink to zero as the sample size goes to infinity. We show how the posterior standard deviation may be used, as a kind of 'plan B metric' in the case that the analysis shows that {theta} is close to 1/2 and argue that such a plan B should generally be part of hypothesis testing. All the analysis presented in the paper is done with a
Advances in Bayesian Model Based Clustering Using Particle Learning
Energy Technology Data Exchange (ETDEWEB)
Merl, D M
2009-11-19
Recent work by Carvalho, Johannes, Lopes and Polson and Carvalho, Lopes, Polson and Taddy introduced a sequential Monte Carlo (SMC) alternative to traditional iterative Monte Carlo strategies (e.g. MCMC and EM) for Bayesian inference for a large class of dynamic models. The basis of SMC techniques involves representing the underlying inference problem as one of state space estimation, thus giving way to inference via particle filtering. The key insight of Carvalho et al was to construct the sequence of filtering distributions so as to make use of the posterior predictive distribution of the observable, a distribution usually only accessible in certain Bayesian settings. Access to this distribution allows a reversal of the usual propagate and resample steps characteristic of many SMC methods, thereby alleviating to a large extent many problems associated with particle degeneration. Furthermore, Carvalho et al point out that for many conjugate models the posterior distribution of the static variables can be parametrized in terms of [recursively defined] sufficient statistics of the previously observed data. For models where such sufficient statistics exist, particle learning as it is being called, is especially well suited for the analysis of streaming data do to the relative invariance of its algorithmic complexity with the number of data observations. Through a particle learning approach, a statistical model can be fit to data as the data is arriving, allowing at any instant during the observation process direct quantification of uncertainty surrounding underlying model parameters. Here we describe the use of a particle learning approach for fitting a standard Bayesian semiparametric mixture model as described in Carvalho, Lopes, Polson and Taddy. In Section 2 we briefly review the previously presented particle learning algorithm for the case of a Dirichlet process mixture of multivariate normals. In Section 3 we describe several novel extensions to the original
On the Markov Chain Monte Carlo (MCMC) method
Indian Academy of Sciences (India)
Rajeeva L Karandikar
2006-04-01
Markov Chain Monte Carlo (MCMC) is a popular method used to generate samples from arbitrary distributions, which may be speciﬁed indirectly. In this article, we give an introduction to this method along with some examples.
CLICK MODEL BASED ON BAYESIAN INFERENCE AND ITS IMPLEMENTATION%基于贝叶斯推理的点击模型及其实现
Institute of Scientific and Technical Information of China (English)
孙付伟; 李娟; 杨达
2013-01-01
为能更好地解释搜索引擎和商务搜索的点击日志中的用户行为,实现一种用于分析日志中包含的用户行为的贝叶斯点击模型.通过分析中国最大电子商务网站的约927万条用户搜索点击日志数据,发现一个的文档的点击是受其上下位置点击过的文档共同影响的,然后基于此发现提出并实现一种新的基于贝叶斯推理的点击模型,并给出并行版本的算法实现.最后通过利用来自用户搜索的一个月日志数据验证,结果表明该模型优于现有的点击模型.%In order to better explain user behaviour from click logs in search engine or sponsored search, we implement a Bayesian click model for analysing user behaviours included in logs. By analysing about 9.27 million click log data collected from a largest e-commerce site of China, there finds that the click probability of a document is affected by the clicked documents above and below it. Then we propose and implement a new click model based on Bayesian inference according to the phenomenon found, together with the implementation of an algorithm in parallel version. At last, we validate the model through a log data set collected about a mouth from user search, and the result shows that the proposed model outperforms existing click models.
Konda Reddy, B.; Gnanasekaran, N.; Balaji, C.
2012-11-01
An inverse methodology is proposed to estimate thermo-physical and transport properties individually and simultaneously from in-house experimental data obtained using the transient Liquid Crystal Thermography (LCT) technique. A vertical rectangular fin made of mild steel and size of 75 × 250 × 3 (L × W × t) (all in mm) has been used. Thermochromic Liquid Crystals (TLCs) are used to obtain transient temperature distribution along the fin surface to determine the temperature dependent heat transfer coefficient, hθ and the thermal diffusivity, α of the fin. The variation of heat transfer coefficient is considered as a power law function of temperature excess (hθ = a''θb(x,t)) and is derived from the basic Nusselt number equation, Nuθ = aRabθ used for laminar natural convection for a vertical plate in ambient air. Using this functional form, the 1-D transient fin equation solved using the finite difference technique for assumed values of 'a' and 'α'. Treating the inverse problem as a one parameter estimation in 'a' or 'α' or a two parameter estimation problem in 'a' and 'α', the sum of the squares of the difference between the TLC measured and simulated temperatures are minimized with the Bayesian frame work in the inverse model to determine the point estimates for 'a' and 'α'. Two point estimates namely the (i) mean and (ii) maximum a posterior (MAP) are used to report the retrieved quantities together with the associated standard deviation.
Directory of Open Access Journals (Sweden)
Watanabe Yukito
2012-01-01
Full Text Available Abstract Background Bayesian networks (BNs have been widely used to estimate gene regulatory networks. Many BN methods have been developed to estimate networks from microarray data. However, two serious problems reduce the effectiveness of current BN methods. The first problem is that BN-based methods require huge computational time to estimate large-scale networks. The second is that the estimated network cannot have cyclic structures, even if the actual network has such structures. Results In this paper, we present a novel BN-based deterministic method with reduced computational time that allows cyclic structures. Our approach generates all the combinational triplets of genes, estimates networks of the triplets by BN, and unites the networks into a single network containing all genes. This method decreases the search space of predicting gene regulatory networks without degrading the solution accuracy compared with the greedy hill climbing (GHC method. The order of computational time is the cube of number of genes. In addition, the network estimated by our method can include cyclic structures. Conclusions We verified the effectiveness of the proposed method for all known gene regulatory networks and their expression profiles. The results demonstrate that this approach can predict regulatory networks with reduced computational time without degrading the solution accuracy compared with the GHC method.
International Nuclear Information System (INIS)
An inverse methodology is proposed to estimate thermo-physical and transport properties individually and simultaneously from in-house experimental data obtained using the transient Liquid Crystal Thermography (LCT) technique. A vertical rectangular fin made of mild steel and size of 75 × 250 × 3 (L × W × t) (all in mm) has been used. Thermochromic Liquid Crystals (TLCs) are used to obtain transient temperature distribution along the fin surface to determine the temperature dependent heat transfer coefficient, hθ and the thermal diffusivity, α of the fin. The variation of heat transfer coefficient is considered as a power law function of temperature excess (hθ = a''θb(x,t)) and is derived from the basic Nusselt number equation, Nuθ = aRabθ used for laminar natural convection for a vertical plate in ambient air. Using this functional form, the 1–D transient fin equation solved using the finite difference technique for assumed values of 'a' and 'α'. Treating the inverse problem as a one parameter estimation in 'a' or 'α' or a two parameter estimation problem in 'a' and 'α', the sum of the squares of the difference between the TLC measured and simulated temperatures are minimized with the Bayesian frame work in the inverse model to determine the point estimates for 'a' and 'α'. Two point estimates namely the (i) mean and (ii) maximum a posterior (MAP) are used to report the retrieved quantities together with the associated standard deviation.
Directory of Open Access Journals (Sweden)
Steven Kelly
Full Text Available BACKGROUND: The rapid increase in the availability of genome information has created considerable demand for both comparative and ab initio predictive bioinformatic analyses. The biology laid bare in the genomes of many organisms is often novel, presenting new challenges for bioinformatic interrogation. A paradigm for this is the collected genomes of the kinetoplastid parasites, a group which includes Trypanosoma brucei the causative agent of human African trypanosomiasis. These genomes, though outwardly simple in organisation and gene content, have historically challenged many theories for gene expression regulation in eukaryotes. METHODOLOGY/PRINCIPLE FINDINGS: Here we utilise a Bayesian approach to identify local changes in nucleotide composition in the genome of T. brucei. We show that there are several elements which are found at the starts and ends of multicopy gene arrays and that there are compositional elements that are common to all intergenic regions. We also show that there is a composition-inversion element that occurs at the position of the trans-splice site. CONCLUSIONS/SIGNIFICANCE: The nature of the elements discovered reinforces the hypothesis that context dependant RNA secondary structure has an important influence on gene expression regulation in Trypanosoma brucei.
DEFF Research Database (Denmark)
Jensen, Finn Verner; Nielsen, Thomas Dyhre
2016-01-01
Mathematically, a Bayesian graphical model is a compact representation of the joint probability distribution for a set of variables. The most frequently used type of Bayesian graphical models are Bayesian networks. The structural part of a Bayesian graphical model is a graph consisting of nodes...... and edges. The nodes represent variables, which may be either discrete or continuous. An edge between two nodes A and B indicates a direct influence between the state of A and the state of B, which in some domains can also be interpreted as a causal relation. The wide-spread use of Bayesian networks...... is largely due to the availability of efficient inference algorithms for answering probabilistic queries about the states of the variables in the network. Furthermore, to support the construction of Bayesian network models, learning algorithms are also available. We give an overview of the Bayesian network...
Institute of Scientific and Technical Information of China (English)
KUNDU Debasis; PRADHAN Biswabrata
2009-01-01
Recently generalized exponential distribution has received considerable attentions. In this paper, we deal with the Bayesian inference of the unknown parameters of the progressively censored generalized exponential distribution. It is assumed that the scale and the shape parameters have independent gamma priors. The Bayes estimates of the unknown parameters cannot be obtained in the closed form. Lindley's approximation and importance sampling technique have been suggested to compute the approximate Bayes estimates. Markov Chain Monte Carlo method has been used to compute the approximate Bayes estimates and also to construct the highest posterior density credible intervals. We also provide different criteria to compare two different sampling schemes and hence to find the optimal sampling schemes. It is observed that finding the optimum censoring procedure is a computationally expensive process. And we have recommended to use the sub-optimal censoring procedure, which can be obtained very easily. Monte Carlo simulations are performed to compare the performances of the different methods and one data analysis has been performed for illustrative purposes.
Amo de Paz, Guillermo; Cubas, Paloma; Divakar, Pradeep K; Lumbsch, H Thorsten; Crespo, Ana
2011-01-01
There is a long-standing debate on the extent of vicariance and long-distance dispersal events to explain the current distribution of organisms, especially in those with small diaspores potentially prone to long-distance dispersal. Age estimates of clades play a crucial role in evaluating the impact of these processes. The aim of this study is to understand the evolutionary history of the largest clade of macrolichens, the parmelioid lichens (Parmeliaceae, Lecanoromycetes, Ascomycota) by dating the origin of the group and its major lineages. They have a worldwide distribution with centers of distribution in the Neo- and Paleotropics, and semi-arid subtropical regions of the Southern Hemisphere. Phylogenetic analyses were performed using DNA sequences of nuLSU and mtSSU rDNA, and the protein-coding RPB1 gene. The three DNA regions had different evolutionary rates: RPB1 gave a rate two to four times higher than nuLSU and mtSSU. Divergence times of the major clades were estimated with partitioned BEAST analyses allowing different rates for each DNA region and using a relaxed clock model. Three calibrations points were used to date the tree: an inferred age at the stem of Lecanoromycetes, and two dated fossils: Parmelia in the parmelioid group, and Alectoria. Palaeoclimatic conditions and the palaeogeological area cladogram were compared to the dated phylogeny of parmelioid. The parmelioid group diversified around the K/T boundary, and the major clades diverged during the Eocene and Oligocene. The radiation of the genera occurred through globally changing climatic condition of the early Oligocene, Miocene and early Pliocene. The estimated divergence times are consistent with long-distance dispersal events being the major factor to explain the biogeographical distribution patterns of Southern Hemisphere parmelioids, especially for Africa-Australia disjunctions, because the sequential break-up of Gondwana started much earlier than the origin of these clades. However, our
BEAST: Bayesian evolutionary analysis by sampling trees
Directory of Open Access Journals (Sweden)
Drummond Alexei J
2007-11-01
Full Text Available Abstract Background The evolutionary analysis of molecular sequence variation is a statistical enterprise. This is reflected in the increased use of probabilistic models for phylogenetic inference, multiple sequence alignment, and molecular population genetics. Here we present BEAST: a fast, flexible software architecture for Bayesian analysis of molecular sequences related by an evolutionary tree. A large number of popular stochastic models of sequence evolution are provided and tree-based models suitable for both within- and between-species sequence data are implemented. Results BEAST version 1.4.6 consists of 81000 lines of Java source code, 779 classes and 81 packages. It provides models for DNA and protein sequence evolution, highly parametric coalescent analysis, relaxed clock phylogenetics, non-contemporaneous sequence data, statistical alignment and a wide range of options for prior distributions. BEAST source code is object-oriented, modular in design and freely available at http://beast-mcmc.googlecode.com/ under the GNU LGPL license. Conclusion BEAST is a powerful and flexible evolutionary analysis package for molecular sequence variation. It also provides a resource for the further development of new models and statistical methods of evolutionary analysis.
Bayesian object classification of gold nanoparticles
Konomi, Bledar A.
2013-06-01
The properties of materials synthesized with nanoparticles (NPs) are highly correlated to the sizes and shapes of the nanoparticles. The transmission electron microscopy (TEM) imaging technique can be used to measure the morphological characteristics of NPs, which can be simple circles or more complex irregular polygons with varying degrees of scales and sizes. A major difficulty in analyzing the TEM images is the overlapping of objects, having different morphological properties with no specific information about the number of objects present. Furthermore, the objects lying along the boundary render automated image analysis much more difficult. To overcome these challenges, we propose a Bayesian method based on the marked-point process representation of the objects. We derive models, both for the marks which parameterize the morphological aspects and the points which determine the location of the objects. The proposed model is an automatic image segmentation and classification procedure, which simultaneously detects the boundaries and classifies the NPs into one of the predetermined shape families. We execute the inference by sampling the posterior distribution using Markov chainMonte Carlo (MCMC) since the posterior is doubly intractable. We apply our novel method to several TEM imaging samples of gold NPs, producing the needed statistical characterization of their morphology. © Institute of Mathematical Statistics, 2013.
Understanding Computational Bayesian Statistics
Bolstad, William M
2011-01-01
A hands-on introduction to computational statistics from a Bayesian point of view Providing a solid grounding in statistics while uniquely covering the topics from a Bayesian perspective, Understanding Computational Bayesian Statistics successfully guides readers through this new, cutting-edge approach. With its hands-on treatment of the topic, the book shows how samples can be drawn from the posterior distribution when the formula giving its shape is all that is known, and how Bayesian inferences can be based on these samples from the posterior. These ideas are illustrated on common statistic
Bayesian Analysis of Inertial Confinement Fusion Experiments at the National Ignition Facility
Gaffney, J A; Sonnad, V; Libby, S B
2012-01-01
We develop a Bayesian inference method that allows the efficient determination of several interesting parameters from complicated high-energy-density experiments performed on the National Ignition Facility (NIF). The model is based on an exploration of phase space using the hydrodynamic code HYDRA. A linear model is used to describe the effect of nuisance parameters on the analysis, allowing an analytic likelihood to be derived that can be determined from a small number of HYDRA runs and then used in existing advanced statistical analysis methods. This approach is applied to a recent experiment in order to determine the carbon opacity and X-ray drive; it is found that the inclusion of prior expert knowledge and fluctuations in capsule dimensions and chemical composition significantly improve the agreement between experiment and theoretical opacity calculations. A parameterisation of HYDRA results is used to test the application of both Markov chain Monte Carlo (MCMC) and genetic algorithm (GA) techniques to e...
Bayesian Reliability Analysis of Non-Stationarity in Multi-agent Systems
Directory of Open Access Journals (Sweden)
TONT Gabriela
2013-05-01
Full Text Available The Bayesian methods provide information about the meaningful parameters in a statistical analysis obtained by combining the prior and sampling distributions to form the posterior distribution of theparameters. The desired inferences are obtained from this joint posterior. An estimation strategy for hierarchical models, where the resulting joint distribution of the associated model parameters cannotbe evaluated analytically, is to use sampling algorithms, known as Markov Chain Monte Carlo (MCMC methods, from which approximate solutions can be obtained. Both serial and parallel configurations of subcomponents are permitted. The capability of time-dependent method to describe a multi-state system is based on a case study, assessingthe operatial situation of studied system. The rationality and validity of the presented model are demonstrated via a case of study. The effect of randomness of the structural parameters is alsoexamined.
Directory of Open Access Journals (Sweden)
Guillermo Amo de Paz
Full Text Available There is a long-standing debate on the extent of vicariance and long-distance dispersal events to explain the current distribution of organisms, especially in those with small diaspores potentially prone to long-distance dispersal. Age estimates of clades play a crucial role in evaluating the impact of these processes. The aim of this study is to understand the evolutionary history of the largest clade of macrolichens, the parmelioid lichens (Parmeliaceae, Lecanoromycetes, Ascomycota by dating the origin of the group and its major lineages. They have a worldwide distribution with centers of distribution in the Neo- and Paleotropics, and semi-arid subtropical regions of the Southern Hemisphere. Phylogenetic analyses were performed using DNA sequences of nuLSU and mtSSU rDNA, and the protein-coding RPB1 gene. The three DNA regions had different evolutionary rates: RPB1 gave a rate two to four times higher than nuLSU and mtSSU. Divergence times of the major clades were estimated with partitioned BEAST analyses allowing different rates for each DNA region and using a relaxed clock model. Three calibrations points were used to date the tree: an inferred age at the stem of Lecanoromycetes, and two dated fossils: Parmelia in the parmelioid group, and Alectoria. Palaeoclimatic conditions and the palaeogeological area cladogram were compared to the dated phylogeny of parmelioid. The parmelioid group diversified around the K/T boundary, and the major clades diverged during the Eocene and Oligocene. The radiation of the genera occurred through globally changing climatic condition of the early Oligocene, Miocene and early Pliocene. The estimated divergence times are consistent with long-distance dispersal events being the major factor to explain the biogeographical distribution patterns of Southern Hemisphere parmelioids, especially for Africa-Australia disjunctions, because the sequential break-up of Gondwana started much earlier than the origin of these
Directory of Open Access Journals (Sweden)
Cláudio Roberto Thiersch
2013-02-01
Full Text Available Neste trabalho foi considerado o modelo de Curtis para a relação hipsométrica em clones de Eucalyptus sp. com os parâmetros sujeitos a restrições. Para fazer a inferência dos parâmetros do modelo com restrições, utilizou-se uma abordagem bayesiana com densidade a priori construída empiricamente. As estimativas bayesianas são calculadas com a técnica de simulação de Monte Carlo em Cadeia de Markov (MCMC. O método proposto foi aplicado a diferentes conjuntos de dados reais, dos quais foram selecionados cinco para exemplificar os resultados. Estes foram comparados com os resultados obtidos pelo método de mínimos quadrados, destacando-se a superioridade da abordagem bayesiana proposta.The model of Curtis was considered in this work for the hypsometric relationship in Eucalyptus sp. clones with parameters submitted to restrictions. A Bayesian a priori density was empirically constructed to infer the parameters of the models with restrictions. The Bayesian estimates are calculated using Monte Carlo Markov Chain (MCMC Method. The proposed method was applied to different groups of real data from which five were selected to show the results. Those were compared to the achieved results by the minimum square method and the superiority of the Bayesian approach is highlighted.
基于MCMC方法的生物气溶胶袭击施放源项参数反演%Source inversion of bioaerosol attack based on MCMC method
Institute of Scientific and Technical Information of China (English)
许晴; 祖正虎; 张文斗; 徐致靖; 黄培堂; 郑涛
2012-01-01
生物气溶胶施放源项参数反演是生物气溶胶袭击危害评估的反问题,对危害评估及应急响应具有重要指导意义.本文基于贝叶斯推理方法,利用生物传感器检测数据和正向大气扩散模型,构造似然函数,采用结合Metropolis-Hasting算法的马尔可夫链蒙特卡洛(Markov chain Monte Carlo,MCMC)抽样,对施放源位置、高度、施放剂量进行反演.统计分析表明,反演结果和初始源项参数设置吻合非常好,证明了方法的有效性.%The inversion of bioaerosol release source parameters is the inverse problem of hazard assessment of bioaerosol attacks,and is of great significance for hazard assessment and emergency response. Based on observations of biosensors and concentrations predicted by an atmospheric dispersion model, a likelihood function was assigned, with which the Markov chain Monte Carlo ( MCMC) sampling based on Bayesian inference was used to invert the source parameters, including the source location,source height,and dispersion strength, statistic analysis shows that the inversion results fit the initial source parameters very well. The validity of the method is proved.
Bayesian binary regression model: an application to in-hospital death after AMI prediction
Directory of Open Access Journals (Sweden)
Aparecida D. P. Souza
2004-08-01
Full Text Available A Bayesian binary regression model is developed to predict death of patients after acute myocardial infarction (AMI. Markov Chain Monte Carlo (MCMC methods are used to make inference and to evaluate Bayesian binary regression models. A model building strategy based on Bayes factor is proposed and aspects of model validation are extensively discussed in the paper, including the posterior distribution for the c-index and the analysis of residuals. Risk assessment, based on variables easily available within minutes of the patients' arrival at the hospital, is very important to decide the course of the treatment. The identified model reveals itself strongly reliable and accurate, with a rate of correct classification of 88% and a concordance index of 83%.Um modelo bayesiano de regressão binária é desenvolvido para predizer óbito hospitalar em pacientes acometidos por infarto agudo do miocárdio. Métodos de Monte Carlo via Cadeias de Markov (MCMC são usados para fazer inferência e validação. Uma estratégia para construção de modelos, baseada no uso do fator de Bayes, é proposta e aspectos de validação são extensivamente discutidos neste artigo, incluindo a distribuição a posteriori para o índice de concordância e análise de resíduos. A determinação de fatores de risco, baseados em variáveis disponíveis na chegada do paciente ao hospital, é muito importante para a tomada de decisão sobre o curso do tratamento. O modelo identificado se revela fortemente confiável e acurado, com uma taxa de classificação correta de 88% e um índice de concordância de 83%.
Loredo, T J
2004-01-01
I describe a framework for adaptive scientific exploration based on iterating an Observation--Inference--Design cycle that allows adjustment of hypotheses and observing protocols in response to the results of observation on-the-fly, as data are gathered. The framework uses a unified Bayesian methodology for the inference and design stages: Bayesian inference to quantify what we have learned from the available data and predict future data, and Bayesian decision theory to identify which new observations would teach us the most. When the goal of the experiment is simply to make inferences, the framework identifies a computationally efficient iterative ``maximum entropy sampling'' strategy as the optimal strategy in settings where the noise statistics are independent of signal properties. Results of applying the method to two ``toy'' problems with simulated data--measuring the orbit of an extrasolar planet, and locating a hidden one-dimensional object--show the approach can significantly improve observational eff...
A Structure Learning Algorithm for Bayesian Network Using Prior Knowledge
Institute of Scientific and Technical Information of China (English)
徐俊刚; 赵越; 陈健; 韩超
2015-01-01
Learning structure from data is one of the most important fundamental tasks of Bayesian network research. Particularly, learning optional structure of Bayesian network is a non-deterministic polynomial-time (NP) hard problem. To solve this problem, many heuristic algorithms have been proposed, and some of them learn Bayesian network structure with the help of different types of prior knowledge. However, the existing algorithms have some restrictions on the prior knowledge, such as quality restriction and use restriction. This makes it diﬃcult to use the prior knowledge well in these algorithms. In this paper, we introduce the prior knowledge into the Markov chain Monte Carlo (MCMC) algorithm and propose an algorithm called Constrained MCMC (C-MCMC) algorithm to learn the structure of the Bayesian network. Three types of prior knowledge are defined: existence of parent node, absence of parent node, and distribution knowledge including the conditional probability distribution (CPD) of edges and the probability distribution (PD) of nodes. All of these types of prior knowledge are easily used in this algorithm. We conduct extensive experiments to demonstrate the feasibility and effectiveness of the proposed method C-MCMC.
Time-varying nonstationary multivariate risk analysis using a dynamic Bayesian copula
Sarhadi, Ali; Burn, Donald H.; Concepción Ausín, María.; Wiper, Michael P.
2016-03-01
A time-varying risk analysis is proposed for an adaptive design framework in nonstationary conditions arising from climate change. A Bayesian, dynamic conditional copula is developed for modeling the time-varying dependence structure between mixed continuous and discrete multiattributes of multidimensional hydrometeorological phenomena. Joint Bayesian inference is carried out to fit the marginals and copula in an illustrative example using an adaptive, Gibbs Markov Chain Monte Carlo (MCMC) sampler. Posterior mean estimates and credible intervals are provided for the model parameters and the Deviance Information Criterion (DIC) is used to select the model that best captures different forms of nonstationarity over time. This study also introduces a fully Bayesian, time-varying joint return period for multivariate time-dependent risk analysis in nonstationary environments. The results demonstrate that the nature and the risk of extreme-climate multidimensional processes are changed over time under the impact of climate change, and accordingly the long-term decision making strategies should be updated based on the anomalies of the nonstationary environment.
A Genomic Bayesian Multi-trait and Multi-environment Model.
Montesinos-López, Osval A; Montesinos-López, Abelardo; Crossa, José; Toledo, Fernando H; Pérez-Hernández, Oscar; Eskridge, Kent M; Rutkoski, Jessica
2016-09-08
When information on multiple genotypes evaluated in multiple environments is recorded, a multi-environment single trait model for assessing genotype × environment interaction (G × E) is usually employed. Comprehensive models that simultaneously take into account the correlated traits and trait × genotype × environment interaction (T × G × E) are lacking. In this research, we propose a Bayesian model for analyzing multiple traits and multiple environments for whole-genome prediction (WGP) model. For this model, we used Half-[Formula: see text] priors on each standard deviation term and uniform priors on each correlation of the covariance matrix. These priors were not informative and led to posterior inferences that were insensitive to the choice of hyper-parameters. We also developed a computationally efficient Markov Chain Monte Carlo (MCMC) under the above priors, which allowed us to obtain all required full conditional distributions of the parameters leading to an exact Gibbs sampling for the posterior distribution. We used two real data sets to implement and evaluate the proposed Bayesian method and found that when the correlation between traits was high (>0.5), the proposed model (with unstructured variance-covariance) improved prediction accuracy compared to the model with diagonal and standard variance-covariance structures. The R-software package Bayesian Multi-Trait and Multi-Environment (BMTME) offers optimized C++ routines to efficiently perform the analyses.
Bayesian structured additive regression modeling of epidemic data: application to cholera
Directory of Open Access Journals (Sweden)
Osei Frank B
2012-08-01
Full Text Available Abstract Background A significant interest in spatial epidemiology lies in identifying associated risk factors which enhances the risk of infection. Most studies, however, make no, or limited use of the spatial structure of the data, as well as possible nonlinear effects of the risk factors. Methods We develop a Bayesian Structured Additive Regression model for cholera epidemic data. Model estimation and inference is based on fully Bayesian approach via Markov Chain Monte Carlo (MCMC simulations. The model is applied to cholera epidemic data in the Kumasi Metropolis, Ghana. Proximity to refuse dumps, density of refuse dumps, and proximity to potential cholera reservoirs were modeled as continuous functions; presence of slum settlers and population density were modeled as fixed effects, whereas spatial references to the communities were modeled as structured and unstructured spatial effects. Results We observe that the risk of cholera is associated with slum settlements and high population density. The risk of cholera is equal and lower for communities with fewer refuse dumps, but variable and higher for communities with more refuse dumps. The risk is also lower for communities distant from refuse dumps and potential cholera reservoirs. The results also indicate distinct spatial variation in the risk of cholera infection. Conclusion The study highlights the usefulness of Bayesian semi-parametric regression model analyzing public health data. These findings could serve as novel information to help health planners and policy makers in making effective decisions to control or prevent cholera epidemics.
Bayesian Inference in Statistical Analysis
Box, George E P
2011-01-01
The Wiley Classics Library consists of selected books that have become recognized classics in their respective fields. With these new unabridged and inexpensive editions, Wiley hopes to extend the life of these important works by making them available to future generations of mathematicians and scientists. Currently available in the Series: T. W. Anderson The Statistical Analysis of Time Series T. S. Arthanari & Yadolah Dodge Mathematical Programming in Statistics Emil Artin Geometric Algebra Norman T. J. Bailey The Elements of Stochastic Processes with Applications to the Natural Sciences Rob
Bayesian Variable Selection via Particle Stochastic Search.
Shi, Minghui; Dunson, David B
2011-02-01
We focus on Bayesian variable selection in regression models. One challenge is to search the huge model space adequately, while identifying high posterior probability regions. In the past decades, the main focus has been on the use of Markov chain Monte Carlo (MCMC) algorithms for these purposes. In this article, we propose a new computational approach based on sequential Monte Carlo (SMC), which we refer to as particle stochastic search (PSS). We illustrate PSS through applications to linear regression and probit models.
von der Linden, Wolfgang; Dose, Volker; von Toussaint, Udo
2014-06-01
Preface; Part I. Introduction: 1. The meaning of probability; 2. Basic definitions; 3. Bayesian inference; 4. Combinatrics; 5. Random walks; 6. Limit theorems; 7. Continuous distributions; 8. The central limit theorem; 9. Poisson processes and waiting times; Part II. Assigning Probabilities: 10. Transformation invariance; 11. Maximum entropy; 12. Qualified maximum entropy; 13. Global smoothness; Part III. Parameter Estimation: 14. Bayesian parameter estimation; 15. Frequentist parameter estimation; 16. The Cramer-Rao inequality; Part IV. Testing Hypotheses: 17. The Bayesian way; 18. The frequentist way; 19. Sampling distributions; 20. Bayesian vs frequentist hypothesis tests; Part V. Real World Applications: 21. Regression; 22. Inconsistent data; 23. Unrecognized signal contributions; 24. Change point problems; 25. Function estimation; 26. Integral equations; 27. Model selection; 28. Bayesian experimental design; Part VI. Probabilistic Numerical Techniques: 29. Numerical integration; 30. Monte Carlo methods; 31. Nested sampling; Appendixes; References; Index.
Bayesian theory and applications
Dellaportas, Petros; Polson, Nicholas G; Stephens, David A
2013-01-01
The development of hierarchical models and Markov chain Monte Carlo (MCMC) techniques forms one of the most profound advances in Bayesian analysis since the 1970s and provides the basis for advances in virtually all areas of applied and theoretical Bayesian statistics. This volume guides the reader along a statistical journey that begins with the basic structure of Bayesian theory, and then provides details on most of the past and present advances in this field. The book has a unique format. There is an explanatory chapter devoted to each conceptual advance followed by journal-style chapters that provide applications or further advances on the concept. Thus, the volume is both a textbook and a compendium of papers covering a vast range of topics. It is appropriate for a well-informed novice interested in understanding the basic approach, methods and recent applications. Because of its advanced chapters and recent work, it is also appropriate for a more mature reader interested in recent applications and devel...
A Bayesian Analysis of Spectral ARMA Model
Directory of Open Access Journals (Sweden)
Manoel I. Silvestre Bezerra
2012-01-01
Full Text Available Bezerra et al. (2008 proposed a new method, based on Yule-Walker equations, to estimate the ARMA spectral model. In this paper, a Bayesian approach is developed for this model by using the noninformative prior proposed by Jeffreys (1967. The Bayesian computations, simulation via Markov Monte Carlo (MCMC is carried out and characteristics of marginal posterior distributions such as Bayes estimator and confidence interval for the parameters of the ARMA model are derived. Both methods are also compared with the traditional least squares and maximum likelihood approaches and a numerical illustration with two examples of the ARMA model is presented to evaluate the performance of the procedures.
An MCMC determination of the primordial helium abundance
Aver, Erik; Skillman, Evan D
2011-01-01
Spectroscopic observations of the chemical abundances in metal-poor H II regions provide an independent method for estimating the primordial helium abundance. H II regions are described by several physical parameters such as electron density, electron temperature, and reddening, in addition to y, the ratio of helium to hydrogen. It had been customary to estimate or determine self-consistently these parameters to calculate y. Frequentist analyses of the parameter space have been shown to be successful in these determinations, and Markov Chain Monte Carlo (MCMC) techniques have proven to be very efficient in sampling this parameter space. Nevertheless, accurate determination of the primordial helium abundance from observations of H II regions is constrained by both systematic and statistical uncertainties. In an attempt to better reduce the latter, and better characterize the former, we apply MCMC methods to the large dataset recently compiled by Izotov, Thuan, & Stasinska (2007). To improve the reliability...
Smedmark, Jenny E E; Swenson, Ulf; Anderberg, Arne A
2006-06-01
We used Bayesian phylogenetic analysis of 5 kb of chloroplast DNA data from 68 Sapotaceae species to clarify phylogenetic relationships within Sapotoideae, one of the two major clades within Sapotaceae. Variation in substitution rates through time was shown to be a very important aspect of molecular evolution for this data set. Relative rates tests indicated that changes in overall rate have taken place in several lineages during the history of the group and Bayes factors strongly supported a covarion model, which allows the rate of a site to vary over time, over commonly used models that only allow rates to vary across sites. Rate variation over time was actually found to be a more important model component than rate variation across sites. The covarion model was originally developed for coding gene sequences and has so far only been tested for this type of data. The fact that it performed so well with the present data set, consisting mainly of data from noncoding spacer regions, suggests that it deserves a wider consideration in model based phylogenetic inference. Repeatability of phylogenetic results was very difficult to obtain with the more parameter rich models, and analyses with identical settings often supported different topologies. Overparameterization may be the reason why the MCMC did not sample from the posterior distribution in these cases. The problem could, however, be overcome by using less parameter rich evolutionary models, and adjusting the MCMC settings. The phylogenetic results showed that two taxa, previously thought to belong in Sapotoideae, are not part of this group. Eberhardtia aurata is the sister of the two major Sapotaceae clades, Chrysophylloideae and Sapotoideae, and Neohemsleya usambarensis belongs in Chrysophylloideae. Within Sapotoideae two clades, Sideroxyleae and Sapoteae, were strongly supported. Bayesian analysis of the character history of some floral morphological traits showed that the ancestral type of flower in
Target tracking in glint noise using a MCMC particle filter
Institute of Scientific and Technical Information of China (English)
Hu Hongtao; Jing Zhongliang; Li Anping; Hu Shiqiang; Tian Hongwei
2005-01-01
In radar target tracking application, the observation noise is usually non-Gaussian, which is also referred as glint noise. The performances of conventional trackers degra de severely in the presence of glint noise. An improved particle filter, Markov chain Monte Carlo particle filter (MCMC-PF), is applied to cope with radar target tracking when the measurements are perturbed by glint noise. Tracking performance of the filter is demonstrated in the present of glint noise by computer simulation.
Stochastic Annealing for Variational Inference
Gultekin, San; Zhang, Aonan; Paisley, John
2015-01-01
We empirically evaluate a stochastic annealing strategy for Bayesian posterior optimization with variational inference. Variational inference is a deterministic approach to approximate posterior inference in Bayesian models in which a typically non-convex objective function is locally optimized over the parameters of the approximating distribution. We investigate an annealing method for optimizing this objective with the aim of finding a better local optimal solution and compare with determin...
Minsley, B.J.
2011-01-01
A meaningful interpretation of geophysical measurements requires an assessment of the space of models that are consistent with the data, rather than just a single, 'best' model which does not convey information about parameter uncertainty. For this purpose, a trans-dimensional Bayesian Markov chain Monte Carlo (MCMC) algorithm is developed for assessing frequency-domain electromagnetic (FDEM) data acquired from airborne or ground-based systems. By sampling the distribution of models that are consistent with measured data and any prior knowledge, valuable inferences can be made about parameter values such as the likely depth to an interface, the distribution of possible resistivity values as a function of depth and non-unique relationships between parameters. The trans-dimensional aspect of the algorithm allows the number of layers to be a free parameter that is controlled by the data, where models with fewer layers are inherently favoured, which provides a natural measure of parsimony and a significant degree of flexibility in parametrization. The MCMC algorithm is used with synthetic examples to illustrate how the distribution of acceptable models is affected by the choice of prior information, the system geometry and configuration and the uncertainty in the measured system elevation. An airborne FDEM data set that was acquired for the purpose of hydrogeological characterization is also studied. The results compare favourably with traditional least-squares analysis, borehole resistivity and lithology logs from the site, and also provide new information about parameter uncertainty necessary for model assessment. ?? 2011. Geophysical Journal International ?? 2011 RAS.
Labarbe, Rudi; Janssens, Guillaume; Sterpin, Edmond
2016-09-01
In proton therapy, quantification of the proton range uncertainty is important to achieve dose distribution compliance. The promising accuracy of prompt gamma imaging (PGI) suggests the development of a mathematical framework using the range measurements to convert population based estimates of uncertainties into patient specific estimates with the purpose of plan adaptation. We present here such framework using Bayesian inference. The sources of uncertainty were modeled by three parameters: setup bias m, random setup precision r and water equivalent path length bias u. The evolution of the expectation values E(m), E(r) and E(u) during the treatment was simulated. The expectation values converged towards the true simulation parameters after 5 and 10 fractions, for E(m) and E(u), respectively. E(r) settle on a constant value slightly lower than the true value after 10 fractions. In conclusion, the simulation showed that there is enough information in the frequency distribution of the range errors measured by PGI to estimate the expectation values and the confidence interval of the model parameters by Bayesian inference. The updated model parameters were used to compute patient specific lateral and local distal margins for adaptive re-planning.
Bayesian approach to rough set
Marwala, Tshilidzi
2007-01-01
This paper proposes an approach to training rough set models using Bayesian framework trained using Markov Chain Monte Carlo (MCMC) method. The prior probabilities are constructed from the prior knowledge that good rough set models have fewer rules. Markov Chain Monte Carlo sampling is conducted through sampling in the rough set granule space and Metropolis algorithm is used as an acceptance criteria. The proposed method is tested to estimate the risk of HIV given demographic data. The results obtained shows that the proposed approach is able to achieve an average accuracy of 58% with the accuracy varying up to 66%. In addition the Bayesian rough set give the probabilities of the estimated HIV status as well as the linguistic rules describing how the demographic parameters drive the risk of HIV.
Bayesian kinematic earthquake source models
Minson, S. E.; Simons, M.; Beck, J. L.; Genrich, J. F.; Galetzka, J. E.; Chowdhury, F.; Owen, S. E.; Webb, F.; Comte, D.; Glass, B.; Leiva, C.; Ortega, F. H.
2009-12-01
Most coseismic, postseismic, and interseismic slip models are based on highly regularized optimizations which yield one solution which satisfies the data given a particular set of regularizing constraints. This regularization hampers our ability to answer basic questions such as whether seismic and aseismic slip overlap or instead rupture separate portions of the fault zone. We present a Bayesian methodology for generating kinematic earthquake source models with a focus on large subduction zone earthquakes. Unlike classical optimization approaches, Bayesian techniques sample the ensemble of all acceptable models presented as an a posteriori probability density function (PDF), and thus we can explore the entire solution space to determine, for example, which model parameters are well determined and which are not, or what is the likelihood that two slip distributions overlap in space. Bayesian sampling also has the advantage that all a priori knowledge of the source process can be used to mold the a posteriori ensemble of models. Although very powerful, Bayesian methods have up to now been of limited use in geophysical modeling because they are only computationally feasible for problems with a small number of free parameters due to what is called the "curse of dimensionality." However, our methodology can successfully sample solution spaces of many hundreds of parameters, which is sufficient to produce finite fault kinematic earthquake models. Our algorithm is a modification of the tempered Markov chain Monte Carlo (tempered MCMC or TMCMC) method. In our algorithm, we sample a "tempered" a posteriori PDF using many MCMC simulations running in parallel and evolutionary computation in which models which fit the data poorly are preferentially eliminated in favor of models which better predict the data. We present results for both synthetic test problems as well as for the 2007 Mw 7.8 Tocopilla, Chile earthquake, the latter of which is constrained by InSAR, local high
Bayesian methods for measures of agreement
Broemeling, Lyle D
2009-01-01
Using WinBUGS to implement Bayesian inferences of estimation and testing hypotheses, Bayesian Methods for Measures of Agreement presents useful methods for the design and analysis of agreement studies. It focuses on agreement among the various players in the diagnostic process.The author employs a Bayesian approach to provide statistical inferences based on various models of intra- and interrater agreement. He presents many examples that illustrate the Bayesian mode of reasoning and explains elements of a Bayesian application, including prior information, experimental information, the likelihood function, posterior distribution, and predictive distribution. The appendices provide the necessary theoretical foundation to understand Bayesian methods as well as introduce the fundamentals of programming and executing the WinBUGS software.Taking a Bayesian approach to inference, this hands-on book explores numerous measures of agreement, including the Kappa coefficient, the G coefficient, and intraclass correlation...
Single channel signal component separation using Bayesian estimation
Institute of Scientific and Technical Information of China (English)
Cai Quanwei; Wei Ping; Xiao Xianci
2007-01-01
A Bayesian estimation method to separate multicomponent signals with single channel observation is presented in this paper. By using the basis function projection, the component separation becomes a problem of limited parameter estimation. Then, a Bayesian model for estimating parameters is set up. The reversible jump MCMC (Monte Carlo Markov Chain) algorithmis adopted to perform the Bayesian computation. The method can jointly estimate the parameters of each component and the component number. Simulation results demonstrate that the method has low SNR threshold and better performance.
Bayesian analysis for extreme climatic events: A review
Chu, Pao-Shin; Zhao, Xin
2011-11-01
This article reviews Bayesian analysis methods applied to extreme climatic data. We particularly focus on applications to three different problems related to extreme climatic events including detection of abrupt regime shifts, clustering tropical cyclone tracks, and statistical forecasting for seasonal tropical cyclone activity. For identifying potential change points in an extreme event count series, a hierarchical Bayesian framework involving three layers - data, parameter, and hypothesis - is formulated to demonstrate the posterior probability of the shifts throughout the time. For the data layer, a Poisson process with a gamma distributed rate is presumed. For the hypothesis layer, multiple candidate hypotheses with different change-points are considered. To calculate the posterior probability for each hypothesis and its associated parameters we developed an exact analytical formula, a Markov Chain Monte Carlo (MCMC) algorithm, and a more sophisticated reversible jump Markov Chain Monte Carlo (RJMCMC) algorithm. The algorithms are applied to several rare event series: the annual tropical cyclone or typhoon counts over the central, eastern, and western North Pacific; the annual extremely heavy rainfall event counts at Manoa, Hawaii; and the annual heat wave frequency in France. Using an Expectation-Maximization (EM) algorithm, a Bayesian clustering method built on a mixture Gaussian model is applied to objectively classify historical, spaghetti-like tropical cyclone tracks (1945-2007) over the western North Pacific and the South China Sea into eight distinct track types. A regression based approach to forecasting seasonal tropical cyclone frequency in a region is developed. Specifically, by adopting large-scale environmental conditions prior to the tropical cyclone season, a Poisson regression model is built for predicting seasonal tropical cyclone counts, and a probit regression model is alternatively developed toward a binary classification problem. With a non
Loredo, Thomas J.
2004-04-01
I describe a framework for adaptive scientific exploration based on iterating an Observation-Inference-Design cycle that allows adjustment of hypotheses and observing protocols in response to the results of observation on-the-fly, as data are gathered. The framework uses a unified Bayesian methodology for the inference and design stages: Bayesian inference to quantify what we have learned from the available data and predict future data, and Bayesian decision theory to identify which new observations would teach us the most. When the goal of the experiment is simply to make inferences, the framework identifies a computationally efficient iterative ``maximum entropy sampling'' strategy as the optimal strategy in settings where the noise statistics are independent of signal properties. Results of applying the method to two ``toy'' problems with simulated data-measuring the orbit of an extrasolar planet, and locating a hidden one-dimensional object-show the approach can significantly improve observational efficiency in settings that have well-defined nonlinear models. I conclude with a list of open issues that must be addressed to make Bayesian adaptive exploration a practical and reliable tool for optimizing scientific exploration.
Bayesian modeling using WinBUGS
Ntzoufras, Ioannis
2009-01-01
A hands-on introduction to the principles of Bayesian modeling using WinBUGS Bayesian Modeling Using WinBUGS provides an easily accessible introduction to the use of WinBUGS programming techniques in a variety of Bayesian modeling settings. The author provides an accessible treatment of the topic, offering readers a smooth introduction to the principles of Bayesian modeling with detailed guidance on the practical implementation of key principles. The book begins with a basic introduction to Bayesian inference and the WinBUGS software and goes on to cover key topics, including: Markov Chain Monte Carlo algorithms in Bayesian inference Generalized linear models Bayesian hierarchical models Predictive distribution and model checking Bayesian model and variable evaluation Computational notes and screen captures illustrate the use of both WinBUGS as well as R software to apply the discussed techniques. Exercises at the end of each chapter allow readers to test their understanding of the presented concepts and all ...
Methane emission modeling with MCMC calibration for a boreal peatland
Raivonen, Maarit; Smolander, Sampo; Susiluoto, Jouni; Backman, Leif; Li, Xuefei; Markkanen, Tiina; Kleinen, Thomas; Makela, Jarmo; Aalto, Tuula; Rinne, Janne; Brovkin, Victor; Vesala, Timo
2016-04-01
Natural wetlands, particularly peatlands of the boreal latitudes, are a significant source of methane (CH4). At the moment, the emission estimates are highly uncertain. These natural emissions respond to climatic variability, so it is necessary to understand their dynamics, in order to be able to predict how they affect the greenhouse gas balance in the future. We have developed a model of CH4 production, oxidation and transport in boreal peatlands. It simulates production of CH4 as a proportion of anaerobic peat respiration, transport of CH4 and oxygen between the soil and the atmosphere via diffusion in aerenchymatous plants and in peat pores (water and air filled), ebullition and oxidation of CH4 by methanotrophic microbes. Ultimately, we aim to add the model functionality to global climate models such as the JSBACH (Reick et al., 2013), the land surface scheme of the MPI Earth System Model. We tested the model with measured methane fluxes (using eddy covariance technique) from the Siikaneva site, an oligotrophic boreal fen in southern Finland (61°49' N, 24°11' E), over years 2005-2011. To give the model estimates regional reliability, we calibrated the model using Markov chain Monte Carlo (MCMC) technique. Although the simulations and the research are still ongoing, preliminary results from the MCMC calibration can be described as very promising considering that the model is still at relatively early stage. We will present the model and its dynamics as well as results from the MCMC calibration and the comparison with Siikaneva flux data.
Directory of Open Access Journals (Sweden)
Oliver Ratmann
2007-11-01
Full Text Available Gene duplication with subsequent interaction divergence is one of the primary driving forces in the evolution of genetic systems. Yet little is known about the precise mechanisms and the role of duplication divergence in the evolution of protein networks from the prokaryote and eukaryote domains. We developed a novel, model-based approach for Bayesian inference on biological network data that centres on approximate Bayesian computation, or likelihood-free inference. Instead of computing the intractable likelihood of the protein network topology, our method summarizes key features of the network and, based on these, uses a MCMC algorithm to approximate the posterior distribution of the model parameters. This allowed us to reliably fit a flexible mixture model that captures hallmarks of evolution by gene duplication and subfunctionalization to protein interaction network data of Helicobacter pylori and Plasmodium falciparum. The 80% credible intervals for the duplication-divergence component are [0.64, 0.98] for H. pylori and [0.87, 0.99] for P. falciparum. The remaining parameter estimates are not inconsistent with sequence data. An extensive sensitivity analysis showed that incompleteness of PIN data does not largely affect the analysis of models of protein network evolution, and that the degree sequence alone barely captures the evolutionary footprints of protein networks relative to other statistics. Our likelihood-free inference approach enables a fully Bayesian analysis of a complex and highly stochastic system that is otherwise intractable at present. Modelling the evolutionary history of PIN data, it transpires that only the simultaneous analysis of several global aspects of protein networks enables credible and consistent inference to be made from available datasets. Our results indicate that gene duplication has played a larger part in the network evolution of the eukaryote than in the prokaryote, and suggests that single gene
Bayesian methods for proteomic biomarker development
Directory of Open Access Journals (Sweden)
Belinda Hernández
2015-12-01
In this review we provide an introduction to Bayesian inference and demonstrate some of the advantages of using a Bayesian framework. We summarize how Bayesian methods have been used previously in proteomics and other areas of bioinformatics. Finally, we describe some popular and emerging Bayesian models from the statistical literature and provide a worked tutorial including code snippets to show how these methods may be applied for the evaluation of proteomic biomarkers.
Uniform Modeling of KOIs: MCMC Data Release Notes
Rowe, Jason F.; Thompson, Susan E.
2015-01-01
The Kepler Mission used a 0.95-m aperture space-based telescope to continuously observe more than 150 000 stars for 4 years. We model and analyze most KOIs listed at the Exoplanet Archive using the Kepler data. This document describes data products related to the reported planetary parameters and uncertainties for the Kepler Objects of Interest (KOIs) based on a Markov-Chain-Monte- Carlo (MCMC) analysis. Reported parameters, uncertainties and data products can be found at the NASA Exoplanet A...
Decoding X-ray observations from centres of galaxy clusters using MCMC
Lakhchaura, Kiran; Sharma, Prateek
2016-01-01
Traditionally the thermodynamic profiles (gas density, temperature, etc.) of galaxy clusters are obtained by assuming spherical symmetry and modeling projected X-ray spectra in each annulus. The outer annuli contribute to the inner ones and their contribution needs to be subtracted to obtain the temperature and density of spherical shells. The usual deprojection methods lead to propagation of errors from outside to in and do not model the covariance of parameters in different radial shells. In this paper we describe a method based on a free-form model of clusters with cluster parameters (density, temperature) given in spherical shells, which we {\\it jointly} forward fit to the X-ray data by constructing a Bayesian posterior probability distribution that we sample using the MCMC technique. By systematically marginalising over the nuisance outer shells, we estimate the inner entropy profiles of clusters and fit them to various models for a sample of Chandra X-ray observations of 17 clusters. We show that the en...
Directory of Open Access Journals (Sweden)
Moslem Moradi
2015-06-01
Full Text Available Here in, an application of a new seismic inversion algorithm in one of Iran’s oilfields is described. Stochastic (geostatistical seismic inversion, as a complementary method to deterministic inversion, is perceived as contribution combination of geostatistics and seismic inversion algorithm. This method integrates information from different data sources with different scales, as prior information in Bayesian statistics. Data integration leads to a probability density function (named as a posteriori probability that can yield a model of subsurface. The Markov Chain Monte Carlo (MCMC method is used to sample the posterior probability distribution, and the subsurface model characteristics can be extracted by analyzing a set of the samples. In this study, the theory of stochastic seismic inversion in a Bayesian framework was described and applied to infer P-impedance and porosity models. The comparison between the stochastic seismic inversion and the deterministic model based seismic inversion indicates that the stochastic seismic inversion can provide more detailed information of subsurface character. Since multiple realizations are extracted by this method, an estimation of pore volume and uncertainty in the estimation were analyzed.
Liu, Y.; Pau, G. S. H.; Finsterle, S.
2015-12-01
Parameter inversion involves inferring the model parameter values based on sparse observations of some observables. To infer the posterior probability distributions of the parameters, Markov chain Monte Carlo (MCMC) methods are typically used. However, the large number of forward simulations needed and limited computational resources limit the complexity of the hydrological model we can use in these methods. In view of this, we studied the implicit sampling (IS) method, an efficient importance sampling technique that generates samples in the high-probability region of the posterior distribution and thus reduces the number of forward simulations that we need to run. For a pilot-point inversion of a heterogeneous permeability field based on a synthetic ponded infiltration experiment simulated with TOUGH2 (a subsurface modeling code), we showed that IS with linear map provides an accurate Bayesian description of the parameterized permeability field at the pilot points with just approximately 500 forward simulations. We further studied the use of surrogate models to improve the computational efficiency of parameter inversion. We implemented two reduced-order models (ROMs) for the TOUGH2 forward model. One is based on polynomial chaos expansion (PCE), of which the coefficients are obtained using the sparse Bayesian learning technique to mitigate the "curse of dimensionality" of the PCE terms. The other model is Gaussian process regression (GPR) for which different covariance, likelihood and inference models are considered. Preliminary results indicate that ROMs constructed based on the prior parameter space perform poorly. It is thus impractical to replace this hydrological model by a ROM directly in a MCMC method. However, the IS method can work with a ROM constructed for parameters in the close vicinity of the maximum a posteriori probability (MAP) estimate. We will discuss the accuracy and computational efficiency of using ROMs in the implicit sampling procedure
Bayesian Models of Brain and Behaviour
Penny, William
2012-01-01
This paper presents a review of Bayesian models of brain and behaviour. We first review the basic principles of Bayesian inference. This is followed by descriptions of sampling and variational methods for approximate inference, and forward and backward recursions in time for inference in dynamical models. The review of behavioural models covers work in visual processing, sensory integration, sensorimotor integration, and collective decision making. The review of brain models covers a range of...
Bayesian multiple target tracking
Streit, Roy L
2013-01-01
This second edition has undergone substantial revision from the 1999 first edition, recognizing that a lot has changed in the multiple target tracking field. One of the most dramatic changes is in the widespread use of particle filters to implement nonlinear, non-Gaussian Bayesian trackers. This book views multiple target tracking as a Bayesian inference problem. Within this framework it develops the theory of single target tracking, multiple target tracking, and likelihood ratio detection and tracking. In addition to providing a detailed description of a basic particle filter that implements
Gomes, Guilherme J. C.; Vrugt, Jasper A.; Vargas, Eurípedes A.
2016-04-01
The depth to bedrock controls a myriad of processes by influencing subsurface flow paths, erosion rates, soil moisture, and water uptake by plant roots. As hillslope interiors are very difficult and costly to illuminate and access, the topography of the bedrock surface is largely unknown. This essay is concerned with the prediction of spatial patterns in the depth to bedrock (DTB) using high-resolution topographic data, numerical modeling, and Bayesian analysis. Our DTB model builds on the bottom-up control on fresh-bedrock topography hypothesis of Rempe and Dietrich (2014) and includes a mass movement and bedrock-valley morphology term to extent the usefulness and general applicability of the model. We reconcile the DTB model with field observations using Bayesian analysis with the DREAM algorithm. We investigate explicitly the benefits of using spatially distributed parameter values to account implicitly, and in a relatively simple way, for rock mass heterogeneities that are very difficult, if not impossible, to characterize adequately in the field. We illustrate our method using an artificial data set of bedrock depth observations and then evaluate our DTB model with real-world data collected at the Papagaio river basin in Rio de Janeiro, Brazil. Our results demonstrate that the DTB model predicts accurately the observed bedrock depth data. The posterior mean DTB simulation is shown to be in good agreement with the measured data. The posterior prediction uncertainty of the DTB model can be propagated forward through hydromechanical models to derive probabilistic estimates of factors of safety.
Directory of Open Access Journals (Sweden)
Thomas Christopher M
2011-07-01
Full Text Available Abstract Background IncP-1 plasmids are broad host range plasmids that have been found in clinical and environmental bacteria. They often carry genes for antibiotic resistance or catabolic pathways. The archetypal IncP-1 plasmid RK2 is a well-characterized biological system, with a fully sequenced and annotated genome and wide range of experimental measurements. Its central control operon, encoding two global regulators KorA and KorB, is a natural example of a negatively self-regulated operon. To increase our understanding of the regulation of this operon, we have constructed a dynamical mathematical model using Ordinary Differential Equations, and employed a Bayesian inference scheme, Markov Chain Monte Carlo (MCMC using the Metropolis-Hastings algorithm, as a way of integrating experimental measurements and a priori knowledge. We also compared MCMC and Metabolic Control Analysis (MCA as approaches for determining the sensitivity of model parameters. Results We identified two distinct sets of parameter values, with different biological interpretations, that fit and explain the experimental data. This allowed us to highlight the proportion of repressor protein as dimers as a key experimental measurement defining the dynamics of the system. Analysis of joint posterior distributions led to the identification of correlations between parameters for protein synthesis and partial repression by KorA or KorB dimers, indicating the necessary use of joint posteriors for correct parameter estimation. Using MCA, we demonstrated that the system is highly sensitive to the growth rate but insensitive to repressor monomerization rates in their selected value regions; the latter outcome was also confirmed by MCMC. Finally, by examining a series of different model refinements for partial repression by KorA or KorB dimers alone, we showed that a model including partial repression by KorA and KorB was most compatible with existing experimental data. Conclusions We
Bayesian analysis of heavy-tailed and long-range dependent Processes
Graves, Timothy; Watkins, Nick; Gramacy, Robert; Franzke, Christian
2014-05-01
We have used MCMC algorithms to perform a Bayesian analysis of Auto-Regressive Fractionally-Integrated Moving-Average ARFIMA(p,d,q) processes, which are capable of modelling long range dependence (e.g. Beran et al, 2013). Our principal aim is to obtain inference about the long memory parameter, d, with secondary interest in the scale and location parameters. We have developed a reversible-jump method enabling us to integrate over different model forms for the short memory component. We initially assume Gaussianity, and have tested the method on both synthetic and physical time series. We have extended the ARFIMA model by weakening the Gaussianity assumption, assuming an alpha-stable, heavy tailed, distribution for the innovations, and performing joint inference on d and alpha. We will present a study of the dependence of the posterior variance of the memory parameter d on the length of the time series considered. This will be compared with equivalent error diagnostics for other popular measures of d.
Institute of Scientific and Technical Information of China (English)
黄凯; 张晓玲
2012-01-01
贝叶斯方法是解决不确定问题的新思路,评述了以贝叶斯公式、贝叶斯统计推断及贝叶斯网络为基础的贝叶斯方法在水质评价、水环境模型参数识别、水环境管理及风险决策方面的应用.贝叶斯公式可很好地解决水质评价中监测数据、水质级别、水质标准等蕴含的不确定信息.贝叶斯统计推断耦合水环境模型为模型参数识别提供新方法,其应用难点为贝叶斯后验分布的计算.介绍了后验分布的离散贝叶斯算法、传统及改进MCMC算法.贝叶斯网络在水质评价、模型预测、水环境管理及风险决策中可同时考虑多个变量的综合作用,得到影响管理决策各因素的不确定性信息,为水环境的管理决策提供科学依据.%Bayesian methods provide new ideas for solving uncertainty problems in Water environmental system. Several Bayesian methods, such as Bayesian formula, Bayesian statistical inference and Bayesian networks, are commented on applying to water quality evaluation, parameters identification of water environment model, water environment manage ment and risk decision making. Bayesian formula can solve uncertain information of monitoring data, water quality grade and standard in water quality evaluation. Bayesian statistical inference coupling the water environmental model provides a new approach for model parameter identification. The posterior distribution calculation is the key of application of Bayes ian statistical inference. The Bayesian discrete algorithms based posterior distribution, the traditional and improved MC-MC algorithms are introduced. The application of Bayesian networks to water quality assessment, model prediction, wa ter environment management and risk decision making can take multiple variable into account simultaneously. Then the uncertain information of factors influencing on management decision making is obtained, which provides the scientific ba sis for water environmental
Parameter estimation of general regression neural network using Bayesian approach
Choir, Achmad Syahrul; Prasetyo, Rindang Bangun; Ulama, Brodjol Sutijo Suprih; Iriawan, Nur; Fitriasari, Kartika; Dokhi, Mohammad
2016-02-01
General Regression Neural Network (GRNN) has been applied in a large number of forecasting/prediction problem. Generally, there are two types of GRNN: GRNN which is based on kernel density; and Mixture Based GRNN (MBGRNN) which is based on adaptive mixture model. The main problem on GRNN modeling lays on how its parameters were estimated. In this paper, we propose Bayesian approach and its computation using Markov Chain Monte Carlo (MCMC) algorithms for estimating the MBGRNN parameters. This method is applied in simulation study. In this study, its performances are measured by using MAPE, MAE and RMSE. The application of Bayesian method to estimate MBGRNN parameters using MCMC is straightforward but it needs much iteration to achieve convergence.
MCMC with Strings and Branes: The Suburban Algorithm (Extended Version)
Heckman, Jonathan J; Vigoda, Ben
2016-01-01
Motivated by the physics of strings and branes, we develop a class of Markov chain Monte Carlo (MCMC) algorithms involving extended objects. Starting from a collection of parallel Metropolis-Hastings (MH) samplers, we place them on an auxiliary grid, and couple them together via nearest neighbor interactions. This leads to a class of "suburban samplers" (i.e., spread out Metropolis). Coupling the samplers in this way modifies the mixing rate and speed of convergence for the Markov chain, and can in many cases allow a sampler to more easily overcome free energy barriers in a target distribution. We test these general theoretical considerations by performing several numerical experiments. For suburban samplers with a fluctuating grid topology, performance is strongly correlated with the average number of neighbors. Increasing the average number of neighbors above zero initially leads to an increase in performance, though there is a critical connectivity with effective dimension d_eff ~ 1, above which "groupthin...
An MCMC Fitting Method for Triaxial Dark Matter Haloes
Corless, Virginia L
2008-01-01
Measuring the 3D distribution of mass on galaxy cluster scales is a crucial test of the LCDM model, providing constraints on the behaviour of dark matter. Recent work investigating mass distributions of individual galaxy clusters (e.g. Abell 1689) using weak and strong gravitational lensing has revealed potential inconsistencies between the predictions of structure formation models relating halo mass to concentration and those relationships as measured in massive clusters. However, such analyses employ simple spherical halo models while a growing body of work indicates that triaxial 3D halo structure is both common and important in parameter estimates. The very strong assumptions about the symmetry of the lensing halo implied with circular or perturbative elliptical NFW models are not physically motivated and lead to incorrect parameter estimates with significantly underestimated error bars. We here introduce a Markov Chain Monte Carlo (MCMC) method to fit fully triaxial models to weak lensing data that gives...
Bayesian analysis of spatial point processes in the neighbourhood of Voronoi networks
DEFF Research Database (Denmark)
Skare, Øivind; Møller, Jesper; Vedel Jensen, Eva B.
A model for an inhomogeneous Poisson process with high intensity near the edges of a Voronoi tessellation in 2D or 3D is proposed. The model is analysed in a Bayesian setting with priors on nuclei of the Voronoi tessellation and other model parameters. An MCMC algorithm is constructed to sample...
Bayesian analysis of spatial point processes in the neighbourhood of Voronoi networks
DEFF Research Database (Denmark)
Skare, Øivind; Møller, Jesper; Jensen, Eva B. Vedel
2007-01-01
A model for an inhomogeneous Poisson process with high intensity near the edges of a Voronoi tessellation in 2D or 3D is proposed. The model is analysed in a Bayesian setting with priors on nuclei of the Voronoi tessellation and other model parameters. An MCMC algorithm is constructed to sample...
Directory of Open Access Journals (Sweden)
L.S. Vismara
2007-12-01
Full Text Available A dinâmica da população de plantas daninhas pode ser representada por um sistema de equações que relaciona as densidades de sementes produzidas e de plântulas em áreas de cultivo. Os valores dos parâmetros dos modelos podem ser inferidos diretamente de experimentação e análise estatística ou extraídos da literatura. O presente trabalho teve por objetivo estimar os parâmetros do modelo de densidade populacional de plantas daninhas, a partir de um experimento conduzido na área experimental da Embrapa Milho e Sorgo, Sete Lagoas, MG, via os procedimentos de inferências clássica e Bayesiana.Dynamics of weed populations can be described as a system of equations relating the produced seed and seedling densities in crop areas. The model parameter values can be either directly inferred from experimentation and statistical analysis or obtained from the literature. The objective of this work was to estimate the weed population density model parameters based on experimental field data at Embrapa Milho e Sorgo, Sete Lagoas, MG, using classic and Bayesian inferences.
Topics in Bayesian statistics and maximum entropy
International Nuclear Information System (INIS)
Notions of Bayesian decision theory and maximum entropy methods are reviewed with particular emphasis on probabilistic inference and Bayesian modeling. The axiomatic approach is considered as the best justification of Bayesian analysis and maximum entropy principle applied in natural sciences. Particular emphasis is put on solving the inverse problem in digital image restoration and Bayesian modeling of neural networks. Further topics addressed briefly include language modeling, neutron scattering, multiuser detection and channel equalization in digital communications, genetic information, and Bayesian court decision-making. (author)
Bayesian approach to avoiding track seduction
Salmond, David J.; Everett, Nicholas O.
2002-08-01
The problem of maintaining track on a primary target in the presence spurious objects is addressed. Recursive and batch filtering approaches are developed. For the recursive approach, a Bayesian track splitting filter is derived which spawns candidate tracks if there is a possibility of measurement misassociation. The filter evaluates the probability of each candidate track being associated with the primary target. The batch filter is a Markov-chain Monte Carlo (MCMC) algorithm which fits the observed data sequence to models of target dynamics and measurement-track association. Simulation results are presented.
Oware, E. K.
2015-12-01
Modeling aquifer heterogeneities (AH) is a complex, multidimensional problem that mostly requires stochastic imaging strategies for tractability. While the traditional Bayesian Markov chain Monte Carlo (McMC) provides a powerful framework to model AH, the generic McMC is computationally prohibitive and, thus, unappealing for large-scale problems. An innovative variant of the McMC scheme that imposes priori spatial statistical constraints on model parameter updates, for improved characterization in a computationally efficient manner is proposed. The proposed algorithm (PA) is based on Markov random field (MRF) modeling, which is an image processing technique that infers the global behavior of a random field from its local properties, making the MRF approach well suited for imaging AH. MRF-based modeling leverages the equivalence of Gibbs (or Boltzmann) distribution (GD) and MRF to identify the local properties of an MRF in terms of the easily quantifiable Gibbs energy. The PA employs the two-step approach to model the lithological structure of the aquifer and the hydraulic properties within the identified lithologies simultaneously. It performs local Gibbs energy minimizations along a random path, which requires parameters of the GD (spatial statistics) to be specified. A PA that implicitly infers site-specific GD parameters within a Bayesian framework is also presented. The PA is illustrated with a synthetic binary facies aquifer with a lognormal heterogeneity simulated within each facies. GD parameters of 2.6, 1.2, -0.4, and -0.2 were estimated for the horizontal, vertical, NESW, and NWSE directions, respectively. Most of the high hydraulic conductivity zones (facies 2) were fairly resolved (see results below) with facies identification accuracy rate of 81%, 89%, and 90% for the inversions conditioned on concentration (R1), resistivity (R2), and joint (R3), respectively. The incorporation of the conditioning datasets improved on the root mean square error (RMSE
DEFF Research Database (Denmark)
Mørup, Morten; Schmidt, Mikkel N
2012-01-01
Many networks of scientific interest naturally decompose into clusters or communities with comparatively fewer external than internal links; however, current Bayesian models of network communities do not exert this intuitive notion of communities. We formulate a nonparametric Bayesian model...... for community detection consistent with an intuitive definition of communities and present a Markov chain Monte Carlo procedure for inferring the community structure. A Matlab toolbox with the proposed inference procedure is available for download. On synthetic and real networks, our model detects communities...... consistent with ground truth, and on real networks, it outperforms existing approaches in predicting missing links. This suggests that community structure is an important structural property of networks that should be explicitly modeled....
Decoding X-ray observations from centres of galaxy clusters using MCMC
Lakhchaura, Kiran; Saini, Tarun Deep; Sharma, Prateek
2016-08-01
Traditionally the thermodynamic profiles (gas density, temperature, etc.) of galaxy clusters are obtained by assuming spherical symmetry and modelling projected X-ray spectra in each annulus. The outer annuli contribute to the inner ones and their contribution needs to be subtracted to obtain the temperature and density of spherical shells. The usual deprojection methods lead to propagation of errors from outside to in and typically do not model the covariance of parameters in different radial shells. In this paper we describe a method based on a free-form model of clusters with cluster parameters (density, temperature) given in spherical shells, which we jointly forward fit to the X-ray data by constructing a Bayesian posterior probability distribution that we sample using the MCMC technique. By systematically marginalizing over the nuisance outer shells, we estimate the inner entropy profiles of clusters and fit them to various models for a sample of Chandra X-ray observations of 17 clusters. We show that the entropy profiles in almost all of our clusters are best described as cored power laws. A small subsample is found to be either consistent with a power law, or alternatively their cores are not fully resolved (smaller than, or about few kpc). We find marginal evidence for bimodality in the central values of entropy (and cooling time) corresponding to cool-core and non cool-core clusters. The minimum value of the ratio of the cooling time and the free-fall time (min[tcool/tff]; correlation is much weaker with core entropy) is anti-correlated with H α and radio luminosity. H α emitting cold gas is absent in our clusters with min(tcool/tff) ≳ 10. Our lowest core entropies are systematically and substantially lower than the values quoted by the ACCEPT sample.
Elvira, Clément; Dobigeon, Nicolas
2015-01-01
Sparse representations have proven their efficiency in solving a wide class of inverse problems encountered in signal and image processing. Conversely, enforcing the information to be spread uniformly over representation coefficients exhibits relevant properties in various applications such as digital communications. Anti-sparse regularization can be naturally expressed through an $\\ell_{\\infty}$-norm penalty. This paper derives a probabilistic formulation of such representations. A new probability distribution, referred to as the democratic prior, is first introduced. Its main properties as well as three random variate generators for this distribution are derived. Then this probability distribution is used as a prior to promote anti-sparsity in a Gaussian linear inverse problem, yielding a fully Bayesian formulation of anti-sparse coding. Two Markov chain Monte Carlo (MCMC) algorithms are proposed to generate samples according to the posterior distribution. The first one is a standard Gibbs sampler. The seco...
Bayesian spatial semi-parametric modeling of HIV variation in Kenya.
Directory of Open Access Journals (Sweden)
Oscar Ngesa
Full Text Available Spatial statistics has seen rapid application in many fields, especially epidemiology and public health. Many studies, nonetheless, make limited use of the geographical location information and also usually assume that the covariates, which are related to the response variable, have linear effects. We develop a Bayesian semi-parametric regression model for HIV prevalence data. Model estimation and inference is based on fully Bayesian approach via Markov Chain Monte Carlo (McMC. The model is applied to HIV prevalence data among men in Kenya, derived from the Kenya AIDS indicator survey, with n = 3,662. Past studies have concluded that HIV infection has a nonlinear association with age. In this study a smooth function based on penalized regression splines is used to estimate this nonlinear effect. Other covariates were assumed to have a linear effect. Spatial references to the counties were modeled as both structured and unstructured spatial effects. We observe that circumcision reduces the risk of HIV infection. The results also indicate that men in the urban areas were more likely to be infected by HIV as compared to their rural counterpart. Men with higher education had the lowest risk of HIV infection. A nonlinear relationship between HIV infection and age was established. Risk of HIV infection increases with age up to the age of 40 then declines with increase in age. Men who had STI in the last 12 months were more likely to be infected with HIV. Also men who had ever used a condom were found to have higher likelihood to be infected by HIV. A significant spatial variation of HIV infection in Kenya was also established. The study shows the practicality and flexibility of Bayesian semi-parametric regression model in analyzing epidemiological data.
A new approach for Bayesian model averaging
Institute of Scientific and Technical Information of China (English)
TIAN XiangJun; XIE ZhengHui; WANG AiHui; YANG XiaoChun
2012-01-01
Bayesian model averaging (BMA) is a recently proposed statistical method for calibrating forecast ensembles from numerical weather models.However,successful implementation of BMA requires accurate estimates of the weights and variances of the individual competing models in the ensemble.Two methods,namely the Expectation-Maximization (EM) and the Markov Chain Monte Carlo (MCMC) algorithms,are widely used for BMA model training.Both methods have their own respective strengths and weaknesses.In this paper,we first modify the BMA log-likelihood function with the aim of removing the additional limitation that requires that the BMA weights add to one,and then use a limited memory quasi-Newtonian algorithm for solving the nonlinear optimization problem,thereby formulating a new approach for BMA (referred to as BMA-BFGS).Several groups of multi-model soil moisture simulation experiments from three land surface models show that the performance of BMA-BFGS is similar to the MCMC method in terms of simulation accuracy,and that both are superior to the EM algorithm.On the other hand,the computational cost of the BMA-BFGS algorithm is substantially less than for MCMC and is almost equivalent to that for EM.
Chen, Xi; Jung, Jin-Gyoung; Shajahan-Haq, Ayesha N; Clarke, Robert; Shih, Ie-Ming; Wang, Yue; Magnani, Luca; Wang, Tian-Li; Xuan, Jianhua
2016-04-20
Chromatin immunoprecipitation with massively parallel DNA sequencing (ChIP-seq) has greatly improved the reliability with which transcription factor binding sites (TFBSs) can be identified from genome-wide profiling studies. Many computational tools are developed to detect binding events or peaks, however the robust detection of weak binding events remains a challenge for current peak calling tools. We have developed a novel Bayesian approach (ChIP-BIT) to reliably detect TFBSs and their target genes by jointly modeling binding signal intensities and binding locations of TFBSs. Specifically, a Gaussian mixture model is used to capture both binding and background signals in sample data. As a unique feature of ChIP-BIT, background signals are modeled by a local Gaussian distribution that is accurately estimated from the input data. Extensive simulation studies showed a significantly improved performance of ChIP-BIT in target gene prediction, particularly for detecting weak binding signals at gene promoter regions. We applied ChIP-BIT to find target genes from NOTCH3 and PBX1 ChIP-seq data acquired from MCF-7 breast cancer cells. TF knockdown experiments have initially validated about 30% of co-regulated target genes identified by ChIP-BIT as being differentially expressed in MCF-7 cells. Functional analysis on these genes further revealed the existence of crosstalk between Notch and Wnt signaling pathways.
Nonparametric Bayesian Classification
Coram, M A
2002-01-01
A Bayesian approach to the classification problem is proposed in which random partitions play a central role. It is argued that the partitioning approach has the capacity to take advantage of a variety of large-scale spatial structures, if they are present in the unknown regression function $f_0$. An idealized one-dimensional problem is considered in detail. The proposed nonparametric prior uses random split points to partition the unit interval into a random number of pieces. This prior is found to provide a consistent estimate of the regression function in the $\\L^p$ topology, for any $1 \\leq p < \\infty$, and for arbitrary measurable $f_0:[0,1] \\rightarrow [0,1]$. A Markov chain Monte Carlo (MCMC) implementation is outlined and analyzed. Simulation experiments are conducted to show that the proposed estimate compares favorably with a variety of conventional estimators. A striking resemblance between the posterior mean estimate and the bagged CART estimate is noted and discussed. For higher dimensions, a ...
An MCMC determination of the primordial helium abundance
Aver, Erik; Olive, Keith A.; Skillman, Evan D.
2012-04-01
Spectroscopic observations of the chemical abundances in metal-poor H II regions provide an independent method for estimating the primordial helium abundance. H II regions are described by several physical parameters such as electron density, electron temperature, and reddening, in addition to y, the ratio of helium to hydrogen. It had been customary to estimate or determine self-consistently these parameters to calculate y. Frequentist analyses of the parameter space have been shown to be successful in these parameter determinations, and Markov Chain Monte Carlo (MCMC) techniques have proven to be very efficient in sampling this parameter space. Nevertheless, accurate determination of the primordial helium abundance from observations of H II regions is constrained by both systematic and statistical uncertainties. In an attempt to better reduce the latter, and continue to better characterize the former, we apply MCMC methods to the large dataset recently compiled by Izotov, Thuan, & Stasińska (2007). To improve the reliability of the determination, a high quality dataset is needed. In pursuit of this, a variety of cuts are explored. The efficacy of the He I λ4026 emission line as a constraint on the solutions is first examined, revealing the introduction of systematic bias through its absence. As a clear measure of the quality of the physical solution, a χ2 analysis proves instrumental in the selection of data compatible with the theoretical model. Nearly two-thirds of the observations fall outside a standard 95% confidence level cut, which highlights the care necessary in selecting systems and warrants further investigation into potential deficiencies of the model or data. In addition, the method also allows us to exclude systems for which parameter estimations are statistical outliers. As a result, the final selected dataset gains in reliability and exhibits improved consistency. Regression to zero metallicity yields Yp = 0.2534 ± 0.0083, in broad agreement
Multitarget tracking with IP reversible jump MCMC-PF
Bocquel, Mélanie; Driessen, Hans; Bagchi, Arun
2013-01-01
In this paper we address the problem of tracking multiple targets based on raw measurements by means of Particle filtering. Bayesian multitarget tracking, in the Random Finite Set framework, propagates the multitarget posterior density recursively in time. Sequential Monte Carlo (SMC) approximations
DEFF Research Database (Denmark)
Ødegård, Jørgen; Meuwissen, Theo HE; Heringstad, Bjørg;
2010-01-01
" or "non-informative" with respect to genetic (co)variance components. The "non-informative" individuals are characterized by their Mendelian sampling deviations (deviance from the mid-parent mean) being completely confounded with a single residual on the underlying liability scale. For threshold models...... individual records exist on parents. Therefore, the aim of our study was to develop a new Gibbs sampling algorithm for a proper estimation of genetic (co)variance components within an animal threshold model framework. Methods In the proposed algorithm, individuals are classified as either "informative...... relationship matrix, but genetic (co)variance components are inferred from the sampled breeding values and relationships between "informative" individuals (usually parents) only. The latter is analogous to a sire-dam model (in cases with no individual records on the parents). Results When applied to simulated...
Zhang, Linlin; Guindani, Michele; Versace, Francesco; Vannucci, Marina
2014-07-15
In this paper we present a novel wavelet-based Bayesian nonparametric regression model for the analysis of functional magnetic resonance imaging (fMRI) data. Our goal is to provide a joint analytical framework that allows to detect regions of the brain which exhibit neuronal activity in response to a stimulus and, simultaneously, infer the association, or clustering, of spatially remote voxels that exhibit fMRI time series with similar characteristics. We start by modeling the data with a hemodynamic response function (HRF) with a voxel-dependent shape parameter. We detect regions of the brain activated in response to a given stimulus by using mixture priors with a spike at zero on the coefficients of the regression model. We account for the complex spatial correlation structure of the brain by using a Markov random field (MRF) prior on the parameters guiding the selection of the activated voxels, therefore capturing correlation among nearby voxels. In order to infer association of the voxel time courses, we assume correlated errors, in particular long memory, and exploit the whitening properties of discrete wavelet transforms. Furthermore, we achieve clustering of the voxels by imposing a Dirichlet process (DP) prior on the parameters of the long memory process. For inference, we use Markov Chain Monte Carlo (MCMC) sampling techniques that combine Metropolis-Hastings schemes employed in Bayesian variable selection with sampling algorithms for nonparametric DP models. We explore the performance of the proposed model on simulated data, with both block- and event-related design, and on real fMRI data. PMID:24650600
Statistical ranging of asteroid orbits: efficient MCMC and importance samplers
Virtanen, Jenni; Fedorets, Grigori; Granvik, Mikael; Oszkiewicz, Dagmara Anna; Muinonen, Karri
2015-08-01
We address the asteroid initial orbit computation problem by comparing various versions of the statistical ranging method (Virtanen et al. 2001, Muinonen et al. 2001) developed for exiguous observational data. In particular, the performance of the Markov-chain Monte Carlo (MCMC) ranging (Oszkiewicz et al. 2009) and the importance-sampler ranging methods, including the most recently developed random-walk ranging (Muinonen et al. 2015) are compared. We demonstrate the capabilities of the methods for various classes of asteroids, including near-Earth and main-belt asteroids, as well as transneptunian objects. We also study the performance of our statistical inverse methods as a function of increasing observational time interval, i.e., during the so-called phase transition. We also envision the application of the methods for space-debris orbits.Muinonen, K., Virtanen, J., Bowell, E., 2001. Collision probability for Earth-crossing asteroids using orbital ranging. Celestial Mechanics and Dynamical Astronomy 81, 93-101.Muinonen K., et al. 2015. Asteroid orbits from Gaia astrometry with random-walk statistical ranging, Planetary and Space Science, in preparation.Oszkiewicz, D., Muinonen, K., Virtanen, J., Granvik, M., 2009. Asteroid orbital ranging using Markov-chain Monte Carlo. Meteoritics and Planetary Science 44, 12, 1897-1904.Virtanen, J., Muinonen, K., Bowell, E., 2001. Statistical ranging of asteroid orbits. Icarus 154, 412-431.
Modeling Diagnostic Assessments with Bayesian Networks
Almond, Russell G.; DiBello, Louis V.; Moulder, Brad; Zapata-Rivera, Juan-Diego
2007-01-01
This paper defines Bayesian network models and examines their applications to IRT-based cognitive diagnostic modeling. These models are especially suited to building inference engines designed to be synchronous with the finer grained student models that arise in skills diagnostic assessment. Aspects of the theory and use of Bayesian network models…
Advances in Bayesian Modeling in Educational Research
Levy, Roy
2016-01-01
In this article, I provide a conceptually oriented overview of Bayesian approaches to statistical inference and contrast them with frequentist approaches that currently dominate conventional practice in educational research. The features and advantages of Bayesian approaches are illustrated with examples spanning several statistical modeling…
Institute of Scientific and Technical Information of China (English)
左哲
2015-01-01
为了研究长输管道腐蚀泄漏及蒸气云爆炸事故的演化规律，通过对埋地管道内(外)壁腐蚀失效、燃气泄漏、气体云团扩散及蒸气云爆炸等4阶段事件进行分析，构建埋地管线腐蚀泄漏火灾的贝叶斯网络模型。研究网络结构中节点变量的取值范围及离散化方法，并基于对事故统计和专家分析判断，设定节点变量的先验概率，量化节点关联的条件概率分布。在对贝叶斯网络推理策略研究的基础上，考察节点变量对推理结果的敏感性，验证模型的合理性。结果表明，长输管道腐蚀泄漏及次生灾害事件过程具有较大的不确定性，主要体现在中间事件均具有多种状态，事故演化路径概率受模型输入条件影响较大。贝叶斯网络方法用于描述事故过程中间节点事件间的依赖关系有较大的优势，可以定量衡量事故风险的不确定性。%In order to research evolutionary laws of unconfined vapor cloud explosion ( UVCE) induced by combustible gas leak in long-distance oil and gas pipelines, Bayesian networks on buried pipelines corrosion leak fire were built by analyzing event nodes on inner and outer wall corrosion failure, combustible gas leak, the gas cloud diffusion and UVCE. The state ranges and discrete methods of node variables were studied. Priori probability and conditional probability distribution of the node variables were set by analyzing on accident statistics data and expert judgements. Bayesian network inference strategy was developed, the sensitivities of each network node variable on inference results were analyzed by researching on evolution mechanism of corrosion leak fire, and the rationality of the model was verified. The results show that there are greater uncer-tainty in the process of pipeline corrosion leaks and secondary disaster. The uncertainty presents in diverse intermediate event status value and probability of accident evolutionary
Plug & Play object oriented Bayesian networks
DEFF Research Database (Denmark)
Bangsø, Olav; Flores, J.; Jensen, Finn Verner
2003-01-01
Object oriented Bayesian networks have proven themselves useful in recent years. The idea of applying an object oriented approach to Bayesian networks has extended their scope to larger domains that can be divided into autonomous but interrelated entities. Object oriented Bayesian networks have...... been shown to be quite suitable for dynamic domains as well. However, processing object oriented Bayesian networks in practice does not take advantage of their modular structure. Normally the object oriented Bayesian network is transformed into a Bayesian network and, inference is performed...... by constructing a junction tree from this network. In this paper we propose a method for translating directly from object oriented Bayesian networks to junction trees, avoiding the intermediate translation. We pursue two main purposes: firstly, to maintain the original structure organized in an instance tree...
The bugs book a practical introduction to Bayesian analysis
Lunn, David; Best, Nicky; Thomas, Andrew; Spiegelhalter, David
2012-01-01
Introduction: Probability and ParametersProbabilityProbability distributionsCalculating properties of probability distributionsMonte Carlo integrationMonte Carlo Simulations Using BUGSIntroduction to BUGSDoodleBUGSUsing BUGS to simulate from distributionsTransformations of random variablesComplex calculations using Monte CarloMultivariate Monte Carlo analysisPredictions with unknown parametersIntroduction to Bayesian InferenceBayesian learningPosterior predictive distributionsConjugate Bayesian inferenceInference about a discrete parameterCombinations of conjugate analysesBayesian and classica
Bayesian Agglomerative Clustering with Coalescents
Teh, Yee Whye; Daumé III, Hal; Roy, Daniel
2009-01-01
We introduce a new Bayesian model for hierarchical clustering based on a prior over trees called Kingman's coalescent. We develop novel greedy and sequential Monte Carlo inferences which operate in a bottom-up agglomerative fashion. We show experimentally the superiority of our algorithms over others, and demonstrate our approach in document clustering and phylolinguistics.
Bayesian image restoration, using configurations
DEFF Research Database (Denmark)
Thorarinsdottir, Thordis
configurations are expressed in terms of the mean normal measure of the random set. These probabilities are used as prior probabilities in a Bayesian image restoration approach. Estimation of the remaining parameters in the model is outlined for salt and pepper noise. The inference in the model is discussed...
Bayesian image restoration, using configurations
DEFF Research Database (Denmark)
Thorarinsdottir, Thordis Linda
2006-01-01
configurations are expressed in terms of the mean normal measure of the random set. These probabilities are used as prior probabilities in a Bayesian image restoration approach. Estimation of the remaining parameters in the model is outlined for the salt and pepper noise. The inference in the model is discussed...
A tutorial introduction to Bayesian models of cognitive development
Perfors, Amy; Tenenbaum, Joshua B.; Griffiths, Thomas L.; Xu, Fei
2010-01-01
We present an introduction to Bayesian inference as it is used in probabilistic models of cognitive development. Our goal is to provide an intuitive and accessible guide to the what, the how, and the why of the Bayesian approach: what sorts of problems and data the framework is most relevant for, and how and why it may be useful for developmentalists. We emphasize a qualitative understanding of Bayesian inference, but also include information about additional resources for those interested in...
Ryu, Duchwan
2010-09-28
We consider nonparametric regression analysis in a generalized linear model (GLM) framework for data with covariates that are the subject-specific random effects of longitudinal measurements. The usual assumption that the effects of the longitudinal covariate processes are linear in the GLM may be unrealistic and if this happens it can cast doubt on the inference of observed covariate effects. Allowing the regression functions to be unknown, we propose to apply Bayesian nonparametric methods including cubic smoothing splines or P-splines for the possible nonlinearity and use an additive model in this complex setting. To improve computational efficiency, we propose the use of data-augmentation schemes. The approach allows flexible covariance structures for the random effects and within-subject measurement errors of the longitudinal processes. The posterior model space is explored through a Markov chain Monte Carlo (MCMC) sampler. The proposed methods are illustrated and compared to other approaches, the "naive" approach and the regression calibration, via simulations and by an application that investigates the relationship between obesity in adulthood and childhood growth curves. © 2010, The International Biometric Society.
A Bayesian hierarchical framework for modeling brain connectivity for neuroimaging data.
Chen, Shuo; Bowman, F DuBois; Mayberg, Helen S
2016-06-01
We propose a novel Bayesian hierarchical model for brain imaging data that unifies voxel-level (the most localized unit of measure) and region-level brain connectivity analyses, and yields population-level inferences. Functional connectivity generally refers to associations in brain activity between distinct locations. The first level of our model summarizes brain connectivity for cross-region voxel pairs using a two-component mixture model consisting of connected and nonconnected voxels. We use the proportion of connected voxel pairs to define a new measure of connectivity strength, which reflects the breadth of between-region connectivity. Furthermore, we evaluate the impact of clinical covariates on connectivity between region-pairs at a population level. We perform parameter estimation using Markov chain Monte Carlo (MCMC) techniques, which can be executed quickly relative to the number of model parameters. We apply our method to resting-state functional magnetic resonance imaging (fMRI) data from 32 subjects with major depression and simulated data to demonstrate the properties of our method. PMID:26501687
Applications of Bayesian approach in modelling risk of malaria-related hospital mortality
Directory of Open Access Journals (Sweden)
Simbeye Jupiter S
2008-02-01
Full Text Available Abstract Background Malaria is a major public health problem in Malawi, however, quantifying its burden in a population is a challenge. Routine hospital data provide a proxy for measuring the incidence of severe malaria and for crudely estimating morbidity rates. Using such data, this paper proposes a method to describe trends, patterns and factors associated with in-hospital mortality attributed to the disease. Methods We develop semiparametric regression models which allow joint analysis of nonlinear effects of calendar time and continuous covariates, spatially structured variation, unstructured heterogeneity, and other fixed covariates. Modelling and inference use the fully Bayesian approach via Markov Chain Monte Carlo (MCMC simulation techniques. The methodology is applied to analyse data arising from paediatric wards in Zomba district, Malawi, between 2002 and 2003. Results and Conclusion We observe that the risk of dying in hospital is lower in the dry season, and for children who travel a distance of less than 5 kms to the hospital, but increases for those who are referred to the hospital. The results also indicate significant differences in both structured and unstructured spatial effects, and the health facility effects reveal considerable differences by type of facility or practice. More importantly, our approach shows non-linearities in the effect of metrical covariates on the probability of dying in hospital. The study emphasizes that the methodological framework used provides a useful tool for analysing the data at hand and of similar structure.
Lesaffre, Emmanuel
2012-01-01
The growth of biostatistics has been phenomenal in recent years and has been marked by considerable technical innovation in both methodology and computational practicality. One area that has experienced significant growth is Bayesian methods. The growing use of Bayesian methodology has taken place partly due to an increasing number of practitioners valuing the Bayesian paradigm as matching that of scientific discovery. In addition, computational advances have allowed for more complex models to be fitted routinely to realistic data sets. Through examples, exercises and a combination of introd
Comparison of the Bayesian and Frequentist Approach to the Statistics
Hakala, Michal
2015-01-01
The Thesis deals with introduction to Bayesian statistics and comparing Bayesian approach with frequentist approach to statistics. Bayesian statistics is modern branch of statistics which provides an alternative comprehensive theory to the frequentist approach. Bayesian concepts provides solution for problems not being solvable by frequentist theory. In the thesis are compared definitions, concepts and quality of statistical inference. The main interest is focused on a point estimation, an in...
Probability and Bayesian statistics
1987-01-01
This book contains selected and refereed contributions to the "Inter national Symposium on Probability and Bayesian Statistics" which was orga nized to celebrate the 80th birthday of Professor Bruno de Finetti at his birthplace Innsbruck in Austria. Since Professor de Finetti died in 1985 the symposium was dedicated to the memory of Bruno de Finetti and took place at Igls near Innsbruck from 23 to 26 September 1986. Some of the pa pers are published especially by the relationship to Bruno de Finetti's scientific work. The evolution of stochastics shows growing importance of probability as coherent assessment of numerical values as degrees of believe in certain events. This is the basis for Bayesian inference in the sense of modern statistics. The contributions in this volume cover a broad spectrum ranging from foundations of probability across psychological aspects of formulating sub jective probability statements, abstract measure theoretical considerations, contributions to theoretical statistics an...
Draper, D.
2001-01-01
© 2012 Springer Science+Business Media, LLC. All rights reserved. Article Outline: Glossary Definition of the Subject and Introduction The Bayesian Statistical Paradigm Three Examples Comparison with the Frequentist Statistical Paradigm Future Directions Bibliography
Bayesian Analysis of Multiple Populations I: Statistical and Computational Methods
Stenning, D C; Robinson, E; van Dyk, D A; von Hippel, T; Sarajedini, A; Stein, N
2016-01-01
We develop a Bayesian model for globular clusters composed of multiple stellar populations, extending earlier statistical models for open clusters composed of simple (single) stellar populations (vanDyk et al. 2009, Stein et al. 2013). Specifically, we model globular clusters with two populations that differ in helium abundance. Our model assumes a hierarchical structuring of the parameters in which physical properties---age, metallicity, helium abundance, distance, absorption, and initial mass---are common to (i) the cluster as a whole or to (ii) individual populations within a cluster, or are unique to (iii) individual stars. An adaptive Markov chain Monte Carlo (MCMC) algorithm is devised for model fitting that greatly improves convergence relative to its precursor non-adaptive MCMC algorithm. Our model and computational tools are incorporated into an open-source software suite known as BASE-9. We use numerical studies to demonstrate that our method can recover parameters of two-population clusters, and al...
BASE-9: Bayesian Analysis for Stellar Evolution with nine variables
Robinson, Elliot; von Hippel, Ted; Stein, Nathan; Stenning, David; Wagner-Kaiser, Rachel; Si, Shijing; van Dyk, David
2016-08-01
The BASE-9 (Bayesian Analysis for Stellar Evolution with nine variables) software suite recovers star cluster and stellar parameters from photometry and is useful for analyzing single-age, single-metallicity star clusters, binaries, or single stars, and for simulating such systems. BASE-9 uses a Markov chain Monte Carlo (MCMC) technique along with brute force numerical integration to estimate the posterior probability distribution for the age, metallicity, helium abundance, distance modulus, line-of-sight absorption, and parameters of the initial-final mass relation (IFMR) for a cluster, and for the primary mass, secondary mass (if a binary), and cluster probability for every potential cluster member. The MCMC technique is used for the cluster quantities (the first six items listed above) and numerical integration is used for the stellar quantities (the last three items in the above list).
A Bayesian approach to the modelling of alpha Cen A
Bazot, M; Christensen-Dalsgaard, J
2012-01-01
Determining the physical characteristics of a star is an inverse problem consisting in estimating the parameters of models for the stellar structure and evolution, knowing certain observable quantities. We use a Bayesian approach to solve this problem for alpha Cen A, which allows us to incorporate prior information on the parameters to be estimated, in order to better constrain the problem. Our strategy is based on the use of a Markov Chain Monte Carlo (MCMC) algorithm to estimate the posterior probability densities of the stellar parameters: mass, age, initial chemical composition,... We use the stellar evolutionary code ASTEC to model the star. To constrain this model both seismic and non-seismic observations were considered. Several different strategies were tested to fit these values, either using two or five free parameters in ASTEC. We are thus able to show evidence that MCMC methods become efficient with respect to more classical grid-based strategies when the number of parameters increases. The resul...
Wang, Yuan; Zhou, Xiaobo; Wang, Honghui; Li, King; Yao, Lixiu; Wong, Stephen T C
2008-07-01
Mass spectrometry (MS) has shown great potential in detecting disease-related biomarkers for early diagnosis of stroke. To discover potential biomarkers from large volume of noisy MS data, peak detection must be performed first. This article proposes a novel automatic peak detection method for the stroke MS data. In this method, a mixture model is proposed to model the spectrum. Bayesian approach is used to estimate parameters of the mixture model, and Markov chain Monte Carlo method is employed to perform Bayesian inference. By introducing a reversible jump method, we can automatically estimate the number of peaks in the model. Instead of separating peak detection into substeps, the proposed peak detection method can do baseline correction, denoising and peak identification simultaneously. Therefore, it minimizes the risk of introducing irrecoverable bias and errors from each substep. In addition, this peak detection method does not require a manually selected denoising threshold. Experimental results on both simulated dataset and stroke MS dataset show that the proposed peak detection method not only has the ability to detect small signal-to-noise ratio peaks, but also greatly reduces false detection rate while maintaining the same sensitivity. PMID:18586741
Symbolic Probabilistic Inference with Continuous Variables
Chang, Kuo-Chu; Fung, Robert
2013-01-01
Research on Symbolic Probabilistic Inference (SPI) [2, 3] has provided an algorithm for resolving general queries in Bayesian networks. SPI applies the concept of dependency directed backward search to probabilistic inference, and is incremental with respect to both queries and observations. Unlike traditional Bayesian network inferencing algorithms, SPI algorithm is goal directed, performing only those calculations that are required to respond to queries. Research to date on SPI applies to B...
International Nuclear Information System (INIS)
Estimating uncertainties on doses from bioassay data is of interest in epidemiology studies that estimate cancer risk from occupational exposures to radionuclides. Bayesian methods provide a logical framework to calculate these uncertainties. However, occupational exposures often consist of many intakes, and this can make the Bayesian calculation computationally intractable. This paper describes a novel strategy for increasing the computational speed of the calculation by simplifying the intake pattern to a single composite intake, termed as complex intake regime (CIR). In order to assess whether this approximation is accurate and fast enough for practical purposes, the method is implemented by the Weighted Likelihood Monte Carlo Sampling (WeLMoS) method and evaluated by comparing its performance with a Markov Chain Monte Carlo (MCMC) method. The MCMC method gives the full solution (all intakes are independent), but is very computationally intensive to apply routinely. Posterior distributions of model parameter values, intakes and doses are calculated for a representative sample of plutonium workers from the United Kingdom Atomic Energy cohort using the WeLMoS method with the CIR and the MCMC method. The distributions are in good agreement: posterior means and Q 0.025 and Q 0.975 quantiles are typically within 20 %. Furthermore, the WeLMoS method using the CIR converges quickly: a typical case history takes around 10-20 min on a fast workstation, whereas the MCMC method took around 12-hr. The advantages and disadvantages of the method are discussed. (authors)
A Bayesian Reflection on Surfaces
Directory of Open Access Journals (Sweden)
David R. Wolf
1999-10-01
Full Text Available Abstract: The topic of this paper is a novel Bayesian continuous-basis field representation and inference framework. Within this paper several problems are solved: The maximally informative inference of continuous-basis fields, that is where the basis for the field is itself a continuous object and not representable in a finite manner; the tradeoff between accuracy of representation in terms of information learned, and memory or storage capacity in bits; the approximation of probability distributions so that a maximal amount of information about the object being inferred is preserved; an information theoretic justification for multigrid methodology. The maximally informative field inference framework is described in full generality and denoted the Generalized Kalman Filter. The Generalized Kalman Filter allows the update of field knowledge from previous knowledge at any scale, and new data, to new knowledge at any other scale. An application example instance, the inference of continuous surfaces from measurements (for example, camera image data, is presented.
A Bayesian approach to combining animal abundance and demographic data
Directory of Open Access Journals (Sweden)
Brooks, S. P.
2004-06-01
Full Text Available In studies of wild animals, one frequently encounters both count and mark-recapture-recovery data. Here, we consider an integrated Bayesian analysis of ring¿recovery and count data using a state-space model. We then impose a Leslie-matrix-based model on the true population counts describing the natural birth-death and age transition processes. We focus upon the analysis of both count and recovery data collected on British lapwings (Vanellus vanellus combined with records of the number of frost days each winter. We demonstrate how the combined analysis of these data provides a more robust inferential framework and discuss how the Bayesian approach using MCMC allows us to remove the potentially restrictive normality assumptions commonly assumed for analyses of this sort. It is shown how WinBUGS may be used to perform the Bayesian analysis. WinBUGS code is provided and its performance is critically discussed.
MCMC-ODPR: Primer design optimization using Markov Chain Monte Carlo sampling
Directory of Open Access Journals (Sweden)
Kitchen James L
2012-11-01
Full Text Available Abstract Background Next generation sequencing technologies often require numerous primer designs that require good target coverage that can be financially costly. We aimed to develop a system that would implement primer reuse to design degenerate primers that could be designed around SNPs, thus find the fewest necessary primers and the lowest cost whilst maintaining an acceptable coverage and provide a cost effective solution. We have implemented Metropolis-Hastings Markov Chain Monte Carlo for optimizing primer reuse. We call it the Markov Chain Monte Carlo Optimized Degenerate Primer Reuse (MCMC-ODPR algorithm. Results After repeating the program 1020 times to assess the variance, an average of 17.14% fewer primers were found to be necessary using MCMC-ODPR for an equivalent coverage without implementing primer reuse. The algorithm was able to reuse primers up to five times. We compared MCMC-ODPR with single sequence primer design programs Primer3 and Primer-BLAST and achieved a lower primer cost per amplicon base covered of 0.21 and 0.19 and 0.18 primer nucleotides on three separate gene sequences, respectively. With multiple sequences, MCMC-ODPR achieved a lower cost per base covered of 0.19 than programs BatchPrimer3 and PAMPS, which achieved 0.25 and 0.64 primer nucleotides, respectively. Conclusions MCMC-ODPR is a useful tool for designing primers at various melting temperatures at good target coverage. By combining degeneracy with optimal primer reuse the user may increase coverage of sequences amplified by the designed primers at significantly lower costs. Our analyses showed that overall MCMC-ODPR outperformed the other primer-design programs in our study in terms of cost per covered base.
Multivariate Student -t Regression Models : Pitfalls and Inference
Fernández, C.; Steel, M.F.J.
1997-01-01
We consider likelihood-based inference from multivariate regression models with independent Student-t errors. Some very intruiging pitfalls of both Bayesian and classical methods on the basis of point observations are uncovered. Bayesian inference may be precluded as a consequence of the coarse natu
Technical Note: Approximate Bayesian parameterization of a complex tropical forest model
Hartig, F.; Dislich, C.; Wiegand, T.; Huth, A.
2013-08-01
Inverse parameter estimation of process-based models is a long-standing problem in ecology and evolution. A key problem of inverse parameter estimation is to define a metric that quantifies how well model predictions fit to the data. Such a metric can be expressed by general cost or objective functions, but statistical inversion approaches are based on a particular metric, the probability of observing the data given the model, known as the likelihood. Deriving likelihoods for dynamic models requires making assumptions about the probability for observations to deviate from mean model predictions. For technical reasons, these assumptions are usually derived without explicit consideration of the processes in the simulation. Only in recent years have new methods become available that allow generating likelihoods directly from stochastic simulations. Previous applications of these approximate Bayesian methods have concentrated on relatively simple models. Here, we report on the application of a simulation-based likelihood approximation for FORMIND, a parameter-rich individual-based model of tropical forest dynamics. We show that approximate Bayesian inference, based on a parametric likelihood approximation placed in a conventional MCMC, performs well in retrieving known parameter values from virtual field data generated by the forest model. We analyze the results of the parameter estimation, examine the sensitivity towards the choice and aggregation of model outputs and observed data (summary statistics), and show results from using this method to fit the FORMIND model to field data from an Ecuadorian tropical forest. Finally, we discuss differences of this approach to Approximate Bayesian Computing (ABC), another commonly used method to generate simulation-based likelihood approximations. Our results demonstrate that simulation-based inference, which offers considerable conceptual advantages over more traditional methods for inverse parameter estimation, can
Technical Note: Approximate Bayesian parameterization of a process-based tropical forest model
Hartig, F.; Dislich, C.; Wiegand, T.; Huth, A.
2014-02-01
Inverse parameter estimation of process-based models is a long-standing problem in many scientific disciplines. A key question for inverse parameter estimation is how to define the metric that quantifies how well model predictions fit to the data. This metric can be expressed by general cost or objective functions, but statistical inversion methods require a particular metric, the probability of observing the data given the model parameters, known as the likelihood. For technical and computational reasons, likelihoods for process-based stochastic models are usually based on general assumptions about variability in the observed data, and not on the stochasticity generated by the model. Only in recent years have new methods become available that allow the generation of likelihoods directly from stochastic simulations. Previous applications of these approximate Bayesian methods have concentrated on relatively simple models. Here, we report on the application of a simulation-based likelihood approximation for FORMIND, a parameter-rich individual-based model of tropical forest dynamics. We show that approximate Bayesian inference, based on a parametric likelihood approximation placed in a conventional Markov chain Monte Carlo (MCMC) sampler, performs well in retrieving known parameter values from virtual inventory data generated by the forest model. We analyze the results of the parameter estimation, examine its sensitivity to the choice and aggregation of model outputs and observed data (summary statistics), and demonstrate the application of this method by fitting the FORMIND model to field data from an Ecuadorian tropical forest. Finally, we discuss how this approach differs from approximate Bayesian computation (ABC), another method commonly used to generate simulation-based likelihood approximations. Our results demonstrate that simulation-based inference, which offers considerable conceptual advantages over more traditional methods for inverse parameter estimation
Technical Note: Approximate Bayesian parameterization of a complex tropical forest model
Directory of Open Access Journals (Sweden)
F. Hartig
2013-08-01
Full Text Available Inverse parameter estimation of process-based models is a long-standing problem in ecology and evolution. A key problem of inverse parameter estimation is to define a metric that quantifies how well model predictions fit to the data. Such a metric can be expressed by general cost or objective functions, but statistical inversion approaches are based on a particular metric, the probability of observing the data given the model, known as the likelihood. Deriving likelihoods for dynamic models requires making assumptions about the probability for observations to deviate from mean model predictions. For technical reasons, these assumptions are usually derived without explicit consideration of the processes in the simulation. Only in recent years have new methods become available that allow generating likelihoods directly from stochastic simulations. Previous applications of these approximate Bayesian methods have concentrated on relatively simple models. Here, we report on the application of a simulation-based likelihood approximation for FORMIND, a parameter-rich individual-based model of tropical forest dynamics. We show that approximate Bayesian inference, based on a parametric likelihood approximation placed in a conventional MCMC, performs well in retrieving known parameter values from virtual field data generated by the forest model. We analyze the results of the parameter estimation, examine the sensitivity towards the choice and aggregation of model outputs and observed data (summary statistics, and show results from using this method to fit the FORMIND model to field data from an Ecuadorian tropical forest. Finally, we discuss differences of this approach to Approximate Bayesian Computing (ABC, another commonly used method to generate simulation-based likelihood approximations. Our results demonstrate that simulation-based inference, which offers considerable conceptual advantages over more traditional methods for inverse parameter
An MCMC Study of General Squark Flavour Mixing in the MSSM
Energy Technology Data Exchange (ETDEWEB)
Herrmann, Björn [Annecy, LAPTH; De Causmaecker, Karen [Intl. Solvay Inst., Brussels; Fuks, Benjamin [UPMC, Paris (main); Mahmoudi, Farvah [Lyon, Ecole Normale Superieure; O' Leary, Ben [Wurzburg U.; Porod, Werner [Wurzburg U.; Sekmen, Sezen [Kyungpook Natl. U.; Strobbe, Nadja [Fermilab
2015-10-05
We present an extensive study of non-minimally flavour violating (NMFV) terms in the Lagrangian of the Minimal Supersymmetric Standard Model (MSSM). We impose a variety of theoretical and experimental constraints and perform a detailed scan of the parameter space by means of a Markov Chain Monte-Carlo (MCMC) setup. This represents the first study of several non-zero flavour-violating elements within the MSSM. We present the results of the MCMC scan with a special focus on the flavour-violating parameters. Based on these results, we define benchmark scenarios for future studies of NMFV effects at the LHC.
Analysis of KATRIN data using Bayesian inference
DEFF Research Database (Denmark)
Riis, Anna Sejersen; Hannestad, Steen; Weinheimer, Christian
2011-01-01
and can in some sense be called model-independent as compared to cosmology and neutrino-less double beta decay. However by model independent we only mean in case of the minimal extension of the standard model. One should therefore also analyse the data for non-standard couplings to e.g. righthanded...
A Bayesian Approach for Parameter Estimation and Prediction using a Computationally Intensive Model
Higdon, Dave; Schunck, Nicolas; Sarich, Jason; Wild, Stefan M
2014-01-01
Bayesian methods have been very successful in quantifying uncertainty in physics-based problems in parameter estimation and prediction. In these cases, physical measurements y are modeled as the best fit of a physics-based model $\\eta(\\theta)$ where $\\theta$ denotes the uncertain, best input setting. Hence the statistical model is of the form $y = \\eta(\\theta) + \\epsilon$, where $\\epsilon$ accounts for measurement, and possibly other error sources. When non-linearity is present in $\\eta(\\cdot)$, the resulting posterior distribution for the unknown parameters in the Bayesian formulation is typically complex and non-standard, requiring computationally demanding computational approaches such as Markov chain Monte Carlo (MCMC) to produce multivariate draws from the posterior. While quite generally applicable, MCMC requires thousands, or even millions of evaluations of the physics model $\\eta(\\cdot)$. This is problematic if the model takes hours or days to evaluate. To overcome this computational bottleneck, we pr...
Bayesian Analysis for Exponential Random Graph Models Using the Adaptive Exchange Sampler*
Jin, Ick Hoon; Yuan, Ying; Liang, Faming
2013-01-01
Exponential random graph models have been widely used in social network analysis. However, these models are extremely difficult to handle from a statistical viewpoint, because of the intractable normalizing constant and model degeneracy. In this paper, we consider a fully Bayesian analysis for exponential random graph models using the adaptive exchange sampler, which solves the intractable normalizing constant and model degeneracy issues encountered in Markov chain Monte Carlo (MCMC) simulati...
Frequentism and Bayesianism: A Python-driven Primer
VanderPlas, Jake
2014-01-01
This paper presents a brief, semi-technical comparison of the essential features of the frequentist and Bayesian approaches to statistical inference, with several illustrative examples implemented in Python. The differences between frequentism and Bayesianism fundamentally stem from differing definitions of probability, a philosophical divide which leads to distinct approaches to the solution of statistical problems as well as contrasting ways of asking and answering questions about unknown parameters. After an example-driven discussion of these differences, we briefly compare several leading Python statistical packages which implement frequentist inference using classical methods and Bayesian inference using Markov Chain Monte Carlo.
Variational bayesian method of estimating variance components.
Arakawa, Aisaku; Taniguchi, Masaaki; Hayashi, Takeshi; Mikawa, Satoshi
2016-07-01
We developed a Bayesian analysis approach by using a variational inference method, a so-called variational Bayesian method, to determine the posterior distributions of variance components. This variational Bayesian method and an alternative Bayesian method using Gibbs sampling were compared in estimating genetic and residual variance components from both simulated data and publically available real pig data. In the simulated data set, we observed strong bias toward overestimation of genetic variance for the variational Bayesian method in the case of low heritability and low population size, and less bias was detected with larger population sizes in both methods examined. The differences in the estimates of variance components between the variational Bayesian and the Gibbs sampling were not found in the real pig data. However, the posterior distributions of the variance components obtained with the variational Bayesian method had shorter tails than those obtained with the Gibbs sampling. Consequently, the posterior standard deviations of the genetic and residual variances of the variational Bayesian method were lower than those of the method using Gibbs sampling. The computing time required was much shorter with the variational Bayesian method than with the method using Gibbs sampling.
Dynamic Bayesian Combination of Multiple Imperfect Classifiers
Simpson, Edwin; Psorakis, Ioannis; Smith, Arfon
2012-01-01
Classifier combination methods need to make best use of the outputs of multiple, imperfect classifiers to enable higher accuracy classifications. In many situations, such as when human decisions need to be combined, the base decisions can vary enormously in reliability. A Bayesian approach to such uncertain combination allows us to infer the differences in performance between individuals and to incorporate any available prior knowledge about their abilities when training data is sparse. In this paper we explore Bayesian classifier combination, using the computationally efficient framework of variational Bayesian inference. We apply the approach to real data from a large citizen science project, Galaxy Zoo Supernovae, and show that our method far outperforms other established approaches to imperfect decision combination. We go on to analyse the putative community structure of the decision makers, based on their inferred decision making strategies, and show that natural groupings are formed. Finally we present ...
Bayesian analysis of contingency tables
Gómez Villegas, Miguel A.; González Pérez, Beatriz
2005-01-01
The display of the data by means of contingency tables is used in different approaches to statistical inference, for example, to broach the test of homogeneity of independent multinomial distributions. We develop a Bayesian procedure to test simple null hypotheses versus bilateral alternatives in contingency tables. Given independent samples of two binomial distributions and taking a mixed prior distribution, we calculate the posterior probability that the proportion of successes in the first...